Open Conference Systems, CLADAG2023

Font Size: 
Visualizing Anomalies in Circular Data
Davide Buttarazzi, Giovanni C Porzio

Last modified: 2023-06-01

Abstract


Anomaly detection has a long history in Statistics, with one of the mosteffective approaches being robustness. First, a model describing the majority of thedata is assumed. Second, its parameters are robustly estimated. Then, the distance ofall the points from such a model is evaluated. Eventually, extremely far (i.e., unlikely)observations are flagged as outliers. Visually, this procedure is well described by thewell-worn Tukey’s box-and-whisker plot. Thanks to its robustness properties, it isprobably the graphical tool mostly adopted to highlight anomalies in univariate datasets.This work aims at investigating if the same strategy can be exploited in circulardata analysis, i.e., for data lying on the boundary of the unit circle. For this kind ofdata, a specific boxplot has been designed. However, its first formulation did not focuson anomaly detection. It was rather conceived as an exploratory tool to displaythe main features of a circular data set. Reliyng on a non-robust estimate of the datadispersion, it will be simply misleading if used to visualize anomalies. A robust circularboxplot is then introduced. It will be able to correctly identify circular outliersunder a specific parametric model.