A Multivariate Extreme Value Theory Approach to Anomaly Clustering and Visualization - Laboratoire Traitement et Communication de l'Information Accéder directement au contenu
Article Dans Une Revue Computational Statistics Année : 2020

A Multivariate Extreme Value Theory Approach to Anomaly Clustering and Visualization

Résumé

In a wide variety of situations, anomalies in the behaviour of a complex system, whose health is monitored through the observation of a random vector X = (X1,. .. , X d) valued in R d , correspond to the simultaneous occurrence of extreme values for certain subgroups α ⊂ {1,. .. , d} of variables Xj. Under the heavy-tail assumption, which is precisely appropriate for modeling these phenomena, statistical methods relying on multivariate extreme value theory have been developed in the past few years for identifying such events/subgroups. This paper exploits this approach much further by means of a novel mixture model that permits to describe the distribution of extremal observations and where the anomaly type α is viewed as a latent variable. One may then take advantage of the model by assigning to any extreme point a posterior probability for each anomaly type α, defining implicitly a similarity measure between anomalies. It is explained at length how the latter permits to cluster extreme observations and obtain an informative planar representation of anomalies using standard graph-mining tools. The relevance and usefulness of the clustering and 2-d visual display thus designed is illustrated on simulated datasets and on real observations as well, in the aeronautics application domain.
Fichier principal
Vignette du fichier
Anomaly_VisualDisplay_Long.pdf (501.4 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-02185060 , version 1 (16-07-2019)

Identifiants

Citer

Maël Chiapino, Stéphan Clémençon, Vincent Feuillard, Anne Sabourin. A Multivariate Extreme Value Theory Approach to Anomaly Clustering and Visualization. Computational Statistics, 2020, 35 (2), pp.607-628. ⟨10.1007/s00180-019-00913-y⟩. ⟨hal-02185060⟩
77 Consultations
300 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More