Scoring anomalies: a M-estimation formulation

Abstract : It is the purpose of this paper to formulate the issue of scoring multivariate observations depending on their degree of abnormal-ity/novelty as an unsupervised learning task. Whereas in the 1-d situation, this problem can be dealt with by means of tail estimation techniques, observations being viewed as all the more "abnormal" as they are located far in the tail(s) of the underlying probability distribution. In a wide variety of applications , it is desirable to dispose of a scalar valued "scoring" function allowing for comparing the degree of abnormality of multi-variate observations. Here we formulate the issue of scoring anomalies as a M-estimation problem. A (functional) performance criterion is proposed, whose optimal elements are, as expected, nondecreasing transforms of the density. The question of empirical estimation of this criterion is tackled and preliminary statistical results related to the accuracy of partition-based techniques for optimizing empirical estimates of the empirical performance measure are established
Complete list of metadatas

Cited literature [14 references]  Display  Hide  Download

https://hal.telecom-paristech.fr/hal-02107392
Contributor : Stephan Clémençon <>
Submitted on : Tuesday, April 23, 2019 - 4:21:54 PM
Last modification on : Thursday, October 17, 2019 - 12:36:55 PM

File

clemencon13a.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02107392, version 1

Citation

Stéphan Clémençon, Jérémie Jakubowicz. Scoring anomalies: a M-estimation formulation. AISTATS 2013: 16th International Conference on Artificial Intelligence and Statistics, Apr 2013, Scottsdale, AZ, United States. ⟨hal-02107392⟩

Share

Metrics

Record views

14

Files downloads

11