[CSUR21] 异常检测方法、模型和分类 (VI) - 高维数据场景

Author: Steven Date: May 27, 2022 Updated On: May 31, 2022
201 words in total, 1 minutes required.

Boukerche, Azzedine, Lining Zheng, and Omar Alfandi. “Outlier Detection: Methods, Models, and Classification.” ACM Computing Surveys 53, no. 3 (May 31, 2021): 1–37. https://doi.org/10.1145/3381028.

文章较长,笔记分为多篇,下一篇

Arthur Zimek等人[1]指出了高维场景下异常检测在高效性上的两个挑战:

  1. 计算代价
  2. 现有技术如采样、剪枝、排序和索引面临的性能问题

在有效性上,则面临着“是否能找到有意义的离群点”的问题:

distance concentration: The distances for all pairs of data points tend to become almost uniform. Thus all the regions in the dataset become almost equally sparse, and the distinction between outliers and normal instances is hard to capture.

方法一览方法一览

继续阅读下一篇

扩展阅读


  1. 1.Arthur Zimek, Erich Schubert, and Hans-Peter Kriegel. 2012. A survey on unsupervised outlier detection in highdimensional numerical data. Stat. Anal. Data Min. 5, 5 (2012), 363–387.