Home > Research > Publications & Outputs > Subset Multivariate Collective And Point Anomal...

Electronic data

  • MVCAPA_JCGS_Revision-6

    Rights statement: This is an Accepted Manuscript of an article published by Taylor & Francis in Journal of Computational and Graphical Statistics on 19/11/2021, available online: https://www.tandfonline.com/doi/full/10.1080/10618600.2021.1987257

    Accepted author manuscript, 2.76 MB, PDF document

    Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Links

Text available via DOI:

View graph of relations

Subset Multivariate Collective And Point Anomaly Detection

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published
<mark>Journal publication date</mark>30/06/2022
<mark>Journal</mark>Journal of Computational and Graphical Statistics
Issue number2
Volume31
Number of pages12
Pages (from-to)574-585
Publication StatusPublished
Early online date19/11/21
<mark>Original language</mark>English

Abstract

In recent years, there has been a growing interest in identifying anomalous structure within multivariate data sequences. We consider the problem of detecting collective anomalies, corresponding to intervals where one, or more, of the data sequences behaves anomalously. We first develop a test for a single collective anomaly that has power to simultaneously detect anomalies that are either rare, that is affecting few data sequences, or common. We then show how to detect multiple anomalies in a way that is computationally efficient but avoids the approximations inherent in binary segmentation-like approaches. This approach is shown to consistently estimate the number and location of the collective anomalies -- a property that has not previously been shown for competing methods. Our approach can be made robust to point anomalies and can allow for the anomalies to be imperfectly aligned. We show the practical usefulness of allowing for imperfect alignments through a resulting increase in power to detect regions of copy number variation.

Bibliographic note

This is an Accepted Manuscript of an article published by Taylor & Francis in Journal of Computational and Graphical Statistics on 19/11/2021, available online: https://www.tandfonline.com/doi/full/10.1080/10618600.2021.1987257