Home > Research > Publications & Outputs > Subset Multivariate Collective And Point Anomal...

Electronic data

  • MVCAPA_JCGS_Revision-6

    Rights statement: This is an Accepted Manuscript of an article published by Taylor & Francis in Journal of Computational and Graphical Statistics on 19/11/2021, available online: https://www.tandfonline.com/doi/full/10.1080/10618600.2021.1987257

    Accepted author manuscript, 2.76 MB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Links

Text available via DOI:

View graph of relations

Subset Multivariate Collective And Point Anomaly Detection

Research output: Contribution to journalJournal articlepeer-review

E-pub ahead of print
<mark>Journal publication date</mark>19/11/2021
<mark>Journal</mark>Journal of Computational and Graphical Statistics
Publication StatusE-pub ahead of print
Early online date19/11/21
<mark>Original language</mark>English

Abstract

In recent years, there has been a growing interest in identifying anomalous structure within multivariate data sequences. We consider the problem of detecting collective anomalies, corresponding to intervals where one, or more, of the data sequences behaves anomalously. We first develop a test for a single collective anomaly that has power to simultaneously detect anomalies that are either rare, that is affecting few data sequences, or common. We then show how to detect multiple anomalies in a way that is computationally efficient but avoids the approximations inherent in binary segmentation-like approaches. This approach is shown to consistently estimate the number and location of the collective anomalies -- a property that has not previously been shown for competing methods. Our approach can be made robust to point anomalies and can allow for the anomalies to be imperfectly aligned. We show the practical usefulness of allowing for imperfect alignments through a resulting increase in power to detect regions of copy number variation.

Bibliographic note

This is an Accepted Manuscript of an article published by Taylor & Francis in Journal of Computational and Graphical Statistics on 19/11/2021, available online: https://www.tandfonline.com/doi/full/10.1080/10618600.2021.1987257