On optimal multiple changepoint algorithms for large data

Associated organisational units

Text available via DOI:

https://doi.org/10.1007/s11222-016-9636-3
Final published version
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Keywords

Breakpoints, Dynamic Programming, FPOP , SNIP , Optimal Partitioning, pDPA , PELT, Segment Neighbourhood

View graph of relations

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

On optimal multiple changepoint algorithms for large data. / Maidstone, Robert; Hocking, Toby; Rigaill, Guillem et al.
In: Statistics and Computing, Vol. 27, 31.03.2017, p. 519-533.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Maidstone, R, Hocking, T, Rigaill, G & Fearnhead, P 2017, 'On optimal multiple changepoint algorithms for large data', Statistics and Computing, vol. 27, pp. 519-533. https://doi.org/10.1007/s11222-016-9636-3

APA

Maidstone, R., Hocking, T., Rigaill, G., & Fearnhead, P. (2017). On optimal multiple changepoint algorithms for large data. Statistics and Computing, 27, 519-533. https://doi.org/10.1007/s11222-016-9636-3

Vancouver

Maidstone R, Hocking T, Rigaill G, Fearnhead P. On optimal multiple changepoint algorithms for large data. Statistics and Computing. 2017 Mar 31;27:519-533. Epub 2016 Feb 15. doi: 10.1007/s11222-016-9636-3

Author

Maidstone, Robert ; Hocking, Toby ; Rigaill, Guillem et al. / On optimal multiple changepoint algorithms for large data. In: Statistics and Computing. 2017 ; Vol. 27. pp. 519-533.

Bibtex

@article{4173288320224b3dbaa1d5ab2fa54917,

title = "On optimal multiple changepoint algorithms for large data",

abstract = "Many common approaches to detecting changepoints, for example based on statistical criteria such as penalised likelihood or minimum description length, can be formulated in terms of minimising a cost over segmentations. We focus on a class of dynamic programming algorithms that can solve the resulting minimisation problem exactly, and thus find the optimal segmentation under the given statistical criteria. The standard implementation of these dynamic programming methods have a computational cost that scales at least quadratically in the length of the time-series. Recently pruning ideas have been suggested that can speed up the dynamic programming algorithms, whilst still being guaranteed to be optimal, in that they find the true minimum of the cost function. Here we extend these pruning methods, and introduce two new algorithms for segmenting data: FPOP and SNIP. Empirical results show that FPOP is substantially faster than existing dynamic programming methods, and unlike the existing methods its computational efficiency is robust to the number of changepoints in the data. We evaluate the method for detecting copy number variations and observe that FPOP has a computational cost that is even competitive with that of binary segmentation, but can give much more accurate segmentations.",

keywords = "Breakpoints, Dynamic Programming, FPOP , SNIP , Optimal Partitioning, pDPA , PELT, Segment Neighbourhood",

author = "Robert Maidstone and Toby Hocking and Guillem Rigaill and Paul Fearnhead",

year = "2017",

month = mar,

day = "31",

doi = "10.1007/s11222-016-9636-3",

language = "English",

volume = "27",

pages = "519--533",

journal = "Statistics and Computing",

issn = "0960-3174",

publisher = "Springer Netherlands",

}

RIS

TY - JOUR

T1 - On optimal multiple changepoint algorithms for large data

AU - Maidstone, Robert

AU - Hocking, Toby

AU - Rigaill, Guillem

AU - Fearnhead, Paul

PY - 2017/3/31

Y1 - 2017/3/31

N2 - Many common approaches to detecting changepoints, for example based on statistical criteria such as penalised likelihood or minimum description length, can be formulated in terms of minimising a cost over segmentations. We focus on a class of dynamic programming algorithms that can solve the resulting minimisation problem exactly, and thus find the optimal segmentation under the given statistical criteria. The standard implementation of these dynamic programming methods have a computational cost that scales at least quadratically in the length of the time-series. Recently pruning ideas have been suggested that can speed up the dynamic programming algorithms, whilst still being guaranteed to be optimal, in that they find the true minimum of the cost function. Here we extend these pruning methods, and introduce two new algorithms for segmenting data: FPOP and SNIP. Empirical results show that FPOP is substantially faster than existing dynamic programming methods, and unlike the existing methods its computational efficiency is robust to the number of changepoints in the data. We evaluate the method for detecting copy number variations and observe that FPOP has a computational cost that is even competitive with that of binary segmentation, but can give much more accurate segmentations.

AB - Many common approaches to detecting changepoints, for example based on statistical criteria such as penalised likelihood or minimum description length, can be formulated in terms of minimising a cost over segmentations. We focus on a class of dynamic programming algorithms that can solve the resulting minimisation problem exactly, and thus find the optimal segmentation under the given statistical criteria. The standard implementation of these dynamic programming methods have a computational cost that scales at least quadratically in the length of the time-series. Recently pruning ideas have been suggested that can speed up the dynamic programming algorithms, whilst still being guaranteed to be optimal, in that they find the true minimum of the cost function. Here we extend these pruning methods, and introduce two new algorithms for segmenting data: FPOP and SNIP. Empirical results show that FPOP is substantially faster than existing dynamic programming methods, and unlike the existing methods its computational efficiency is robust to the number of changepoints in the data. We evaluate the method for detecting copy number variations and observe that FPOP has a computational cost that is even competitive with that of binary segmentation, but can give much more accurate segmentations.

KW - Breakpoints

KW - Dynamic Programming

KW - FPOP

KW - SNIP

KW - Optimal Partitioning

KW - pDPA

KW - PELT

KW - Segment Neighbourhood

U2 - 10.1007/s11222-016-9636-3

DO - 10.1007/s11222-016-9636-3

M3 - Journal article

VL - 27

SP - 519

EP - 533

JO - Statistics and Computing

JF - Statistics and Computing

SN - 0960-3174

ER -

Research

Associated organisational units

Links

Text available via DOI:

Keywords