A Comparison of Single and Multiple Changepoint Techniques for Time Series Data

Associated organisational units

Electronic data

Changepoint_Comparison_AcceptedVersion
Rights statement: This is the author’s version of a work that was accepted for publication in Computational Statistics & Data Analysis. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Computational Statistics & Data Analysis, 170, 2022 DOI: 10.1016/j.csda.2022.107433
Accepted author manuscript, 1.66 MB, PDF document
Available under license: CC BY-NC-ND: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

Text available via DOI:

https://doi.org/10.1016/j.csda.2022.107433
Final published version

Keywords

stat.ME, stat.CO

View graph of relations

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

A Comparison of Single and Multiple Changepoint Techniques for Time Series Data. / Shi, Xueheng; Gallagher, Colin; Lund, Robert et al.
In: Computational Statistics and Data Analysis, Vol. 170, 107433, 30.06.2022.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Shi, X, Gallagher, C, Lund, R & Killick, R 2022, 'A Comparison of Single and Multiple Changepoint Techniques for Time Series Data', Computational Statistics and Data Analysis, vol. 170, 107433. https://doi.org/10.1016/j.csda.2022.107433

APA

Shi, X., Gallagher, C., Lund, R., & Killick, R. (2022). A Comparison of Single and Multiple Changepoint Techniques for Time Series Data. Computational Statistics and Data Analysis, 170, Article 107433. https://doi.org/10.1016/j.csda.2022.107433

Vancouver

Shi X, Gallagher C, Lund R, Killick R. A Comparison of Single and Multiple Changepoint Techniques for Time Series Data. Computational Statistics and Data Analysis. 2022 Jun 30;170:107433. Epub 2022 Feb 8. doi: 10.1016/j.csda.2022.107433

Author

Shi, Xueheng ; Gallagher, Colin ; Lund, Robert et al. / A Comparison of Single and Multiple Changepoint Techniques for Time Series Data. In: Computational Statistics and Data Analysis. 2022 ; Vol. 170.

Bibtex

@article{eb9ee2cd57cd4310aa625fc84d72d7b6,

title = "A Comparison of Single and Multiple Changepoint Techniques for Time Series Data",

abstract = " This paper describes and compares several prominent single and multiple changepoint techniques for time series data. Due to their importance in inferential matters, changepoint research on correlated data has accelerated recently. Unfortunately, small perturbations in model assumptions can drastically alter changepoint conclusions; for example, heavy positive correlation in a time series can be misattributed to a mean shift should correlation be ignored. This paper considers both single and multiple changepoint techniques. The paper begins by examining cumulative sum (CUSUM) and likelihood ratio tests and their variants for the single changepoint problem; here, various statistics, boundary cropping scenarios, and scaling methods (e.g., scaling to an extreme value or Brownian Bridge limit) are compared. A recently developed test based on summing squared CUSUM statistics over all times is shown to have realistic Type I errors and superior detection power. The paper then turns to the multiple changepoint setting. Here, penalized likelihoods drive the discourse, with AIC, BIC, mBIC, and MDL penalties being considered. Binary and wild binary segmentation techniques are also compared. We introduce a new distance metric specifically designed to compare two multiple changepoint segmentations. Algorithmic and computational concerns are discussed and simulations are provided to support all conclusions. In the end, the multiple changepoint setting admits no clear methodological winner, performance depending on the particular scenario. Nonetheless, some practical guidance will emerge. ",

keywords = "stat.ME, stat.CO",

author = "Xueheng Shi and Colin Gallagher and Robert Lund and Rebecca Killick",

note = "This is the author{\textquoteright}s version of a work that was accepted for publication in Computational Statistics & Data Analysis. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Computational Statistics & Data Analysis, 170, 2022 DOI: 10.1016/j.csda.2022.107433",

year = "2022",

month = jun,

day = "30",

doi = "10.1016/j.csda.2022.107433",

language = "English",

volume = "170",

journal = "Computational Statistics and Data Analysis",

issn = "0167-9473",

publisher = "Elsevier",

}

RIS

TY - JOUR

T1 - A Comparison of Single and Multiple Changepoint Techniques for Time Series Data

AU - Shi, Xueheng

AU - Gallagher, Colin

AU - Lund, Robert

AU - Killick, Rebecca

N1 - This is the author’s version of a work that was accepted for publication in Computational Statistics & Data Analysis. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Computational Statistics & Data Analysis, 170, 2022 DOI: 10.1016/j.csda.2022.107433

PY - 2022/6/30

Y1 - 2022/6/30

N2 - This paper describes and compares several prominent single and multiple changepoint techniques for time series data. Due to their importance in inferential matters, changepoint research on correlated data has accelerated recently. Unfortunately, small perturbations in model assumptions can drastically alter changepoint conclusions; for example, heavy positive correlation in a time series can be misattributed to a mean shift should correlation be ignored. This paper considers both single and multiple changepoint techniques. The paper begins by examining cumulative sum (CUSUM) and likelihood ratio tests and their variants for the single changepoint problem; here, various statistics, boundary cropping scenarios, and scaling methods (e.g., scaling to an extreme value or Brownian Bridge limit) are compared. A recently developed test based on summing squared CUSUM statistics over all times is shown to have realistic Type I errors and superior detection power. The paper then turns to the multiple changepoint setting. Here, penalized likelihoods drive the discourse, with AIC, BIC, mBIC, and MDL penalties being considered. Binary and wild binary segmentation techniques are also compared. We introduce a new distance metric specifically designed to compare two multiple changepoint segmentations. Algorithmic and computational concerns are discussed and simulations are provided to support all conclusions. In the end, the multiple changepoint setting admits no clear methodological winner, performance depending on the particular scenario. Nonetheless, some practical guidance will emerge.

AB - This paper describes and compares several prominent single and multiple changepoint techniques for time series data. Due to their importance in inferential matters, changepoint research on correlated data has accelerated recently. Unfortunately, small perturbations in model assumptions can drastically alter changepoint conclusions; for example, heavy positive correlation in a time series can be misattributed to a mean shift should correlation be ignored. This paper considers both single and multiple changepoint techniques. The paper begins by examining cumulative sum (CUSUM) and likelihood ratio tests and their variants for the single changepoint problem; here, various statistics, boundary cropping scenarios, and scaling methods (e.g., scaling to an extreme value or Brownian Bridge limit) are compared. A recently developed test based on summing squared CUSUM statistics over all times is shown to have realistic Type I errors and superior detection power. The paper then turns to the multiple changepoint setting. Here, penalized likelihoods drive the discourse, with AIC, BIC, mBIC, and MDL penalties being considered. Binary and wild binary segmentation techniques are also compared. We introduce a new distance metric specifically designed to compare two multiple changepoint segmentations. Algorithmic and computational concerns are discussed and simulations are provided to support all conclusions. In the end, the multiple changepoint setting admits no clear methodological winner, performance depending on the particular scenario. Nonetheless, some practical guidance will emerge.

KW - stat.ME

KW - stat.CO

U2 - 10.1016/j.csda.2022.107433

DO - 10.1016/j.csda.2022.107433

M3 - Journal article

VL - 170

JO - Computational Statistics and Data Analysis

JF - Computational Statistics and Data Analysis

SN - 0167-9473

M1 - 107433

ER -

Research

Associated organisational units

Electronic data

Links

Text available via DOI:

Keywords