Changepoint detection for data intensive settings

School Of Mathematical Sciences

Associated organisational units

Electronic data

2020ticklephd
Final published version, 2.51 MB, PDF document
Available under license: CC BY-NC-ND: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

Text available via DOI:

https://doi.org/10.17635/lancaster/thesis/823
Final published version

View graph of relations

Research output: Thesis › Doctoral Thesis

Published

Standard

Changepoint detection for data intensive settings. / Tickle, Sam.
Lancaster University, 2020. 226 p.

Research output: Thesis › Doctoral Thesis

Harvard

Tickle, S 2020, 'Changepoint detection for data intensive settings', PhD, Lancaster University. https://doi.org/10.17635/lancaster/thesis/823

APA

Tickle, S. (2020). Changepoint detection for data intensive settings. [Doctoral Thesis, Lancaster University]. Lancaster University. https://doi.org/10.17635/lancaster/thesis/823

Vancouver

Tickle S. Changepoint detection for data intensive settings. Lancaster University, 2020. 226 p. doi: 10.17635/lancaster/thesis/823

Author

Tickle, Sam. / Changepoint detection for data intensive settings. Lancaster University, 2020. 226 p.

Bibtex

@phdthesis{c661e8e5b19a46ba9ecf7cb8f74cdcfc,

title = "Changepoint detection for data intensive settings",

abstract = "Detecting a point in a data sequence where the behaviour alters abruptly, otherwise known as a changepoint, has been an active area of interest for decades. More recently, with the advent of the data intensive era, the need for automated and computationally efficient changepoint methods has grown. We here introduce several new techniques for doing this which address many of the issues inherent in detecting changes in a streaming setting. In short, these new methods, which may be viewed as non-trivial extensions of existing classical procedures, are intended to be as useful in as wide a set of situations as possible, while retaining important theoretical guarantees and ease of implementation.The first novel contribution concerns two methods for parallelising existing dynamic programming based approaches to changepoint detection in the single variate setting. We demonstrate that these methods can result in near quadratic computational gains, while retaining important theoretical guarantees.Our next area of focus is the multivariate setting. We introduce two new methods for data intensive scenarios with a fixed, but possibly large, number of dimensions. The first of these is an offline method which detects one change at a time using a new test statistic. We demonstrate that this test statistic has competitive power in a variety of possible settings for a given changepoint, while allowing the method to be versatile across a range of possible modelling assumptions. The other method we introduce for multivariate data is also suitable in the streaming setting. In addition, it is able to relax many standard modelling assumptions. We discuss the empirical properties of the procedure, especially insofar as they relate to a desired false alarm error rate.",

author = "Sam Tickle",

year = "2020",

doi = "10.17635/lancaster/thesis/823",

language = "English",

publisher = "Lancaster University",

school = "Lancaster University",

}

RIS

TY - BOOK

T1 - Changepoint detection for data intensive settings

AU - Tickle, Sam

PY - 2020

Y1 - 2020

N2 - Detecting a point in a data sequence where the behaviour alters abruptly, otherwise known as a changepoint, has been an active area of interest for decades. More recently, with the advent of the data intensive era, the need for automated and computationally efficient changepoint methods has grown. We here introduce several new techniques for doing this which address many of the issues inherent in detecting changes in a streaming setting. In short, these new methods, which may be viewed as non-trivial extensions of existing classical procedures, are intended to be as useful in as wide a set of situations as possible, while retaining important theoretical guarantees and ease of implementation.The first novel contribution concerns two methods for parallelising existing dynamic programming based approaches to changepoint detection in the single variate setting. We demonstrate that these methods can result in near quadratic computational gains, while retaining important theoretical guarantees.Our next area of focus is the multivariate setting. We introduce two new methods for data intensive scenarios with a fixed, but possibly large, number of dimensions. The first of these is an offline method which detects one change at a time using a new test statistic. We demonstrate that this test statistic has competitive power in a variety of possible settings for a given changepoint, while allowing the method to be versatile across a range of possible modelling assumptions. The other method we introduce for multivariate data is also suitable in the streaming setting. In addition, it is able to relax many standard modelling assumptions. We discuss the empirical properties of the procedure, especially insofar as they relate to a desired false alarm error rate.

AB - Detecting a point in a data sequence where the behaviour alters abruptly, otherwise known as a changepoint, has been an active area of interest for decades. More recently, with the advent of the data intensive era, the need for automated and computationally efficient changepoint methods has grown. We here introduce several new techniques for doing this which address many of the issues inherent in detecting changes in a streaming setting. In short, these new methods, which may be viewed as non-trivial extensions of existing classical procedures, are intended to be as useful in as wide a set of situations as possible, while retaining important theoretical guarantees and ease of implementation.The first novel contribution concerns two methods for parallelising existing dynamic programming based approaches to changepoint detection in the single variate setting. We demonstrate that these methods can result in near quadratic computational gains, while retaining important theoretical guarantees.Our next area of focus is the multivariate setting. We introduce two new methods for data intensive scenarios with a fixed, but possibly large, number of dimensions. The first of these is an offline method which detects one change at a time using a new test statistic. We demonstrate that this test statistic has competitive power in a variety of possible settings for a given changepoint, while allowing the method to be versatile across a range of possible modelling assumptions. The other method we introduce for multivariate data is also suitable in the streaming setting. In addition, it is able to relax many standard modelling assumptions. We discuss the empirical properties of the procedure, especially insofar as they relate to a desired false alarm error rate.

U2 - 10.17635/lancaster/thesis/823

DO - 10.17635/lancaster/thesis/823

M3 - Doctoral Thesis

PB - Lancaster University

ER -

Research

Associated organisational units

Electronic data

Text available via DOI: