Final published version, 5.47 MB, PDF document
Available under license: None
Research output: Thesis › Doctoral Thesis
Research output: Thesis › Doctoral Thesis
}
TY - BOOK
T1 - Detecting changepoints in multivariate data
AU - Ryan, Sean
PY - 2020
Y1 - 2020
N2 - In this thesis, we propose new methodology for detecting changepoints in multivariate data, focusing on the setting where the number of variables and the length of the data can be very large.We begin by considering the problem of detecting changepoints where only a sub- set of the variables are affected by the change. Previous work demonstrated that the changepoint locations and affected variables can be simultaneously estimated by solving a discrete optimisation problem. We propose two new methods PSMOP (Pruned Subset Multivariate Optimal Partitioning) and SPOT (Subset Partitioning Optimal Time) for solving this problem. PSMOP uses novel search space reduction techniques to efficiently compute an exact solution for data of moderate size. SPOT is an approximate method, which gives near optimal solutions at a very low computationalcost, and can be applied to very large datasets. We use this new methodology tostudy changes in sales data due to the effect of promotions.We then examine the problem of detecting changes in the covariance structure of high dimensional data. Using results from Random Matrix Theory, we introduce a novel test statistic for detecting such changes. Importantly, under the null hypothesis of no change, the distribution of this test statistic is independent of the underlying covariance matrix. We utilise this test statistic to study changes in the amount of water on the surface of a plot of soil.
AB - In this thesis, we propose new methodology for detecting changepoints in multivariate data, focusing on the setting where the number of variables and the length of the data can be very large.We begin by considering the problem of detecting changepoints where only a sub- set of the variables are affected by the change. Previous work demonstrated that the changepoint locations and affected variables can be simultaneously estimated by solving a discrete optimisation problem. We propose two new methods PSMOP (Pruned Subset Multivariate Optimal Partitioning) and SPOT (Subset Partitioning Optimal Time) for solving this problem. PSMOP uses novel search space reduction techniques to efficiently compute an exact solution for data of moderate size. SPOT is an approximate method, which gives near optimal solutions at a very low computationalcost, and can be applied to very large datasets. We use this new methodology tostudy changes in sales data due to the effect of promotions.We then examine the problem of detecting changes in the covariance structure of high dimensional data. Using results from Random Matrix Theory, we introduce a novel test statistic for detecting such changes. Importantly, under the null hypothesis of no change, the distribution of this test statistic is independent of the underlying covariance matrix. We utilise this test statistic to study changes in the amount of water on the surface of a plot of soil.
U2 - 10.17635/lancaster/thesis/1141
DO - 10.17635/lancaster/thesis/1141
M3 - Doctoral Thesis
PB - Lancaster University
ER -