Accepted author manuscript, 26.8 MB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Online Multivariate Changepoint Detection
T2 - Leveraging Links With Computational Geometry
AU - Pishchagina, Liudmila
AU - Romano, Gaetano
AU - Fearnhead, Paul
AU - Runge, Vincent
AU - Rigaill, Guillem
PY - 2025/6/24
Y1 - 2025/6/24
N2 - The increasing volume of data streams poses significant computational challenges for detecting changepoints online. Likelihood-based methods are effective, but a naive sequential implementation becomes impractical online due to high computational costs. We develop an online algorithm that exactly calculates the likelihood ratio test for a single changepoint in $p$-dimensional data streams by leveraging a fascinating connection with computational geometry. This connection straightforwardly allows us to exactly recover sparse likelihood ratio statistics: that is assuming only a subset of the dimensions are changing. Our algorithm is straightforward, fast, and apparently quasi-linear. A dyadic variant of our algorithm is provably quasi-linear, being $\mathcal{O}(n\log(n)^{p+1})$ for $n$ data points and $p$ less than $3$, but slower in practice. These algorithms are computationally impractical when $p$ is larger than $5$, and we provide an approximate algorithm suitable for such $p$ which is $\mathcal{O}(np\log(n)^{\tilde{p}+1}), $ for some user-specified $\tilde{p} \leq 5$. We derive statistical guarantees for the proposed procedures in the Gaussian case, and confirm the good computational and statistical performance, and usefulness, of the algorithms on both empirical data and NBA data.
AB - The increasing volume of data streams poses significant computational challenges for detecting changepoints online. Likelihood-based methods are effective, but a naive sequential implementation becomes impractical online due to high computational costs. We develop an online algorithm that exactly calculates the likelihood ratio test for a single changepoint in $p$-dimensional data streams by leveraging a fascinating connection with computational geometry. This connection straightforwardly allows us to exactly recover sparse likelihood ratio statistics: that is assuming only a subset of the dimensions are changing. Our algorithm is straightforward, fast, and apparently quasi-linear. A dyadic variant of our algorithm is provably quasi-linear, being $\mathcal{O}(n\log(n)^{p+1})$ for $n$ data points and $p$ less than $3$, but slower in practice. These algorithms are computationally impractical when $p$ is larger than $5$, and we provide an approximate algorithm suitable for such $p$ which is $\mathcal{O}(np\log(n)^{\tilde{p}+1}), $ for some user-specified $\tilde{p} \leq 5$. We derive statistical guarantees for the proposed procedures in the Gaussian case, and confirm the good computational and statistical performance, and usefulness, of the algorithms on both empirical data and NBA data.
KW - stat.CO
M3 - Journal article
JO - Journal of the Royal Statistical Society: Series B (Statistical Methodology)
JF - Journal of the Royal Statistical Society: Series B (Statistical Methodology)
SN - 1369-7412
ER -