Improving Power by Conditioning on Less in Post-selection Inference for Changepoints

School Of Mathematical Sciences

Associated organisational unit

Changepoints and Time Series

Electronic data

2301.05636v2
Submitted manuscript, 5.6 MB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Keywords

stat.ME

View graph of relations

Research output: Working paper › Preprint

Published

More...

Publication date	13/01/2023
Publisher	Arxiv
<mark>Original language</mark>	English

Abstract

Post-selection inference has recently been proposed as a way of quantifying uncertainty about detected changepoints. The idea is to run a changepoint detection algorithm, and then re-use the same data to perform a test for a change near each of the detected changes. By defining the p-value for the test appropriately, so that it is conditional on the information used to choose the test, this approach will produce valid p-values. We show how to improve the power of these procedures by conditioning on less information. This gives rise to an ideal selective p-value that is intractable but can be approximated by Monte Carlo. We show that for any Monte Carlo sample size, this procedure produces valid p-values, and empirically that noticeable increase in power is possible with only very modest Monte Carlo sample sizes. Our procedure is easy to implement given existing post-selection inference methods, as we just need to generate perturbations of the data set and re-apply the post-selection method to each of these. On genomic data consisting of human GC content, our procedure increases the number of significant changepoints that are detected from e.g. 17 to 27, when compared to existing methods.

Bibliographic note

33 pages, 13 figures

Research

Associated organisational unit

Electronic data

Links

Keywords