The Zig-Zag Process and Super-Efficient Sampling for Bayesian Analysis of Big Data

School Of Mathematical Sciences

Associated organisational units

Electronic data

zigzagRev4
Accepted author manuscript, 912 KB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Text available via DOI:

https://doi.org/10.1214/18-AOS1715
Final published version

Keywords

stat.CO, math.PR, 65C60, 65C05, 62F15, 60J25

View graph of relations

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

The Zig-Zag Process and Super-Efficient Sampling for Bayesian Analysis of Big Data. / Bierkens, Joris; Fearnhead, Paul; Roberts, Gareth.
In: Annals of Statistics, Vol. 47, No. 3, 13.02.2019, p. 1288-1320.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Bierkens, J, Fearnhead, P & Roberts, G 2019, 'The Zig-Zag Process and Super-Efficient Sampling for Bayesian Analysis of Big Data', Annals of Statistics, vol. 47, no. 3, pp. 1288-1320. https://doi.org/10.1214/18-AOS1715

APA

Bierkens, J., Fearnhead, P., & Roberts, G. (2019). The Zig-Zag Process and Super-Efficient Sampling for Bayesian Analysis of Big Data. Annals of Statistics, 47(3), 1288-1320. https://doi.org/10.1214/18-AOS1715

Vancouver

Bierkens J, Fearnhead P, Roberts G. The Zig-Zag Process and Super-Efficient Sampling for Bayesian Analysis of Big Data. Annals of Statistics. 2019 Feb 13;47(3):1288-1320. doi: 10.1214/18-AOS1715

Author

Bierkens, Joris ; Fearnhead, Paul ; Roberts, Gareth. / The Zig-Zag Process and Super-Efficient Sampling for Bayesian Analysis of Big Data. In: Annals of Statistics. 2019 ; Vol. 47, No. 3. pp. 1288-1320.

Bibtex

@article{a1aaab3995594706853c8506278e8e2a,

title = "The Zig-Zag Process and Super-Efficient Sampling for Bayesian Analysis of Big Data",

abstract = "Standard MCMC methods can scale poorly to big data settings due to the need to evaluate the likelihood at each iteration. There have been a number of approximate MCMC algorithms that use sub-sampling ideas to reduce this computational burden, but with the drawback that these algorithms no longer target the true posterior distribution. We introduce a new family of Monte Carlo methods based upon a multi-dimensional version of the Zig-Zag process of Bierkens and Roberts (2015), a continuous time piecewise deterministic Markov process. While traditional MCMC methods are reversible by construction (a property which is known to inhibit rapid convergence) the Zig-Zag process offers a flexible non-reversible alternative which we observe to often have favourable convergence properties. We show how the Zig-Zag process can be simulated without discretisation error, and give conditions for the process to be ergodic. Most importantly, we introduce a sub-sampling version of the Zig-Zag process that is an example of an exact approximate scheme, i.e. the resulting approximate process still has the posterior as its stationary distribution. Furthermore, if we use a control-variate idea to reduce the variance of our unbiased estimator, then the Zig-Zag process can be super-efficient: after an initial pre-processing step, essentially independent samples from the posterior distribution are obtained at a computational cost which does not depend on the size of the data. ",

keywords = "stat.CO, math.PR, 65C60, 65C05, 62F15, 60J25",

author = "Joris Bierkens and Paul Fearnhead and Gareth Roberts",

year = "2019",

month = feb,

day = "13",

doi = "10.1214/18-AOS1715",

language = "English",

volume = "47",

pages = "1288--1320",

journal = "Annals of Statistics",

issn = "0090-5364",

publisher = "Institute of Mathematical Statistics",

number = "3",

}

RIS

TY - JOUR

T1 - The Zig-Zag Process and Super-Efficient Sampling for Bayesian Analysis of Big Data

AU - Bierkens, Joris

AU - Fearnhead, Paul

AU - Roberts, Gareth

PY - 2019/2/13

Y1 - 2019/2/13

N2 - Standard MCMC methods can scale poorly to big data settings due to the need to evaluate the likelihood at each iteration. There have been a number of approximate MCMC algorithms that use sub-sampling ideas to reduce this computational burden, but with the drawback that these algorithms no longer target the true posterior distribution. We introduce a new family of Monte Carlo methods based upon a multi-dimensional version of the Zig-Zag process of Bierkens and Roberts (2015), a continuous time piecewise deterministic Markov process. While traditional MCMC methods are reversible by construction (a property which is known to inhibit rapid convergence) the Zig-Zag process offers a flexible non-reversible alternative which we observe to often have favourable convergence properties. We show how the Zig-Zag process can be simulated without discretisation error, and give conditions for the process to be ergodic. Most importantly, we introduce a sub-sampling version of the Zig-Zag process that is an example of an exact approximate scheme, i.e. the resulting approximate process still has the posterior as its stationary distribution. Furthermore, if we use a control-variate idea to reduce the variance of our unbiased estimator, then the Zig-Zag process can be super-efficient: after an initial pre-processing step, essentially independent samples from the posterior distribution are obtained at a computational cost which does not depend on the size of the data.

AB - Standard MCMC methods can scale poorly to big data settings due to the need to evaluate the likelihood at each iteration. There have been a number of approximate MCMC algorithms that use sub-sampling ideas to reduce this computational burden, but with the drawback that these algorithms no longer target the true posterior distribution. We introduce a new family of Monte Carlo methods based upon a multi-dimensional version of the Zig-Zag process of Bierkens and Roberts (2015), a continuous time piecewise deterministic Markov process. While traditional MCMC methods are reversible by construction (a property which is known to inhibit rapid convergence) the Zig-Zag process offers a flexible non-reversible alternative which we observe to often have favourable convergence properties. We show how the Zig-Zag process can be simulated without discretisation error, and give conditions for the process to be ergodic. Most importantly, we introduce a sub-sampling version of the Zig-Zag process that is an example of an exact approximate scheme, i.e. the resulting approximate process still has the posterior as its stationary distribution. Furthermore, if we use a control-variate idea to reduce the variance of our unbiased estimator, then the Zig-Zag process can be super-efficient: after an initial pre-processing step, essentially independent samples from the posterior distribution are obtained at a computational cost which does not depend on the size of the data.

KW - stat.CO

KW - math.PR

KW - 65C60, 65C05, 62F15, 60J25

U2 - 10.1214/18-AOS1715

DO - 10.1214/18-AOS1715

M3 - Journal article

VL - 47

SP - 1288

EP - 1320

JO - Annals of Statistics

JF - Annals of Statistics

SN - 0090-5364

IS - 3

ER -

Research

Associated organisational units

Electronic data

Links

Text available via DOI:

Keywords