Control variates for stochastic gradient MCMC

Associated organisational units

Electronic data

1706.05439.pd
Rights statement: The final publication is available at Springer via https://doi.org/10.1007/s11222-018-9826-2
Accepted author manuscript, 1.05 MB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Text available via DOI:

https://doi.org/10.1007/s11222-018-9826-2
Final published version
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Keywords

stat.CO, cs.LG, stat.ML

View graph of relations

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

Control variates for stochastic gradient MCMC. / Baker, Jack ; Fearnhead, Paul; Fox, Emily B. et al.
In: Statistics and Computing, Vol. 29, No. 3, 01.05.2019, p. 599-615.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Baker, J , Fearnhead, P, Fox, EB & Nemeth, C 2019, 'Control variates for stochastic gradient MCMC', Statistics and Computing, vol. 29, no. 3, pp. 599-615. https://doi.org/10.1007/s11222-018-9826-2

APA

Baker, J., Fearnhead, P., Fox, E. B., & Nemeth, C. (2019). Control variates for stochastic gradient MCMC. Statistics and Computing, 29(3), 599-615. https://doi.org/10.1007/s11222-018-9826-2

Vancouver

Baker J , Fearnhead P, Fox EB, Nemeth C. Control variates for stochastic gradient MCMC. Statistics and Computing. 2019 May 1;29(3):599-615. Epub 2018 Aug 4. doi: 10.1007/s11222-018-9826-2

Author

Baker, Jack ; Fearnhead, Paul ; Fox, Emily B. et al. / Control variates for stochastic gradient MCMC. In: Statistics and Computing. 2019 ; Vol. 29, No. 3. pp. 599-615.

Bibtex

@article{3a28f6f7d040407099af39cd9d6540a2,

title = "Control variates for stochastic gradient MCMC",

abstract = "It is well known that Markov chain Monte Carlo (MCMC) methods scale poorly with dataset size. A popular class of methods for solving this issue is stochastic gradient MCMC (SGMCMC). These methods use a noisy estimate of the gradient of the log-posterior, which reduces the per iteration computational cost of the algorithm. Despite this, there are a number of results suggesting that stochastic gradient Langevin dynamics (SGLD), probably the most popular of these methods, still has computational cost proportional to the dataset size. We suggest an alternative log-posterior gradient estimate for stochastic gradient MCMC which uses control variates to reduce the variance. We analyse SGLD using this gradient estimate, and show that, under log-concavity assumptions on the target distribution, the computational cost required for a given level of accuracy is independent of the dataset size. Next we show that a different control variate technique, known as zero variance control variates, can be applied to SGMCMC algorithms for free. This post-processing step improves the inference of the algorithm by reducing the variance of the MCMC output. Zero variance control variates rely on the gradient of the log-posterior; we explore how the variance reduction is affected by replacing this with the noisy gradient estimate calculated by SGMCMC.",

keywords = "stat.CO, cs.LG, stat.ML",

author = "Jack Baker and Paul Fearnhead and Fox, {Emily B.} and Christopher Nemeth",

note = "The final publication is available at Springer via https://doi.org/10.1007/s11222-018-9826-2",

year = "2019",

month = may,

day = "1",

doi = "10.1007/s11222-018-9826-2",

language = "English",

volume = "29",

pages = "599--615",

journal = "Statistics and Computing",

issn = "0960-3174",

publisher = "Springer Netherlands",

number = "3",

}

RIS

TY - JOUR

T1 - Control variates for stochastic gradient MCMC

AU - Baker, Jack

AU - Fearnhead, Paul

AU - Fox, Emily B.

AU - Nemeth, Christopher

N1 - The final publication is available at Springer via https://doi.org/10.1007/s11222-018-9826-2

PY - 2019/5/1

Y1 - 2019/5/1

N2 - It is well known that Markov chain Monte Carlo (MCMC) methods scale poorly with dataset size. A popular class of methods for solving this issue is stochastic gradient MCMC (SGMCMC). These methods use a noisy estimate of the gradient of the log-posterior, which reduces the per iteration computational cost of the algorithm. Despite this, there are a number of results suggesting that stochastic gradient Langevin dynamics (SGLD), probably the most popular of these methods, still has computational cost proportional to the dataset size. We suggest an alternative log-posterior gradient estimate for stochastic gradient MCMC which uses control variates to reduce the variance. We analyse SGLD using this gradient estimate, and show that, under log-concavity assumptions on the target distribution, the computational cost required for a given level of accuracy is independent of the dataset size. Next we show that a different control variate technique, known as zero variance control variates, can be applied to SGMCMC algorithms for free. This post-processing step improves the inference of the algorithm by reducing the variance of the MCMC output. Zero variance control variates rely on the gradient of the log-posterior; we explore how the variance reduction is affected by replacing this with the noisy gradient estimate calculated by SGMCMC.

AB - It is well known that Markov chain Monte Carlo (MCMC) methods scale poorly with dataset size. A popular class of methods for solving this issue is stochastic gradient MCMC (SGMCMC). These methods use a noisy estimate of the gradient of the log-posterior, which reduces the per iteration computational cost of the algorithm. Despite this, there are a number of results suggesting that stochastic gradient Langevin dynamics (SGLD), probably the most popular of these methods, still has computational cost proportional to the dataset size. We suggest an alternative log-posterior gradient estimate for stochastic gradient MCMC which uses control variates to reduce the variance. We analyse SGLD using this gradient estimate, and show that, under log-concavity assumptions on the target distribution, the computational cost required for a given level of accuracy is independent of the dataset size. Next we show that a different control variate technique, known as zero variance control variates, can be applied to SGMCMC algorithms for free. This post-processing step improves the inference of the algorithm by reducing the variance of the MCMC output. Zero variance control variates rely on the gradient of the log-posterior; we explore how the variance reduction is affected by replacing this with the noisy gradient estimate calculated by SGMCMC.

KW - stat.CO

KW - cs.LG

KW - stat.ML

U2 - 10.1007/s11222-018-9826-2

DO - 10.1007/s11222-018-9826-2

M3 - Journal article

VL - 29

SP - 599

EP - 615

JO - Statistics and Computing

JF - Statistics and Computing

SN - 0960-3174

IS - 3

ER -

Research

Associated organisational units

Electronic data

Links

Text available via DOI:

Keywords