Home > Research > Publications & Outputs > Merging MCMC subposteriors through Gaussian-Pro...

Links

Text available via DOI:

View graph of relations

Merging MCMC subposteriors through Gaussian-Process Approximations

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

Merging MCMC subposteriors through Gaussian-Process Approximations. / Nemeth, Christopher; Sherlock, Christopher Gerrard.
In: Bayesian Analysis, Vol. 13, No. 2, 03.2018, p. 507-530.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

APA

Vancouver

Nemeth C, Sherlock CG. Merging MCMC subposteriors through Gaussian-Process Approximations. Bayesian Analysis. 2018 Mar;13(2):507-530. Epub 2017 Aug 9. doi: 10.1214/17-BA1063

Author

Bibtex

@article{ebdb2177e94d4a81947ae2fe63c500b3,
title = "Merging MCMC subposteriors through Gaussian-Process Approximations",
abstract = "Markov chain Monte Carlo (MCMC) algorithms have become powerful tools for Bayesian inference. However, they do not scale well to large-data problems. Divide-and-conquer strategies, which split the data into batches and, for each batch, run independent MCMC algorithms targeting the corresponding subposterior, can spread the computational burden across a number of separate computer cores. The challenge with such strategies is in recombining the subposteriors to approximate the full posterior. By creating a Gaussian-process approximation for each log-subposterior density we create a tractable approximation for the full posterior. This approximation is exploited through three methodologies: firstly a Hamiltonian Monte Carlo algorithm targeting the expectation of the posterior density provides a sample from an approximation to the posterior; secondly,evaluating the true posterior at the sampled points leads to an importance sampler that, asymptotically, targets the true posterior expectations; finally, an alternative importance sampler uses the full Gaussian-process distribution of the approximation to the log-posterior density to re-weight any initial sample and provide both an estimate of the posterior expectation and a measure of the uncertainty in it.",
keywords = "stat.CO, stat.ML, Big data, Markov chain Monte Carlo, Gaussian processes, distributed importance sampling",
author = "Christopher Nemeth and Sherlock, {Christopher Gerrard}",
year = "2018",
month = mar,
doi = "10.1214/17-BA1063",
language = "English",
volume = "13",
pages = "507--530",
journal = "Bayesian Analysis",
issn = "1936-0975",
publisher = "Carnegie Mellon University",
number = "2",

}

RIS

TY - JOUR

T1 - Merging MCMC subposteriors through Gaussian-Process Approximations

AU - Nemeth, Christopher

AU - Sherlock, Christopher Gerrard

PY - 2018/3

Y1 - 2018/3

N2 - Markov chain Monte Carlo (MCMC) algorithms have become powerful tools for Bayesian inference. However, they do not scale well to large-data problems. Divide-and-conquer strategies, which split the data into batches and, for each batch, run independent MCMC algorithms targeting the corresponding subposterior, can spread the computational burden across a number of separate computer cores. The challenge with such strategies is in recombining the subposteriors to approximate the full posterior. By creating a Gaussian-process approximation for each log-subposterior density we create a tractable approximation for the full posterior. This approximation is exploited through three methodologies: firstly a Hamiltonian Monte Carlo algorithm targeting the expectation of the posterior density provides a sample from an approximation to the posterior; secondly,evaluating the true posterior at the sampled points leads to an importance sampler that, asymptotically, targets the true posterior expectations; finally, an alternative importance sampler uses the full Gaussian-process distribution of the approximation to the log-posterior density to re-weight any initial sample and provide both an estimate of the posterior expectation and a measure of the uncertainty in it.

AB - Markov chain Monte Carlo (MCMC) algorithms have become powerful tools for Bayesian inference. However, they do not scale well to large-data problems. Divide-and-conquer strategies, which split the data into batches and, for each batch, run independent MCMC algorithms targeting the corresponding subposterior, can spread the computational burden across a number of separate computer cores. The challenge with such strategies is in recombining the subposteriors to approximate the full posterior. By creating a Gaussian-process approximation for each log-subposterior density we create a tractable approximation for the full posterior. This approximation is exploited through three methodologies: firstly a Hamiltonian Monte Carlo algorithm targeting the expectation of the posterior density provides a sample from an approximation to the posterior; secondly,evaluating the true posterior at the sampled points leads to an importance sampler that, asymptotically, targets the true posterior expectations; finally, an alternative importance sampler uses the full Gaussian-process distribution of the approximation to the log-posterior density to re-weight any initial sample and provide both an estimate of the posterior expectation and a measure of the uncertainty in it.

KW - stat.CO

KW - stat.ML

KW - Big data

KW - Markov chain Monte Carlo

KW - Gaussian processes

KW - distributed importance sampling

U2 - 10.1214/17-BA1063

DO - 10.1214/17-BA1063

M3 - Journal article

VL - 13

SP - 507

EP - 530

JO - Bayesian Analysis

JF - Bayesian Analysis

SN - 1936-0975

IS - 2

ER -