Home > Research > Publications & Outputs > Improved variational Bayes inference for transc...
View graph of relations

Improved variational Bayes inference for transcript expression estimation

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

Improved variational Bayes inference for transcript expression estimation. / Papastamoulis, Panagiotis; Hensman, James; Glaus, Peter et al.
In: Statistical Applications in Genetics and Molecular Biology, Vol. 13, No. 2, 04.2014, p. 203-216.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

Papastamoulis, P, Hensman, J, Glaus, P & Rattray, M 2014, 'Improved variational Bayes inference for transcript expression estimation', Statistical Applications in Genetics and Molecular Biology, vol. 13, no. 2, pp. 203-216. https://doi.org/10.1515/sagmb-2013-0054

APA

Papastamoulis, P., Hensman, J., Glaus, P., & Rattray, M. (2014). Improved variational Bayes inference for transcript expression estimation. Statistical Applications in Genetics and Molecular Biology, 13(2), 203-216. https://doi.org/10.1515/sagmb-2013-0054

Vancouver

Papastamoulis P, Hensman J, Glaus P, Rattray M. Improved variational Bayes inference for transcript expression estimation. Statistical Applications in Genetics and Molecular Biology. 2014 Apr;13(2):203-216. Epub 2014 Jan 10. doi: 10.1515/sagmb-2013-0054

Author

Papastamoulis, Panagiotis ; Hensman, James ; Glaus, Peter et al. / Improved variational Bayes inference for transcript expression estimation. In: Statistical Applications in Genetics and Molecular Biology. 2014 ; Vol. 13, No. 2. pp. 203-216.

Bibtex

@article{5c2e44279c104a15971e54276e9d51d2,
title = "Improved variational Bayes inference for transcript expression estimation",
abstract = "RNA-seq studies allow for the quantification of transcript expression by aligning millions of short reads to a reference genome. However, transcripts share much of their sequence, so that many reads map to more than one place and their origin remains uncertain. This problem can be dealt using mixtures of distributions and transcript expression reduces to estimating the weights of the mixture. In this paper, variational Bayesian (VB) techniques are used in order to approximate the posterior distribution of transcript expression. VB has previously been shown to be more computationally efficient for this problem than Markov chain Monte Carlo. VB methodology can precisely estimate the posterior means, but leads to variance underestimation. For this reason, a novel approach is introduced which integrates the latent allocation variables out of the VB approximation. It is shown that this modification leads to a better marginal likelihood bound and improved estimate of the posterior variance. A set of simulation studies and application to real RNA-seq datasets highlight the improved performance of the proposed method.",
keywords = "BitSeq, Generalized Dirichlet distribution, Kullback-Leibler divergence, Marginal likelihood bound, Mixture model",
author = "Panagiotis Papastamoulis and James Hensman and Peter Glaus and Magnus Rattray",
year = "2014",
month = apr,
doi = "10.1515/sagmb-2013-0054",
language = "English",
volume = "13",
pages = "203--216",
journal = "Statistical Applications in Genetics and Molecular Biology",
issn = "2194-6302",
publisher = "Berkeley Electronic Press",
number = "2",

}

RIS

TY - JOUR

T1 - Improved variational Bayes inference for transcript expression estimation

AU - Papastamoulis, Panagiotis

AU - Hensman, James

AU - Glaus, Peter

AU - Rattray, Magnus

PY - 2014/4

Y1 - 2014/4

N2 - RNA-seq studies allow for the quantification of transcript expression by aligning millions of short reads to a reference genome. However, transcripts share much of their sequence, so that many reads map to more than one place and their origin remains uncertain. This problem can be dealt using mixtures of distributions and transcript expression reduces to estimating the weights of the mixture. In this paper, variational Bayesian (VB) techniques are used in order to approximate the posterior distribution of transcript expression. VB has previously been shown to be more computationally efficient for this problem than Markov chain Monte Carlo. VB methodology can precisely estimate the posterior means, but leads to variance underestimation. For this reason, a novel approach is introduced which integrates the latent allocation variables out of the VB approximation. It is shown that this modification leads to a better marginal likelihood bound and improved estimate of the posterior variance. A set of simulation studies and application to real RNA-seq datasets highlight the improved performance of the proposed method.

AB - RNA-seq studies allow for the quantification of transcript expression by aligning millions of short reads to a reference genome. However, transcripts share much of their sequence, so that many reads map to more than one place and their origin remains uncertain. This problem can be dealt using mixtures of distributions and transcript expression reduces to estimating the weights of the mixture. In this paper, variational Bayesian (VB) techniques are used in order to approximate the posterior distribution of transcript expression. VB has previously been shown to be more computationally efficient for this problem than Markov chain Monte Carlo. VB methodology can precisely estimate the posterior means, but leads to variance underestimation. For this reason, a novel approach is introduced which integrates the latent allocation variables out of the VB approximation. It is shown that this modification leads to a better marginal likelihood bound and improved estimate of the posterior variance. A set of simulation studies and application to real RNA-seq datasets highlight the improved performance of the proposed method.

KW - BitSeq

KW - Generalized Dirichlet distribution

KW - Kullback-Leibler divergence

KW - Marginal likelihood bound

KW - Mixture model

U2 - 10.1515/sagmb-2013-0054

DO - 10.1515/sagmb-2013-0054

M3 - Journal article

AN - SCOPUS:84898659983

VL - 13

SP - 203

EP - 216

JO - Statistical Applications in Genetics and Molecular Biology

JF - Statistical Applications in Genetics and Molecular Biology

SN - 2194-6302

IS - 2

ER -