Fast and accurate approximate inference of transcript expression from RNA-seq data

Data Science Institute

Text available via DOI:

https://doi.org/10.1093/bioinformatics/btv483
Final published version

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

Fast and accurate approximate inference of transcript expression from RNA-seq data. / Hensman, James; Papastamoulis, Panagiotis; Glaus, Peter et al.
In: Bioinformatics, Vol. 31, No. 24, 15.12.2015, p. 3881-3889.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Hensman, J, Papastamoulis, P, Glaus, P, Honkela, A & Rattray, M 2015, 'Fast and accurate approximate inference of transcript expression from RNA-seq data', Bioinformatics, vol. 31, no. 24, pp. 3881-3889. https://doi.org/10.1093/bioinformatics/btv483

APA

Hensman, J., Papastamoulis, P., Glaus, P., Honkela, A., & Rattray, M. (2015). Fast and accurate approximate inference of transcript expression from RNA-seq data. Bioinformatics, 31(24), 3881-3889. https://doi.org/10.1093/bioinformatics/btv483

Vancouver

Hensman J, Papastamoulis P, Glaus P, Honkela A, Rattray M. Fast and accurate approximate inference of transcript expression from RNA-seq data. Bioinformatics. 2015 Dec 15;31(24):3881-3889. Epub 2015 Aug 26. doi: 10.1093/bioinformatics/btv483

Author

Hensman, James ; Papastamoulis, Panagiotis ; Glaus, Peter et al. / Fast and accurate approximate inference of transcript expression from RNA-seq data. In: Bioinformatics. 2015 ; Vol. 31, No. 24. pp. 3881-3889.

Bibtex

@article{7cd119f287b841eea78afaa77032e959,

title = "Fast and accurate approximate inference of transcript expression from RNA-seq data",

abstract = "Motivation: Assigning RNA-seq reads to their transcript of origin is a fundamental task in transcript expression estimation. Where ambiguities in assignments exist due to transcripts sharing sequence, e.g. alternative isoforms or alleles, the problem can be solved through probabilistic inference. Bayesian methods have been shown to provide accurate transcript abundance estimates compared with competing methods. However, exact Bayesian inference is intractable and approximate methods such as Markov chain Monte Carlo and Variational Bayes (VB) are typically used. While providing a high degree of accuracy and modelling flexibility, standard implementations can be prohibitively slow for large datasets and complex transcriptome annotations. Results: We propose a novel approximate inference scheme based on VB and apply it to an existing model of transcript expression inference from RNA-seq data. Recent advances in VB algorithmics are used to improve the convergence of the algorithm beyond the standard Variational Bayes Expectation Maximization algorithm. We apply our algorithm to simulated and biological datasets, demonstrating a significant increase in speed with only very small loss in accuracy of expression level estimation. We carry out a comparative study against seven popular alternative methods and demonstrate that our new algorithm provides excellent accuracy and inter-replicate consistency while remaining competitive in computation time.",

author = "James Hensman and Panagiotis Papastamoulis and Peter Glaus and Antti Honkela and Magnus Rattray",

year = "2015",

month = dec,

day = "15",

doi = "10.1093/bioinformatics/btv483",

language = "English",

volume = "31",

pages = "3881--3889",

journal = "Bioinformatics",

issn = "1367-4803",

publisher = "Oxford University Press",

number = "24",

}

RIS

TY - JOUR

T1 - Fast and accurate approximate inference of transcript expression from RNA-seq data

AU - Hensman, James

AU - Papastamoulis, Panagiotis

AU - Glaus, Peter

AU - Honkela, Antti

AU - Rattray, Magnus

PY - 2015/12/15

Y1 - 2015/12/15

N2 - Motivation: Assigning RNA-seq reads to their transcript of origin is a fundamental task in transcript expression estimation. Where ambiguities in assignments exist due to transcripts sharing sequence, e.g. alternative isoforms or alleles, the problem can be solved through probabilistic inference. Bayesian methods have been shown to provide accurate transcript abundance estimates compared with competing methods. However, exact Bayesian inference is intractable and approximate methods such as Markov chain Monte Carlo and Variational Bayes (VB) are typically used. While providing a high degree of accuracy and modelling flexibility, standard implementations can be prohibitively slow for large datasets and complex transcriptome annotations. Results: We propose a novel approximate inference scheme based on VB and apply it to an existing model of transcript expression inference from RNA-seq data. Recent advances in VB algorithmics are used to improve the convergence of the algorithm beyond the standard Variational Bayes Expectation Maximization algorithm. We apply our algorithm to simulated and biological datasets, demonstrating a significant increase in speed with only very small loss in accuracy of expression level estimation. We carry out a comparative study against seven popular alternative methods and demonstrate that our new algorithm provides excellent accuracy and inter-replicate consistency while remaining competitive in computation time.

AB - Motivation: Assigning RNA-seq reads to their transcript of origin is a fundamental task in transcript expression estimation. Where ambiguities in assignments exist due to transcripts sharing sequence, e.g. alternative isoforms or alleles, the problem can be solved through probabilistic inference. Bayesian methods have been shown to provide accurate transcript abundance estimates compared with competing methods. However, exact Bayesian inference is intractable and approximate methods such as Markov chain Monte Carlo and Variational Bayes (VB) are typically used. While providing a high degree of accuracy and modelling flexibility, standard implementations can be prohibitively slow for large datasets and complex transcriptome annotations. Results: We propose a novel approximate inference scheme based on VB and apply it to an existing model of transcript expression inference from RNA-seq data. Recent advances in VB algorithmics are used to improve the convergence of the algorithm beyond the standard Variational Bayes Expectation Maximization algorithm. We apply our algorithm to simulated and biological datasets, demonstrating a significant increase in speed with only very small loss in accuracy of expression level estimation. We carry out a comparative study against seven popular alternative methods and demonstrate that our new algorithm provides excellent accuracy and inter-replicate consistency while remaining competitive in computation time.

U2 - 10.1093/bioinformatics/btv483

DO - 10.1093/bioinformatics/btv483

M3 - Journal article

AN - SCOPUS:84950238720

VL - 31

SP - 3881

EP - 3889

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 24

ER -

Research

Links

Text available via DOI: