Home > Research > Publications & Outputs > A Bayesian method to incorporate hundreds of fu...

Electronic data

  • 10.1371-journal.pone.0098122

    Rights statement: © 2014 Gagliano et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

    Final published version, 1.29 MB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Links

Text available via DOI:

View graph of relations

A Bayesian method to incorporate hundreds of functional characteristics with association evidence to improve variant prioritization

Research output: Contribution to journalJournal articlepeer-review

Published

Standard

A Bayesian method to incorporate hundreds of functional characteristics with association evidence to improve variant prioritization. / Gagliano, Sarah A.; Barnes, Michael R.; Weale, Michael E.; Knight, Jo.

In: PLoS ONE, Vol. 9, No. 5, e98122, 20.05.2014.

Research output: Contribution to journalJournal articlepeer-review

Harvard

APA

Vancouver

Author

Gagliano, Sarah A. ; Barnes, Michael R. ; Weale, Michael E. ; Knight, Jo. / A Bayesian method to incorporate hundreds of functional characteristics with association evidence to improve variant prioritization. In: PLoS ONE. 2014 ; Vol. 9, No. 5.

Bibtex

@article{7823c054627c463b9aa5a450b0f089b5,
title = "A Bayesian method to incorporate hundreds of functional characteristics with association evidence to improve variant prioritization",
abstract = "The increasing quantity and quality of functional genomic information motivate the assessment and integration of these data with association data, including data originating from genome-wide association studies (GWAS). We used previously described GWAS signals ({"}hits{"}) to train a regularized logistic model in order to predict SNP causality on the basis of a large multivariate functional dataset. We show how this model can be used to derive Bayes factors for integrating functional and association data into a combined Bayesian analysis. Functional characteristics were obtained from the Encyclopedia of DNA Elements (ENCODE), from published expression quantitative trait loci (eQTL), and from other sources of genome-wide characteristics. We trained the model using all GWAS signals combined, and also using phenotype specific signals for autoimmune, brain-related, cancer, and cardiovascular disorders. The non-phenotype specific and the autoimmune GWAS signals gave the most reliable results. We found SNPs with higher probabilities of causality from functional characteristics showed an enrichment of more significant p-values compared to all GWAS SNPs in three large GWAS studies of complex traits. We investigated the ability of our Bayesian method to improve the identification of true causal signals in a psoriasis GWAS dataset and found that combining functional data with association data improves the ability to prioritise novel hits. We used the predictions from the penalized logistic regression model to calculate Bayes factors relating to functional characteristics and supply these online alongside resources to integrate these data with association data.",
keywords = "Bayes Theorem, Cluster Analysis, Computational Biology, Databases, Genetic, Genome-Wide Association Study, Genomics, Humans, Models, Theoretical, Phenotype, Polymorphism, Single Nucleotide, Quantitative Trait, Heritable, ROC Curve, Reproducibility of Results",
author = "Gagliano, {Sarah A.} and Barnes, {Michael R.} and Weale, {Michael E.} and Jo Knight",
note = "{\textcopyright} 2014 Gagliano et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.",
year = "2014",
month = may,
day = "20",
doi = "10.1371/journal.pone.0098122",
language = "English",
volume = "9",
journal = "PLoS ONE",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "5",

}

RIS

TY - JOUR

T1 - A Bayesian method to incorporate hundreds of functional characteristics with association evidence to improve variant prioritization

AU - Gagliano, Sarah A.

AU - Barnes, Michael R.

AU - Weale, Michael E.

AU - Knight, Jo

N1 - © 2014 Gagliano et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

PY - 2014/5/20

Y1 - 2014/5/20

N2 - The increasing quantity and quality of functional genomic information motivate the assessment and integration of these data with association data, including data originating from genome-wide association studies (GWAS). We used previously described GWAS signals ("hits") to train a regularized logistic model in order to predict SNP causality on the basis of a large multivariate functional dataset. We show how this model can be used to derive Bayes factors for integrating functional and association data into a combined Bayesian analysis. Functional characteristics were obtained from the Encyclopedia of DNA Elements (ENCODE), from published expression quantitative trait loci (eQTL), and from other sources of genome-wide characteristics. We trained the model using all GWAS signals combined, and also using phenotype specific signals for autoimmune, brain-related, cancer, and cardiovascular disorders. The non-phenotype specific and the autoimmune GWAS signals gave the most reliable results. We found SNPs with higher probabilities of causality from functional characteristics showed an enrichment of more significant p-values compared to all GWAS SNPs in three large GWAS studies of complex traits. We investigated the ability of our Bayesian method to improve the identification of true causal signals in a psoriasis GWAS dataset and found that combining functional data with association data improves the ability to prioritise novel hits. We used the predictions from the penalized logistic regression model to calculate Bayes factors relating to functional characteristics and supply these online alongside resources to integrate these data with association data.

AB - The increasing quantity and quality of functional genomic information motivate the assessment and integration of these data with association data, including data originating from genome-wide association studies (GWAS). We used previously described GWAS signals ("hits") to train a regularized logistic model in order to predict SNP causality on the basis of a large multivariate functional dataset. We show how this model can be used to derive Bayes factors for integrating functional and association data into a combined Bayesian analysis. Functional characteristics were obtained from the Encyclopedia of DNA Elements (ENCODE), from published expression quantitative trait loci (eQTL), and from other sources of genome-wide characteristics. We trained the model using all GWAS signals combined, and also using phenotype specific signals for autoimmune, brain-related, cancer, and cardiovascular disorders. The non-phenotype specific and the autoimmune GWAS signals gave the most reliable results. We found SNPs with higher probabilities of causality from functional characteristics showed an enrichment of more significant p-values compared to all GWAS SNPs in three large GWAS studies of complex traits. We investigated the ability of our Bayesian method to improve the identification of true causal signals in a psoriasis GWAS dataset and found that combining functional data with association data improves the ability to prioritise novel hits. We used the predictions from the penalized logistic regression model to calculate Bayes factors relating to functional characteristics and supply these online alongside resources to integrate these data with association data.

KW - Bayes Theorem

KW - Cluster Analysis

KW - Computational Biology

KW - Databases, Genetic

KW - Genome-Wide Association Study

KW - Genomics

KW - Humans

KW - Models, Theoretical

KW - Phenotype

KW - Polymorphism, Single Nucleotide

KW - Quantitative Trait, Heritable

KW - ROC Curve

KW - Reproducibility of Results

U2 - 10.1371/journal.pone.0098122

DO - 10.1371/journal.pone.0098122

M3 - Journal article

C2 - 24844982

VL - 9

JO - PLoS ONE

JF - PLoS ONE

SN - 1932-6203

IS - 5

M1 - e98122

ER -