Home > Research > Publications & Outputs > Ranking the importance of genetic factors by va...

Electronic data

  • Final Version

    Rights statement: This is the peer reviewed version of the following article: Zheng, C. , Ferrari, D. , Zhang, M. and Baird, P. (2019), Ranking the importance of genetic factors by variable‐selection confidence sets. J. R. Stat. Soc. C, 68: 727-749. doi:10.1111/rssc.12337 which has been published in final form at https://rss.onlinelibrary.wiley.com/doi/full/10.1111/rssc.12337 This article may be used for non-commercial purposes in accordance With Wiley Terms and Conditions for self-archiving.

    Accepted author manuscript, 402 KB, PDF document

    Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Links

Text available via DOI:

View graph of relations

Ranking the importance of genetic factors by variable-selection confidence sets

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

Ranking the importance of genetic factors by variable-selection confidence sets. / Zheng, Chao; Ferrari, Davide; Zhang, Michael; Baird, Paul .

In: Journal of the Royal Statistical Society: Series C (Applied Statistics), Vol. 68, No. 3, 01.04.2019, p. 727-749.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

Zheng, C, Ferrari, D, Zhang, M & Baird, P 2019, 'Ranking the importance of genetic factors by variable-selection confidence sets', Journal of the Royal Statistical Society: Series C (Applied Statistics), vol. 68, no. 3, pp. 727-749. https://doi.org/10.1111/rssc.12337

APA

Zheng, C., Ferrari, D., Zhang, M., & Baird, P. (2019). Ranking the importance of genetic factors by variable-selection confidence sets. Journal of the Royal Statistical Society: Series C (Applied Statistics), 68(3), 727-749. https://doi.org/10.1111/rssc.12337

Vancouver

Zheng C, Ferrari D, Zhang M, Baird P. Ranking the importance of genetic factors by variable-selection confidence sets. Journal of the Royal Statistical Society: Series C (Applied Statistics). 2019 Apr 1;68(3):727-749. https://doi.org/10.1111/rssc.12337

Author

Zheng, Chao ; Ferrari, Davide ; Zhang, Michael ; Baird, Paul . / Ranking the importance of genetic factors by variable-selection confidence sets. In: Journal of the Royal Statistical Society: Series C (Applied Statistics). 2019 ; Vol. 68, No. 3. pp. 727-749.

Bibtex

@article{221a83063f5443078bc7d05a85804139,
title = "Ranking the importance of genetic factors by variable-selection confidence sets",
abstract = "The widespread use of generalized linear models in case–control genetic studies has helped to identify many disease-associated risk factors typically defined as DNA variants, or single-nucleotide polymorphisms (SNPs). Up to now, most literature has focused on selecting a unique best subset of SNPs based on some statistical perspective. When the noise is large compared with the signal, however, multiple biological paths are often found to be supported by a given data set. We address the ambiguity related to SNP selection by constructing a list of models—called a variable-selection confidence set (VSCS)—which contains the collection of all well-supported SNP combinations at a user-specified confidence level. The VSCS extends the familiar notion of confidence intervals in the variable-selection setting and provides the practitioner with new tools aiding the variable-selection activity beyond trusting a single model. On the basis of the VSCS, we consider natural graphical and numerical statistics measuring the inclusion importance of an SNP based on its frequency in the most parsimonious VSCS models. This work is motivated by available case–control genetic data on age-related macular degeneration, which is a widespread disease and leading cause of loss of vision. {\textcopyright} 2019 Royal Statistical Society",
keywords = "Age-related macular degeneration, Case–control genotype data, Likelihood ratio test, Predictor ranking, Variable-selection confidence set",
author = "Chao Zheng and Davide Ferrari and Michael Zhang and Paul Baird",
note = "This is the peer reviewed version of the following article: Zheng, C. , Ferrari, D. , Zhang, M. and Baird, P. (2019), Ranking the importance of genetic factors by variable‐selection confidence sets. J. R. Stat. Soc. C, 68: 727-749. doi:10.1111/rssc.12337 which has been published in final form at https://rss.onlinelibrary.wiley.com/doi/full/10.1111/rssc.12337 This article may be used for non-commercial purposes in accordance With Wiley Terms and Conditions for self-archiving. ",
year = "2019",
month = apr,
day = "1",
doi = "10.1111/rssc.12337",
language = "English",
volume = "68",
pages = "727--749",
journal = "Journal of the Royal Statistical Society: Series C (Applied Statistics)",
issn = "0035-9254",
publisher = "Wiley-Blackwell",
number = "3",

}

RIS

TY - JOUR

T1 - Ranking the importance of genetic factors by variable-selection confidence sets

AU - Zheng, Chao

AU - Ferrari, Davide

AU - Zhang, Michael

AU - Baird, Paul

N1 - This is the peer reviewed version of the following article: Zheng, C. , Ferrari, D. , Zhang, M. and Baird, P. (2019), Ranking the importance of genetic factors by variable‐selection confidence sets. J. R. Stat. Soc. C, 68: 727-749. doi:10.1111/rssc.12337 which has been published in final form at https://rss.onlinelibrary.wiley.com/doi/full/10.1111/rssc.12337 This article may be used for non-commercial purposes in accordance With Wiley Terms and Conditions for self-archiving.

PY - 2019/4/1

Y1 - 2019/4/1

N2 - The widespread use of generalized linear models in case–control genetic studies has helped to identify many disease-associated risk factors typically defined as DNA variants, or single-nucleotide polymorphisms (SNPs). Up to now, most literature has focused on selecting a unique best subset of SNPs based on some statistical perspective. When the noise is large compared with the signal, however, multiple biological paths are often found to be supported by a given data set. We address the ambiguity related to SNP selection by constructing a list of models—called a variable-selection confidence set (VSCS)—which contains the collection of all well-supported SNP combinations at a user-specified confidence level. The VSCS extends the familiar notion of confidence intervals in the variable-selection setting and provides the practitioner with new tools aiding the variable-selection activity beyond trusting a single model. On the basis of the VSCS, we consider natural graphical and numerical statistics measuring the inclusion importance of an SNP based on its frequency in the most parsimonious VSCS models. This work is motivated by available case–control genetic data on age-related macular degeneration, which is a widespread disease and leading cause of loss of vision. © 2019 Royal Statistical Society

AB - The widespread use of generalized linear models in case–control genetic studies has helped to identify many disease-associated risk factors typically defined as DNA variants, or single-nucleotide polymorphisms (SNPs). Up to now, most literature has focused on selecting a unique best subset of SNPs based on some statistical perspective. When the noise is large compared with the signal, however, multiple biological paths are often found to be supported by a given data set. We address the ambiguity related to SNP selection by constructing a list of models—called a variable-selection confidence set (VSCS)—which contains the collection of all well-supported SNP combinations at a user-specified confidence level. The VSCS extends the familiar notion of confidence intervals in the variable-selection setting and provides the practitioner with new tools aiding the variable-selection activity beyond trusting a single model. On the basis of the VSCS, we consider natural graphical and numerical statistics measuring the inclusion importance of an SNP based on its frequency in the most parsimonious VSCS models. This work is motivated by available case–control genetic data on age-related macular degeneration, which is a widespread disease and leading cause of loss of vision. © 2019 Royal Statistical Society

KW - Age-related macular degeneration

KW - Case–control genotype data

KW - Likelihood ratio test

KW - Predictor ranking

KW - Variable-selection confidence set

U2 - 10.1111/rssc.12337

DO - 10.1111/rssc.12337

M3 - Journal article

VL - 68

SP - 727

EP - 749

JO - Journal of the Royal Statistical Society: Series C (Applied Statistics)

JF - Journal of the Royal Statistical Society: Series C (Applied Statistics)

SN - 0035-9254

IS - 3

ER -