Rights statement: This is the author’s version of a work that was accepted for publication in American Journal of Human Genetics . Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in American Journal of Human Genetics, 102 (6), 2018 DOI: 10.1016/j.ajhg.2018.03.021
Accepted author manuscript, 777 KB, PDF document
Available under license: CC BY-NC-ND
Final published version
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Estimation of Genetic Correlation via Linkage Disequilibrium Score Regression and Genomic Restricted Maximum Likelihood
AU - Ni, Guiyan
AU - Moser, Gerhard
AU - Wray, Naomi R
AU - Lee, S Hong
AU - Knight, Jo
AU - Schizophrenia Working Group of the Psychiatric Genomics Consortium
N1 - This is the author’s version of a work that was accepted for publication in American Journal of Human Genetics . Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in American Journal of Human Genetics, 102 (6), 2018 DOI: 10.1016/j.ajhg.2018.03.021
PY - 2018/6/7
Y1 - 2018/6/7
N2 - Genetic correlation is a key population parameter that describes the shared genetic architecture of complex traits and diseases. It can be estimated by current state-of-art methods, i.e., linkage disequilibrium score regression (LDSC) and genomic restricted maximum likelihood (GREML). The massively reduced computing burden of LDSC compared to GREML makes it an attractive tool, although the accuracy (i.e., magnitude of standard errors) of LDSC estimates has not been thoroughly studied. In simulation, we show that the accuracy of GREML is generally higher than that of LDSC. When there is genetic heterogeneity between the actual sample and reference data from which LD scores are estimated, the accuracy of LDSC decreases further. In real data analyses estimating the genetic correlation between schizophrenia (SCZ) and body mass index, we show that GREML estimates based on ∼150,000 individuals give a higher accuracy than LDSC estimates based on ∼400,000 individuals (from combined meta-data). A GREML genomic partitioning analysis reveals that the genetic correlation between SCZ and height is significantly negative for regulatory regions, which whole genome or LDSC approach has less power to detect. We conclude that LDSC estimates should be carefully interpreted as there can be uncertainty about homogeneity among combined meta-datasets. We suggest that any interesting findings from massive LDSC analysis for a large number of complex traits should be followed up, where possible, with more detailed analyses with GREML methods, even if sample sizes are lesser.
AB - Genetic correlation is a key population parameter that describes the shared genetic architecture of complex traits and diseases. It can be estimated by current state-of-art methods, i.e., linkage disequilibrium score regression (LDSC) and genomic restricted maximum likelihood (GREML). The massively reduced computing burden of LDSC compared to GREML makes it an attractive tool, although the accuracy (i.e., magnitude of standard errors) of LDSC estimates has not been thoroughly studied. In simulation, we show that the accuracy of GREML is generally higher than that of LDSC. When there is genetic heterogeneity between the actual sample and reference data from which LD scores are estimated, the accuracy of LDSC decreases further. In real data analyses estimating the genetic correlation between schizophrenia (SCZ) and body mass index, we show that GREML estimates based on ∼150,000 individuals give a higher accuracy than LDSC estimates based on ∼400,000 individuals (from combined meta-data). A GREML genomic partitioning analysis reveals that the genetic correlation between SCZ and height is significantly negative for regulatory regions, which whole genome or LDSC approach has less power to detect. We conclude that LDSC estimates should be carefully interpreted as there can be uncertainty about homogeneity among combined meta-datasets. We suggest that any interesting findings from massive LDSC analysis for a large number of complex traits should be followed up, where possible, with more detailed analyses with GREML methods, even if sample sizes are lesser.
KW - linkage disequilibrium score regression
KW - genomic restricted maximum likelihood
KW - genetic correlation
KW - schizophrenia
KW - body mass index
KW - height
KW - SNP heritability
KW - accuracy
KW - biasedness
KW - genome-wide SNPs
U2 - 10.1016/j.ajhg.2018.03.021
DO - 10.1016/j.ajhg.2018.03.021
M3 - Journal article
C2 - 29754766
VL - 102
SP - 1185
EP - 1194
JO - American Journal of Human Genetics
JF - American Journal of Human Genetics
SN - 0002-9297
IS - 6
ER -