Document Type
Article
Publication Date
1-3-2024
Department
Department of Mathematical Sciences
Abstract
Genome-wide association studies (GWAS) have successfully revealed many disease-associated genetic variants. For a case-control study, the adequate power of an association test can be achieved with a large sample size, although genotyping large samples is expensive. A cost-effective strategy to boost power is to integrate external control samples with publicly available genotyped data. However, the naive integration of external controls may inflate the type I error rates if ignoring the systematic differences (batch effect) between studies, such as the differences in sequencing platforms, genotype-calling procedures, population stratification, and so forth. To account for the batch effect, we propose an approach by integrating External Controls into the Association Test by Regression Calibration (iECAT-RC) in case-control association studies. Extensive simulation studies show that iECAT-RC not only can control type I error rates but also can boost statistical power in all models. We also apply iECAT-RC to the UK Biobank data for M72 Fibroblastic disorders by considering genotype calling as the batch effect. Four SNPs associated with fibroblastic disorders have been detected by iECAT-RC and the other two comparison methods, iECAT-Score and Internal. However, our method has a higher probability of identifying these significant SNPs in the scenario of an unbalanced case-control association study.
Publication Title
Genes
Recommended Citation
Zhu, L.,
Yan, S.,
Cao, X.,
Zhang, S.,
&
Sha, Q.
(2024).
Integrating External Controls by Regression Calibration for Genome-Wide Association Study.
Genes,
15(1).
http://doi.org/10.3390/genes15010067
Retrieved from: https://digitalcommons.mtu.edu/michigantech-p2/426
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Version
Publisher's PDF