Network construction using sparse Gaussian graphical model based on GWAS summary statistics
Document Type
Article
Publication Date
12-1-2025
Abstract
In genome-wide association studies (GWAS), thousands of genetic variants are tested to identify their associations with a phenotype. GWAS have identified many strongly associated genetic variants with phenotypes and have greatly enhanced our understanding of the genetic architecture of complex phenotypes and diseases. Joint analysis of multiple phenotypes can increase the overall statistical power to detect genetic associations and allows for the identification of pleiotropic loci. A phenotype-phenotype network (PPN) represents phenotypes as nodes and the relationships between them as edges, which allows for the visualization of complex relationships between phenotypes, making it easier to identify clusters and helps us intuitively grasp how different phenotypes are related. In this study, we propose a new method to construct a PPN using the sparse Gaussian Graphical Model (sGGM) based on GWAS summary statistics. This approach isolates the direct relationship between phenotypes, making it easier to identify clusters of phenotypes conditional on other phenotypes that reflect more meaningful biological or functional connections. We then applied a community detection method to partition phenotypes into disjoint modules based on the partial correlation matrix of phenotypes. For each module, various multiple phenotype association tests can be employed to test the association between a SNP and phenotypes in that module. We conducted a comprehensive simulation study to compare the performance of several multiple phenotype association tests using the network modules obtained from sGGM, the correlation matrix, as well as using all phenotypes without modular segmentation. The simulation results demonstrated that most of the multiple phenotype association tests based on network modules from sGGM not only effectively control the Type I error rate but also exhibit higher power compared to network modules derived from the correlation matrix and the association tests on all phenotypes without modular segmentation. We applied this method to the GWAS summary statistics of 92 phenotypes derived from Chapter IX (Diseases of the circulatory system) of ICD-10 codes in the UK Biobank. The results showed that applying multiple phenotype association tests using network modules from the sGGM detected more significant SNPs than using the network modules from the correlation matrix.
Publication Title
Scientific Reports
Recommended Citation
Subedi, M.,
Cao, X.,
Kim, B.,
&
Sha, Q.
(2025).
Network construction using sparse Gaussian graphical model based on GWAS summary statistics.
Scientific Reports,
15(1).
http://doi.org/10.1038/s41598-025-22475-4
Retrieved from: https://digitalcommons.mtu.edu/michigantech-p2/2142