Date of Award
2025
Document Type
Campus Access Dissertation
Degree Name
Doctor of Philosophy in Mathematical Sciences (PhD)
Administrative Home Department
Department of Mathematical Sciences
Advisor 1
Qiuying Sha
Committee Member 1
Kui Zhang
Committee Member 2
Xiao Zhang
Committee Member 3
Dukka KC
Abstract
In chapter one, we proposed a novel multiple phenotype association test methods to test the association between multiple phenotypes and a SNP. Genome-wide association studies (GWAS) have identified many strongly associated genetic variants with phenotypes and have greatly enhanced our understanding of the genetic architecture of complex phenotypes and diseases. In this study, we proposed a new method to construct a Phenotype-Phenotype Network using the sparse Gaussian Graphical Model (sGGM) based on GWAS summary statistics. This approach will isolate the direct relationship between phenotypes, making it easier to identify clusters of phenotypes conditional on other phenotypes that reflect the more meaningful biological or functional connections. Then we applied the community detection method to partition phenotypes into disjoint modules based on the partial correlation matrix of phenotypes and applied multiple phenotype test methods on each of the clusters and combined p-values to get the final P-value. The comprehensive simulation study showed that our method can control the type I error rate effectively and has the highest power compared to the other methods we compared. Application of this method to the GWAS summary statistics of 92 phenotypes from UK-Biobank data has identified higher number of significant SNPs than other methods. Downstream analysis of the significant SNPs shows the functional and biological importance.
The complex traits are often the result of the combined effect of multiple genetic variants, each with a small individual effect and these small effects may be missed due to stringent p-value thresholds in GWAS studies. Gene-based association tests combine the effects of multiple variants within a gene and provide more interpretable associations and enable researchers to detect associations that might be missed in single-SNP analyses. Gene-based tests increase statistical power by considering the cumulative effect of several variants in a gene, reducing the multiple testing burden and prioritizing variants based on their likelihood of contributing to the phenotype. The power of the gene-based test depended on the genetic architecture of the trait and the weight of SNPs in a gene. The genetic architecture of the complex trait is not known in advance and the power of the gene-based association tests decreases if the weight is mis-specified. The equal weighting assumption implies that all SNPs within the gene have an equal effect on the trait. On the other hand, minor allele frequency (MAF)-based weights, such as inverse or beta distribution of MAF, upweight rare variants under the assumption that they are more likely to be deleterious. These weighting schemes also ignore the underlying biological function of SNPs, which, if integrated, can enhance the power of gene-based tests by prioritizing functionally relevant SNPs. In this study, we propose a novel gene-based test approach (H2-Gene) that weights SNPs according to their heritability estimated using multiple functional annotations. The comprehensive simulation studies showed that this approach can control the false discovery rate very well and yield the highest power compared to MAF weight and equal weight. When applied to GWAS summary statistics for three different traits: Schizophrenia, Bipolar Disorder, and Attention Deficit Hyperactivity Disorder, H2-Gene identified a greater number of significant genes relative to equal and MAF-based weighting schemes. The downstream analysis of the significant genes showed biological and functional importance of the genes.
Recommended Citation
Subedi, Megh Raj, "Novel Statistical Methods for Multiple Phenotype and Gene Based Association Tests", Campus Access Dissertation, Michigan Technological University, 2025.
https://digitalcommons.mtu.edu/etdr/1975