Off-campus Michigan Tech users: To download campus access theses or dissertations, please use the following button to log in with your Michigan Tech ID and password: log in to proxy server

Non-Michigan Tech users: Please talk to your librarian about requesting this thesis or dissertation through interlibrary loan.

Date of Award

2025

Document Type

Campus Access Dissertation

Degree Name

Doctor of Philosophy in Statistics (PhD)

Administrative Home Department

Department of Mathematical Sciences

Advisor 1

Kui Zhang

Committee Member 1

Xiao Zhang

Committee Member 2

Qiuying Sha

Committee Member 3

Hairong Wei

Abstract

This dissertation consists of three chapters, with a brief overview of each chapter provided below.

In Chapter One, we proposed a latent variable model pairwise likelihood to jointly model genotypes, multiple types of phenotypes and covariates efficiently. We proposed a new Wald-type test statistic to test the conditional independence between genotypes and phenotypes after adjusting for covariates. This method preserves the ordinal nature of genotypes and ordinal phenotypes and does not require commonly assumptions used in association studies to detect genetic variants that are associated with phenotypes related to human complex diseases, such as assumptions for continuity and linearity, thus provides enhanced flexibility. Additionally, it explores covariance structures and can efficiently handle large-scale data. Simulations demonstrated that the proposed method had well-controlled Type I error rates and higher power than existing methods. Real data analysis based on the Genetics of Kidneys in Diabetes (GoKinD) study identified novel genetic variants that are associated with type 1 diabetes.

In Chapter Two, we developed a robust and flexible Tweedie-imputation method to tackle zero inflation and high taxa variability in microbiome data. Our approach preserves index variability, imputes missing values using dataset-derived averages, and adjusts for sequencing depth. Simulations showed its superiority over existing methods and its robustness. Real data analysis identified more significant taxa than other existing methods.

In Chapter Three, we proposed a regression-based method that utilizes the latent variable model and the pairwise likelihood to detect genetic variants that are associated with phenotypes related to human complex diseases. This method can efficiently model a large number of multiple types of phenotypes, genotypes, and covariates. Comparing with the method proposed in Chapter One, this method offers better interpretations since it can quantify the genetic effect on phenotypes. Simulations confirmed that the proposed method has well-controlled Type I error rates and high power.

Share

COinS