Michigan Tech Publications, Part 1

An autoencoder-based deep learning method for genotype imputation

Document Type

Article

Publication Date

11-3-2022

Department

College of Computing

Abstract

Genotype imputation has a wide range of applications in genome-wide association study (GWAS), including increasing the statistical power of association tests, discovering trait-associated loci in meta-analyses, and prioritizing causal variants with fine-mapping. In recent years, deep learning (DL) based methods, such as sparse convolutional denoising autoencoder (SCDA), have been developed for genotype imputation. However, it remains a challenging task to optimize the learning process in DL-based methods to achieve high imputation accuracy. To address this challenge, we have developed a convolutional autoencoder (AE) model for genotype imputation and implemented a customized training loop by modifying the training process with a single batch loss rather than the average loss over batches. This modified AE imputation model was evaluated using a yeast dataset, the human leukocyte antigen (HLA) data from the 1,000 Genomes Project (1KGP), and our in-house genotype data from the Louisiana Osteoporosis Study (LOS). Our modified AE imputation model has achieved comparable or better performance than the existing SCDA model in terms of evaluation metrics such as the concordance rate (CR), the Hellinger score, the scaled Euclidean norm (SEN) score, and the imputation quality score (IQS) in all three datasets. Taking the imputation results from the HLA data as an example, the AE model achieved an average CR of 0.9468 and 0.9459, Hellinger score of 0.9765 and 0.9518, SEN score of 0.9977 and 0.9953, and IQS of 0.9515 and 0.9044 at missing ratios of 10% and 20%, respectively. As for the results of LOS data, it achieved an average CR of 0.9005, Hellinger score of 0.9384, SEN score of 0.9940, and IQS of 0.8681 at the missing ratio of 20%. In summary, our proposed method for genotype imputation has a great potential to increase the statistical power of GWAS and improve downstream post-GWAS analyses.

Publisher's Statement

Publication Title

Frontiers in Artificial Intelligence

Recommended Citation

Song, M., Greenbaum, J., Luttrell, J., Zhou, W., Wu, C., Luo, Z., Qiu, C., Zhao, L., Su, K., Tian, Q., Shen, H., Hong, H., Gong, P., Shi, X., Deng, H., & Zhang, C. (2022). An autoencoder-based deep learning method for genotype imputation. Frontiers in Artificial Intelligence, 5. http://doi.org/10.3389/frai.2022.1028978
Retrieved from: https://digitalcommons.mtu.edu/michigantech-p/16579

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Version

Publisher's PDF

Download

Included in

Computer Sciences Commons

COinS

Michigan Tech Publications, Part 1

An autoencoder-based deep learning method for genotype imputation

Document Type

Publication Date

Department

Abstract

Publisher's Statement

Publication Title

Recommended Citation

Creative Commons License

Version

Included in

LINKS

Browse

Search

Author Corner

Links

Michigan Tech Publications, Part 1

An autoencoder-based deep learning method for genotype imputation

Authors

Document Type

Publication Date

Department

Abstract

Publisher's Statement

Publication Title

Recommended Citation

Creative Commons License

Version

Included in

Share

LINKS

Browse

Search

Author Corner

Links