Michigan Tech Publications, Part 2

TGPred: efficient methods for predicting target genes of a transcription factor by integrating statistics, machine learning and optimization

Xuewei Cao, Michigan Technological University
Ling Zhang, Michigan Technological University
Md Khairul Islam, Michigan Technological University
Mingxia Zhao, Kansas State University
Cheng He, Kansas State University
Kui Zhang, Michigan Technological UniversityFollow
Sanzhen Liu, Kansas State University
Qiuying Sha, Michigan Technological UniversityFollow
Hairong Wei, Michigan Technological UniversityFollow

Document Type

Article

Publication Date

9-13-2023

Department

Department of Mathematical Sciences; College of Forest Resources and Environmental Science

Abstract

Four statistical selection methods for inferring transcription factor (TF)-target gene (TG) pairs were developed by coupling mean squared error (MSE) or Huber loss function, with elastic net (ENET) or least absolute shrinkage and selection operator (Lasso) penalty. Two methods were also developed for inferring pathway gene regulatory networks (GRNs) by combining Huber or MSE loss function with a network (Net)-based penalty. To solve these regressions, we ameliorated an accelerated proximal gradient descent (APGD) algorithm to optimize parameter selection processes, resulting in an equally effective but much faster algorithm than the commonly used convex optimization solver. The synthetic data generated in a general setting was used to test four TF-TG identification methods, ENET-based methods performed better than Lasso-based methods. Synthetic data generated from two network settings was used to test Huber-Net and MSE-Net, which outperformed all other methods. The TF-TG identification methods were also tested with SND1 and overexpression transcriptomic data, Huber-ENET and MSE-ENET outperformed all other methods when genome-wide predictions were performed. The TF-TG identification methods fill the gap of lacking a method for genome-wide TG prediction of a TF, and potential for validating ChIP/DAP-seq results, while the two Net-based methods are instrumental for predicting pathway GRNs.

Publication Title

NAR genomics and bioinformatics

Recommended Citation

Cao, X., Zhang, L., Islam, M. K., Zhao, M., He, C., Zhang, K., Liu, S., Sha, Q., & Wei, H. (2023). TGPred: efficient methods for predicting target genes of a transcription factor by integrating statistics, machine learning and optimization. NAR genomics and bioinformatics, 5(3). http://doi.org/10.1093/nargab/lqad083
Retrieved from: https://digitalcommons.mtu.edu/michigantech-p2/154

Link to Full Text

COinS

Michigan Tech Publications, Part 2

TGPred: efficient methods for predicting target genes of a transcription factor by integrating statistics, machine learning and optimization

Document Type

Publication Date

Department

Abstract

Publication Title

Recommended Citation

LINKS

Browse

Search

Author Corner

Michigan Tech Publications, Part 2

TGPred: efficient methods for predicting target genes of a transcription factor by integrating statistics, machine learning and optimization

Authors

Document Type

Publication Date

Department

Abstract

Publication Title

Recommended Citation

Share

LINKS

Browse

Search

Author Corner