Date of Award
2026
Document Type
Open Access Dissertation
Degree Name
Doctor of Philosophy in Computational Science and Engineering (PhD)
Administrative Home Department
College of Forest Resources and Environmental Science
Advisor 1
Hairong Wei
Committee Member 1
Kui Zhang
Committee Member 2
Qiuying Sha
Committee Member 3
Weihua Zhou
Abstract
This dissertation presents computational and AI-driven frameworks for identifying key regulatory genes and their downstream targets across plant and human biological systems. Three studies address distinct challenges in genomic regulation using advanced machine learning and bioinformatics approaches.
The first study introduces DyGAF (Dynamic Gene Attention Focus), a dual-attention transformer framework that identifies and ranks disease-relevant biomarker genes by simultaneously modeling independent molecular responses and interdependent regulatory network behavior. Two attention models provide complementary perspectives on gene importance and are fused through a novel combination metric. Applied to COVID-19 nasopharyngeal swab profiles, the attention-weighted representations achieved 94.23% classification accuracy, high sensitivity, and a cumulative mutual information of 17.13 nats across the selected feature set, outperforming conventional combination metrics, and confirming that the learned biomarker weights capture biologically distinct transcriptional signatures. Pathway and functional enrichment analyses further validated its relevance to COVID-19 pathogenesis, outperforming differential expression and Random Forest based methods.
The second study introduces SignalPath-Finder, an AI-driven framework designed to identify downstream target genes of signaling complexes from heterogeneous public RNA-seq datasets without requiring targeted perturbation experiments. The framework applies a pseudo-peak transformation that reorders transcriptomic samples using TOR complex anchor genes as references, converting heterogeneous expression profiles into aligned bell-shaped patterns. Genes sharing distributional and structural similarity with TOR anchors are grouped via unsupervised clustering, and a cluster-wise autoencoder-based representation learning module ranks candidate downstream genes by their contribution to the learned latent manifold. Applied to 628 Populus trichocarpa RNA-seq samples across three tissue types, SignalPath-Finder recovered known TOR downstream genes with significantly higher enrichment than conventional methods including Spearman correlation, GENIE3, and TIGRESS, and identified novel downstream gene candidates supported by literature evidence across all tissues.
The third study addresses transcription factor regulation of regeneration in Arabidopsis thaliana. Using CollaborativeNet and Triple-Gene Mutual Interaction analysis on 78 RNA-seq samples, nine regeneration-associated subnetworks were identified and refined to three, from which six candidate transcription factors, WOX9A, LEC2, PGA37, WIP5, PEI1, and AIL1, were prioritized for their roles in somatic embryogenesis and regeneration.
Together, these studies advance scalable, AI-powered genomic frameworks for biomarker discovery, signaling pathway analysis, and regulatory network inference across diverse biological systems.
Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License
Recommended Citation
Islam, Md Khairul, "COMPUTATIONAL AND AI FRAMEWORKS FOR IDENTIFYING KEY REGULATORY GENES AND THEIR TARGET GENES IN PLANTS AND HUMANS", Open Access Dissertation, Michigan Technological University, 2026.
Included in
Bioinformatics Commons, Computational Biology Commons, Computational Engineering Commons, Data Science Commons, Genetics Commons, Statistical Models Commons, Systems Biology Commons