Abstract:
Biological processes rely on intricate interactions among multiple genes, forming diverse networks like protein-protein interaction networks, gene regulation networks, gene co-expression networks, and metabolic networks. Graph theory algorithms analyze these networks to unveil complex biological interactions. The Random Walk with Restart (RWR) algorithm is cutting-edge, extending to multiplex and heterogeneous networks, exploring various layers of gene and protein interactions, including protein-protein interactions and co-expression correlations. Additionally, it transitions to networks reflecting phenotype similarities among genes. We developed a method to decipher the intrinsic phenotype mechanisms through genomic data utilizing RWR-M to identify functionally associated genes, integrated with pathway analysis, facilitating the subsequent identification of pivotal pathways. To validate our method, we conducted a single-cell bottleneck experiment. We grew the wild-type E. coli strain MG1655 under two varying conditions, subjecting them to increasing sublethal antibiotic pressure. Following that, we conducted whole-genome sequencing at every time point for both growth conditions. The variants were identified and they were used as seed genes in RWR-M. The RWR-M network was constructed using STRING, E. coli net, and Weighted Gene Co-expression Network Analysis (WGCNA). We subsequently analyzed the gene set through pathway analysis to pinpoint the crucial pathways and their associated genes responsible for the observed phenotypic variances in the two distinct environmental conditions. We supported our finding using protein language model (PLMs) and literature survey. By employing a multifaceted approach that combines various methodologies, we have established a comprehensive framework for pinpointing the critical factors responsible for the observed phenotypic changes.