IIIT-Delhi Institutional Repository

Unveiling E. coli adaptation dynamics with protein language model and random walk

Show simple item record

dc.contributor.author Jaiswal, Nancy
dc.contributor.author Sengupta, Debarkar (Advisor)
dc.date.accessioned 2024-09-14T06:54:01Z
dc.date.available 2024-09-14T06:54:01Z
dc.date.issued 2024-05-21
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/1659
dc.description.abstract Biological processes rely on intricate interactions among multiple genes, forming diverse networks like protein-protein interaction networks, gene regulation networks, gene co-expression networks, and metabolic networks. Graph theory algorithms analyze these networks to unveil complex biological interactions. The Random Walk with Restart (RWR) algorithm is cutting-edge, extending to multiplex and heterogeneous networks, exploring various layers of gene and protein interactions, including protein-protein interactions and co-expression correlations. Additionally, it transitions to networks reflecting phenotype similarities among genes. We developed a method to decipher the intrinsic phenotype mechanisms through genomic data utilizing RWR-M to identify functionally associated genes, integrated with pathway analysis, facilitating the subsequent identification of pivotal pathways. To validate our method, we conducted a single-cell bottleneck experiment. We grew the wild-type E. coli strain MG1655 under two varying conditions, subjecting them to increasing sublethal antibiotic pressure. Following that, we conducted whole-genome sequencing at every time point for both growth conditions. The variants were identified and they were used as seed genes in RWR-M. The RWR-M network was constructed using STRING, E. coli net, and Weighted Gene Co-expression Network Analysis (WGCNA). We subsequently analyzed the gene set through pathway analysis to pinpoint the crucial pathways and their associated genes responsible for the observed phenotypic variances in the two distinct environmental conditions. We supported our finding using protein language model (PLMs) and literature survey. By employing a multifaceted approach that combines various methodologies, we have established a comprehensive framework for pinpointing the critical factors responsible for the observed phenotypic changes. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject Algorithm en_US
dc.subject Data quality control en_US
dc.subject Bioinformatic pipeline for VCF calling en_US
dc.title Unveiling E. coli adaptation dynamics with protein language model and random walk en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account