IIIT-Delhi Institutional Repository

Advancing gene signature discovery with generative models: a case study ovarian cancer

Show simple item record

dc.contributor.author Anand, Alok
dc.contributor.author Sethi, Tavpritesh (Advisor)
dc.date.accessioned 2026-04-18T07:28:19Z
dc.date.available 2026-04-18T07:28:19Z
dc.date.issued 2025-06
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/1931
dc.description.abstract Ovarian cancer is a highly aggressive malignancy with poor survival rates, largely due to late detection and extensive tumor heterogeneity. This study introduces a computational framework, NIGAM (Normalize, Identify, Generate, Authenticate, Meta-analyze) , to overcome limitations posed by small sample sizes in transcriptomic datasets. Gene expression data from ovarian cancer microarray studies were preprocessed (normalized), key underlying dimensions were identified, synthetic data were generated which were then authenticated via statistical and biological enrichment. Finally, the key findings were meta-analyzed to derive signatures. In the case study on Ovarian Cancer, a comparative analysis between original and augmented data revealed significant improvements in detecting biologically relevant signals. Pathways emerging only after augmentation included those associated with key cancer hallmarks such as uncontrolled proliferation, genomic instability, and angiogenesis. From these datasets, eight genes AURKA, DAPK1, MCM2, WNT2B, CNRIP1, CXXC5, PEX5L, and SEL1L2 were identified as novel candidates, with 50% of these supported by existing literature and pathway databases. AURKA and MCM2, in particular, showed strong alignment with known ovarian cancer biology. The remaining genes, lacking prior association, may represent unexplored therapeutic or diagnostic targets. Observed trends in p-values and fold changes confirmed that increasing augmented sample size enhanced statistical robustness. Our innovation, NIGAM and its application to Ovarian cancer demonstrates how an end-to-end framework incorporating AI, biology and data-science can enable biomarker discovery, speed up discovery of diagnostic panels and lead to precision treatment. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject Machine learning en_US
dc.subject Gaussian Mixture Model en_US
dc.subject Gene Expression Omnibus en_US
dc.subject Benjamini-Hochberg en_US
dc.title Advancing gene signature discovery with generative models: a case study ovarian cancer en_US
dc.type Thesis en_US


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account