Please use this identifier to cite or link to this item:
http://repository.iiitd.edu.in/xmlui/handle/123456789/944| Title: | Statistical and machine learning-based approaches to precise characterization of cellular phenotypes |
| Authors: | Gupta, Krishan Sengupta, Debarka (Advisor) Ghosh, Abhik (Advisor) Ahuja, Gaurav (Advisor) |
| Keywords: | RNA species Single-cell expression studies Pseudotemporal analysis Single Circulating Tumor Cells (CTCs) |
| Issue Date: | Oct-2021 |
| Publisher: | IIIT-Delhi |
| Abstract: | Delineation of the complex layers of biological system requires a cumulative effort from multiple disciplines of science. The present thesis work utilizes some of the interdisciplinary approaches by combining the automation and accuracy of computation to the in-depth concepts of Biology. In my thesis, I have addressed three fundamental biological problems. In one of my initial projects, I developed a computational framework by utilizing Machine Learning-based approach to build a classification model for the detection of Circulating Tumor Cells (CTCs). Moreover, I validated the authenticity of our model on a large number of publicly available scRNA-seq datasets and a newly generated CTC dataset of breast tumour cells, captured using a newly developed microfluidic system for label-free enrichment of CTCs. In my second project, I utilized single cell genomics approach coupled with stringent statistical and structural biology frameworks to dissect the cellular basis of the loss of smell in COVID-19 infected patients. Of note, one of the prevalent, but largely ignored symptoms during the early COVID-19 pandemic was the loss of smell and taste. Our work utilized the known information about the viral entry proteins, and viral-human protein-protein interaction map. Our integrative analysis clearly suggests that the non-sensory (sustentacular, Globolar Basal Cells and Bow-man’s gland) cell-types are vulnerable to SARS-CoV-2 infection. In my third project, I explored the potential of modelling expression-ranks, as robust surrogates for transcript abundance. Here I examined the Discrete Generalized Beta Distribution (DGBD) performance on real data and devised a Wald type test to compare gene expression between two phenotypically divergent groups of single cells. We carried out a comprehensive assessment of the proposed method, to understand its advantages as compared to some of the current best practice approaches. In addition to striking a reasonable balance between Type 1 and Type 2 errors, we concluded that with increasing sample size, Rank Order- Sequencing (ROSeq), the proposed differential expression test, is remarkably robust for expression noise and scales rapidly. |
| URI: | http://repository.iiitd.edu.in/xmlui/handle/123456789/944 |
| Appears in Collections: | Year-2021 |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| KRISHAN_PhD16008_IIITD_THESIS_REVISED-1.pdf | 9.49 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.