Abstract:
Human cancers are composed of cells with varying genotypes, epigenetic states, and gene expression profiles. Intra and inter-tumor clonal heterogeneity is now recognized as one of the biggest drawbacks for therapeutic advancements in medical oncology. The clinical therapy of the condition is hampered by such an absurd degree of variability. For instance, a niche population may become tolerant and continue to develop and multiply when the majority of cancer cells die as a result of the toxicity brought on by a particular anti-cancer treatment. Thus it becomes crucial to find which of the clones has ancestry ties to the tolerant clone. This can be achieved by knowing the clonal phylogeny within the tumor. In this study, we tried to model the clonal phylogeny in cancer by using the single cell RNAseq data. Single-cell RNA sequencing (scRNA-seq) is an emerging technology for profiling the gene expression of thousands of cells at single-cell resolution. This level of throughput analysis enables researchers to understand at the single-cell level what genes are expressed, in what quantities, and how they differ across thousands of cells within a heterogeneous sample. It can reveal complex and rare cell populations, uncover regulatory relationships between genes, and track the trajectories of distinct cell lineages in development which aims at understanding how a single-celled embryo gives rise to various cell types that are organized into complex tissue and organs. However, the analysis comes up with a set of its own challenges like batch effects between datasets, limited availability of computational resources, and sharing restrictions on raw data. Recently, utilizing large-scale reference datasets to gain knowledge and then transferring it to smaller query datasets has become common in order to solve the above-mentioned problems. This concept is commonly known as Transfer Learning. In the second part of our study, we developed tranSCend which is a web server that hosts different pre-trained models accessible through a user-friendly interface to carry out different single cell analysis tasks. These tasks include data harmonizing, batch effect correction, normalization, visualization, clustering, cell-type classification, and differential gene expression analysis.