Abstract:
In an era where machine learning (ML) is changing the landscape of financial markets, education, security and privacy, the retail sector, and many other crucial aspects of human life, it is only fitting that we should use its potential for personalized medicine. Combining precision medicine with statistical analysis and machine learning techniques may pave the future of disease treatment. Personalized, or precision, medicine consists of using knowledge specific to a patient, such as biomarkers, genomic information, demographics, or lifestyle characteristics, best to treat their ailment, rather than generic best practices. According to a given scenario, machine learning (ML) can help predict the best treatment plan for the patient. ML can help supply clinicians with highconfidence hypotheses to support the complex decision-making process on an individual basis. This system of assistance is called clinical decision support systems (CDSS). Because cancer is so heterogeneous in nature, it is essential that each patient’s treatment be individually tailored and targeted rather than adopting a standard system. Some key aspects of clinical decision-making are improving treatment efficiency, reducing adverse effects, lowering patient and care providers’ costs, and diagnosing the disease early. To study, design, analyze and interpret such multidisciplinary aspects of clinical and translational cancer research, we drew on both statistical and machine-learning based methods. Below is an anecdote of our key contributions that successfully incorporate machine learning, genomics and patient-focused healthcare. We start the journey at the cellular level, where we propose a method that can help reveal the factors contributing to cellular heterogeneity in single-cell datasets. By identifying influential genes that contribute to cellular heterogeneity, our proposed method InGene lays the groundwork for personalized medicine. Single-cell RNA sequencing (scRNA-seq) provides a powerful means of characterizing transcriptional heterogeneity within cells of seemingly identical phenotypes. Due to factors like high variability in scRNA-seq data, high dimensionality, and sparsity, traditional feature selection methods fall short in this task. Recently, non-linear dimensionality reduction techniques have made foray into scRNA-seq as they help us assess local and global cellular arrangement. However, non-linear dimensionality reduction techniques are primarily used for visualization purposes only since they do not shed any light on the individual genes’ identity that influences the non-linear transformation. We developed InGene, a first of its kind non-linear unsupervised method to overcome this limitation. Our method can also be used as an alternative to state-of-the-art methods for finding differential genes, which can be further used as a targeted sequencing panel, thus aiding in clinical decision making. InGene can be used to obtain reliable targeted panels for scRNA-sequencing, thus reducing the cost manifold. Using a cost-effective scRNA-seq sequencing solution can prove to be a headway in personalized therapy recommendation and help make the clinical decision making process more effective. Next, we expand the scope from cellular insights to a broader patient-centric approach. In the realm of oncology, there is a critical need for diagnostic methodologies that are both efficacious and patient-friendly. In this chapter, we contribute to improving cancer diagnosis. Our study proposes an affordable, non-invasive, liquid-biopsy based diagnostic method. Although tissue biopsy is widely used to diagnose cancer, it has drawbacks, particularly when repeated sampling is necessary. Due to their ability to precisely identify the existence and subtype of tumours, tumour educated platelets (TEPs) have recently attracted interest. The majority of research involving TEPs has utilized marker-panels that include hundreds of genes, which can be expensive and impede the adoption of the diagnostic method. To address this issue, we investigated TEP expression profiles that are available to the public and discovered a signature of 11 platelet-genes that can effectively differentiate between malignant and normal samples. Next, in our journey, we foray from disease detection to disease management. We propose to enhance patient outcomes for Multiple Myeloma (MM) patients and help clinicians optimize a patient’s treatment plans. Patient stratification and prediction of disease recurrence is another important aspect of personalized therapy. To determine the probability of recurrence in MM patients receiving Autologous Stem Cell Transplantation (ASCT), we developed a stratification model to enhance prognosis estimate and treatment efficacy. For a lot of practical reasons, it is crucial to identify whether a patient undergoing ASCT is at high risk for recurrence (likely to relapse within 36 months). Our model, which consists of a 3-factor multivariate 2-stage staging system, is highly decisive in predicting the outcome of stem cell rescue. It is essential to detect cancer promptly in order to manage cancer patients effectively. In conclusion, this thesis harmonizes molecular insights, diagnostic innovations and clinical management in oncology.