Abstract:
Due to the advent of next-generation sequencing techniques, the sequencing cost is reduced with increasing accuracy. We can use this as an advantage in liquid biopsy to identify and monitor cancer at different stages, which has many advantages like being non-invasive and help in disease monitoring at different stages over conventional techniques like tissue extraction. We can use the RNA detected in the liquid biopsy and can help in the prognosis and diagnosis of cancer. The challenge with the liquid biopsy is that the data is minimal and not publicly accessible. To overcome this, we have tried to simulate liquid biopsy data using publicly available TCGA pan-cancer and GTEX whole blood data at different dilutions. Later this simulated data was used to train the machine learning models, and then we identified the best markers present in six types of cancers. We have identified multiple marker genes from six type of simulated cancer and enhancer RNA in BRCA.validated on already available markers and literature. Further analysis of these markers will give us more insights. We believe that our work can help in designing cancer gene panels for the detection and prognosis of cancer .