Abstract:
Tumor development is rooted at genetic level with abnormalities in gene expressions as an important biomarker. Clear cell renal cell carcinoma, a histological subtype of Renal cell carcinoma is the one of the most common form of adult kidney cancer. It shows resistance to conventional chemo-therapies and radio therapies, due to which it is important to continue its intrinsic understanding and identify more molecular markers that can improve the diagnosis outcomes. Analysis of gene level variations for insights into cancer detection is a common practice with gene expression data as its basis. Several attempts have been made to use basic statistical measures to identify genes with differential patterns. The aim of this study is to uncover the information hidden beneath gene expressions by a.) exploiting advanced statistical techniques, b.) analysing structural form (gene correlation network) and c.) discovering relevant segments from the distribution of gene expressions using frequent itemsetmining. Theseapproacheshavebeenmodelledwithanideatoreflectupontheunseen aspects of gene expressions and put them to use to achieve better and robust renal cell cancer
classification . Average classification accuracy of ’79.5’ % is reported on unseen test data. Inferred results were mapped back to the literature and evidences validate the relevance of the proposed feature engineering strategies.