Abstract:
Alzheimer’s disease is progressing as the most prevalent neurological disorder world- wide. It is the most common cause of dementia in ageing society. An artificial Neural Network (ANN) is a set of neural networks which are inspired by the human brain. They are learning algorithms which can learn and make corrections as they receive input. Despite these benefits, they are not actively used in classification problems involving single-cell genomics. Many recent studies have reported the effectiveness of Machine Learning models in predicting diseases using single-cell genomics, but the sample sizes were too small. Thus, here we have compared ANN with other ML models in prediction and biomarker identification with a large dataset. In this study, ANN was compared to other machine learning models on 169,496 cells of RNA-seq data from normal human subjects and AD patients’ prefrontal cortex. Of these, 90713 were AD labelled, and 78783 were NC labelled. Two different feature sets were selected, and classification accuracies were determined with ANN, LR (Logistic Regression), RF (Random Forest) and other models. As a result, ANN showed the highest performance in both the features of 100 genes and 35 genes with accuracies of 82% and 74%, respectively. Interestingly, when the feature size was decreased to 35 genes, the ANN showed a small decline (7-8%) in accuracy, but it did not change drastically to a low value. In conclusion, it indicates that these conserved 35 genes can be used to predict Alzheimer’s patients and can very well act as potential biomarkers for AD diagnosis and screening. Eventually we have developed a python package named "AlzScPred" based on the above study to facilitate the scientific community.