Abstract:
Hormones play a crucial role in communicating information between cells and organs; responsible for regulating almost all the physiological processes of organisms. Thus, it is important to collect, compile and mine hormones associated information. Firstly, a repository Hmrbase2 have been developed to maintain comprehensive information on hormones and their receptors, which is an update of Hmrbase. The information was compiled from literature and public repositories like HMDB, Uniprot, HORDB, ENDONET and PubChem. It contains a total of 12,056 entries, including 7,406 entries for peptide hormones, 753 entries for non-peptide hormones, and 3,897 entries for hormone receptors. The database also includes 5,662 hormone receptor pairs. The database is available free for scientific community (https://webs.iiitd.edu.in/raghava/hmrbase2/. Secondly, systematic attempt has been made to develop a method for predicting peptide hormones using data mining techniques. All models in this study were trained, test and evaluated on a dataset of 1174 hormonal and 1174 non-hormonal peptides. A wide range of machine and deep learning techniques have been implemented to discriminate hormones and non-hormones with high precision. Best performing model based on logistic regression achieved maximum performance AUC of 0.93. Finally, a hybrid method has been developed that combine logistic regression model (alignment free method) with BLAST/motif (alignment-based method) and achieved AUC of 0.96 with MCC of 0.8 on independent/validation dataset. To facilitate research community a web server HOPPred have been developed (https://webs.iiitd.edu.in/raghava/hoppred/).