IIIT-Delhi Institutional Repository

ThpPred: an ML based tool for predicting therapeutic proteins/peptides

Show simple item record

dc.contributor.author Gupta, Srijanee
dc.contributor.author Raghava, Gajendra Pal Singh (Advisor)
dc.date.accessioned 2023-12-18T14:51:02Z
dc.date.available 2023-12-18T14:51:02Z
dc.date.issued 2023-07
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/1345
dc.description.abstract ThpPred is a web-based tool, developed for predicting druggable proteins/peptides. The main dataset used in this study contained 356 therapeutic proteins/peptides and 356 random proteins/peptides, curated from DrugBank, Uniprot and other sources. In order to provide a fair assessment, we did internal validation on 80% of the data and external validation on the remaining 20%. In this study, we have implemented the following methods for predicting druggability of proteins/peptides; i) machine learning models on features chosen using SVC-L1, Variance Threshold, and correlation coefficient; ii) machine learning models on single feature (AAC, DPC & TPC); and iii) MERCI-based motif search. The goal was to construct the best model and install it on a web server by training it on protein sequences of already existing medications. When compared to other models, the XGB-based model performed the best on AAC features and obtained maximum AUCs of 0.91 and 0.91 on the training and validation datasets, respectively for the alternate dataset consisting of 356 positive sequences and 3560 negative sequences. On the other hand, the RF-based model performed admirably on DPC features and obtained maximum AUCs of 0.91 and 0.89 on the training and validation datasets for the main dataset. The AUC score and accuracy for both datasets improved when motif labels were added to ML predicted labels. ThpPred was created to determine if a protein is therapeutic or not by combining motif search with RF and XGB models. The platform helps the scientific community create more effective protein-based medicines by providing a free web server and a standalone package. Overall, the results of the study indicate that ThpPred has the potential to improve the development of pharmaceuticals and protein-based treatments for the treatment of numerous diseases. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject Machine learning en_US
dc.subject Motif scan model en_US
dc.subject Design module en_US
dc.title ThpPred: an ML based tool for predicting therapeutic proteins/peptides en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account