Please use this identifier to cite or link to this item: http://repository.iiitd.edu.in/xmlui/handle/123456789/1345
Full metadata record
DC FieldValueLanguage
dc.contributor.authorGupta, Srijanee-
dc.contributor.authorRaghava, Gajendra Pal Singh (Advisor)-
dc.date.accessioned2023-12-18T14:51:02Z-
dc.date.available2023-12-18T14:51:02Z-
dc.date.issued2023-07-
dc.identifier.urihttp://repository.iiitd.edu.in/xmlui/handle/123456789/1345-
dc.description.abstractThpPred is a web-based tool, developed for predicting druggable proteins/peptides. The main dataset used in this study contained 356 therapeutic proteins/peptides and 356 random proteins/peptides, curated from DrugBank, Uniprot and other sources. In order to provide a fair assessment, we did internal validation on 80% of the data and external validation on the remaining 20%. In this study, we have implemented the following methods for predicting druggability of proteins/peptides; i) machine learning models on features chosen using SVC-L1, Variance Threshold, and correlation coefficient; ii) machine learning models on single feature (AAC, DPC & TPC); and iii) MERCI-based motif search. The goal was to construct the best model and install it on a web server by training it on protein sequences of already existing medications. When compared to other models, the XGB-based model performed the best on AAC features and obtained maximum AUCs of 0.91 and 0.91 on the training and validation datasets, respectively for the alternate dataset consisting of 356 positive sequences and 3560 negative sequences. On the other hand, the RF-based model performed admirably on DPC features and obtained maximum AUCs of 0.91 and 0.89 on the training and validation datasets for the main dataset. The AUC score and accuracy for both datasets improved when motif labels were added to ML predicted labels. ThpPred was created to determine if a protein is therapeutic or not by combining motif search with RF and XGB models. The platform helps the scientific community create more effective protein-based medicines by providing a free web server and a standalone package. Overall, the results of the study indicate that ThpPred has the potential to improve the development of pharmaceuticals and protein-based treatments for the treatment of numerous diseases.en_US
dc.language.isoen_USen_US
dc.publisherIIIT-Delhien_US
dc.subjectMachine learningen_US
dc.subjectMotif scan modelen_US
dc.subjectDesign moduleen_US
dc.titleThpPred: an ML based tool for predicting therapeutic proteins/peptidesen_US
dc.typeThesisen_US
Appears in Collections:Year-2023

Files in This Item:
File Description SizeFormat 
Thesis_Srinjee_MT21231.pdf2.53 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.