Please use this identifier to cite or link to this item: http://repository.iiitd.edu.in/xmlui/handle/123456789/1668
Full metadata record
DC FieldValueLanguage
dc.contributor.authorSingh, Surabhi
dc.contributor.authorMurugan, N Arul (Advisor)
dc.date.accessioned2024-09-18T13:55:47Z
dc.date.available2024-09-18T13:55:47Z
dc.date.issued2024-05-01
dc.identifier.urihttp://repository.iiitd.edu.in/xmlui/handle/123456789/1668
dc.description.abstractThe accurate prediction of protein stability temperatures is essential for numerous applications in bioinformatics and biotechnology. In this study, we utilized a multifaceted computational approach to develop predictive models for protein stability temperatures and determine which modeling technique yields the best results. We began by utilizing the Pfeature package to compute different set of 16 features for a dataset comprising 31,470 protein sequences. These features encompassed various aspects, including amino acid composition, physicochemical properties, and structural characteristics. Subsequently, the dataset was standardized using StandardScaler to prepare it for analysis. Next, we employed an array of modeling techniques, including Artificial Neural Networks (ANN), Linear Regression, Decision Trees, and Random Forests, to establish predictive models. Each model was trained on the concatenated dataset of protein features and evaluated using standard regression metrics such as root mean square error (RMSE), mean absolute error (MAE), and R^2 score. Furthermore, we utilized MODELLER for homology modeling to generate three-dimensional structures for a subset of 15,000 proteins, selected based on sequence similarity. The Graphein package facilitated the analysis of protein structures by computing various types of bonds within the proteins. Additionally, using Amber Tools, we computed various energy components for each protein structure, including bond energy, angle energy, and solvation energy. These energy values were integrated into a dataset alongside the corresponding stability temperatures. Finally, we assess the accuracy of the different modeling techniques by evaluating their predictive accuracy using the aforementioned regression metrics. By systematically assessing the performance of each model, we endeavored to identify the most effective approach for predicting protein stability temperatures. This comprehensive computational study offers valuable insights into the prediction of protein stability temperatures, offering a sturdy foundation for future research in this domain.en_US
dc.language.isoen_USen_US
dc.publisherIIIT-Delhien_US
dc.subjectminimization of energyen_US
dc.subjectmachine learningen_US
dc.subjectdeep learningen_US
dc.titleDeveloping machine learning and deep learning models for predicting the thermal stability of proteinsen_US
dc.typeThesisen_US
Appears in Collections:Year-2024

Files in This Item:
File Description SizeFormat 
thesis_report_surabhi_Singh MT22207.pdf692.37 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.