IIIT-Delhi Institutional Repository

Developing machine learning and deep learning models for predicting the thermal stability of proteins

Show simple item record

dc.contributor.author Singh, Surabhi
dc.contributor.author Murugan, N Arul (Advisor)
dc.date.accessioned 2024-09-18T13:55:47Z
dc.date.available 2024-09-18T13:55:47Z
dc.date.issued 2024-05-01
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/1668
dc.description.abstract The accurate prediction of protein stability temperatures is essential for numerous applications in bioinformatics and biotechnology. In this study, we utilized a multifaceted computational approach to develop predictive models for protein stability temperatures and determine which modeling technique yields the best results. We began by utilizing the Pfeature package to compute different set of 16 features for a dataset comprising 31,470 protein sequences. These features encompassed various aspects, including amino acid composition, physicochemical properties, and structural characteristics. Subsequently, the dataset was standardized using StandardScaler to prepare it for analysis. Next, we employed an array of modeling techniques, including Artificial Neural Networks (ANN), Linear Regression, Decision Trees, and Random Forests, to establish predictive models. Each model was trained on the concatenated dataset of protein features and evaluated using standard regression metrics such as root mean square error (RMSE), mean absolute error (MAE), and R^2 score. Furthermore, we utilized MODELLER for homology modeling to generate three-dimensional structures for a subset of 15,000 proteins, selected based on sequence similarity. The Graphein package facilitated the analysis of protein structures by computing various types of bonds within the proteins. Additionally, using Amber Tools, we computed various energy components for each protein structure, including bond energy, angle energy, and solvation energy. These energy values were integrated into a dataset alongside the corresponding stability temperatures. Finally, we assess the accuracy of the different modeling techniques by evaluating their predictive accuracy using the aforementioned regression metrics. By systematically assessing the performance of each model, we endeavored to identify the most effective approach for predicting protein stability temperatures. This comprehensive computational study offers valuable insights into the prediction of protein stability temperatures, offering a sturdy foundation for future research in this domain. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject minimization of energy en_US
dc.subject machine learning en_US
dc.subject deep learning en_US
dc.title Developing machine learning and deep learning models for predicting the thermal stability of proteins en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account