IIIT-Delhi Institutional Repository

Machine learning and deep learning models for prediction of protein-ligand binding affinity

Show simple item record

dc.contributor.author Kaur, Parneet
dc.contributor.author Murugan, N Arul (Advisor)
dc.date.accessioned 2024-09-17T13:37:50Z
dc.date.available 2024-09-17T13:37:50Z
dc.date.issued 2024-05-01
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/1665
dc.description.abstract In recent years, there has been significant interest in using Machine Learning and Deep Learning to predict protein-ligand binding affinity. This is due to the rapid growth of the computational approaches that have evolved in drug discovery. The binding affinity prediction is useful in the virtual screening and drug screening optimization step of drug discovery.. The ML and DL-based approaches have shown notable improvements compared to the conventional approaches. The conventional approaches are time-consuming, complex, and challenging. However, the introduction of computational approaches has expedited the drug discovery timeline. In this study, we aim to develop Machine Learning models and benchmark some of the Deep Learning models to predict the protein-ligand binding affinity. We have used the refined set of the PDBbind database(version 2020) to fetch the protein-ligand structural data and binding affinity data. We have used the dataset mentioned above for the machine learning models and featurized the protein-ligand complexes using tools such as RDkit/Mordred and Pfeature, followed by feature selection. Models such as SVM, Random Forest, Multiple Linear Regression, etc, have been used to predict the binding affinity of PL complexes. From all the ML models we tested, it was observed that Random Forest performed better with an R-squared value of 0.6. Further, we benchmarked the CNN-based Deep learning models such as Pafnucy and OnionNet-2 using the refined set of PDBbind as the benchmarking test dataset. It was observed that the OnionNet-2 model showed better predictive performance at an R-squared value of 0.85 than that of the Pafnucy model at an R-squared value of 0.46. We have discussed this relative performance in our study. Hence, it was observed that out of all the approaches we used, the PDBbind refined dataset showed the maximum R-squared value when it was benchmarked using the OnionNet-2 model. We have also discussed the reasons for the variation and the future scope of the study. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject Binding affinity en_US
dc.subject protein-ligand complex en_US
dc.subject PDBbind en_US
dc.subject machine learning en_US
dc.subject deep learning en_US
dc.title Machine learning and deep learning models for prediction of protein-ligand binding affinity en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account