IIIT-Delhi Institutional Repository

Deep learning-based image-to-SMILES conversion of 2D chemical structures

Show simple item record

dc.contributor.author Alisha
dc.contributor.author Murugan, N. Arul (Advisor)
dc.date.accessioned 2026-04-17T12:58:22Z
dc.date.available 2026-04-17T12:58:22Z
dc.date.issued 2025-07
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/1916
dc.description.abstract Chemical information science faces an important bottleneck because millions of chemical structures are trapped in visual formats throughout scientific literature and patents, making them inaccessible for automatic analysis and large-scale data mining. Traditional optical chemical structure recognition (OCSR) methods depend on the rules-based approaches that demonstrate limited robustness when processing the real-world literature diversity, while the current deep learning approaches seek large-scale computational resources yet remain impractical for comprehensive deployment. This research addresses these limitations through the development of an integrated three-phase deep learning pipeline that (1) a Faster R-CNN with ResNet-50 backbone and Feature Pyramid Network architecture adapted for chemical structure detection, handling diverse molecular configurations across 15 chemical elements and 4 bond types (19 classes total); (2) uses spatial connectivity analysis using K-D tree algorithms to generate adjacency and bond-order matrices for molecular graph representation; and (3) uses multi-strategy SMILES generation with progressive RDKit sanitization, fragment-linking, and domain-aware validation. Key technical innovations include chemical-aware anchor generation, class-specific confidence thresholds, focal loss implementation, and strategic training methodologies addressing severe class imbalance. The developed system displays strong performance through comprehensive evaluation on 14,997 testing images; 612,371 total detections (99.7% detection rate) at 40.83 detections per image, 99.2% successful molecular graph conversion, 98.1% right bond connectivity, and SMILES generating (41.2% valid). While 25 epochs on the full 100K dataset are converged to a loss of 0.8877. The system achieves an mAP of 74.9% with 88.1% of successfully generated molecules that receive high-quality scores (80) on the comprehensive verification metrics. The framework is optimized for standard computational infrastructure with efficient memory use en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject Deep Learning en_US
dc.subject -SMILES en_US
dc.title Deep learning-based image-to-SMILES conversion of 2D chemical structures en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account