Please use this identifier to cite or link to this item: http://repository.iiitd.edu.in/xmlui/handle/123456789/1912
Full metadata record
DC FieldValueLanguage
dc.contributor.authorS, Prasanna Kumar-
dc.contributor.authorSethi, Tavpritesh (Advisor)-
dc.date.accessioned2026-04-17T10:21:58Z-
dc.date.available2026-04-17T10:21:58Z-
dc.date.issued2025-06-
dc.identifier.urihttp://repository.iiitd.edu.in/xmlui/handle/123456789/1912-
dc.description.abstractThe exponential growth of biomedical data promises new insights, but semantic heterogeneity and inconsistent metadata limit reuse. In practice, many publicly available datasets (e.g., tabular datasets from Figshare or Zenodo) are annotated with non-standardized field names, violating Findable Accessible Interoperable Reusable criteria (FAIR). To bridge this gap, we propose a framework for FAIR Assessment using Ontology Mapping and large language models (LLMs), that assesses and enhances interoperability of such “not-so-FAIR” datasets. First, we quantify dataset FAIRness by mapping variables to standard clinical terms - Systematized Medical Nomenclature for Medicine Clinical Terms (SNOMED CT) – a comprehensive ontology widely used for semantic interoperability. Then we explore the use of large language models – specifically Mistral and LLaMA – to improve SNOMED CT term mapping coverage and disambiguation for dataset fields. We prompt these large language models with field context and compare their predicted SNOMED terms to ground-truth concepts (baseline: Medical Concept Annotation Tool). Our experiments on diverse clinical datasets show that large language models can significantly augment automated ontology mapping and reduce semantic mismatches. Taken together, this work presents a principled approach that integrates ontology-based FAIR assessment with LLM-driven harmonization to close the semantic gap in biomedical data integration.en_US
dc.language.isoen_USen_US
dc.publisherIIIT-Delhien_US
dc.subjectBiomedical Dataen_US
dc.subjectSemantic Interoperabilityen_US
dc.subjectFAIR Data Principlesen_US
dc.subjectData Standardizationen_US
dc.subjectSNOMED CTen_US
dc.subjectOntology Mappingen_US
dc.subjectLarge Language Modelsen_US
dc.titleLLM-assisted ontology mapping for semantic interoperability in structured biomedical dataen_US
dc.typeThesisen_US
Appears in Collections:Year-2025

Files in This Item:
File Description SizeFormat 
MT23234_Prasanna Kumar S.pdf693.42 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.