Please use this identifier to cite or link to this item:
http://repository.iiitd.edu.in/xmlui/handle/123456789/1912Full metadata record
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | S, Prasanna Kumar | - |
| dc.contributor.author | Sethi, Tavpritesh (Advisor) | - |
| dc.date.accessioned | 2026-04-17T10:21:58Z | - |
| dc.date.available | 2026-04-17T10:21:58Z | - |
| dc.date.issued | 2025-06 | - |
| dc.identifier.uri | http://repository.iiitd.edu.in/xmlui/handle/123456789/1912 | - |
| dc.description.abstract | The exponential growth of biomedical data promises new insights, but semantic heterogeneity and inconsistent metadata limit reuse. In practice, many publicly available datasets (e.g., tabular datasets from Figshare or Zenodo) are annotated with non-standardized field names, violating Findable Accessible Interoperable Reusable criteria (FAIR). To bridge this gap, we propose a framework for FAIR Assessment using Ontology Mapping and large language models (LLMs), that assesses and enhances interoperability of such “not-so-FAIR” datasets. First, we quantify dataset FAIRness by mapping variables to standard clinical terms - Systematized Medical Nomenclature for Medicine Clinical Terms (SNOMED CT) – a comprehensive ontology widely used for semantic interoperability. Then we explore the use of large language models – specifically Mistral and LLaMA – to improve SNOMED CT term mapping coverage and disambiguation for dataset fields. We prompt these large language models with field context and compare their predicted SNOMED terms to ground-truth concepts (baseline: Medical Concept Annotation Tool). Our experiments on diverse clinical datasets show that large language models can significantly augment automated ontology mapping and reduce semantic mismatches. Taken together, this work presents a principled approach that integrates ontology-based FAIR assessment with LLM-driven harmonization to close the semantic gap in biomedical data integration. | en_US |
| dc.language.iso | en_US | en_US |
| dc.publisher | IIIT-Delhi | en_US |
| dc.subject | Biomedical Data | en_US |
| dc.subject | Semantic Interoperability | en_US |
| dc.subject | FAIR Data Principles | en_US |
| dc.subject | Data Standardization | en_US |
| dc.subject | SNOMED CT | en_US |
| dc.subject | Ontology Mapping | en_US |
| dc.subject | Large Language Models | en_US |
| dc.title | LLM-assisted ontology mapping for semantic interoperability in structured biomedical data | en_US |
| dc.type | Thesis | en_US |
| Appears in Collections: | Year-2025 | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| MT23234_Prasanna Kumar S.pdf | 693.42 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.