Please use this identifier to cite or link to this item:
http://repository.iiitd.edu.in/xmlui/handle/123456789/1962| Title: | Applications of NLP in recipe texts |
| Authors: | Neelu Vaikundam, Gurupriya Upadhyay, Rituj Bagler, Ganesh (Advisor) |
| Keywords: | Recipe Classification Text Preprocessing XGBoost Food Analytics |
| Issue Date: | 27-Jul-2025 |
| Publisher: | IIIT-Delhi |
| Abstract: | This study addresses the challenge of large-scale, multi-label recipe classification us- ing a real-world dataset of over 600,000 recipes collected from heterogeneous sources. The raw data exhibited significant noise, duplication, and label imbalance, motivating a comprehensive, multi-stage cleaning and preprocessing framework. Key steps included in- gredient normalization, instructions standardization, multi-label parsing, deduplication, and semantic category mapping into hierarchical supercategories. For modeling, we im- plemented a modular pipeline combining TF-IDF feature extraction, classical classifiers, XGBoost, and fine-tuned BERT models to capture both statistical and contextual signals. By adopting a per-supercategory strategy, we minimized cross-domain interference and achieved strong performance, with the fine-tuned BERT classifier attaining a weighted F1-score of 0.7996 and high accuracy on dominant labels. This work demonstrates how rigorous data preparation and modular modeling can enable fine-grained, interpretable recipe classification at scale, providing a robust foundation for downstream culinary ap- plications such as personalized meal planning and intelligent search. |
| URI: | http://repository.iiitd.edu.in/xmlui/handle/123456789/1962 |
| Appears in Collections: | Year-2025 |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| BTP_Poster_Summer - Gurupriya Vaikundam.pdf Restricted Access | 1.13 MB | Adobe PDF | View/Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.