Please use this identifier to cite or link to this item: http://repository.iiitd.edu.in/xmlui/handle/123456789/1962
Title: Applications of NLP in recipe texts
Authors: Neelu
Vaikundam, Gurupriya
Upadhyay, Rituj
Bagler, Ganesh (Advisor)
Keywords: Recipe Classification
Text Preprocessing
XGBoost
Food Analytics
Issue Date: 27-Jul-2025
Publisher: IIIT-Delhi
Abstract: This study addresses the challenge of large-scale, multi-label recipe classification us- ing a real-world dataset of over 600,000 recipes collected from heterogeneous sources. The raw data exhibited significant noise, duplication, and label imbalance, motivating a comprehensive, multi-stage cleaning and preprocessing framework. Key steps included in- gredient normalization, instructions standardization, multi-label parsing, deduplication, and semantic category mapping into hierarchical supercategories. For modeling, we im- plemented a modular pipeline combining TF-IDF feature extraction, classical classifiers, XGBoost, and fine-tuned BERT models to capture both statistical and contextual signals. By adopting a per-supercategory strategy, we minimized cross-domain interference and achieved strong performance, with the fine-tuned BERT classifier attaining a weighted F1-score of 0.7996 and high accuracy on dominant labels. This work demonstrates how rigorous data preparation and modular modeling can enable fine-grained, interpretable recipe classification at scale, providing a robust foundation for downstream culinary ap- plications such as personalized meal planning and intelligent search.
URI: http://repository.iiitd.edu.in/xmlui/handle/123456789/1962
Appears in Collections:Year-2025

Files in This Item:
File Description SizeFormat 
BTP_Poster_Summer - Gurupriya Vaikundam.pdf
  Restricted Access
1.13 MBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.