Please use this identifier to cite or link to this item: http://repository.iiitd.edu.in/xmlui/handle/123456789/1090
Full metadata record
DC FieldValueLanguage
dc.contributor.authorSingh, Himanshu-
dc.contributor.authorShah, Rajiv Ratn (Advisor)-
dc.date.accessioned2023-04-05T11:15:40Z-
dc.date.available2023-04-05T11:15:40Z-
dc.date.issued2022-07-
dc.identifier.urihttp://repository.iiitd.edu.in/xmlui/handle/123456789/1090-
dc.description.abstractIn this paper, we worked on different aspects like dataset, annotation guidelines, annotation platform, and models to build a complete eco-system, aimed at making significant contributions towards NLP for the Tamil language. We focused on researching about morpho-syntactic relations in the Tamil text. A more diverse dataset was curated from 5 sources to form a treebank of 10,000 CoNLL-U format annotated sentences. Detailed annotation guidelines were developed for guiding the annotators and the users. We proposed hierarchical tag sets for POS and NER tasks, after testing various available tag sets for the Tamil language. To carry out the CoNLL-U format annotations efficiently, we introduce CoNLL-U GSheets. This annotation platform uses the highly accessible and easy-to-use Google sheets and equips it with all the necessary tools for annotations. The research also focused on developing the pipeline and the models for each task in the morpho-syntactic analysis. We have addresed the language-specific issues for each task in the morpho-syntactic analysis. We also took design decisions that promote flexibility in applications and assist in later NLP tasks.en_US
dc.language.isoen_USen_US
dc.publisherIIIT-Delhien_US
dc.subjectLow resource language processingen_US
dc.subjectNLPen_US
dc.subjectDataseten_US
dc.subjectAnnotation guidelinesen_US
dc.subjectMorpho-syntactic relationsen_US
dc.subjectTamil languageen_US
dc.titleTamilNLP: low resource language processingen_US
dc.typeThesisen_US
Appears in Collections:Year-2022

Files in This Item:
File Description SizeFormat 
MTech_Thesis__Himanshu.pdf6.29 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.