IIIT-Delhi Institutional Repository

Analysis and modeling of content and coherence for automatic scoring of L2 speakers

Show simple item record

dc.contributor.author Aggarwal, Gaurav
dc.contributor.author Shah, Rajiv Ratn (Advisor)
dc.date.accessioned 2022-03-30T11:54:00Z
dc.date.available 2022-03-30T11:54:00Z
dc.date.issued 2021-05
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/985
dc.description.abstract This study focuses on understanding content and coherence for Automated Oral Proficiency Scoring and Feedback Generating System in the context of spontaneous speech of non-native(L2) English learners. We aim to understand and introduce a new dataset with verbal responses from Simulated Oral Proficiency Interview annotated with coherence and content scores. In this report, guidelines explicitly tailored to our needs have been provided, followed by annotating the spoken responses. An agreeable inter-annotator score of _ = 0.770 (Cohen's kappa) and_ = 0.884 (Krippendor_'s alpha) is obtained. However, the skewness of the data forced us tore-sample the dataset balanced across multiple dimensions. The time and labour to manually transcribe and annotate this new data proved a bottleneck in the content modeling. We limited ourselves to content-relevance modeling and started analyzing a similar common dataset. We provided various data augmentation techniques to build training data samples and provided a deep-neural network model architecture for this task. The results obtained proved to be promising. A thorough analysis of the model, data augmentations, and the results was done, which gave us insight into their effectiveness and the problems that need to be addressed. We later suggested a few techniques and changes which can be investigated in future to boost the scores. en_US
dc.language.iso en_US en_US
dc.publisher IIIT- Delhi en_US
dc.subject Natural Language Processing en_US
dc.subject Content en_US
dc.subject Automated Scoring Systems en_US
dc.subject Deep Learning en_US
dc.subject Relevance en_US
dc.title Analysis and modeling of content and coherence for automatic scoring of L2 speakers en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account