IIIT-Delhi Institutional Repository

Learning content curation and enrichment

Show simple item record

dc.contributor.author V, Venktesh
dc.contributor.author Mohania, Mukesh (Advisor)
dc.contributor.author Goyal, Vikram (Advisor)
dc.date.accessioned 2023-08-24T12:10:48Z
dc.date.available 2023-08-24T12:10:48Z
dc.date.issued 2023-03
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/1304
dc.description.abstract Education in traditional classroom settings was restricted to static content in textbooks. It also assumes all the learners have a similar pace of learning. Online learning platforms have shifted the paradigm and have made “learning from anywhere” possible. They have also enabled to scale learning to millions of users across demographics. Such learning plat-forms curate content from multiple sources. They also have dedicated academicians to aiding the creation and curation of learning content like videos, lecture transcripts, assessments etc. For sake of simplicity, we refer to the various categories of content as learning con-tent. Manual creation of content is cumbersome. Additionally, when on-boarding content from other sources, they have to be pre-processed to follow the organization standard of the learning management system. For catering to the needs of different stages of learners, such platforms also have to link to related content to facilitate the effective distribution of knowledge to learners. In order to deliver learning content at scale to the learners and cater to the individual needs of the learners, the content in such platforms can no longer be static and must adapt according to the interaction of the individual learners with the system. This involves on boarding new content from other sources, organizing them for ease of access, and enrichment of existing content to generate diverse content for the learners. In our work, we build content curation and enrichment tools to assist the academicians. We achieve this by proposing novel tasks and also relating tasks to existing work in literature. We first look at the problem of organizing content according to a standardized hierarchical learning taxonomy of form (subject - chapter - topic - sub-topic) to aid in applications like faceted search. Effective organization of learning content is a prerequisite for the recommendation of appropriate learning content. However, the label space of hierarchical learning taxonomy is large and has a heavy class imbalance at the topic level of the taxonomy. To tackle this, we propose a novel reformulation of the classic HMLTC (Hierarchical Multi-Label Text Classification) task to a dense retrieval task. We further augment this approach with an efficient cross-attention mechanism with theoretical bounds to induce label-aware learning content representations. Second, we demonstrate that the content can be further tagged with fine-grained academic concepts to facilitate the linkage of related content and granular content recommendations. To accomplish this, we map it to the tasks of unsupervised key phrase extraction and set expansion. We propose a novel random walk based approach with new measures leveraging contextualized and topical representations of candidate phrases and content. The fine-grained concepts extracted aid in indexing content and enriching existing learning taxonomy. We further enrich the extracted concepts with additional concepts from Knowledgebase (KBs) like Wikipedia to aid in linking related content. Since e-learning platforms have diverse learners who learn at different paces, the assessments conducted should also be adaptive. The first step in adaptive assessments is the organization of content according to well-established difficulty levels and pedagogical cognitive taxonomy. We adopt the Bloom’s taxonomy and related difficulty levels and pro-pose a tool for automated categorization of content to the appropriate levels. We propose a multi-task learning system to automatically tag the difficulty level of questions while simultaneously predicting the Bloom’s taxonomy levels. We propose a novel interactive attention mechanism that leverages the affinity between the two tasks. Tagging questions with difficulty levels aids in adapting questions according to learners abilities in automated assessments. The difficulty adapted questions would aid a learner in understanding the skill gap and concepts to focus on while learning. Further, we propose a framework for the novel task of paraphrasing Math Word Problems (MWP) to enrich the existing questions ‘repository. This renders a diverse array of questions for academicians to choose from when formulating assessments. It also facilitates adaptive assessments for learners to enhance their learning experience. In summary, the core contributions of this work are: a) pipelines to aid academicians in the automated curation of content at scale and b) data enrichment pipelines to provide diverse learning content to the users of the learning platform. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject Hierarchical Taxonomy based Tagging of Learning Content en_US
dc.subject Text Data Augmentation en_US
dc.subject Adaptive hard-negative sampling en_US
dc.subject Zero-shot performance en_US
dc.subject Self-supervised and Unsupervised Paraphrasing Approaches en_US
dc.title Learning content curation and enrichment en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account