IIIT-Delhi Institutional Repository

Advancing text summarization with conscience, comprehension, and multimodality

Show simple item record

dc.contributor.author Kumar, Yash
dc.contributor.author Goyal, Vikram (Advisor)
dc.contributor.author Chakraborty, Tanmoy (Advisor)
dc.date.accessioned 2024-06-19T12:17:15Z
dc.date.available 2024-06-19T12:17:15Z
dc.date.issued 2024-01
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/1640
dc.description.abstract Summarization, an essential technique for efficiently condensing extensive textual content into concise versions, has become increasingly valuable in the face of the information deluge characterizing the modern digital era. This thesis goes beyond the traditional scope by not only pushing the boundaries of summarization techniques, focusing on both unimodal and multimodal approaches for standard and extreme summarization tasks but also addressing the critical issue of biases within summarization systems. In addition to proposing innovative methods for bias identification, the thesis introduces mechanisms to control and mitigate biases, contributing to a more comprehensive and equitable approach in the domain of information condensation and knowledge extraction. The first section of this thesis investigates the area of unimodal summarization, focusing on the intricate task of transforming extensive textual content into succinct and coherent summaries. Existing datasets and systems often exhibit biases, both intrinsic (stemming from data) and extrinsic (introduced during training), leading to unfaithful and hallucinating summaries. By tackling the challenges of bias in unimodal summarization, this work proposes novel methods to generate coherent, and faithful summaries in both general and extreme summarization settings. The next section of the thesis explores the area of multimodal summarization, integrating videos, audio, and text to generate comprehensive and coherent summaries. Existing literature in multimodal summarization is still in its early stages, with highly limited datasets and systems available. This is especially true for the task of multimodal summarization of scientific videos. To address this challenge, we introduce the problem of multimodal summarization and extreme multimodal summarization of scientific videos. Multimodal summarization generates concise and coherent summaries that capture the key points of a video in 5-7 lines, while extreme multimodal summarization generates extremely short summaries in 2-3 lines. In the subsequent section, we meticulously examine biases in existing summarization systems by thoroughly evaluating both datasets and models. Employing diverse intrinsic and extrinsic metrics, we systematically identify biases, gaining a nuanced understanding of the constraints in current summarization datasets and methodologies. Building upon these insights, we introduce a novel method designed to counteract biases and enhance coherence and faithfulness while preserving crucial information. This method represents a significant step forward in advancing the reliability and integrity of summarization systems. The research findings of this thesis have significant implications for summarization’s future applications. By improving unimodal summarization, the proposed methods promise coherent, and faithful models across domains like news aggregation and decision-making. Advancements in multimodal summarization will revolutionize fields rich in multimodal data, like education and entertainment. Enabling automatic generation of comprehensive summaries from various sources empowers users to access and comprehend multimedia content efficiently. Additionally, bias identification and mitigation methods are crucial for ensuring fairness and inclusivity in summarization technologies. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject Baseline Systems en_US
dc.subject Evaluation Setup en_US
dc.subject Human Evaluation Setup en_US
dc.subject REISA vs ChatGPT en_US
dc.title Advancing text summarization with conscience, comprehension, and multimodality en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account