Corpora evaluation and system bias detection in multi document summarization

Dey, Alvin; Chakraborty, Tanmoy (Advisor)

Please use this identifier to cite or link to this item: http://repository.iiitd.edu.in/xmlui/handle/123456789/852

Full metadata record

DC Field	Value	Language
dc.contributor.author	Dey, Alvin
dc.contributor.author	Chakraborty, Tanmoy (Advisor)
dc.date.accessioned	2021-03-24T07:14:01Z
dc.date.available	2021-03-24T07:14:01Z
dc.date.issued	2020-06
dc.identifier.uri	http://repository.iiitd.edu.in/xmlui/handle/123456789/852
dc.description.abstract	Multi-document summarization (MDS) is the task of reflecting key points from any set of documents into a concise text paragraph. In the past, it has been used to aggregate news, tweets, product reviews, etc. from various sources. Owing to no standard definition of the task, we encounter a plethora of datasets with varying levels of overlap and conflict between participating documents. There is also no standard regarding what constitutes summary information in MDS. Adding to the challenge is the fact that new systems report results on a set of chosen datasets, which might not correlate with their performance on the other datasets. In this paper, we study this heterogeneous task with the help of a few widely used MDS corpora and a suite of state-of-the-art models. We make an attempt to quantify the quality of summarization corpus and prescribe a list of points to consider while proposing a new MDS corpus. Next, we analyze the reason behind the absence of an MDS system which achieves superior performance across all corpora. We then observe the extent to which system metrics are influenced, and bias is propagated due to corpus properties.	en_US
dc.language.iso	en_US	en_US
dc.publisher	IIIT-Delhi	en_US
dc.subject	DUC, TAC, TextRank, LexRank	en_US
dc.title	Corpora evaluation and system bias detection in multi document summarization	en_US
dc.type	Thesis	en_US
Appears in Collections:	Year-2020

Files in This Item:

File	Description	Size	Format
MT18066_Alvin Dey.pdf		856.48 kB	Adobe PDF	View/Open

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets