Long document keyphrase extraction

Please use this identifier to cite or link to this item: http://repository.iiitd.edu.in/xmlui/handle/123456789/1145

Title:	Long document keyphrase extraction
Authors:	Gautam, Dibya Agrawal, Navneet
Keywords:	keyphrase extraction long document dataset unsupervised keyphrase extraction semi-supervised keyphrase extraction
Issue Date:	Nov-2021
Publisher:	IIIT-Delhi
Abstract:	Keyphrase extraction is the task of automatically extracting a set of phrases from a document that represents the overall context of the given document. Such keyphrases can be used in multiple ways in Information retrieval, recommendation systems, document clustering, etc. For scientific papers, all the existing works for keyphrase extraction use datasets consisting title and abstracts. However, since the abstract and the title aren’t the complete representation of an entire paper, these datasets have a major limitation. To overcome this limitation, we introduce a dataset of over 1.3M full body scientific papers with their keyphrases that can be used for the automatic keyphrase extraction tasks. We also present the results of initial experiments done using the popular unsupervised and supervised techniques on this dataset. We also experimented a new semi-supervised approach on this dataset.
URI:	http://repository.iiitd.edu.in/xmlui/handle/123456789/1145
Appears in Collections:	Year-2021

Files in This Item:

File	Description	Size	Format
BTP_report_Navnet_Dibya_.pdf Restricted Access		741.09 kB	Adobe PDF	View/Open Request a copy

DSpace JSPUI