IIIT-Delhi Institutional Repository

Long document keyphrase extraction

Show simple item record

dc.contributor.author Gautam, Dibya
dc.contributor.author Agrawal, Navneet
dc.date.accessioned 2023-04-14T10:56:57Z
dc.date.available 2023-04-14T10:56:57Z
dc.date.issued 2021-11
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/1145
dc.description.abstract Keyphrase extraction is the task of automatically extracting a set of phrases from a document that represents the overall context of the given document. Such keyphrases can be used in multiple ways in Information retrieval, recommendation systems, document clustering, etc. For scientific papers, all the existing works for keyphrase extraction use datasets consisting title and abstracts. However, since the abstract and the title aren’t the complete representation of an entire paper, these datasets have a major limitation. To overcome this limitation, we introduce a dataset of over 1.3M full body scientific papers with their keyphrases that can be used for the automatic keyphrase extraction tasks. We also present the results of initial experiments done using the popular unsupervised and supervised techniques on this dataset. We also experimented a new semi-supervised approach on this dataset. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject keyphrase extraction en_US
dc.subject long document dataset en_US
dc.subject unsupervised keyphrase extraction en_US
dc.subject semi-supervised keyphrase extraction en_US
dc.title Long document keyphrase extraction en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account