Please use this identifier to cite or link to this item: http://repository.iiitd.edu.in/xmlui/handle/123456789/205
Full metadata record
DC FieldValueLanguage
dc.contributor.authorArora, Kanchan-
dc.contributor.authorBedathur, Srikanta (Advisor)-
dc.date.accessioned2014-12-12T07:07:16Z-
dc.date.available2014-12-12T07:07:16Z-
dc.date.issued2014-12-12T07:07:16Z-
dc.identifier.urihttps://repository.iiitd.edu.in/jspui/handle/123456789/205-
dc.description.abstractIdentifying semantic similarity between two texts has many applications in NLP including information extraction and retrieval, word sense disambigua- tion, text summarization and type classi cation. Similarity between texts is commonly determined using a taxonomy based approach, but the limited scalability of existing taxonomies has led recent research to use Wikipedia's encyclopaedic knowledge base to nd similarity or relatedness. In this the- sis, we propose Hierarchical Semantic Analysis, a method which represents semantics of a text in high dimensional space of Wikipedia concepts and category hierarchies. We represent the meaning of any text excerpt as a weighed vector of Wikipedia-based resources. To evaluate the similarity of texts in this space, we compare the corresponding vectors using conventional metrics (e.g. cosine). Compared with the previous state of the art, use of Hierarchical Semantic Analysis(HSA) results in substantial improvements in correlation of computed similarity scores with human judgements from r= .873 to 0.901 for short sentence pairs and from r= .72 to 0.863 for paragraph pairs.en_US
dc.language.isoen_USen_US
dc.subjectWikipediaen_US
dc.subjectSemantic Similarityen_US
dc.subjectHierarchical Abstractionen_US
dc.titleSemantic similarity through hierarchical abstraction of knowledgeen_US
dc.typeThesisen_US
Appears in Collections:Year-2014

Files in This Item:
File Description SizeFormat 
MT12039.pdf358.13 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.