IIIT-Delhi Institutional Repository

Semantic similarity through hierarchical abstraction of knowledge

Show simple item record

dc.contributor.author Arora, Kanchan
dc.contributor.author Bedathur, Srikanta (Advisor)
dc.date.accessioned 2014-12-12T07:07:16Z
dc.date.available 2014-12-12T07:07:16Z
dc.date.issued 2014-12-12T07:07:16Z
dc.identifier.uri https://repository.iiitd.edu.in/jspui/handle/123456789/205
dc.description.abstract Identifying semantic similarity between two texts has many applications in NLP including information extraction and retrieval, word sense disambigua- tion, text summarization and type classi cation. Similarity between texts is commonly determined using a taxonomy based approach, but the limited scalability of existing taxonomies has led recent research to use Wikipedia's encyclopaedic knowledge base to nd similarity or relatedness. In this the- sis, we propose Hierarchical Semantic Analysis, a method which represents semantics of a text in high dimensional space of Wikipedia concepts and category hierarchies. We represent the meaning of any text excerpt as a weighed vector of Wikipedia-based resources. To evaluate the similarity of texts in this space, we compare the corresponding vectors using conventional metrics (e.g. cosine). Compared with the previous state of the art, use of Hierarchical Semantic Analysis(HSA) results in substantial improvements in correlation of computed similarity scores with human judgements from r= .873 to 0.901 for short sentence pairs and from r= .72 to 0.863 for paragraph pairs. en_US
dc.language.iso en_US en_US
dc.subject Wikipedia en_US
dc.subject Semantic Similarity en_US
dc.subject Hierarchical Abstraction en_US
dc.title Semantic similarity through hierarchical abstraction of knowledge en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account