Abstract:
Taxonomy is a hierarchical structure that deals with knowledge. It can be perceivedas Knowledge graph with all the relations being only ’is-a’. Automatic creation of taxonomies can be achieved by creation scratch, by completion of existing taxonomyand by expansion of existing taxonomies. In taxonomy expansion, nodes canbe inserted at the leaf only, while in taxonomy completion nodes can be insertedinto any position of the existing taxonomy. In this thesis taxonomy completion has been explored with the introduction of polar embedding for a term. The taxonomyis perceived as a set of concentric circles and the inheritance of the terms is injectedinto the sectors formed on the circles. Each circle indicates a level of the taxonomy.As we go down the taxonomy, the radius of a circle or orbit increases exponentially.Semantically similar terms in the first orbit(or circle) are clustered to form a sector.The sectors in the following orbits(or circles), inherit the children of their correspondings ectors in the previous orbit. The number of sub-sectors of a particular sector indicates the number of sectors which will be children to it. Now, a sectorcan be represented using its starting and ending points which are angles. A termhas a unique ID represented using r-hot vectors, which are one-hot vectors, but thenon-zero values indicate the level of the term. In previous literature of taxonomy completion, most of the time the parent is found out using a scoring method. Thisthesis directly predicts the parent and the concept of unique ID described above isinstrumental in doing so. In this thesis, two main things occur which do not followthe standard trend in the works of taxonomy completion or expansion. One of themis the use of polar coordinates, where radius indicates the level of the term and theangles indicate the place where the term will be present in a particular level. Theother part is the use of unique IDs and direct prediction of the parent. Predicting theparent directly would save a lot of time during inference and also it would relieve usof the idea that we should be ranking parents. The reason being parents of a node can not be ranked since it should be exact. Thus, this thesis tries to save time and triesto give exact information for a taxonomy.