Abstract:
Community detection has gained immense popularity in recent years. Real world networks
consist of millions of nodes and edges. Groups of nodes exhibit interesting characteristics, whoseknowledge can be of great help in various fields. Moreover, networks in the real world are everevolving and it is impossible to have complete information about a network at any given time.The aim of this thesis is to study the different techniques for the exploration of the incompletenetwork available. The motive is to explore the incomplete network such that nodes whichare in the community of the known nodes are brought into the network. We build a machinelearning model to predict which node should be explored. We study four methods for selectingclustering coefficient. For identifying the communities of the nodes of the incomplete networkwe study three algorithms namely, community detection by hopcount, community detection bymaximizing modularity, and community detection by maximizing permanence. We observe thatMachine learning classifier is the best approach to maximize the recall by exploring least numberof nodes, whereas global clustering coefficient is the best approach for maintaining high precision.BFS gives better f1-score as compared to other approaches for higher budget. The communitydetection algorithms are performing equally well. Though with only a slight margin, HopCountmethod of community detection gives better results for recall and permanence maximizationgives better results for precision.