Abstract:
Over the past decade with the explosion of smartphones and pervasive usage of data connectivity, location-based services have increasingly become popular. As a result, Location- based Social Network(LBSN) such as Foursquare, Facebook Places, Brightkite and Gowalla have emerged. These platforms provide users not only to connect, share, and interact but also allow users to share their check-in information with their friends. These networks typically do not expose the check-in information of users due to privacy concerns. Although, popular LBSNs such as Foursquare and Brightkite have their datasets publicly available but are anonymized, wherein only the latitude and longitude of each check-in are available. The latitude and longitude information only provides users’ spatial preferences but to know their taste, knowing the location type such as restaurant, multiplex, fitness center, etc. would be essential. In the previous work, the location categories (e.g. restaurant, multiplex, etc.) were overlooked because of the unavailability of this information. In this thesis, we brought in the new direction of inferring categories and leveraging them for Location-based applications such as Location Prediction, Location Promotion, Influence Maximization, and Community Detection. We align the location information in different networks to infer categories first at a coarse level (i.e. multiple categories for a location) for the publicly available datasets. Moreover, we also collect LBSNs data from Foursquare through Twitter spatio-temporal posts, where we also obtain the categories along with check-in location, time, and users’ social connections. The dataset1 is released publicly for researchers. We call it the fine grained category information as we have a single category associated with each location. The crucial task at hand is to model all this heterogeneous information i.e. spatial, temporal, and categorical to leverage it for location-based applications. We propose models that jointly model spatial, temporal, and categorical features together, and show that the use of auxiliary information and joint modeling provides improvement over state-of-the-art methods. In this thesis, we propose three semi-supervised machine learning models: 1) Category Language Model for next location prediction, 2) LoCaTe and LoCaTe+ for quantifying the influence between two users, and 3) CoLAB for determining implicit communities and to model the information diffusion process in LBSNs.