Abstract:
High Utility Itemset Mining (HUIM) has gained significant progress in recent years.
The HUIM refers to the method of finding most relevant itemsets from a database and it finds its applications in the domain of senosor data analytics, ad-click data analytics and retail stores. The HUIM allows to associate notion of utility with each item which was not possible in the case of frequent pattern mining (FPM). In FPM, only presence or absence of an item is considered in a transaction itemset and hence the approach does not allow provide exibility when different items have different importance. The focus of pattern mining work has been limited to mainly static databases. However, with the increase in data and need of timely information requires existing methods to be either scaled to or adapted to the streaming environment. The key concerns for streaming data are high throughput computations with minimum time and space constraints. In this thesis, we implemented and compared the top-k streaming version of state-of-the- art algorithms T-HUDS(High Utility Itemset Mining over Data Stream), FHM(Faster High Utility Itemset Mining),EFIM(Efficient High Utility Itemset Mining) on various databases. Our experimental results show that Stream-FHM and Stream-EFIM out- performs tree based T-HUDS algorithm. The Stream-FHM results are better for sparse databases and Stream-EFIM results better for dense databases.