Abstract:
Process mining consists of mining business process event-logs for discovering run-time process models, process compliance verifi cation and extracting
useful insights on process e efficiency. Process model discovery from event-logs
is one of the most important and challenging process mining tasks. Process
model discovery consists of learning a System Net (such as a Petri Net) from
an event log. The -algorithm is fi rst and most widely used process discovery technique. There are several extensions proposed to -algorithm but we
use the basic -algorithm as a baseline and benchmark algorithm for our
study. We present a CQL (Cassandra Query Language) and SQL (Structured Query Language) implementation of the basic -algorithm (translation of -algorithm computations into CQL and SQL). Column-oriented
databases have shown to improve the performance of several functions and
algorithms that require analytical query processing on a large dataset. We
conduct a benchmarking study consisting of a series of experiments on a
large real-world dataset to compare the performance of the -algorithm
CQL and SQL implementations.