dc.description.abstract |
Process-Aware Information Systems (PAIS) support business processes and
generate large amounts of event logs from the execution of business processes.
An event log is represented as a tuple of CaseID, Timestamp, Activity
and Actor. Process mining is a new and emerging field that aims at
analyzing the event logs to discover, enhance and improve business processes
and check conformance between run time and design time. A large volume
of event logs that are generated are stored in the databases such as relational,
NoSQL and NewSQL. While relational databases perform well for
a certain class of applications, there are a certain class of applications for
which such databases create bottlenecks (like Scalability and Sharding). To
handle such class of applications, NoSQL database systems have emerged.
A relevant application of interest is the process mining task of discovering
a process model (workflow model) from event logs. The -miner algorithm
is one of the first and most widely used Process Discovery technique. Our
objective is to investigate which of the databases (Relational or NoSQL) perform
better for a Process Discovery application under Process Mining. We
implement the -miner algorithm on relational (row-oriented) and NoSQL
(column-oriented) databases in database query languages so that our algorithm
is tightly coupled to the database. We do a performance benchmarking
of the -miner algorithm on a row-oriented database and a NoSQL columnoriented
database to compare which database can efficiently store massive
event logs and analyze it in seconds to discover a process model. |
en_US |