Abstract:
An undergraduate level Software Engineering course generally consists of a team-based semester
long project and emphasizes on both technical and managerial skills. Software Engineering
is a practice-oriented and applied discipline and hence there is an emphasis on hands-on de-
velopment, process, usage of tools in addition to theory and basic concepts. We present an
approach for mining the process data (process mining) from software repositories archiving data
generated as a result of constructing software by student teams in an educational setting. We
present an application of mining three software repositories: team wiki (used during require-
ment engineering), version control system (development and maintenance) and issue tracking
system (corrective and adaptive maintenance) in the context of an undergraduate Software En-
gineering course. We propose visualizations, metrics and algorithms to provide an insight into
practices and procedures followed during various phases of a software development life-cycle.
The proposed visualizations and metrics (learning analytics) provide a multi-faceted view to the
instructor serving as a feedback tool on development process and quality by students. We mine
the event logs produced by software repositories and derive insights such as degree of individual
contributions in a team, quality of commit messages, intensity and consistency of commit activi-
ties, bug xing process trend and quality, component and developer entropy, process compliance
and veri cation. We present our empirical analysis on a software repository dataset consisting
of 19 teams of 5 members each and discuss challenges, limitations and recommendations.