Static technique for bug localization using character N-Gram based information retrieval model

Sangeeta

Please use this identifier to cite or link to this item: http://repository.iiitd.edu.in/xmlui/handle/123456789/17

Title:	Static technique for bug localization using character N-Gram based information retrieval model
Authors:	Surekha, Ashish Sangeeta
Keywords:	Bug Localization Mining Software repositories Information Retrieval Automated Software engineering
Issue Date:	14-Mar-2012
Abstract:	Bug or Fault localization is a process of identifying the speci c location(s) or region(s) of source code (at various granularity levels such as the directory path, le, method or state- ment) that is faulty and needs to be modi ed to repair the defect. Bug localization is a routine task in software maintenance (corrective maintenance). Due to the increasing size and complexity of current software applications, automated solutions for bug localization can signi cantly reduce human e ort and software maintenance cost. We presented a technique (which falls into the class of static techniques for bug localiza- tion) for bug localization using a character N-gram based Information Retrieval (IR) model. We framed the problem of bug localization as a relevant document(s) search task for a given query and investigated the application of character-level N-gram based textual features de- rived from bug reports and source-code le attributes. We implemented the proposed IR model and evaluated its performance on dataset downloaded from two popular open-source projects (JBOSS and Apache). We conducted a series of experiments to validate our hypothesis and presented evidences to demonstrate that the proposed approach is e ective. The accuracy of the proposed ap- proach is measured in terms of the standard and commonly used SCORE and MAP (Mean Average Precision) metrics for the task of bug localization. Experimental results reveal that the median value for the SCORE metric for JBOSS and Apache dataset is 99.03% and 93.70% respectively. We observed that for 16.16% of the bug reports in the JBOSS dataset and for 10.67% of the bug reports in the Apache dataset, the average precision value (computed at all recall levels) is between 0.9 and 1.0.
URI:	https://repository.iiitd.edu.in/jspui/handle/123456789/17
Appears in Collections:	Year-2011

Files in This Item:

File	Description	Size	Format
MTech-Thesis-Sangeeta-Lal-13-OCT-2011.pdf		1.24 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets