IIIT-Delhi Institutional Repository

Static technique for bug localization using character N-Gram based information retrieval model

Show simple item record

dc.contributor.advisor Surekha, Ashish
dc.contributor.author Sangeeta
dc.date.accessioned 2012-03-14T10:36:50Z
dc.date.available 2012-03-14T10:36:50Z
dc.date.issued 2012-03-14T10:36:50Z
dc.identifier.uri https://repository.iiitd.edu.in/jspui/handle/123456789/17
dc.description.abstract Bug or Fault localization is a process of identifying the speci c location(s) or region(s) of source code (at various granularity levels such as the directory path, le, method or state- ment) that is faulty and needs to be modi ed to repair the defect. Bug localization is a routine task in software maintenance (corrective maintenance). Due to the increasing size and complexity of current software applications, automated solutions for bug localization can signi cantly reduce human e ort and software maintenance cost. We presented a technique (which falls into the class of static techniques for bug localiza- tion) for bug localization using a character N-gram based Information Retrieval (IR) model. We framed the problem of bug localization as a relevant document(s) search task for a given query and investigated the application of character-level N-gram based textual features de- rived from bug reports and source-code le attributes. We implemented the proposed IR model and evaluated its performance on dataset downloaded from two popular open-source projects (JBOSS and Apache). We conducted a series of experiments to validate our hypothesis and presented evidences to demonstrate that the proposed approach is e ective. The accuracy of the proposed ap- proach is measured in terms of the standard and commonly used SCORE and MAP (Mean Average Precision) metrics for the task of bug localization. Experimental results reveal that the median value for the SCORE metric for JBOSS and Apache dataset is 99.03% and 93.70% respectively. We observed that for 16.16% of the bug reports in the JBOSS dataset and for 10.67% of the bug reports in the Apache dataset, the average precision value (computed at all recall levels) is between 0.9 and 1.0. en_US
dc.language.iso en_US en_US
dc.subject Bug Localization en_US
dc.subject Mining Software repositories en_US
dc.subject Information Retrieval en_US
dc.subject Automated Software engineering en_US
dc.title Static technique for bug localization using character N-Gram based information retrieval model en_US
dc.type Thesis en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository

Advanced Search


My Account