IIIT-Delhi Institutional Repository

An extensive study on state-of-the-art c decompilers

Show simple item record

dc.contributor.author Singh, Sejal
dc.contributor.author Purandare, Rahul (Advisor)
dc.contributor.author Jain, Ridhi (Advisor)
dc.date.accessioned 2023-05-29T10:51:28Z
dc.date.available 2023-05-29T10:51:28Z
dc.date.issued 2021-06
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/1288
dc.description.abstract C decompilers are often chosen by the developers when they do not have the source code available in order to either debug or understand the code. Previous studies suggest that the C decompilers are not correct; however, the decompiled code is majorly used to understand the code and not generate correct codes. The codes produced by C decompilers are often semantically the same but syntactically very di erent. We plan a study to understand the syntactical di erence between the codes and focus on comprehension of the decompiled code. We use three widely used state-of-the-art open source decompilers: RetDec by Avast and two Radare’s decompiler plugins, R2Dec and R2Ghidra. The study intends to evaluate the structural dissimilarities between the original and the decompiled code and how does that a ects the developers’ performance. We will conduct a study involving developers with programming experience to validate our intuition. We plan to compute the similarity between Abstract Syntax Trees (AST) of various versions of a code. We used LLVM-clang to generate AST’s, constructed the AST’s as a comparator to compare the original code and decompiled code and calculate similarity score. The C codes for the study are taken from GitHub, which are then complied with using four different optimization levels coupled with three di erent compile options resulting in 12 variants for each code. Then these compiled codes are further decompiled using these three decompilers. Recompiling these codes is often a challenge. We manually solved the errors and made them compliable without changing its algorithm. Then with the help of clang LLVM, we generate AST for codes and build an AST comparator to compare the generated AST of source code and decompiled code. We calculated the similarity score based on the number of matches found between the two files. Analyzed the results using graphs and di erent statistical methods. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject Decompilers en_US
dc.subject Compilers en_US
dc.subject Abstract Syntax Trees (AST) en_US
dc.subject Clang en_US
dc.subject Optimization levels en_US
dc.title An extensive study on state-of-the-art c decompilers en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account