A comparative analysis of large language models for code documentation generation

Dvivedi, Shubhang Shekhar; Pujari, Sai Leela Rahul; Vijay, Vyshnav; Lodh, Shoumik; Kumar, Dhruv (Advisor)

dc.contributor.author	Dvivedi, Shubhang Shekhar
dc.contributor.author	Pujari, Sai Leela Rahul
dc.contributor.author	Vijay, Vyshnav
dc.contributor.author	Lodh, Shoumik
dc.contributor.author	Kumar, Dhruv (Advisor)
dc.date.accessioned	2024-05-09T12:41:55Z
dc.date.available	2024-05-09T12:41:55Z
dc.date.issued	2023-11-29
dc.identifier.uri	http://repository.iiitd.edu.in/xmlui/handle/123456789/1413
dc.description.abstract	This paper presents a comprehensive comparative analysis of Large Language Models (LLMs) for code documentation generation. Code documentation is an essential part of the software writing process as it allows a new user to learn and build more code on top of the existing code base with relative ease. The paper evaluates models such as GPT-3.5, GPT-4, Bard, Llama2, and Starchat. Our evaluation employs a checklist-based system to minimize subjectivity, providing a more objective assessment. We find that, barring Starchat, all LLMs consistently outperform the original documentation. Notably, closed-source models GPT-3.5, GPT-4, and Bard exhibit superior performance across various parameters compared to open-source alternatives, namely LLama 2 and StarChat. Additionally, considering the time taken for generation, GPT-4 leads, followed by Llama2, Bard, with ChatGPT and Starchat exhibiting comparable generation times. This study contributes insights into the nuanced challenges of industry-level code documentation generation and establishes benchmarks for future research in this evolving domain.	en_US
dc.language.iso	en_US	en_US
dc.publisher	IIIT-Delhi	en_US
dc.subject	Code Documentation	en_US
dc.subject	Large Language Models	en_US
dc.subject	Open Source LLMs	en_US
dc.title	A comparative analysis of large language models for code documentation generation	en_US
dc.type	Other	en_US