Cost bias mitigation of audio deepfake detection

Kandpal, Sarthak; Gupta, Anubha (Advisor)

dc.contributor.author	Kandpal, Sarthak
dc.contributor.author	Gupta, Anubha (Advisor)
dc.date.accessioned	2026-04-18T09:59:11Z
dc.date.available	2026-04-18T09:59:11Z
dc.date.issued	2024-12-08
dc.identifier.uri	http://repository.iiitd.edu.in/xmlui/handle/123456789/1935
dc.description.abstract	With the rapid development and advance in the field of Speech Synthesis technology, differentiation of genuine and fake audios has become increasingly challenging. This semester the project focuses on benchmarking the voice conversion model as a preparatory step to create a dataset aimed at cost biased mitigation of audio deepfake detection but due to many constraints and inefficiency to finetune the model only one model was benchmarked correctly. To demonstrate progress and contribute meaningfully a dataset of 20.596 utterances was proposed named Kalpvani using the benchmark model. A user study was conducted where they were present with 6 fake and 6 real audios and evaluated cloned audio through subjective analysis. Participants were also asked to assess whether the given cloned audio is close to source audio or the target audio. Furthermore, speaker verification systems like Ecapa TDNN and Resnet TDNN were used to calculate Equal Error Rates (EER) for target-clone and source-clone pairs, providing an objective evaluation of voice similarity. This benchmarking lays the foundation for future work in cost bias mitigation of audio deep fake detection.	en_US
dc.language.iso	en_US	en_US
dc.publisher	IIIT-Delhi	en_US
dc.subject	Voice Conversion	en_US
dc.subject	Deepfake Detection	en_US
dc.subject	Equal Error Rate	en_US
dc.title	Cost bias mitigation of audio deepfake detection	en_US
dc.type	Other	en_US