IIIT-Delhi Institutional Repository

Frequency domain gradient visualization for acoustic models

Show simple item record

dc.contributor.author Thakran, Yash
dc.contributor.author Abrol, Vinayak (Advisor)
dc.date.accessioned 2024-05-20T08:56:13Z
dc.date.available 2024-05-20T08:56:13Z
dc.date.issued 2023-12-08
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/1533
dc.description.abstract Modeling directly raw waveforms through neural networks for speech processing is gaining more and more attention. Despite its varied success, a question that remains is: what kind of information are such neural networks capturing or learning for different tasks from the speech signal? Such an insight is not only interesting for advancing those techniques but also for understanding better speech signal characteristics. This paper takes a step in that direction, where we develop a gradient based approach to estimate the relevance of each speech sample input on the output score. We show that analysis of the resulting “relevance signal” through conventional speech signal processing techniques can reveal the information modeled by the whole network. We demonstrate the potential of the proposed approach by analyzing raw waveform CNN-based automatic speech recognition and speaker verification systems. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject deep learning en_US
dc.subject CNN visualization en_US
dc.subject gradients en_US
dc.subject raw waveforms en_US
dc.title Frequency domain gradient visualization for acoustic models en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account