IIIT-Delhi Institutional Repository

Quaternion-enhanced neural networks : a new paradigm for audio processing efficiency

Show simple item record

dc.contributor.author Chaudhary, Aryan
dc.contributor.author Abrol, Vinayak (Advisor)
dc.date.accessioned 2024-09-25T13:54:20Z
dc.date.available 2024-09-25T13:54:20Z
dc.date.issued 2024-05-01
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/1681
dc.description.abstract This thesis explores the integration of quaternion algebra into neural network architectures to enhance their efficiency for diverse audio processing tasks. Quaternion-based transformations are employed to achieve structural compression to reduce model size and computational demands. Further, this is achieved while retaining the task’s high accuracy and reliability and enhancing the model’s learning capabilities. This thesis presents three main studies: the first focuses on applying quaternion models for on-device keyword spotting, demonstrating their ability to match the performance of state-of-the-art models with a fraction of the computational footprint; the second investigates the combined use of quaternion transformations and pruning techniques in convolutional neural networks for audio tagging, achieving substantial reductions in computational demands and memory usage; the third explores the use of quaternion algebra in speech synthesis through vocoder models, which enables high-quality speech generation with significantly reduced parameter sizes and computational overhead. The proposed quaternion models demonstrate substantial reductions in parameter count and computational load across these applications, making them suitable for deployment on resource-limited devices. Experimental validations on standard datasets highlight the effectiveness and versatility of these models. Together, these studies underscore the potential of quaternion-based models in advancing real-world applications on edge devices. All these studies achieve or set the state-of-the-art performance in their respective domains. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject Quaternion Neural Networks en_US
dc.subject Audio Processing en_US
dc.subject Keyword Spotting en_US
dc.subject Audio Tagging en_US
dc.subject Speech Synthesis en_US
dc.subject Vocoder Models en_US
dc.subject Loss-Landscape Visualisation en_US
dc.subject Speech and Audio Applications en_US
dc.title Quaternion-enhanced neural networks : a new paradigm for audio processing efficiency en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account