Abstract:
In the contemporary multilingual landscape of online communication, the emergence of codemixed language, the seamless integration of multiple languages within a single utterance, has become increasingly prevalent. The Transformer Architecture, a revolutionary development in natural language processing, has significantly facilitated the modeling of such linguistic complexity. However, despite its efficacy, deploying Transformer models to Edge Devices presents challenges. The inherent depth of Transformer models, while enhancing their learning capacity, poses obstacles for deployment on resource-constrained Edge Devices. These devices, characterized by limited computational capabilities, struggle with the computational intensity of deep models, resulting in impractical latency. Consequently, the transformative benefits of code-mixed language processing are hindered when confined to internet-based usage. The current limitation of deploying Transformer models exclusively via the internet restricts their accessibility and utility, especially in scenarios where real-time, low-latency processing is imperative. As technological advancements continue, addressing these deployment challenges and enabling the efficient implementation of Transformer models on Edge Devices could unlock new possibilities for seamless, multilingual communication in diverse settings.