Abstract:
Despite advances in Machine Translation (MT), translating gender-marked languages remains chal- lenging. The challenge is further compounded by existing models that attempt to handle both co-reference resolution and translation simultaneously. Additionally, standard parallel MT datasets often default to masculine forms in Hindi, introducing a bias that limits MT models’ ability to accu- rately learn gendered translations. This paper investigates this problem using English (En) to Hindi (Hi) translation, a language pair with distinct gender-marked grammatical structures and addresses these challenges in the following ways: (1) This paper introduces the Speaker-Aware Gender Eval- uation Corpus (SAGECorp), a synthetic dataset comprising 13,420 En-Hi sentence pairs, including contrastive gendered sentences for each pair. (2) To address the inefficiencies of existing models, this work proposes a lightweight, plug-and-play framework that leverages a small language model (SLM) as a post-processing solution to improve gender-aware translation. (3) To robustly measure the effectiveness of co-reference resolution in gender-aware models, a new metric, Weighted Gender Accuracy (WGA), is proposed. In the end, elaborate benchmarking of three small multilingual language models (LLAMA-3.2, Phi-3.5-mini, Gemma 2.0) has been carried out on the SAGECorp dataset on multiple metrics including our newly introduced WGA metric. The proposed framework demonstrates a 16% average improvement over the baseline translator in gender-aware translation when evaluated on the same dataset.