{"ID":2852878,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.18898","arxiv_id":"2510.18898","title":"Transformer-Based Low-Resource Language Translation: A Study on Standard Bengali to Sylheti","abstract":"Machine Translation (MT) has advanced from rule-based and statistical methods to neural approaches based on the Transformer architecture. While these methods have achieved impressive results for high-resource languages, low-resource varieties such as Sylheti remain underexplored. In this work, we investigate Bengali-to-Sylheti translation by fine-tuning multilingual Transformer models and comparing them with zero-shot large language models (LLMs). Experimental results demonstrate that fine-tuned models significantly outperform LLMs, with mBART-50 achieving the highest translation adequacy and MarianMT showing the strongest character-level fidelity. These findings highlight the importance of task-specific adaptation for underrepresented languages and contribute to ongoing efforts toward inclusive language technologies.","short_abstract":"Machine Translation (MT) has advanced from rule-based and statistical methods to neural approaches based on the Transformer architecture. While these methods have achieved impressive results for high-resource languages, low-resource varieties such as Sylheti remain underexplored. In this work, we investigate Bengali-to...","url_abs":"https://arxiv.org/abs/2510.18898","url_pdf":"https://arxiv.org/pdf/2510.18898v1","authors":"[\"Mangsura Kabir Oni\",\"Tabia Tanzin Prama\"]","published":"2025-10-20T16:29:24Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.CY\"]","methods":"[\"Transformer\",\"Large Language Model\",\"Language Model\"]","has_code":false}
