{"ID":2872916,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.07471","arxiv_id":"2509.07471","title":"From Scarcity to Efficiency: Investigating the Effects of Data Augmentation on African Machine Translation","abstract":"The linguistic diversity across the African continent presents different challenges and opportunities for machine translation. This study explores the effects of data augmentation techniques in improving translation systems in low-resource African languages. We focus on two data augmentation techniques: sentence concatenation with back translation and switch-out, applying them across six African languages. Our experiments show significant improvements in machine translation performance, with a minimum increase of 25\\% in BLEU score across all six languages. We provide a comprehensive analysis and highlight the potential of these techniques to improve machine translation systems for low-resource languages, contributing to the development of more robust translation systems for under-resourced languages.","short_abstract":"The linguistic diversity across the African continent presents different challenges and opportunities for machine translation. This study explores the effects of data augmentation techniques in improving translation systems in low-resource African languages. We focus on two data augmentation techniques: sentence concat...","url_abs":"https://arxiv.org/abs/2509.07471","url_pdf":"https://arxiv.org/pdf/2509.07471v2","authors":"[\"Mardiyyah Oduwole\",\"Oluwatosin Olajide\",\"Jamiu Suleiman\",\"Faith Hunja\",\"Busayo Awobade\",\"Fatimo Adebanjo\",\"Comfort Akanni\",\"Chinonyelum Igwe\",\"Peace Ododo\",\"Promise Omoigui\",\"Abraham Owodunni\",\"Steven Kolawole\"]","published":"2025-09-09T07:49:37Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[]","has_code":false}
