{"ID":2874340,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.05425","arxiv_id":"2509.05425","title":"No Text Needed: Forecasting MT Quality and Inequity from Fertility and Metadata","abstract":"We show that translation quality can be predicted with surprising accuracy \\textit{without ever running the translation system itself}. Using only a handful of features, token fertility ratios, token counts, and basic linguistic metadata (language family, script, and region), we can forecast ChrF scores for GPT-4o translations across 203 languages in the FLORES-200 benchmark. Gradient boosting models achieve favorable performance ($R^{2}=0.66$ for XX$\\rightarrow$English and $R^{2}=0.72$ for English$\\rightarrow$XX). Feature importance analyses reveal that typological factors dominate predictions into English, while fertility plays a larger role for translations into diverse target languages. These findings suggest that translation quality is shaped by both token-level fertility and broader linguistic typology, offering new insights for multilingual evaluation and quality estimation.","short_abstract":"We show that translation quality can be predicted with surprising accuracy \\textit{without ever running the translation system itself}. Using only a handful of features, token fertility ratios, token counts, and basic linguistic metadata (language family, script, and region), we can forecast ChrF scores for GPT-4o tran...","url_abs":"https://arxiv.org/abs/2509.05425","url_pdf":"https://arxiv.org/pdf/2509.05425v2","authors":"[\"Jessica M. Lundin\",\"Ada Zhang\",\"David Adelani\",\"Cody Carroll\"]","published":"2025-09-05T18:11:49Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.AI\"]","methods":"[]","has_code":false}
