{"ID":2893793,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.11939","arxiv_id":"2507.11939","title":"POLYCHARTQA: Benchmarking Large Vision-Language Models with Multilingual Chart Question Answering","abstract":"Charts are a universally adopted medium for data communication, yet existing chart understanding benchmarks are overwhelmingly English-centric, limiting their accessibility and relevance to global audiences. To address this limitation, we introduce PolyChartQA, the first large-scale multilingual benchmark for chart question answering, comprising 22,606 charts and 26,151 QA pairs across 10 diverse languages. PolyChartQA is constructed through a scalable pipeline that enables efficient multilingual chart generation via data translation and code reuse, supported by LLM-based translation and rigorous quality control. We systematically evaluate multilingual chart understanding with PolyChartQA on state-of-the-art LVLMs and reveal a significant performance gap between English and other languages, particularly low-resource ones. Additionally, we introduce a companion multilingual chart question answering training set, PolyChartQA-Train, on which fine-tuning LVLMs yields substantial gains in multilingual chart understanding across diverse model sizes and architectures. Together, our benchmark provides a foundation for developing globally inclusive vision-language models capable of understanding charts across diverse linguistic contexts.","short_abstract":"Charts are a universally adopted medium for data communication, yet existing chart understanding benchmarks are overwhelmingly English-centric, limiting their accessibility and relevance to global audiences. To address this limitation, we introduce PolyChartQA, the first large-scale multilingual benchmark for chart que...","url_abs":"https://arxiv.org/abs/2507.11939","url_pdf":"https://arxiv.org/pdf/2507.11939v2","authors":"[\"Yichen Xu\",\"Liangyu Chen\",\"Liang Zhang\",\"Jianzhe Ma\",\"Wenxuan Wang\",\"Qin Jin\"]","published":"2025-07-16T06:09:02Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.AI\",\"cs.CV\",\"cs.MM\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
