{"ID":2866600,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.20136","arxiv_id":"2509.20136","title":"V-GameGym: Visual Game Generation for Code Large Language Models","abstract":"Code large language models have demonstrated remarkable capabilities in programming tasks, yet current benchmarks primarily focus on single modality rather than visual game development. Most existing code-related benchmarks evaluate syntax correctness and execution accuracy, overlooking critical game-specific metrics such as playability, visual aesthetics, and user engagement that are essential for real-world deployment. To address the gap between current LLM capabilities in algorithmic problem-solving and competitive programming versus the comprehensive requirements of practical game development, we present V-GameGym, a comprehensive benchmark comprising 2,219 high-quality samples across 100 thematic clusters derived from real-world repositories, adopting a novel clustering-based curation methodology to ensure both diversity and structural completeness. Further, we introduce a multimodal evaluation framework with an automated LLM-driven pipeline for visual code synthesis using complete UI sandbox environments. Our extensive analysis reveals that V-GameGym effectively bridges the gap between code generation accuracy and practical game development workflows, providing quantifiable quality metrics for visual programming and interactive element generation.","short_abstract":"Code large language models have demonstrated remarkable capabilities in programming tasks, yet current benchmarks primarily focus on single modality rather than visual game development. Most existing code-related benchmarks evaluate syntax correctness and execution accuracy, overlooking critical game-specific metrics s...","url_abs":"https://arxiv.org/abs/2509.20136","url_pdf":"https://arxiv.org/pdf/2509.20136v1","authors":"[\"Wei Zhang\",\"Jack Yang\",\"Renshuai Tao\",\"Lingzheng Chai\",\"Shawn Guo\",\"Jiajun Wu\",\"Xiaoming Chen\",\"Ganqu Cui\",\"Ning Ding\",\"Xander Xu\",\"Hu Wei\",\"Bowen Zhou\"]","published":"2025-09-24T14:01:18Z","proceeding":"cs.SE","tasks":"[\"cs.SE\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}