{"ID":2844463,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.06522","arxiv_id":"2511.06522","title":"FractalBench: Diagnosing Visual-Mathematical Reasoning Through Recursive Program Synthesis","abstract":"Mathematical reasoning requires abstracting symbolic rules from visual patterns -- inferring the infinite from the finite. We investigate whether multimodal AI systems possess this capability through FractalBench, a benchmark evaluating fractal program synthesis from images. Fractals provide ideal test cases: Iterated Function Systems with only a few contraction maps generate complex self-similar patterns through simple recursive rules, requiring models to bridge visual perception with mathematical abstraction. We evaluate four leading MLLMs -- GPT-4o, Claude 3.7 Sonnet, Gemini 2.5 Flash, and Qwen 2.5-VL -- on 12 canonical fractals. Models must generate executable Python code reproducing the fractal, enabling objective evaluation. Results reveal a striking disconnect: 76% generate syntactically valid code but only 4% capture mathematical structure. Success varies systematically -- models handle geometric transformations (Koch curves: 17-21%) but fail at branching recursion (trees: \u003c2%), revealing fundamental gaps in mathematical abstraction. FractalBench provides a contamination-resistant diagnostic for visual-mathematical reasoning and is available at https://github.com/NaiveNeuron/FractalBench","short_abstract":"Mathematical reasoning requires abstracting symbolic rules from visual patterns -- inferring the infinite from the finite. We investigate whether multimodal AI systems possess this capability through FractalBench, a benchmark evaluating fractal program synthesis from images. Fractals provide ideal test cases: Iterated...","url_abs":"https://arxiv.org/abs/2511.06522","url_pdf":"https://arxiv.org/pdf/2511.06522v1","authors":"[\"Jan Ondras\",\"Marek Šuppa\"]","published":"2025-11-09T20:22:42Z","proceeding":"cs.AI","tasks":"[\"cs.AI\",\"cs.LG\"]","methods":"[\"Large Language Model\"]","has_code":false,"code_links":[{"ID":607296,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2844463,"paper_url":"https://arxiv.org/abs/2511.06522","paper_title":"FractalBench: Diagnosing Visual-Mathematical Reasoning Through Recursive Program Synthesis","repo_url":"https://github.com/NaiveNeuron/FractalBench","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
