{"ID":2899110,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.01544","arxiv_id":"2507.01544","title":"MARVIS: Modality Adaptive Reasoning over VISualizations","abstract":"Predictive applications of machine learning often rely on small (sub 1 Bn parameter) specialized models tuned to particular domains or modalities. Such models often achieve excellent performance, but lack flexibility. LLMs and VLMs offer versatility, but typically underperform specialized predictors, especially on non-traditional modalities and long-tail domains. We propose MARVIS (Modality Adaptive Reasoning over VISualizations), a system that transforms latent embedding spaces into visual representations and then leverages the spatial and fine-grained reasoning skills of VLMs to interpret the visualizations and utilize them for predictions successfully. MARVIS achieves competitive performance across vision, audio, biological, and tabular domains using a single 3B parameter model, yielding results that beat Gemini 2.0 by 16% on average. MARVIS drastically reduces the gap between LLM/VLMs approaches and specialized domain-specific methods, without requiring any domain-specific training. Code and datasets are available at https://github.com/penfever/marvis.","short_abstract":"Predictive applications of machine learning often rely on small (sub 1 Bn parameter) specialized models tuned to particular domains or modalities. Such models often achieve excellent performance, but lack flexibility. LLMs and VLMs offer versatility, but typically underperform specialized predictors, especially on non-...","url_abs":"https://arxiv.org/abs/2507.01544","url_pdf":"https://arxiv.org/pdf/2507.01544v2","authors":"[\"Benjamin Feuer\",\"Lennart Purucker\",\"Oussama Elachqar\",\"Chinmay Hegde\"]","published":"2025-07-02T09:56:24Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Large Language Model\"]","has_code":false,"code_links":[{"ID":612456,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2899110,"paper_url":"https://arxiv.org/abs/2507.01544","paper_title":"MARVIS: Modality Adaptive Reasoning over VISualizations","repo_url":"https://github.com/penfever/marvis","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
