{"ID":2847780,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.00261","arxiv_id":"2511.00261","title":"Spot The Ball: A Benchmark for Visual Social Inference","abstract":"Humans excel at visual social inference, the ability to infer hidden elements of a scene from subtle behavioral cues such as other people's gaze, pose, and orientation. This ability drives everyday social reasoning in humans and is critical for developing more human-like AI agents. We introduce Spot The Ball, a challenging benchmark for evaluating visual social inference in vision-language models (VLMs) using sports as a test domain. The task is to localize a removed sports ball from soccer, basketball, and volleyball images. We present a curated evaluation set with human baselines and a scalable pipeline for generating additional test items. We evaluate four state-of-the-art VLMs (Gemini, GPT, LLaMA, Qwen) using three prompting strategies, finding that humans are consistently two to three times more accurate (20-34%) than models ($\\leq$ 17%) across all sports. Our analyses show that models rely on superficial spatial heuristics--such as guessing near the image center or nearby players--while humans leverage social cues like gaze direction and body pose. These findings reveal a persistent human-model gap in visual social reasoning and underscore the need for architectures that explicitly encode structured behavioral cues to achieve robust, human-like inference.","short_abstract":"Humans excel at visual social inference, the ability to infer hidden elements of a scene from subtle behavioral cues such as other people's gaze, pose, and orientation. This ability drives everyday social reasoning in humans and is critical for developing more human-like AI agents. We introduce Spot The Ball, a challen...","url_abs":"https://arxiv.org/abs/2511.00261","url_pdf":"https://arxiv.org/pdf/2511.00261v2","authors":"[\"Neha Balamurugan\",\"Sarah Wu\",\"Adam Chun\",\"Gabe Gaw\",\"Cristobal Eyzaguirre\",\"Tobias Gerstenberg\"]","published":"2025-10-31T21:20:46Z","proceeding":"cs.CV","tasks":"[\"cs.CV\",\"cs.HC\"]","methods":"[\"Language Model\"]","has_code":false}
