{"ID":2829768,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.11393","arxiv_id":"2512.11393","title":"The N-Body Problem: Parallel Execution from Single-Person Egocentric Video","abstract":"Humans can intuitively parallelise complex activities, but can a model learn this from observing a single person? Given one egocentric video, we introduce the N-Body Problem: how N individuals, can hypothetically perform the same set of tasks observed in this video. The goal is to maximise speed-up, but naive assignment of video segments to individuals often violates real-world constraints, leading to physically impossible scenarios like two people using the same object or occupying the same space. To address this, we formalise the N-Body Problem and propose a suite of metrics to evaluate both performance (speed-up, task coverage) and feasibility (spatial collisions, object conflicts and causal constraints). We then introduce a structured prompting strategy that guides a Vision-Language Model (VLM) to reason about the 3D environment, object usage, and temporal dependencies to produce a viable parallel execution. On 100 videos from EPIC-Kitchens and HD-EPIC, our method for N = 2 boosts action coverage by 45% over a baseline prompt for Gemini 2.5 Pro, while simultaneously slashing collision rates, object and causal conflicts by 55%, 45% and 55% respectively.","short_abstract":"Humans can intuitively parallelise complex activities, but can a model learn this from observing a single person? Given one egocentric video, we introduce the N-Body Problem: how N individuals, can hypothetically perform the same set of tasks observed in this video. The goal is to maximise speed-up, but naive assignmen...","url_abs":"https://arxiv.org/abs/2512.11393","url_pdf":"https://arxiv.org/pdf/2512.11393v1","authors":"[\"Zhifan Zhu\",\"Yifei Huang\",\"Yoichi Sato\",\"Dima Damen\"]","published":"2025-12-12T09:07:21Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Language Model\"]","has_code":false}
