{"ID":2874666,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.04276","arxiv_id":"2509.04276","title":"PAOLI: Pose-free Articulated Object Learning from Sparse-view Images","abstract":"We present a methodology to model articulated objects using a sparse set of images with unknown poses. Current methods require dense multi-view observations and ground-truth camera poses. Our approach operates with as few as four views per articulation and no camera supervision. Our central insight is to first solve a robust correspondence and alignment problem between unaligned reconstructions, before part motions can be analyzed. We first reconstruct each articulation independently using recent advances in sparse-view 3D reconstruction, then learn a deformation field that establishes dense correspondences across poses. A progressive disentanglement strategy further separates static from moving parts, enabling robust separation of camera and object motion. Finally, we optimize geometry, appearance, and kinematics jointly with a self-supervised loss that enforces cross-view and cross-pose consistency. Experiments on the standard benchmark and real-world examples demonstrate that our method produces accurate and detailed articulated object representations under significantly weaker input assumptions than existing approaches.","short_abstract":"We present a methodology to model articulated objects using a sparse set of images with unknown poses. Current methods require dense multi-view observations and ground-truth camera poses. Our approach operates with as few as four views per articulation and no camera supervision. Our central insight is to first solve a...","url_abs":"https://arxiv.org/abs/2509.04276","url_pdf":"https://arxiv.org/pdf/2509.04276v2","authors":"[\"Jianning Deng\",\"Kartic Subr\",\"Hakan Bilen\"]","published":"2025-09-04T14:51:03Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false}