{"ID":2893073,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.13861","arxiv_id":"2507.13861","title":"PositionIC: Unified Position and Identity Consistency for Image Customization","abstract":"Recent subject-driven image customization excels in fidelity, yet fine-grained instance-level spatial control remains an elusive challenge, hindering real-world applications. This limitation stems from two factors: a scarcity of scalable, position-annotated datasets, and the entanglement of identity and layout by global attention mechanisms. To this end, we introduce PositionIC, a unified framework for high-fidelity, spatially controllable multi-subject customization. First, we present BMPDS, the first automatic data-synthesis pipeline for position-annotated multi-subject datasets, effectively providing crucial spatial supervision. Second, we design a lightweight, layout-aware diffusion framework that integrates a novel visibility-aware attention mechanism. This mechanism explicitly models spatial relationships via an NeRF-inspired volumetric weight regulation to effectively decouple instance-level spatial embeddings from semantic identity features, enabling precise, occlusion-aware placement of multiple subjects. Extensive experiments demonstrate PositionIC achieves state-of-the-art performance on public benchmarks, setting new records for spatial precision and identity consistency. Our work represents a significant step towards truly controllable, high-fidelity image customization in multi-entity scenarios. Code and data: https://github.com/MeiGen-AI/PositionIC.","short_abstract":"Recent subject-driven image customization excels in fidelity, yet fine-grained instance-level spatial control remains an elusive challenge, hindering real-world applications. This limitation stems from two factors: a scarcity of scalable, position-annotated datasets, and the entanglement of identity and layout by globa...","url_abs":"https://arxiv.org/abs/2507.13861","url_pdf":"https://arxiv.org/pdf/2507.13861v6","authors":"[\"Junjie Hu\",\"Tianyang Han\",\"Kai Ma\",\"Jialin Gao\",\"Song Yang\",\"Xianhua He\",\"Junfeng Luo\",\"Xiaoming Wei\",\"Wenqiang Zhang\"]","published":"2025-07-18T12:35:47Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Diffusion Model\"]","has_code":false,"code_links":[{"ID":612033,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2893073,"paper_url":"https://arxiv.org/abs/2507.13861","paper_title":"PositionIC: Unified Position and Identity Consistency for Image Customization","repo_url":"https://github.com/MeiGen-AI/PositionIC","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}