{"ID":2887866,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.00428","arxiv_id":"2508.00428","title":"Sel3DCraft: Interactive Visual Prompts for User-Friendly Text-to-3D Generation","abstract":"Text-to-3D (T23D) generation has transformed digital content creation, yet remains bottlenecked by blind trial-and-error prompting processes that yield unpredictable results. While visual prompt engineering has advanced in text-to-image domains, its application to 3D generation presents unique challenges requiring multi-view consistency evaluation and spatial understanding. We present Sel3DCraft, a visual prompt engineering system for T23D that transforms unstructured exploration into a guided visual process. Our approach introduces three key innovations: a dual-branch structure combining retrieval and generation for diverse candidate exploration; a multi-view hybrid scoring approach that leverages MLLMs with innovative high-level metrics to assess 3D models with human-expert consistency; and a prompt-driven visual analytics suite that enables intuitive defect identification and refinement. Extensive testing and user studies demonstrate that Sel3DCraft surpasses other T23D systems in supporting creativity for designers.","short_abstract":"Text-to-3D (T23D) generation has transformed digital content creation, yet remains bottlenecked by blind trial-and-error prompting processes that yield unpredictable results. While visual prompt engineering has advanced in text-to-image domains, its application to 3D generation presents unique challenges requiring mult...","url_abs":"https://arxiv.org/abs/2508.00428","url_pdf":"https://arxiv.org/pdf/2508.00428v1","authors":"[\"Nan Xiang\",\"Tianyi Liang\",\"Haiwen Huang\",\"Shiqi Jiang\",\"Hao Huang\",\"Yifei Huang\",\"Liangyu Chen\",\"Changbo Wang\",\"Chenhui Li\"]","published":"2025-08-01T08:36:15Z","proceeding":"cs.GR","tasks":"[\"cs.GR\",\"cs.HC\"]","methods":"[\"Large Language Model\",\"LoRA\"]","has_code":false}
