{"ID":2838862,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.15984","arxiv_id":"2511.15984","title":"UniDGF: A Unified Detection-to-Generation Framework for Hierarchical Object Visual Recognition","abstract":"Achieving visual semantic understanding requires a unified framework that simultaneously handles object detection, category prediction, and attribute recognition. However, current advanced approaches rely on global similarity and struggle to capture fine-grained category distinctions and category-specific attribute diversity, especially in large-scale e-commerce scenarios. To overcome these challenges, we introduce a detection-guided generative framework that predicts hierarchical category and attribute tokens. For each detected object, we extract refined ROI-level features and employ a BART-based generator to produce semantic tokens in a coarse-to-fine sequence covering category hierarchies and property-value pairs, with support for property-conditioned attribute recognition. Experiments on both large-scale proprietary e-commerce datasets and open-source datasets demonstrate that our approach significantly outperforms existing similarity-based pipelines and multi-stage classification systems, achieving stronger fine-grained recognition and more coherent unified inference.","short_abstract":"Achieving visual semantic understanding requires a unified framework that simultaneously handles object detection, category prediction, and attribute recognition. However, current advanced approaches rely on global similarity and struggle to capture fine-grained category distinctions and category-specific attribute div...","url_abs":"https://arxiv.org/abs/2511.15984","url_pdf":"https://arxiv.org/pdf/2511.15984v1","authors":"[\"Xinyu Nan\",\"Lingtao Mao\",\"Huangyu Dai\",\"Zexin Zheng\",\"Xinyu Sun\",\"Zihan Liang\",\"Ben Chen\",\"Yuqing Ding\",\"Chenyi Lei\",\"Wenwu Ou\",\"Han Li\"]","published":"2025-11-20T02:37:43Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false}
