{"ID":2835550,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.23241","arxiv_id":"2511.23241","title":"Synthetic Industrial Object Detection: GenAI vs. Feature-Based Methods","abstract":"Reducing the burden of data generation and annotation remains a major challenge for the cost-effective deployment of machine learning in industrial and robotics settings. While synthetic rendering is a promising solution, bridging the sim-to-real gap often requires expert intervention. In this work, we benchmark a range of domain randomization (DR) and domain adaptation (DA) techniques, including feature-based methods, generative AI (GenAI), and classical rendering approaches, for creating contextualized synthetic data without manual annotation. Our evaluation focuses on the effectiveness and efficiency of low-level and high-level feature alignment, as well as a controlled diffusion-based DA method guided by prompts generated from real-world contexts. We validate our methods on two datasets: a proprietary industrial dataset (automotive and logistics) and a public robotics dataset. Results show that if render-based data with enough variability is available as seed, simpler feature-based methods, such as brightness-based and perceptual hashing filtering, outperform more complex GenAI-based approaches in both accuracy and resource efficiency. Perceptual hashing consistently achieves the highest performance, with mAP50 scores of 98% and 67% on the industrial and robotics datasets, respectively. Additionally, GenAI methods present significant time overhead for data generation at no apparent improvement of sim-to-real mAP values compared to simpler methods. Our findings offer actionable insights for efficiently bridging the sim-to-real gap, enabling high real-world performance from models trained exclusively on synthetic data.","short_abstract":"Reducing the burden of data generation and annotation remains a major challenge for the cost-effective deployment of machine learning in industrial and robotics settings. While synthetic rendering is a promising solution, bridging the sim-to-real gap often requires expert intervention. In this work, we benchmark a rang...","url_abs":"https://arxiv.org/abs/2511.23241","url_pdf":"https://arxiv.org/pdf/2511.23241v1","authors":"[\"Jose Moises Araya-Martinez\",\"Adrián Sanchis Reig\",\"Gautham Mohan\",\"Sarvenaz Sardari\",\"Jens Lambrecht\",\"Jörg Krüger\"]","published":"2025-11-28T14:51:08Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Diffusion Model\"]","has_code":false}