{"ID":2862891,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.26454","arxiv_id":"2509.26454","title":"Multi-View Camera System for Variant-Aware Autonomous Vehicle Inspection and Defect Detection","abstract":"Ensuring that every vehicle leaving a modern production line is built to the correct \\emph{variant} specification and is free from visible defects is an increasingly complex challenge. We present the \\textbf{Automated Vehicle Inspection (AVI)} platform, an end-to-end, \\emph{multi-view} perception system that couples deep-learning detectors with a semantic rule engine to deliver \\emph{variant-aware} quality control in real time. Eleven synchronized cameras capture a full 360° sweep of each vehicle; task-specific views are then routed to specialised modules: YOLOv8 for part detection, EfficientNet for ICE/EV classification, Gemini-1.5 Flash for mascot OCR, and YOLOv8-Seg for scratch-and-dent segmentation. A view-aware fusion layer standardises evidence, while a VIN-conditioned rule engine compares detected features against the expected manifest, producing an interpretable pass/fail report in \\(\\approx\\! 300\\,\\text{ms}\\). On a mixed data set of Original Equipment Manufacturer(OEM) vehicle data sets of four distinct models plus public scratch/dent images, AVI achieves \\textbf{93\\%} verification accuracy, \\textbf{86 \\%} defect-detection recall, and sustains \\(\\mathbf{3.3}\\) vehicles/min, surpassing single-view or no segmentation baselines by large margins. To our knowledge, this is the first publicly reported system that unifies multi-camera feature validation with defect detection in a deployable automotive setting in industry.","short_abstract":"Ensuring that every vehicle leaving a modern production line is built to the correct \\emph{variant} specification and is free from visible defects is an increasingly complex challenge. We present the \\textbf{Automated Vehicle Inspection (AVI)} platform, an end-to-end, \\emph{multi-view} perception system that couples de...","url_abs":"https://arxiv.org/abs/2509.26454","url_pdf":"https://arxiv.org/pdf/2509.26454v4","authors":"[\"Yash Kulkarni\",\"Raman Jha\",\"Renu Kachhoria\"]","published":"2025-09-30T16:08:59Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false}
