MCAQ-YOLO: Morphological Complexity-Aware Quantization for Efficient Object Detection with Curriculum Learning
Abstract
Most neural network quantization methods apply uniform bit precision across spatial regions, disregarding the heterogeneous complexity inherent in visual data. This paper introduces MCAQ-YOLO, a practical framework for tile-wise spatial mixed-precision quantization in real-time object detectors. Morphological complexity--quantified through five complementary metrics (fractal dimension, texture entropy, gradient variance, edge density, and contour complexity)--is proposed as a signal-centric predictor of spatial quantization sensitivity. A calibration-time analysis design enables spatial bit allocation with only 0.3ms inference overhead, achieving 151 FPS throughput. Additionally, a curriculum-based training scheme that progressively increases quantization difficulty is introduced to stabilize optimization and accelerate convergence. On a construction safety equipment dataset exhibiting high morphological variability, MCAQ-YOLO achieves 85.6% [email protected] with an average bit-width of 4.2 bits and a 7.6x compression ratio, outperforming uniform 4-bit quantization by 3.5 percentage points. Cross-dataset evaluation on COCO 2017 (+2.9%) and Pascal VOC 2012 (+2.3%) demonstrates consistent improvements, with performance gains correlating with within-image complexity variation.