Multi-Sensor Attention Networks for Automated Subsurface Delamination Detection in Concrete Bridge Decks

cs.CV arXiv:2512.20113
View PDF arXiv JSON

Abstract

Subsurface delaminations in concrete bridge decks remain undetectable through conventional visual inspection, necessitating automated non-destructive evaluation methods. This work introduces a deep learning framework that integrates Ground Penetrating Radar (GPR) and Infrared Thermography (IRT) through hierarchical attention mechanisms. Our architecture employs temporal self-attention to process GPR electromagnetic signals, spatial attention to analyze thermal imagery, and cross-modal attention with learnable embeddings to model inter-sensor correspondences. We integrate Monte Carlo dropout-based uncertainty quantification, decomposing prediction confidence into model uncertainty and data-driven uncertainty components. Testing across five real-world bridge datasets from the SDNET2021 benchmark reveals that our approach delivers substantial performance gains over single-sensor and concatenation-based baselines when applied to balanced or moderately imbalanced data distributions. Comprehensive ablation analysis confirms that cross-modal attention mechanisms contribute meaningful improvements beyond unimodal attention alone. Critically, we identify and characterize specific failure modes: under extreme class imbalance, attention-based architectures demonstrate susceptibility to majority class bias, indicating scenarios where simpler architectural choices may prove more robust. Our findings equip practitioners with empirically-grounded criteria for selecting appropriate fusion strategies based on dataset characteristics, rather than promoting universal architectural superiority.

PDF Viewer