{"ID":2840823,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.13922","arxiv_id":"2511.13922","title":"Self-Supervised Compression and Artifact Correction for Streaming Underwater Imaging Sonar","abstract":"Real-time imaging sonar is crucial for underwater monitoring where optical sensing fails, but its use is limited by low uplink bandwidth and severe sonar-specific artifacts (speckle, motion blur, reverberation, acoustic shadows) affecting up to 98% of frames. We present SCOPE, a self-supervised framework that jointly performs compression and artifact correction without clean-noise pairs or synthetic assumptions. SCOPE combines (i) Adaptive Codebook Compression (ACC), which learns frequency-encoded latent representations tailored to sonar, with (ii) Frequency-Aware Multiscale Segmentation (FAMS), which decomposes frames into low-frequency structure and sparse high-frequency dynamics while suppressing rapidly fluctuating artifacts. A hedging training strategy further guides frequency-aware learning using low-pass proxy pairs generated without labels. Evaluated on months of in-situ ARIS sonar data, SCOPE achieves a structural similarity index (SSIM) of 0.77, representing a 40% improvement over prior self-supervised denoising baselines, at bitrates down to \u003c= 0.0118 bpp. It reduces uplink bandwidth by more than 80% while improving downstream detection. The system runs in real time, with 3.1 ms encoding on an embedded GPU and 97 ms full multi-layer decoding on the server end. SCOPE has been deployed for months in three Pacific Northwest rivers to support real-time salmon enumeration and environmental monitoring in the wild. Results demonstrate that learning frequency-structured latents enables practical, low-bitrate sonar streaming with preserved signal details under real-world deployment conditions.","short_abstract":"Real-time imaging sonar is crucial for underwater monitoring where optical sensing fails, but its use is limited by low uplink bandwidth and severe sonar-specific artifacts (speckle, motion blur, reverberation, acoustic shadows) affecting up to 98% of frames. We present SCOPE, a self-supervised framework that jointly p...","url_abs":"https://arxiv.org/abs/2511.13922","url_pdf":"https://arxiv.org/pdf/2511.13922v2","authors":"[\"Rongsheng Qian\",\"Chi Xu\",\"Xiaoqiang Ma\",\"Hao Fang\",\"Yili Jin\",\"William I. Atlas\",\"Jiangchuan Liu\"]","published":"2025-11-17T21:19:15Z","proceeding":"eess.IV","tasks":"[\"eess.IV\",\"cs.CV\",\"cs.LG\",\"cs.MM\"]","methods":"[]","has_code":false}
