{"ID":2873886,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.05554","arxiv_id":"2509.05554","title":"RED: Robust Event-Guided Motion Deblurring with Modality-Specific Disentanglement","abstract":"Event-guided motion deblurring reconstructs sharp images using the high-temporal-resolution motion cues from event cameras. However, in real capture, thresholding-induced event under-reporting causes missing and fragmented motion cues, under which existing methods often degrade in performance due to two limitations: i) assumptions of dense and stable events, and ii) modality-indiscriminate extraction and fusion that fail to separate useful motion cues from disrupted events, allowing them to contaminate cross-modal representations. In this paper, we first introduce a Robustness-Oriented Perturbation Strategy (RPS) that mimics various trigger thresholds of dynamic vision sensors, exposing our model to diverse under-reporting patterns and thereby improving robustness under unknown conditions. Built upon this setting, we propose RED, a Robust Event-guided Deblurring network, following the principle of disentangle first and then fuse selectively. Specifically, the Modality-specific Representation Mechanism disentangles the inputs into image-semantic, event-motion, and cross-modal representations, capturing appearance, motion, and complementary interactions, respectively. With the reliable disentangled features, we selectively fuse modalities to enhance motion-sensitive areas in blurry images and enrich under-reported events with semantic context. Extensive experiments on synthetic and real-world datasets demonstrate RED consistently achieves state-of-the-art performance in terms of both accuracy and robustness.","short_abstract":"Event-guided motion deblurring reconstructs sharp images using the high-temporal-resolution motion cues from event cameras. However, in real capture, thresholding-induced event under-reporting causes missing and fragmented motion cues, under which existing methods often degrade in performance due to two limitations: i)...","url_abs":"https://arxiv.org/abs/2509.05554","url_pdf":"https://arxiv.org/pdf/2509.05554v3","authors":"[\"Yihong Leng\",\"Siming Zheng\",\"Jinwei Chen\",\"Bo Li\",\"Jiaojiao Li\",\"Peng-Tao Jiang\"]","published":"2025-09-06T01:07:08Z","proceeding":"cs.CV","tasks":"[\"cs.CV\",\"cs.IR\"]","methods":"[]","has_code":false}