{"ID":2863355,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.24359","arxiv_id":"2509.24359","title":"DRIFT: Divergent Response in Filtered Transformations for Robust Adversarial Defense","abstract":"Deep neural networks remain highly vulnerable to adversarial examples, and most defenses collapse once gradients can be reliably estimated. We identify \\emph{gradient consensus} -- the tendency of randomized transformations to yield aligned gradients -- as a key driver of adversarial transferability. Attackers exploit this consensus to construct perturbations that remain effective across transformations. We introduce \\textbf{DRIFT} (Divergent Response in Filtered Transformations), a stochastic ensemble of lightweight, learnable filters trained to actively disrupt gradient consensus. Unlike prior randomized defenses that rely on gradient masking, DRIFT enforces \\emph{gradient dissonance} by maximizing divergence in Jacobian- and logit-space responses while preserving natural predictions. Our contributions are threefold: (i) we formalize gradient consensus and provide a theoretical analysis linking consensus to transferability; (ii) we propose a consensus-divergence training strategy combining prediction consistency, Jacobian separation, logit-space separation, and adversarial robustness; and (iii) we show that DRIFT achieves substantial robustness gains on ImageNet across CNNs and Vision Transformers, outperforming state-of-the-art preprocessing, adversarial training, and diffusion-based defenses under adaptive white-box, transfer-based, and gradient-free attacks. DRIFT delivers these improvements with negligible runtime and memory cost, establishing gradient divergence as a practical and generalizable principle for adversarial defense.","short_abstract":"Deep neural networks remain highly vulnerable to adversarial examples, and most defenses collapse once gradients can be reliably estimated. We identify \\emph{gradient consensus} -- the tendency of randomized transformations to yield aligned gradients -- as a key driver of adversarial transferability. Attackers exploit...","url_abs":"https://arxiv.org/abs/2509.24359","url_pdf":"https://arxiv.org/pdf/2509.24359v1","authors":"[\"Amira Guesmi\",\"Muhammad Shafique\"]","published":"2025-09-29T06:57:47Z","proceeding":"cs.CV","tasks":"[\"cs.CV\",\"cs.LG\"]","methods":"[\"Vision Transformer\",\"Diffusion Model\",\"Transformer\",\"Convolutional Neural Network\"]","has_code":false}
