{"ID":2857997,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.07922","arxiv_id":"2510.07922","title":"SketchGuard: Scaling Byzantine-Robust Decentralized Federated Learning via Sketch-Based Screening","abstract":"Decentralized Federated Learning (DFL) enables privacy-preserving collaborative training without centralized servers but remains vulnerable to Byzantine attacks. Existing Byzantine-robust defenses are predicated on exchanging full, high-dimensional model vectors with every neighbor before filtering, an $O(d|\\mathcal{N}_i|)$ communication cost incurred regardless of how many neighbors are ultimately rejected. This design choice is sustainable in small-scale experimental settings but becomes a fundamental barrier to deployment as network scale or model size grows. We propose SketchGuard, a framework that decouples Byzantine filtering from aggregation via sketch-based screening. Each client compresses its $d$-dimensional model to a $k$-dimensional Count Sketch ($k \\ll d$), exchanges only sketches for neighbor screening, and fetches full models exclusively from accepted neighbors. This eliminates the pre-filtering communication waste of existing defenses: rejected Byzantine neighbors incur only $O(k)$ sketch cost rather than $O(d)$ full-model cost. Communication savings therefore scale with the Byzantine rejection rate: negligible extra overhead in benign conditions, rising to 50-70% total savings when 50-70% of neighbors are rejected. We prove convergence in both strongly convex and non-convex settings, establishing that Count Sketch's distance-preservation guarantee causes sketch-based filtering to deviate from full-precision filtering by at most a $(1+O(ε))$ factor in the effective threshold, a gap that can be made arbitrarily small. Experiments across three non-IID federated benchmarks, five network topologies, and four attack types confirm that SketchGuard matches state-of-the-art robustness (mean TER deviation $\\leq$0.5 percentage points) while reducing computation by up to 82%, with robustness remaining stable across compression ratios up to 13,000:1.","short_abstract":"Decentralized Federated Learning (DFL) enables privacy-preserving collaborative training without centralized servers but remains vulnerable to Byzantine attacks. Existing Byzantine-robust defenses are predicated on exchanging full, high-dimensional model vectors with every neighbor before filtering, an $O(d|\\mathcal{N}...","url_abs":"https://arxiv.org/abs/2510.07922","url_pdf":"https://arxiv.org/pdf/2510.07922v4","authors":"[\"Murtaza Rangwala\",\"Farag Azzedin\",\"Richard O. Sinnott\",\"Rajkumar Buyya\"]","published":"2025-10-09T08:16:32Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.DC\"]","methods":"[]","has_code":false}