{"ID":2861670,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.02470","arxiv_id":"2510.02470","title":"SAGE: Streaming Agreement-Driven Gradient Sketches for Representative Subset Selection","abstract":"Training modern neural networks on large datasets is computationally and energy intensive. We present SAGE, a streaming data-subset selection method that maintains a compact Frequent Directions (FD) sketch of gradient geometry in $O(\\ell D)$ memory and prioritizes examples whose sketched gradients align with a consensus direction. The approach eliminates $N \\times N$ pairwise similarities and explicit $N \\times \\ell$ gradient stores, yielding a simple two-pass, GPU-friendly pipeline. Leveraging FD's deterministic approximation guarantees, we analyze how agreement scoring preserves gradient energy within the principal sketched subspace. Across multiple benchmarks, SAGE trains with small kept-rate budgets while retaining competitive accuracy relative to full-data training and recent subset-selection baselines, and reduces end-to-end compute and peak memory. Overall, SAGE offers a practical, constant-memory alternative that complements pruning and model compression for efficient training.","short_abstract":"Training modern neural networks on large datasets is computationally and energy intensive. We present SAGE, a streaming data-subset selection method that maintains a compact Frequent Directions (FD) sketch of gradient geometry in $O(\\ell D)$ memory and prioritizes examples whose sketched gradients align with a consensu...","url_abs":"https://arxiv.org/abs/2510.02470","url_pdf":"https://arxiv.org/pdf/2510.02470v2","authors":"[\"Ashish Jha\",\"Salman Ahmadi-Asl\"]","published":"2025-10-02T18:22:06Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[]","has_code":false}