{"ID":2852122,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.18410","arxiv_id":"2510.18410","title":"Provable Generalization Bounds for Deep Neural Networks with Momentum-Adaptive Gradient Dropout","abstract":"Deep neural networks (DNNs) achieve remarkable performance but often suffer from overfitting due to their high capacity. We introduce Momentum-Adaptive Gradient Dropout (MAGDrop), a novel regularization method that dynamically adjusts dropout rates on activations based on current gradients and accumulated momentum, enhancing stability in non-convex optimization landscapes. To theoretically justify MAGDrop's effectiveness, we derive a non-asymptotic, computable PAC-Bayes generalization bound that accounts for its adaptive nature, achieving up to 29.2\\% tighter bounds compared to standard approaches by leveraging momentum-driven perturbation control. Empirically, the activation-based MAGDrop achieves competitive performance on MNIST (99.52\\%) and CIFAR-10 (92.03\\%), with generalization gaps of 0.48\\% and 6.52\\%, respectively. We provide fully reproducible code and numerical computation of our bounds to validate our theoretical claims. Our work bridges theoretical insights and practical advancements, offering a robust framework for enhancing DNN generalization, making it suitable for high-stakes applications.","short_abstract":"Deep neural networks (DNNs) achieve remarkable performance but often suffer from overfitting due to their high capacity. We introduce Momentum-Adaptive Gradient Dropout (MAGDrop), a novel regularization method that dynamically adjusts dropout rates on activations based on current gradients and accumulated momentum, enh...","url_abs":"https://arxiv.org/abs/2510.18410","url_pdf":"https://arxiv.org/pdf/2510.18410v2","authors":"[\"Adeel Safder\"]","published":"2025-10-21T08:36:56Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"math.ST\"]","methods":"[]","has_code":false}
