{"ID":2832182,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.06288","arxiv_id":"2512.06288","title":"Theoretical Compression Bounds for Wide Multilayer Perceptrons","abstract":"Pruning and quantization techniques have been broadly successful in reducing the number of parameters needed for large neural networks, yet theoretical justification for their empirical success falls short. We consider a randomized greedy compression algorithm for pruning and quantization post-training and use it to rigorously show the existence of pruned/quantized subnetworks of multilayer perceptrons (MLPs) with competitive performance. We further extend our results to structured pruning of MLPs and convolutional neural networks (CNNs), thus providing a unified analysis of pruning in wide networks. Our results are free of data assumptions, and showcase a tradeoff between compressibility and network width. The algorithm we consider bears some similarities with Optimal Brain Damage (OBD) and can be viewed as a post-training randomized version of it. The theoretical results we derive bridge the gap between theory and application for pruning/quantization, and provide a justification for the empirical success of compression in wide multilayer perceptrons.","short_abstract":"Pruning and quantization techniques have been broadly successful in reducing the number of parameters needed for large neural networks, yet theoretical justification for their empirical success falls short. We consider a randomized greedy compression algorithm for pruning and quantization post-training and use it to ri...","url_abs":"https://arxiv.org/abs/2512.06288","url_pdf":"https://arxiv.org/pdf/2512.06288v1","authors":"[\"Houssam El Cheairi\",\"David Gamarnik\",\"Rahul Mazumder\"]","published":"2025-12-06T04:32:25Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"math.ST\"]","methods":"[\"Convolutional Neural Network\"]","has_code":false}
