Generalization Bounds for Magnitude-Based Pruning via Sparse Matrix Sketching
Abstract
In this paper we derive a novel bound on the generalization error of Magnitude-Based pruning of overparameterized neural networks. Our work builds on the bounds in \citet{Arora2018} where the error depends on one, the approximation induced by pruning, and two, the number of parameters in the pruned model, and improves upon standard norm-based generalization bounds. The pruned estimates obtained using our new Magnitude-Based compression algorithm are close to the unpruned functions with high probability, which improves the first criteria. Using Sparse Matrix Sketching the space of the pruned matrices can be efficiently represented in the space of dense matrices of much smaller dimensions, thereby lowering the second criterion. This leads to stronger generalization bound than many state-of-the-art methods, thereby breaking new ground in the algorithm development for pruning and bounding generalization error of overparameterized models. Beyond this, we extend our results to obtain generalization bound for Iterative Pruning \citep{Frankle2018}. We empirically verify the success of this new method on ReLU-activated Feed Forward Networks on the MNIST and CIFAR10 datasets.