The paper is called “Understanding the Effect of the Long Tail on Neural Network Compression” (https://arxiv.org/abs/2306.06238).
In this paper, I used a sharding-based method (https://arxiv.org/abs/2008.03703) to estimate the influence of training examples on validation examples, where influence is expected accuracy gain from training with that training example vs training without it. We did such estimation on a few pruned and unpruned image classifiers and found differences, some significant, between the degree of influence on examples misclassified by pruned models and those by the unpruned models. In the table below, CIE refers to disagreements between the pruned and unpruned models.

The paper was published in the ICLR 2023 Workshop on Sparsity in Neural Networks.
But really, I’m not satisfied with it. The main problem is that more general readers (not just SNN reviewers) don’t know why they should care, and we don’t really have a good reason for only experimenting with image classifiers. ¯\_(ツ)_/¯ I’m not trying to undersell it; I’m just reassuring you that I know what good papers look like and it’s not like this.
Leave a comment