Efficient Tensor Decomposition-Based Filter Pruning

2024-05-31 Fri
network compression tensor decompositions filter pruning convolutional neural networks low-rank approximation
Background IntroductionNetwork Pruning is a crucial technique for designing efficient Convolutional Neural Network (CNN) models. By reducing memory footprint and computational demands, while maintaining or improving overall performance, it makes deploying CNNs on resource-constrained devices (such as mobile phones or embedded systems) feasible. The current assumption is that many model parameters are excessive, i.e., containing a large number of unnecessary or redundant parameters. Pruning these redundant parameters can produce smaller and more efficient models, which not only applies to resource-constrained devices but can also improve the generalization ability of the model in some cases.
Among existing pruning methods, Filter Pruning and Weight Pruning are popular techniques. Weight Pruning is an unstructured pruning approach that prunes individual weights based on their importance without considering any specific structure or pattern. In contrast, Filter Pruning is a structured pruning method, which prunes entire filters according to certain criteria while maintaining the overall network structure.
Early filter selection methods determined the importance of a filter solely based on the information of the individual filter itself, neglecting the correlations between filters, leading to higher redundancy. Recent advancements have shown that leveraging the correlation or similarity between filters/feature maps to prune redundant filters can bring significant benefits. This is because similar filters may generate repetitive features, and such redundancy can be compensated for during subsequent fine-tuning processes. Despite achieving notable results, many state-of-the-art methods still face unresolved limitations, such as the potential loss of spatial or temporal information when flattening third-order tensor filters into two-dimensional matrices or one-dimensional vectors during similarity calculation.
Paper SourceThe research paper titled “Efficient Tensor Decomposition-Based Filter Pruning” was written by Van Tien Pham, Yassine Zniyed, and Thanh Phuong Nguyen, all affiliated with Université de Toulon, Aix Marseille University, CNRS, LIS in France. The paper was published in the journal “Neural Networks”. The manuscript was received on October 12, 2023, revised on February 16, 2024, and accepted on May 15, 2024.
Research DetailsWorkflowThe paper proposes a new filter pruning method named CORING (Efficient Tensor Decomposition-Based Filter Pruning), which conducts filter pruning based on tensor decomposition methods. The specific steps are as follows:
Filter Decomposition: Utilize Higher-Order Singular Value Decomposition (HOSVD) to decompose the filters of each layer into low-rank representations, preserving the multi-dimensional structure and key information of the filters.
Similarity Measurement: Construct a similarity matrix by calculating the distance between the low-rank representations of two filters. This process avoids directly using complete filters or their reshaped versions.
Filter Selection: Based on the similarity matrix, adopt an algorithm that considers collective importance to iteratively delete the filters most similar to others until a predetermined sparsity target is reached.
Pruning Strategy: Propose a k-shots pruning strategy to achieve higher precision control through multiple rounds of pruning and fine-tuning.
Main ResultsExtensive validation was conducted on various architectures and datasets, including image classification, object detection, instance segmentation, and keypoint detection tasks. The main results are as follows:
On the VGG-16-BN model on the CIFAR-10 dataset, the Coring method reduced 81.6% of parameters and 58.1% MAC operations, while the accuracy increased from 93.96% to 94.42%.
For the ResNet-56 model, the Coring method reduced 22.4% of parameters and 27.3% MAC operations, with accuracy improving from 93.26% to 94.76%.
On the ResNet-50 model on the ImageNet dataset, Coring achieved a 40.8% reduction in parameters and a 44.8% reduction in MAC operations, while accuracy increased from 76.15% to 76.78%.
Conclusion and SignificanceThe Coring method implements effective structured pruning of networks by introducing multi-dimensional tensor decomposition and innovative filter similarity measurement methods. Its main contributions include:
Introduction of Tensor Decomposition (especially HOSVD) for Filter Pruning: Effectively reduces complexity while maintaining the multi-dimensional structure of filters.
Provision of a Simple and Efficient Filter Selection Method: Distance calculation based on HOSVD decomposition representations avoids the direct use of complete filters or their reshaped versions.
Demonstration of Effectiveness in Multiple Computer Vision Tasks: Extensive experiments demonstrate the superiority of Coring in terms of accuracy, parameter reduction, and MAC operation reduction.
HighlightsInnovative Tensor Decomposition Method: Coring provides low-rank approximations while maintaining the multi-dimensional structure of filters, significantly reducing complexity.
Efficient Similarity Calculation Method: A novel and general method uses low-rank approximations provided by HOSVD for filter similarity calculation, avoiding complexities associated with using complete filters or reshaped versions.
Extensive Validation and Superior Performance: Extensive experiments on various architectures and datasets validate the effectiveness and superiority of the method.
SummaryCoring, as an effective new filter pruning method based on tensor decomposition, maintains the multi-dimensional structure and key information of filters, achieving superior performance in multiple computer vision tasks. It demonstrates significant potential and broad application value in the field of model compression. Meanwhile, the newly proposed k-shots strategy provides a flexible approach to model pruning while maintaining high precision.