Noninvasive Grading of Glioma by Knowledge Distillation Based Lightweight Convolutional Neural Network

Review of Non-Invasive Glioma Grading Research: Lightweight Convolutional Neural Networks Based on Knowledge Distillation

Background

Gliomas are the main tumors of the central nervous system, and early detection is crucial. The World Health Organization (WHO) classifies gliomas from grade I to IV, with grades I and II being low-grade gliomas (LGG) and grades III and IV being high-grade gliomas (HGG). Accurate classification of gliomas is critical for assessing survival rates.

Magnetic Resonance Imaging (MRI) is a common method used in the medical field for diagnosing and treating gliomas. At present, many scholars apply machine learning and deep learning methods for glioma classification. For example, Zacharaki et al. successfully applied the Support Vector Machine (SVM) algorithm for the classification of gliomas in MRI images. Fatemeh et al. used Convolutional Neural Networks (CNN) for the classification of gliomas in MRI images. Unfortunately, most of these studies focus on improving classification accuracy, but CNN architectures with high parameters are difficult to apply in practical medical environments. Furthermore, due to the small size of glioma datasets, they can only use CNNs with fewer parameters, making it hard to improve classification accuracy.

In the context of smart medical diagnosis based on high-integration Field Programmable Gate Arrays (FPGA), it is very necessary to improve CNN performance and reduce parameters and computation through compression technology. To solve the above problems, this paper proposes a knowledge distillation (KD)-based glioma classification method, which significantly reduces model parameters and computation while maintaining high accuracy. The study selects Inception-ResNet-V2 as the teacher model and SqueezeNet as the student model, and introduces the SE module to further improve model efficiency.

Source Introduction

The authors of this paper, Ai Lingmei and Bai Wenhao, are both from the School of Computer Science at Shaanxi Normal University. This paper was published at the 4th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE) in 2021, and was published by IEEE.

Research Details

Workflow

The research workflow includes the following steps: 1. Selecting Teacher and Student Models: - Compare the performance of ResNet18, ResNet34, ResNet50, and Inception-ResNet-V2, and select Inception-ResNet-V2 as the teacher model. - Compare ResNet18, AlexNet-v2, and SqueezeNet, and select SqueezeNet as the student model.

  1. Data Preparation:

    • Obtain MRI data from 130 patients from The Cancer Imaging Archive (TCIA) and expand the dataset to 59878 samples through data augmentation techniques.
    • The dataset includes various types of gliomas, such as Astrocytoma II, Oligodendroglioma II, Astrocytoma III, etc., and ultimately classify the data into three categories: healthy, low-grade glioma, and high-grade glioma.
  2. Network Architecture Design:

    • Compare the GPU usage, number of parameters, FLOPs (floating-point operations), and model size of each teacher model and ultimately select Inception-ResNet-V2 for its best performance in classification accuracy.
    • SqueezeNet is chosen as the student model due to its significant advantages in GPU load, number of parameters, FLOPs, and model size.
  3. Introducing SE Module for Improvement:

    • Although SqueezeNet effectively reduces computation costs, its classification accuracy is relatively low. Therefore, the study introduces the SE module (Squeeze-and-Excitation block) to improve model performance, particularly classification accuracy.

Experimental Process and Results

The training process is as follows: 1. The dataset is divided into training, validation, and test sets in an 8:1:1 ratio, and the images are normalized. 2. Train SqueezeNet on the teacher model, setting the temperature to 5, using the Adam optimizer, with an initial learning rate of 0.001, a dropout rate of 0.2, a total of 50 epochs, and a batch size of 64. 3. Compare the knowledge distilled SqueezeNet with the original models based on indicators such as accuracy, precision, recall, and F1 score.

The experimental results show that after knowledge distillation, the student model (SqueezeNet) achieves significant improvements in various indicators. Compared to the teacher model Inception-ResNet-V2, its accuracy, precision, recall, and F1 score are improved by 3.53%, 5.68%, 3.73%, and 4.69%, respectively; meanwhile, GPU usage, number of parameters, FLOPs, and model size are reduced by 35%, 98.72%, 98.7%, and 98.62%, respectively.

Conclusions and Significance

The knowledge distillation-based glioma classification method proposed in this paper not only significantly improves the accuracy of the lightweight CNN but also reduces the model’s computation and parameters, making it more suitable for practical applications in embedded devices or medical electronics. In the future, this paper will continue to optimize the efficiency and accuracy of the student model, further promoting the practical application of smart medical diagnosis.