An Attention-Guided CNN Framework for Segmentation and Grading of Glioma Using 3D MRI Scans
Study of Attention-Guided CNN Framework for 3D MRI Glioma Segmentation and Grading
Gliomas are the most deadly form of brain tumors in humans. Timely diagnosis of these tumors is a crucial step for effective tumor treatment. Magnetic Resonance Imaging (MRI) typically provides a non-invasive examination of brain lesions. However, manual inspection of tumors in MRI scans is time-consuming and prone to errors. Therefore, automatic tumor diagnosis plays a critical role in the clinical management and surgical intervention of gliomas. In this study, we propose a Convolutional Neural Network (CNN)-based framework for non-invasive tumor grading from 3D MRI scans.
Background
Gliomas are common and fatal brain tumors that can be graded from I to IV based on their invasiveness and malignancy. Low-grade tumors (Grades I-III) are generally less invasive and respond better to treatment. However, high-grade tumors (Grade IV) such as glioblastoma are highly invasive and have poor treatment outcomes, with only 5% of patients surviving for 5 years.
To conduct glioma research using medical imaging, researchers typically rely on medical imaging, especially MRI. MRI images provide high spatial and temporal resolution observations of brain tissues, making MRI a pivotal direction for brain tumor analysis in research. Additionally, the classification of gliomas also involves genetic characteristics, such as isocitrate dehydrogenase (IDH) mutations and 1p/19q chromosomal arm status. These molecular markers play a significant role in influencing the tumor’s response to treatment.
Research Institutions and Publication Information
This study was completed by Prasun Chandra Tripathi and Soumen Bag from the Indian Institute of Technology (ISM) in Dhanbad, India. The paper was published on November 9, 2022, in the IEEE/ACM Transactions on Computational Biology and Bioinformatics.
Research Methods
The research method consists of two main steps: glioma segmentation and classification. The segmentation network adopts an encoder-decoder architecture, while the classification network uses a multi-task learning strategy.
Segmentation Network
The architecture of the segmentation network is as follows:
- Input Data: MRI images, including four modalities: T1, T1c, T2, and FLAIR.
- Encoder Part: Three downsampling layers, each reducing the input data size by half, along with several residual blocks.
- Transition Part: Residual blocks for extracting deep features from encoded features.
- Decoder Part: Three upsampling layers to restore the original size, with long skip connections to preserve low-level image features.
The segmentation network also incorporates spatial and channel attention mechanisms to refine feature maps. Attention mechanisms in CNNs can select critical information while ignoring irrelevant content using channel attention and spatial attention approaches.
Multi-Task Classification Network
The architecture of the classification network is as follows:
- Input Data: The 3D tumor region obtained from the segmentation.
- Shared Backbone Network: Multiple convolution layers and residual blocks for feature extraction.
- Task-Specific Layers: Three fully connected layers responsible for low/high-grade classification, 1p/19q chromosomal status prediction, and IDH mutation status prediction.
Multi-task learning leverages information sharing between different tasks to improve classification accuracy.
Experimental Results
The experimental dataset includes multi-modal MRI data from BraTS 2019 and the Cancer Imaging Archive, comprising MRI images from 617 patients. These images were used to train and validate the model, and the model’s performance was evaluated.
Segmentation Results
Segmentation performance was evaluated using metrics such as Dice Similarity Coefficient (DSC), Hausdorff distance, sensitivity, and specificity. The experimental results showed that the spatial and channel attention mechanisms significantly improved the model’s performance in segmentation tasks.
For example: - Enhanced tumor region DSC improved from 0.7612 to 0.7712. - Whole tumor region DSC improved from 0.8721 to 0.9002. - Core tumor region DSC improved from 0.8090 to 0.8230.
Classification Results
Classification performance was evaluated using metrics such as classification accuracy, precision, specificity, sensitivity, and F1-score. The multi-task classification network performed excellently across all three tasks, with significant improvements in classification accuracy after using spatial and channel attention mechanisms.
For example: - Low/High-grade classification accuracy improved from 91.00% to 95.86%. - IDH status classification accuracy improved from 87.94% to 91.96%. - 1p/19q status classification accuracy improved from 81.20% to 87.88%.
Research Conclusions and Significance
By introducing attention mechanisms, our proposed CNN framework shows significant advantages in both glioma segmentation and classification tasks. In particular, the multi-task learning strategy enhances the robustness and effectiveness of the model in handling multiple classification tasks. This study provides a novel non-invasive method for diagnosing and assessing gliomas, presenting substantial scientific value and clinical application potential.
Future efforts will focus on developing models combining CNN-Transformer architectures to further enhance the performance of various classification and segmentation tasks, ultimately providing more effective tools for glioma diagnosis and treatment.