Auto-Segmentation of Neck Nodal Metastases Using Self-Distilled Masked Image Transformer on Longitudinal MR Images
Potential of Self-Distilling Masked Image Transformer in Longitudinal MRI - Automatic Segmentation of Cervical Lymph Node Metastases
Report Introduction
In tumor radiotherapy, automatic segmentation technology promises to improve speed and reduce inter-reader variability caused by manual segmentation. In radiotherapy clinical practice, accurate and rapid tumor segmentation is crucial for personalized treatment of patients. Ramesh Paudyal and colleagues from Memorial Sloan Kettering Cancer Center conducted this study to implement and evaluate the accuracy of the “Self-distilling Masked Image Transformer” (SMIT) algorithm in automatically segmenting cervical lymph node metastases in longitudinal T2-weighted MRI images of patients with oropharyngeal squamous cell carcinoma.
This paper was published in BJR|Artificial Intelligence, Issue 1, 2024. The study was completed by Ramesh Paudyal, Jue Jiang, James Han, Bill H. Diplas, Nadeem Riaz, Vaios Hatzoglou, Nancy Lee, Joseph O. Deasy, Harini Veeraraghavan, and Amita Shukla-Dave, all from Memorial Sloan Kettering Cancer Center in New York, USA.
Research Background
Cross-sectional imaging techniques commonly used in tumor detection and radiotherapy planning include computed tomography (CT) and magnetic resonance imaging (MRI). In clinical practice, accurate identification and segmentation of organs at risk (OARs) and tumors are crucial for precise radiotherapy. However, CT imaging has limitations in metal artifacts and clarity of tumor-normal soft tissue boundaries, while MRI provides superior soft tissue contrast and is becoming a potential imaging method for radiotherapy planning, implementation, and evaluation.
Traditionally, radiologists manually segment head and neck tumor boundaries, a process that is time-consuming and susceptible to inter-reader variability, especially for morphologically complex head and neck tumors and potential artifacts. Although atlas-based segmentation methods have shown time-saving potential, manual editing is still required for small target tumor volumes in the head and neck region. Recently, deep learning techniques have shown potential in automatic segmentation, offering better efficiency and consistency compared to manual and atlas-based segmentation.
Research Methods
This study included 123 human papillomavirus-positive (HPV+) oropharyngeal squamous cell carcinoma (OPSCC) patients who received concurrent chemoradiotherapy. Longitudinal T2-weighted (T2W) MRI images were collected before and during treatment (i.e., week 0 and weeks 1-3). Cervical lymph node metastases were manually delineated for these 123 patients and processed using the SMIT automatic segmentation algorithm, followed by calculation of total tumor volume. The study used standard statistical analysis methods to compare SMIT segmentation with manual segmentation volumes (using Wilcoxon signed-rank test [WSRT]) and calculated Spearman rank correlation coefficients.
Data Acquisition and Processing
MRI images were acquired using a Philips 3T scanner (Ingenia). The standard MRI acquisition procedure included multiplanar T2W (echo time [TE] = 80ms, repetition time [TR] = 4099-5939ms). The dataset included MRI images from 123 patients, with 95 used for training, 10 for validation, and 18 for testing.
Deep Learning Automatic Segmentation Algorithm
A pre-trained 3D Swin model was used to model cervical lymph node metastases, with a U-Net decoder for refinement. The research model was trained for 500 epochs, with a total training time of 20 hours, and fine-tuned using 5-fold cross-validation. The inference time for each model round was about 2 seconds, including data loading and segmentation.
Statistical Analysis
The segmentation accuracy of the SMIT algorithm was evaluated using DSC (Dice Similarity Coefficient), and WSRT was used to compare manual and automatic segmentation volumes. Spearman rank correlation coefficient was used to calculate the correlation between SMIT model and manually delineated tumor volumes. Statistical significance for experimental results was set at p<0.05.
Research Results
Comparison of Manual and Automatic Segmentation
The study showed that there was no significant difference between SMIT model segmentation and manually delineated pre-treatment (pre-tx) tumor volumes (8.68±7.15 vs 8.38±7.01 cm³, p=0.26 [WSRT]), and Bland-Altman analysis showed strong consistency, with overall tumor volumes significantly correlated (q=0.84-0.96, p<0.001). Segmentation accuracy in the test dataset was represented by DSC values, reaching 0.86, 0.85, 0.77, and 0.79 for pre-treatment and different weeks during treatment, respectively.
Research Highlights
- Innovation: First evaluation of the effectiveness of SMIT algorithm automatic segmentation using longitudinal T2W MRI in HPV+ OPSCC patients.
- High Accuracy: The SMIT algorithm showed high-precision segmentation results, highly consistent with manual segmentation results.
- Efficiency Improvement: Significantly improved segmentation efficiency and reduced time cost compared to manual delineation.
Research Significance
The SMIT algorithm proposed in this study provides a practical automatic segmentation solution in tumor radiotherapy, significantly reducing inter-reader variability and workload caused by manual delineation, improving segmentation efficiency and accuracy in radiotherapy clinical practice, and providing reference for future clinical applications.
References
All references mentioned in the text are clearly marked with detailed sources and links, ensuring the reliability and traceability of the research.
Conclusion
This study innovatively applied the SMIT algorithm for automatic segmentation of HPV+ OPSCC patients, achieving high-precision and efficient segmentation results, with promising valuable application prospects in future radiotherapy clinical practice. The important methods and data contributed by this study provide significant support for open scientific research.