Residual-Dense Network for Glaucoma Prediction Using Structural Features of Optic Nerve Head

2025-02-05 Wed
glaucoma residual-dense network optic nerve head structural features deep learning optic cup optic disc
Using Residual Dense Network (RD-Net) for Glaucoma Prediction Based on Structural Features of the Optic Nerve HeadBackground and Research PurposeGlaucoma is one of the leading causes of blindness worldwide, often referred to as the “silent thief of sight.” It is characterized by the progressive degeneration of the optic nerve head (ONH), resulting in irreversible vision loss before patients become aware of visual impairment. Statistically, glaucoma is the second leading cause of blindness after cataracts. Early screening and accurate diagnosis are critical to managing disease progression and preserving visual function.
Clinically, glaucoma diagnosis primarily relies on structural and functional assessments, including intraocular pressure (IOP) measurement, structural evaluation of the ONH, and visual field tests. However, visual field assessments often require expensive equipment, making them less accessible in primary healthcare settings. By analyzing ONH structural features, such as the cup-to-disc ratio (CDR), disc damage likelihood scale (DDLS), and the ISNT rule (Inferior, Superior, Nasal, Temporal rim width relationship), early and effective disease screening can be achieved.
Although several methods have been developed to automatically detect ONH structural damage, these approaches frequently rely solely on CDR as a single indicator, overlooking other critical anatomical features. Furthermore, manual assessment of ONH is time-consuming, costly, and prone to subjectivity. Against this backdrop, the researchers aimed to develop a hybrid deep learning model, called Residual Dense Network (RD-Net), to precisely segment the optic disc (OD) and optic cup (OC) and predict glaucoma based on these segmentation results.
Source of the PaperThis research was conducted by scholars Preity, Ashish Kumar Bhandari, Akanksha Jha, and Syed Shahnawazuddin from the Department of Electronics and Communication Engineering at the National Institute of Technology Patna, India. The study was published in Volume 6, Issue 1 of IEEE Transactions on Artificial Intelligence (January 2025). The paper demonstrates the effectiveness of the proposed model through experiments on four benchmark datasets (Drishti, RIMONE, ORIGA, and REFUGE).
Research Methods and WorkflowOverview of the WorkflowThe research consists of the following three main stages:
1. Image preprocessing: Includes data augmentation and label encoding.
2. Construction and training of the RD-Net model: Enhancements to the traditional U-Net with additional deep learning modules.
3. Feature extraction and glaucoma prediction: Calculating CDR, DDLS, and ISNT values based on RD-Net segmentation results.
1. Image PreprocessingGiven the relative scarcity of retinal image datasets for glaucoma, the study applied the Albumentation library for various augmentation operations, including random rotation, flipping, elastic distortion, optical distortion, grid warping, and scaling to a uniform resolution of 256×256 pixels. Data augmentation generated six different versions for each image, significantly increasing the training dataset size (e.g., Drishti dataset expanded from 30 training images to 300).
2. RD-Net Network ArchitectureRD-Net is a hybrid deep network based on an enhanced U-Net, incorporating Dense Residual Blocks and Squeeze-Excitation (SE) Blocks. Key architectural details include 18 convolutional layers, 4 max-pooling layers, and skip connections between the encoder and decoder. The overall architecture is divided into two parts: encoder and decoder.
Encoder Section: Consists of five levels of convolutional modules, each including convolution, batch normalization, nonlinear activation (ReLU), and dense residual blocks. Max-pooling layers follow each convolution to extract critical features while reducing spatial dimensions.
Decoder Section: Consists of four levels of upsampling. Outputs from the current decoder and corresponding encoder stages are integrated via skip connections, progressively restoring the original resolution. The final layer incorporates an SE Block and 1×1 convolution to generate precise segmentation maps.
The model uses the Adam optimizer with an initial learning rate of 0.001, and He Initialization for weight initialization. Regularization is achieved using dropout layers with a dropout rate of 0.2 to prevent overfitting.
3. Feature Extraction and PredictionThree key structural features are extracted from the RD-Net’s OD and OC segmentation maps:
- CDR Calculation: Determined by the ratio of the vertical length of the optic cup to the vertical length of the optic disc, commonly regarded as a critical early indicator of glaucoma.
- DDLS Calculation: Defined as the ratio of the minimum neuroretinal rim width to OD diameter. Values below 0.3 indicate glaucoma-related damage.
- ISNT Rule: Healthy eyes follow an order of Inferior < Superior < Nasal < Temporal rim width; deviations may indicate abnormalities.
These features are analyzed to further classify glaucoma risk levels.
Datasets and Experimental SetupThe study used four publicly available benchmark datasets:
- Drishti: Contains 101 retinal images with ground truth for OD and OC segmentation.
- RIMONE-DL: Includes 485 images manually annotated by specialists.
- ORIGA: Contains 650 images with ground truth for combined OD and OC segmentation.
- REFUGE: A glaucoma challenge dataset with 400 images in total.
The model was trained and tested using an NVIDIA Tesla T4 GPU, implemented with the TensorFlow framework.
Results and AnalysisQualitative AnalysisThrough experiments on the four datasets, the segmentation results generated by RD-Net exhibited sharp edges and reduced artifacts. Compared to traditional U-Net and U-Net++, the segmentation quality of RD-Net was superior, as shown in the results from the Drishti dataset. Even in complex scenarios (e.g., full retinal images versus cropped regions), RD-Net consistently delivered clear segmentation.
In addition, RD-Net demonstrated consistent high-precision segmentation in joint OD and OC segmentation for the ORIGA and REFUGE datasets.
Quantitative AnalysisSeven standard metrics, including Dice Coefficient (DC), Intersection over Union (IoU), and Accuracy, were used to evaluate model performance. Compared to existing algorithms such as U-Net, ResUNet, and K-means clustering, RD-Net achieved segmentation accuracies of 98.94% for OC and 99.40% for OD across all four datasets. Cross-dataset testing further showed robust performance of RD-Net.
Ablation Studies and Complexity AnalysisThe researchers performed ablation studies by removing modules or changing hyperparameters to validate the effectiveness of residual and SE blocks. For instance, removing the SE block significantly reduced segmentation quality (Dice Coefficient dropped to 0.91 on the Drishti dataset). While RD-Net is slightly more complex in terms of computational requirements, its superior performance justifies the increased complexity.
Glaucoma AssessmentFollowing segmentation, features such as ISNT rule violations, abnormal DDLS, and CDR values were extracted to facilitate glaucoma prediction. For example, for an image with a vertical CDR of 0.45, ISNT values and corresponding DDLS analysis classified it as non-glaucoma. Another image with vertical CDR of 0.6878 was identified as a suspected glaucoma case.
Significance and Value of the StudyThe proposed RD-Net model not only achieves high-accuracy OD and OC segmentation but also enhances glaucoma prediction by leveraging multiple structural indicators. Particularly in resource-constrained healthcare environments, automated segmentation and prediction technologies can significantly reduce costs and improve diagnostic efficiency.
Future research directions include:
1. Developing severity grading systems for glaucoma.
2. Extending the model to other retinal diseases, such as diabetic retinopathy.
By integrating state-of-the-art deep learning modules, RD-Net provides a powerful tool for ophthalmic image analysis and has the potential for broad applications in public health and ophthalmology.