Asthma Prediction via Affinity Graph Enhanced Classifier: A Machine Learning Approach Based on Routine Blood Biomarkers
Asthma Prediction Enhanced by Affinity Graph-Based Classifier: A Machine Learning Approach Using Routine Blood Biomarkers
Background
Asthma is a chronic respiratory disease that affects approximately 235 million people worldwide. According to the World Health Organization (WHO), the main characteristic of asthma is airway inflammation, leading to symptoms such as wheezing, shortness of breath, and chest tightness in patients. Accurate and timely diagnosis is crucial for effective management and treatment of asthma. However, traditional asthma diagnostic methods often involve a combination of medical history, physical examinations, and lung function tests, which are not only expensive but also time-consuming, particularly for patients with atypical symptoms, leading to delayed diagnosis or misdiagnosis. Moreover, diagnosing asthma in children is especially challenging, and the time-consuming nature of traditional methods may exacerbate this issue.
With the development of machine learning (ML), there is tremendous potential in analyzing medical data, identifying patterns, and generating predictions. This study aims to utilize an Affinity Graph Enhanced Classifier (AGEC) to improve the accuracy of asthma prediction.
Source of the Paper
This research paper was authored by Dejing Li, Stanley Ebhohimhen Abhadiomhen, Dongmei Zhou, Xiang-Jun Shen, Lei Shi, and Yubao Cui, and published in the “Journal of Translational Medicine,” Volume 22, Issue 100, in 2024. The affiliated institutions include Wuxi People’s Hospital of Nanjing Medical University, Jiangsu University, and University of Nigeria, among others. The paper was accepted and published on January 6, 2024, and released under Open Access.
Research Process
Data Collection
The clinical dataset used in the study consists of 152 samples from asthma patients from Shuguang Hospital Affiliated to Shanghai University of Traditional Chinese Medicine. The data include patient records from ages 20 to 100, with 18.4% of samples aged between 20 to 40, 47.4% between 50 to 69, and 34.2% over 70. The dataset also includes gender proportions, with 40% male and 60% female.
Each record extracted 24 indicators, including blood routine differences and red blood cell indices. The candidate predictors used in the classification program include White Blood Cell count (WBC), Neutrophil percentage (NE%), Lymphocyte percentage (LY%), Monocyte percentage (MO%), Eosinophil percentage (EO%), Basophil percentage (BA%), Red Blood Cell count (RBC), Hemoglobin (HGB), Hematocrit (HCT), Mean Corpuscular Volume (MCV), Platelet count (PLT), etc.
Model Construction
Traditional Multi-label Learning Models are used to learn the mapping from feature dimensions to label dimensions. The new model reduces the feature space dimensionality by introducing a projection matrix P, while capturing intrinsic relationships between samples through the affinity graph W.
The formulas are as follows:
[ \begin{aligned} &1. \ \text{Optimization goal} \ \left(\min||y-zw||^2_f+||z||^2f\right) \ &2. \ \text{Introducing relationship matrix} \ W \ \left(\sum{i,j}||P(x_i-x_j)||^2f W{ij}) \ &3. \ \text{Combining the projection matrix} \ P \ \text{and the optimization model} \ \Rightarrow w \ &4. \ \text{Constructing a new classifier} \ Z \ \text{to obtain the optimized model} \end{aligned}]
Model Optimization
By utilizing Augmented Lagrange Multiplier (ALM) methods, the optimization algorithm for each variable is derived based on the Lagrangian function. This approach obtains an optimized model and further adjusts hyperparameters. By minimizing the loss function, the projection matrix P and the affinity graph matrix W are obtained.
Research Results
Experimental results show that AGEC has a significantly higher accuracy rate in asthma prediction compared to existing multi-label learning algorithms (MLFE), Support Vector Machine (SVM), and Exclusive Regularized Machine (ERM). Specifically, the prediction accuracy of the AGEC model is 72.50%, which is significantly higher than Support Vector Regression (SVR) at 64.01%, and improved Adaboost at 61.02%.
Furthermore, by using Receiver Operating Characteristic (ROC) curves and Area Under Curve (AUC) values to measure model performance, AGEC’s AUC value is 74.01%, significantly higher than other models. Additionally, the p-value results indicate that the differences between models are statistically significant, supporting the superiority and effectiveness of AGEC.
Confusion Matrix
The confusion matrix shows that AGEC has darker shading on the correctly classified results, indicating its effectiveness in correct classifications. The lighter shading on the off-diagonal indicates fewer misclassifications.
Impact of Different Feature Groups
The experiments also compared the impact of feature subsets and found that the model had the highest accuracy (78.18%) with the first feature group. This indicates that appropriate feature selection is crucial for enhancing the performance of classification models.
Conclusion and Significance
The AGEC method proposed in this study shows significant improvements and advantages in asthma prediction using affinity graph-based machine learning models. This study provides a new method for more accurately predicting asthma through the analysis of routine blood biomarkers, which can help the clinical community predict and manage asthma patients in a timely manner, reducing the risk of exacerbation and hospitalization.
Furthermore, the data-driven nature of AGEC and its scalability to other disease prediction tasks offer a framework for future research. Ultimately, the potential application of AGEC in early asthma detection can lead to more proactive and targeted interventions, optimizing patient care and reducing healthcare costs.
Research Contribution and Funding Support
This research was supported by the Wuxi Taihu Talent Plan Top Talent Project (2020THRC-GD-7), the 2022 Jiangsu Province 333 Project (202221001), and the Wuxi Science and Technology Bureau “Taihu Light” Key Technology Project (Y20212006). All data and code can be provided by the corresponding author. The experimental protocol adhered to the ethical standards of the Declaration of Helsinki and was approved by the ethics committee of Shuguang Hospital Affiliated to Shanghai University of Traditional Chinese Medicine.