Predictive Model for Daily Risk Alerts in Sepsis Patients in the ICU: Visualization and Clinical Analysis of Risk Indicators

Sepsis is a systemic inflammatory response syndrome triggered by infection, often leading to multiple organ failure and high mortality rates. Although modern medical technology has made significant progress in the treatment of sepsis, some patients still die due to the rapid deterioration of their condition. Therefore, accurately predicting the mortality risk of sepsis patients is crucial for clinicians to develop timely and personalized intervention strategies. However, existing clinical scoring systems (such as APACHE-II and SOFA scores), while capable of assessing the overall condition of critically ill patients, are not specifically optimized for sepsis patients. Additionally, traditional machine learning models often overlook the temporal characteristics of disease progression when processing time-series data, resulting in limited predictive performance.

To address these challenges, this study proposes a time-series model based on the Transformer architecture, aiming to capture the dynamic health trajectories of patients during their ICU stay, identify high-risk individuals in real time, and provide actionable insights for personalized interventions. This research not only improves the prediction accuracy of mortality risk in sepsis patients but also offers a new paradigm for ICU prognosis evaluation.

Source of the Paper

This paper was co-authored by Hao Yang, Jiaxi Li, Chi Zhang, Alejandro Pazos Sierra, and Bairong Shen. The authors are affiliated with the Information Center of West China Hospital, Sichuan University, the Department of Computer Science and Information Technologies at the University of A Coruña, Spain, the Clinical Laboratory Medicine Department of Jinniu Maternity and Child Health Hospital in Chengdu, and the Department of Critical Care Medicine and Institutes for Systems Genetics at West China Hospital, Sichuan University. The paper was published on February 8, 2025, in the journal Precision Clinical Medicine, with the DOI 10.1093/pcmedi/pbaf003.

Research Process and Details

1. Data Source and Preprocessing

The research data was sourced from the eICU Collaborative Research Database, which contains clinical data from over 200,000 ICU patients across 208 hospitals in the United States. The study included patients diagnosed with sepsis and excluded records of patients under 18 years old, those with ICU stays shorter than 24 hours, and those with missing data exceeding 30%. Ultimately, the study screened 13,610 patients, of whom 2,114 died during their ICU stay, and 11,496 survived.

Data preprocessing included the following steps: - Data Cleaning: The data was cleaned and organized using Python’s NumPy and Pandas libraries. - Time-Series Construction: Based on the ICU admission timeline, vital signs and laboratory test results recorded hourly were arranged in chronological order to construct a 24×226 time-series matrix. - Missing Value Imputation: For time-series data, forward imputation was used; for non-time-series features, the random forest algorithm was employed for imputation.

2. Model Architecture and Training

The study proposed a two-stage Transformer architecture designed to capture hourly and daily time-series patterns of patients. The specific steps are as follows: - Stage 1: Hour-Level Transformer Encoder: Processes 24-hour time-series data for each day, capturing intra-day dependencies through self-attention mechanisms and generating daily representations using average pooling. - Stage 2: Day-Level Transformer Encoder: Processes the daily representations from the first 5 days, capturing inter-day dependencies through self-attention mechanisms. For patients with ICU stays shorter than 5 days, masking was applied to handle missing data, ensuring input consistency for the model.

The model was trained using the PyTorch framework on a Windows 11 operating system. The training, validation, and test sets were split in a 7:2:1 ratio. To address the issue of imbalanced samples, the study introduced the focal loss function.

3. Model Performance Evaluation

The study evaluated the model’s generalizability through external validation: - Chinese Sepsis Dataset: The model achieved an accuracy of 81.8% with an AUC value of 0.73. - MIMIC-IV-3.1 Database: The model achieved an accuracy of 76.56% with an AUC value of 0.84.

Additionally, the study compared the performance of traditional machine learning models (such as decision trees, XGBoost, and LSTM), showing that the two-stage Transformer model significantly outperformed others in AUC, accuracy, and F1 score.

4. Feature Visualization and Clinical Analysis

The study used the SHAP (SHapley Additive exPlanations) algorithm to generate feature weight heatmaps, revealing the dynamic changes in features associated with mortality. For example, lactate levels, tidal volume, chloride concentration, and blood glucose levels were significantly associated with patient mortality on the first day of admission. As the condition progressed, red cell distribution width (RDW), albumin, alkaline phosphatase, and calcium levels gradually became important predictive indicators.

Research Conclusions and Significance

By introducing a Transformer-based time-series model, this study significantly improved the prediction accuracy of mortality risk in ICU sepsis patients. The model not only captures the temporal characteristics of patient condition changes but also provides clinically interpretable biomarkers through feature visualization. These findings offer new tools for ICU prognosis evaluation, helping to optimize triage processes, reduce diagnostic delays, and ultimately improve patient survival outcomes.

Research Highlights

  1. Innovative Model Architecture: The two-stage Transformer model was applied for the first time to time-series data analysis of ICU sepsis patients, effectively capturing the dynamic changes in patient conditions.
  2. High Predictive Performance: The model demonstrated strong generalizability in external validation, achieving a maximum AUC value of 0.92.
  3. Clinical Interpretability: The feature heatmaps generated by the SHAP algorithm provide clinicians with intuitive risk indicators, aiding in the development of personalized treatment plans.
  4. Broad Application Prospects: The model is not only applicable to prognosis evaluation in sepsis patients but can also be extended to the prediction and management of other critical illnesses.

Other Valuable Information

The research team plans to further integrate real-time data from hospital information systems and develop online learning techniques, enabling the model to adjust predictions in real time based on clinical feedback. This will provide clinicians with more dynamic and precise decision support.


Through this study, we see the immense potential of artificial intelligence in the field of critical care medicine. In the future, with the accumulation of more data and model optimization, time-series-based prediction models are expected to become standard tools in ICU patient management, making significant contributions to improving healthcare quality and patient survival rates.