DeepSleepNet: A Model for Automatic Sleep Stage Scoring Based on Raw Single-Channel EEG

DeepSleepNet: An Automatic Sleep Stage Scoring Model Based on Single-Channel EEG

Background Introduction

Sleep has a significant impact on human health, and monitoring sleep quality is crucial in medical research and practice. Typically, sleep experts score sleep stages by analyzing various physiological signals such as electroencephalogram (EEG), electrooculogram (EOG), electromyogram (EMG), and electrocardiogram (ECG). These signals are collectively known as polysomnography (PSG) and are classified to determine an individual’s sleep state. However, this manual method is time-consuming and labor-intensive, requiring experts to continuously record and analyze multiple sensors over several nights.

Automated sleep stage scoring methods based on multiple signals (such as EEG, EOG, and EMG) or single-signal EEG have been extensively researched. However, most of the existing methods rely on manual feature extraction, typically designed according to the dataset’s characteristics, which cannot be generalized to a larger heterogeneous population. Moreover, fewer methods consider the temporal information used to identify sleep stage transitions. Recently, the application of deep learning in automated sleep stage scoring has been validated, but often the temporal information used by sleep experts in scoring is neglected, resulting in limited performance.

Source Introduction

This paper, titled “DeepSleepNet: A Model for Automatic Sleep Stage Scoring Based on Raw Single-Channel EEG”, is authored by Akara Supratak, Hao Dong, Chao Wu, and Yike Guo, and was published in the November 2017 issue of IEEE Transactions on Neural Systems and Rehabilitation Engineering. This research was completed by a team from the Department of Computing at Imperial College London, United Kingdom.

DeepSleepNet Research Details

Research Process

The DeepSleepNet model consists of two parts: feature learning and sequential residual learning.

Feature Learning Phase

  1. Using convolutional neural networks (CNN) with two different filter sizes to extract time-invariant features:
    • Smaller filters are more suitable for capturing the temporal information of the EEG signal, while larger filters are suitable for capturing frequency information.
    • Each CNN consists of four convolutional layers and two max pooling layers, performing one-dimensional convolution, batch normalization, and ReLU activation.
   h_s_i = CNN_θ_s(x_i)
   h_l_i = CNN_θ_l(x_i)
   a_i = h_s_i || h_l_i

Where, CNN_θ(x_i) denotes the function transforming a 30-second EEG segment into a feature vector, and the combined output is passed on to the sequential residual learning part.

Sequential Residual Learning Phase

  1. Using bidirectional long short-term memory networks (Bi-LSTM) to learn temporal information:
    • Two-layer Bi-LSTM: Learns the sleep stage transition rules, such as those described in the AASM manual.
    • Shortcut connections: Recomputes residual functions, allowing the model to combine features obtained from the CNN with temporal information learned from the input sequence.
   h^f_t, c^f_t = LSTM_θ_f(h^f_t-1, c^f_t-1, a_t)
   h^b_t, c^b_t = LSTM_θ_b(h^b_t+1, c^b_t+1, a_t)
   o_t = h^f_t || h^b_t + FC_θ(a_t)
  1. Model Parameter Setting:
    • Parameters for the CNN and LSTM are set based on capturing the temporal and frequency information of the EEG. Smaller filters limit long-term dependencies, while larger filters capture high-frequency features.
    • Regularization techniques like dropout and L2 weight decay are used to prevent overfitting.

Training Algorithm

To effectively train the model, a two-step training algorithm is proposed: 1. Pre-training: The feature learning part is pre-trained with a balanced class training set to prevent overfitting. 2. Fine-tuning: The entire model is fine-tuned with a sequential training set to encode temporal information and adjust pre-trained CNN parameters.

Algorithm 1: Two-step Training
1. Pre-training step
   - Oversample data
   - Train CNNs with balanced data
2. Fine-tuning step
   - Replace pre-trained CNNs
   - Train entire model with sequential data

Experiments and Results

  1. Datasets:

    • MASS dataset: Includes PSG recordings of 62 healthy subjects, manually labeled into five sleep stages (W, N1, N2, N3, REM).
    • Sleep-EDF dataset: Includes 20 subjects, with PSG recordings manually classified into eight categories under R&K standards, merging N3 and N4 stages as per AASM standards.
  2. Performance Metrics:

    • Model performance is evaluated using k-fold cross-validation, including accuracy (Acc), macro F1-score (mF1), and Cohen’s κ coefficient.
  3. Results Analysis:

    • DeepSleepNet achieved overall accuracy and macro F1-score similar to state-of-the-art manual feature extraction methods on different single-channel EEG datasets.
    • The temporal information learned by the bidirectional LSTM significantly improved classification performance, validating the effectiveness of sequential residual learning.
  4. Comparison with Existing Methods:

    • Across both datasets (MASS and Sleep-EDF), the performance metrics of DeepSleepNet were superior or comparable to existing manual feature extraction and deep learning methods on single-channel EEG.

Research Significance and Value

The DeepSleepNet model, by combining CNN and Bi-LSTM networks, achieves automatic learning of features for sleep stage scoring directly from raw single-channel EEG data without manual feature extraction. The model can adapt to different datasets’ sampling rates and scoring standards, exhibiting excellent generalization performance. The research results indicate that DeepSleepNet can provide a more efficient method for remote sleep monitoring, eliminating the reliance on manual expert annotation, and enhancing automation and accuracy.

Highlights and Innovation

  1. Innovative Model Architecture: Combining CNN and Bi-LSTM for time-invariant feature extraction and temporal information learning.
  2. Effective Training Algorithm: Two-step training method effectively addresses class imbalance in large datasets.
  3. Superior Generalization Capability: Consistent high performance across different datasets and channels.

Conclusion and Future Work

The DeepSleepNet model demonstrates a new deep learning-based automated sleep stage scoring method without the need for manual feature extraction, suitable for single-channel EEG data. Future plans include applying this model to single-channel EEG data collected from wearable devices to achieve remote sleep monitoring and assessment in home environments or clinical settings.