EISATC-Fusion: Inception Self-Attention Temporal Convolutional Network Fusion for Motor Imagery EEG Decoding

2024-06-13 Thu
brain-computer interface motor imagery attention collapse temporal convolution network transfer learning
Research BackgroundBrain-Computer Interface (BCI) technology enables direct communication between the brain and external devices. It is widely used in fields such as human-computer interaction, motor rehabilitation, and healthcare. Common BCI paradigms include steady-state visual evoked potentials (SSVEP), P300, and motor imagery (MI). Among these, MI-BCI has garnered significant attention due to its broad application prospects.
MI-BCI typically uses electroencephalography (EEG) signals to detect motor imagery, allowing users to control devices such as electric wheelchairs, cursors, and upper limb robots through imagined movement. However, the instability and low signal-to-noise ratio (SNR) of brain activity, as well as inter-individual signal differences and inter-channel EEG correlations, increase the complexity of brain signal analysis and classification. Currently, MI EEG signal decoding mainly relies on traditional machine learning and deep learning techniques. However, due to the variability and individual differences in EEG signals, decoding accuracy remains limited, hampering the application of MI-BCI.
Paper SourceThis paper was written by researchers Guangjin Liang, Dianguo Cao, Jinqiang Wang, Zhongcai Zhang, and Yuqiang Wu, all affiliated with the School of Engineering at Qufu Normal University. The paper was published in IEEE Transactions on Neural Systems and Rehabilitation Engineering, Vol. 32, 2024.
Research ProcessThis study proposes a high-performance and lightweight end-to-end MI EEG decoding model called EISATC-Fusion, which includes modules like Inception blocks, Multi-Head Self-Attention (MSA), Temporal Convolutional Network (TCN), as well as feature fusion and decision fusion. The specific research process and methods are as follows:
Data PreprocessingInput Representation and Preprocessing:
The data includes c channels and t sampling points, with no need for filtering or artifact removal.
Z-score normalization is used to reduce the non-stationarity of EEG signals, with the normalization formula:
[ X’ = \frac{X_i - \mu}{\sqrt{\sigma^2}} ]
Model StructureEISATC-Fusion Model Structure:
EDSI Module: Uses standard convolution and depthwise convolution to extract temporal and spatial features, and multi-scale temporal features are extracted through depthwise separable convolution Inception modules.
CNNCoS Multi-Head Self-Attention Module: Addresses attention collapse issues based on CNN and adds cosine attention to improve model interpretability.
TDScn Module: Reduces model parameters through depthwise decomposed convolution.
Fusion Module: Includes feature fusion and decision fusion to fully utilize model output features and improve model robustness.
Feature ExtractionEDSI Module:
The core consists of three convolutional layers: the first layer for temporal convolution, the second for channel convolution, and the third for the Inception block.
Different paths use different convolution kernel sizes, and max-pooling layers fuse the input information.
Batch normalization and exponential linear unit activation are applied after each convolution, with dropout layers added after pooling layers.
CNNCoS Multi-Head Self-Attention Module:
Simulates the attention mechanism through three parts: query, key, and value.
Depthwise convolution is used to compute q, k, and v vectors, and cosine attention mechanism is applied to compute attention scores, improving the original attention weights.
TDScn Module:
TCN improves computational efficiency and time dependency without explicitly maintaining sequence data state.
Replacing dilated convolution with dilated depthwise convolution reduces model parameters.
Fusion Module:
Feature fusion combines outputs of different model layers to extract hidden information from input data.
Decision fusion reduces uncertainty and errors through combining outputs of multiple classifiers, enhancing model information integration capacity.
Main ResultsWithin-Subject Decoding Experiments:
EISATC-Fusion achieved the highest average decoding accuracy on both BCI-2a and BCI-2b datasets.
Markedly improved performance compared to models with CNN, MSA, and multi-scale structures, with significantly reduced parameter counts.
Ablation Experiments:
Ablation experiments on various modules of EISATC-Fusion indicated that each module contributed to improved decoding performance.
Especially, the fusion module had the most significant impact on model performance.
Comparison of Different Training Strategies:
The improved two-stage training strategy significantly enhanced model performance, verifying the universality of the strategy.
Between-Subject Decoding Experiments:
EISATC-Fusion performed excellently in between-subject experiments, significantly improving cross-subject decoding performance.
Transfer Learning Experiments:
Cross-subject transfer learning experiments demonstrated that EISATC-Fusion showed better generalization performance for new subjects.
Model performance improved steadily under different data sets and learning rate experiments.
Interpretability Experiments:
Feature visualization and convolutional weight visualization validated the model’s interpretability.
Cosine attention clearly demonstrated the specific physical meaning of each attention head, enhancing model transparency.
Research ConclusionThe EISATC-Fusion model proposed in this paper achieves high performance and lightweight MI EEG decoding through the collaborative operation of multiple modules. Improved training strategies further enhance model performance, and the model shows excellent performance in cross-subject transfer learning. This study demonstrates the model’s interpretability through visualization methods, providing strong support for future practical applications and further optimizations. However, online experiments and model lightweighting were not conducted in this study; future work will optimize model parameters and conduct online experimentation for validation.