An Intersubject Brain-Computer Interface Based on Domain-Adversarial Training of Convolutional Neural Network for Online Attention Decoding

Cross-Subject Brain-Computer Interface: Real-time Attention Decoding Based on Domain-Adversarial Training with Convolutional Neural Networks

Academic Background

Attention decoding plays a crucial role in our daily lives and its implementation based on electroencephalogram (EEG) has garnered extensive attention. However, due to significant inter-individual differences in EEG signals, training a universal model for every individual is impractical. Therefore, this paper proposes an end-to-end brain-computer interface (BCI) framework aiming to address this challenge by particularly utilizing one-dimensional convolutional neural networks (1D CNN) for temporal and spatial features combined with domain-adversarial training strategies. Designed Convolutional Neural Network Model in this study

Traditional attention decoding methods typically rely on predefined feature extraction and pattern classification techniques such as linear discriminant analysis (LDA) and support vector machines (SVM). However, these methods show limitations when handling cross-subject data. Additionally, while deep learning methods excel in classification performance, handling the significant individual differences in EEG signals remains a challenge.

Paper and Its Source

This paper is written by Dichen, Haiyun Huang, Zijing Guan, Jiahui Pan, and Yuanqing Li, from the School of Software at South China University of Technology and South China Normal University, and the Ethics Committee of the affiliated Brain Hospital of Guangzhou Medical University. It was published in IEEE Transactions on Biomedical Engineering, with the DOI: 10.1109/tbme.2024.3404131.

Research Process and Methods

Research Process

The study involves multiple steps starting from the representation of raw EEG data, sequentially going through feature extraction, task label prediction, and domain classifier, ultimately leading to the decoding result. The methods and tools involved in the framework are described as follows:

  1. EEG Representation: The original EEG data is inputted, each sample represented as a matrix containing channels and time sampling points.
  2. Feature Extractor: Time domain, spatial domain, and combined features are extracted using temporal convolution blocks, spatial convolution blocks, and separable convolution blocks, respectively.
  3. Task Label Predictor: The feature vector is input into a fully connected layer to achieve task label prediction.
  4. Domain Classifier: A special gradient reversal layer (GRL) is used during training to align features across different domains by reversing gradients.

Feature Extraction Steps

  1. Temporal Convolution Block: 1D convolution and batch normalization are applied to the EEG signal to extract time-domain features.
  2. Spatial Convolution Block: Convolution is applied across all channels to extract spatial features, followed by batch normalization and average pooling.
  3. Separable Convolution Block: Depthwise convolution and 1D convolution are combined to effectively extract the most relevant temporal features, compressed through an average pooling layer and a flatten layer.

Optimization Function

The article proposes to optimize model parameters using a combination of task loss and domain loss with the following formulas:

[ L_i^y(\theta_f, \theta_y) = L_y(g_y(g_f(x_i; \theta_f); \theta_y), y_i) ]

[ L_i^d(\theta_f, \theta_d) = L_d(g_d(g_f(x_i; \theta_f); \theta_d), d_i) ]

[ E(\theta_f, \theta_y, \thetad) = \frac{1}{n} \sum{i=1}^n L_i^y(\theta_f, \thetay) - \lambda \left(\frac{1}{n} \sum{i=1}^n L_i^d(\theta_f, \thetad) + \frac{1}{n’} \sum{i=n+1}^{n+n’} L_i^d(\theta_f, \theta_d) \right) ]

The optimization objective is to find the saddle point of $\theta_f, \theta_y$, and $\theta_d$ using a gradient descent method to train and optimize the entire model.

Research Results

Offline Experiment

In offline experiments, cross-validation analysis was conducted using data from 85 subjects. Results showed that DA-TSNet achieved an accuracy of 89.40% ± 9.96%, significantly higher than other baseline methods (e.g., PSD-SVM at 75.24%, TFN-SVM at 80.83%, and EEGNet at 84.09%). Statistical analysis results were also displayed via confusion matrix, further corroborating the superior performance of DA-TSNet in attention recognition tasks.

Simulated Online Experiment

In simulated online experiments, data of new subjects were segmented and trained through different numbers of initial trials (e.g., 20, 40, 60, 80, and 100). Results showed that with more trial segments, DA-TSNet outperformed other methods in accuracy, such as achieving 88.07% ± 11.22% under 6060 data segmentation conditions.

Real Online Experiment

In real online experiments, the testing involved a 32-channel Neuroscan amplifier and 22 subjects. Experimental results showed that DA-TSNet achieved accuracies of 86.44% ± 13.28% and 89.02% ± 9.58% in two experiments, significantly higher than TSNet at 75.15% ± 13.04%, indicating the practical value of DA-TSNet in online attention decoding.

Conclusion and Future Prospects

The proposed DA-TSNet framework effectively addresses significant inter-individual differences in EEG signals and significantly improves the accuracy and efficiency of attention decoding. Its performance and practicality have been validated through both offline and online experiments, particularly showing strong adaptability and stability in cross-subject attention decoding tasks. Future research will further optimize the structure of the domain classifier, reducing model size, and explore the impact of time intervals in long-term experiments. In summary, DA-TSNet offers an innovative and effective method for attention decoding and its online applications.