Multi-scale Hyperbolic Contrastive Learning for Cross-subject EEG Emotion Recognition
Cross-Subject EEG Emotion Recognition Research Based on Multi-Scale Hyperbolic Contrastive Learning
Academic Background
Electroencephalography (EEG), as a physiological signal, plays an important role in the field of affective computing. Compared with traditional non-physiological cues (such as facial expressions or voice), EEG signals have higher temporal resolution and objectivity, enabling them to more reliably reflect human emotional states. However, significant individual differences exist in EEG signals, making cross-subject emotion recognition a challenging task. EEG signals from different individuals are influenced by various factors such as age, mental state, and cognitive characteristics, leading to poor generalization ability of pre-trained models on new subjects.
To address this challenge, researchers have proposed various methods, including time-frequency domain feature analysis, deep learning models, and transfer learning. However, these methods often struggle to reduce inter-subject variability while retaining the discriminative features of emotions. To solve this problem, this paper proposes a novel method called Multi-Scale Hyperbolic Contrastive Learning (MSHCL), which aims to learn more generalized cross-subject EEG emotion representations by combining event-relatedness and hyperbolic space embedding techniques.
Paper Source
This paper was co-authored by Jiang Chang, Zhixin Zhang, Yuhua Qian (IEEE Member), and Pan Lin (IEEE Member). The authors are from the Institute of Big Data Science and Industry at Shanxi University and the Center for Mind & Brain Sciences at Hunan Normal University. The paper was published in IEEE Transactions on Affective Computing and is scheduled for official publication in 2025. The source code is publicly available at https://github.com/jiangchang-brain/mshcl.
Research Process and Results
1. Research Objectives and Method Design
The goal of this study is to extract emotion-invariant features from cross-subject EEG signals through multi-scale contrastive learning and hyperbolic space embedding techniques, thereby improving the accuracy and generalization ability of emotion recognition. To achieve this, the authors propose the MSHCL framework, which applies contrastive losses at two scales—emotion and stimulus—and captures the hierarchical structure of EEG signals through hyperbolic space.
2. Data Preprocessing and Experimental Setup
The study used three publicly available EEG emotion recognition datasets: SEED, MPED, and FACED. These datasets cover different emotion categories and experimental paradigms, providing rich test scenarios for cross-subject emotion recognition. Data preprocessing included downsampling to 200 Hz and normalization along the channel dimension.
3. Multi-Scale Contrastive Learning Framework
The core of the MSHCL framework lies in multi-scale contrastive learning. Specifically, the authors designed two types of contrastive losses: - Emotion-Scale Contrastive Loss: Retains the discriminative features of emotions by contrasting EEG signals with the same emotion labels. - Stimulus-Scale Contrastive Loss: Reduces inter-subject variability by contrasting EEG signals under the same stimulus.
Additionally, the authors embedded EEG signals into hyperbolic space, leveraging the exponentially growing distance properties of hyperbolic geometry to better capture the hierarchical structure of EEG signals. The curvature parameter © and multi-scale weight parameter (λ) of hyperbolic space were optimized through experiments.
4. Experiments and Results
The study evaluated model performance using Leave-One-Subject-Out (LOSO) cross-validation and 10-fold cross-validation. The experimental results show that MSHCL outperformed existing methods across all three datasets: - In the three-class task on the SEED dataset, the accuracy reached 89.3%. - In the seven-class task on the MPED dataset, the accuracy was 38.8%. - In the binary and nine-class tasks on the FACED dataset, the accuracies were 77.0% and 45.7%, respectively.
5. Result Analysis and Visualization
To further validate the effectiveness of the model, the authors visualized the extracted features using t-SNE (t-Distributed Stochastic Neighbor Embedding). The results show that MSHCL can better separate different emotion categories while reducing inter-subject variability.
6. Ablation Experiment
To evaluate the contributions of each component of MSHCL, the authors conducted ablation experiments. The results indicate: - Removing multi-scale contrastive learning significantly reduced model performance, especially in complex emotion classification tasks. - Removing hyperbolic embedding also led to a certain degree of performance decline, highlighting the importance of hyperbolic space in capturing the hierarchical structure of EEG signals.
Conclusions and Significance
This study proposes an innovative method for cross-subject EEG emotion recognition, effectively reducing inter-subject variability while retaining the discriminative features of emotions through multi-scale contrastive learning and hyperbolic space embedding techniques. The experimental results demonstrate that MSHCL outperformed existing methods across multiple datasets, providing a new solution for cross-subject emotion recognition.
Research Highlights
- Multi-Scale Contrastive Learning: For the first time, contrastive losses were applied at both emotion and stimulus scales, significantly improving the model’s generalization ability.
- Hyperbolic Space Embedding: For the first time, hyperbolic geometry was applied to EEG emotion classification tasks, better capturing the hierarchical structure of EEG signals.
- High Performance and Stability: Achieved State-of-the-Art (SOTA) performance on multiple datasets with lower variance.
Future Prospects
Although MSHCL performed well in cross-subject EEG emotion recognition, there are still some limitations that require further research. For example, the interpretability of hyperbolic embeddings needs improvement, and future work could introduce techniques like attention mechanisms to better understand the hierarchical relationships in EEG signals. Additionally, the model’s generalization ability needs to be validated on larger and more diverse datasets.