EEG Speech Emotion Conversion Demo
Comparison of Speech Emotion Conversion Models Based on EEG Emotional Features (Audio Sample Demonstration)
Abstract
Existing speech emotion conversion methods mainly rely on speech signal modeling and therefore struggle to accurately reflect the speaker's true emotional state. To address this limitation, his paper proposes an electroencephalography (EEG)-guided speech emotion conversion framework that introduces emotional information contained in EEG signals into the speech emotion conversion process through cross-modal knowledge distillation and a conditional generation mechanism. First, a baseline speech emotion conversion model (SGEVC) is trained on the ESD dataset to establish stable emotion conversion capability. Then, using the speech emotion encoder of SGEVC as a teacher model on the EAV dataset, the EEG emotion extractor is guided to learn representations consistent with speech emotion features through joint optimization with regression and classification losses. Finally, the trained EEG emotion extraction module is integrated into the SGEVC framework to construct an EEG-conditioned speech emotion conversion model.Experimental results show that the proposed method improves the consistency and accuracy of emotional expression while maintaining speech naturalness and intelligibility. These findings demonstrate the effectiveness of EEG signals as auxiliary information for emotion-aware speech modeling.
Code
https://github.com/roman1115/eeg-sec
Samples
S1(seen)
N2H (Neutral-to-Happy)
N2A (Neutral-to-Angry)
N2S (Neutral-to-Sad)
S2(seen)
N2H (Neutral-to-Happy)
N2A (Neutral-to-Angry)
N2S (Neutral-to-Sad)
U1(unseen)
N2H (Neutral-to-Happy)
N2A (Neutral-to-Angry)
N2S (Neutral-to-Sad)
© 2026 Your Lab / Your Name