本文已被:浏览 747次 下载 1888次
Received:February 10, 2022 Revised:March 03, 2022
Received:February 10, 2022 Revised:March 03, 2022
中文摘要: 正确识别语音中包含的情感信息可以大幅提高人机交互的效率. 目前, 语音情感识别系统主要由语音特征抽取和语音特征分类两步组成. 为了提高语音情感识别准确率, 选用语谱图而非传统声学特征作为模型输入, 采用基于attention机制的CGRU网络提取语谱图中包含的频域信息和时域信息. 实验结果表明: 在模型中引入注意力机制有利于减少冗余信息的干扰, 并且相较于基于LSTM网络的模型, 采用GRU网络的模型预测精确度更高, 且在训练时收敛更快, 与基于LSTM的基线模型相比, 基于GRU网络的模型训练时长只有前者的60%.
Abstract:Accurate recognition of speech emotion information can help to greatly improve the efficiency of human-computer interaction. At present, the speech emotion recognition system mainly consists of two steps: speech feature extraction and speech feature classification. In order to improve the accuracy of speech emotion recognition, the spectrogram is used as the model input instead of traditional acoustic features, and the CGRU network based on the attention mechanism is adopted to extract the frequency domain and time domain information in the spectrogram. The experimental results show that the introduction of the attention mechanism in the model is beneficial to reduce the interference of redundant information, and compared with the model based on the LSTM network, the model using the GRU network can fast converge during training and has higher prediction accuracy. In addition, the training time of the GRU-based model is only 60% of that of the LSTM-based baseline model.
keywords: speech emotion recognition attention mechanism gate recurrent unit (GRU) spectrogram deep learning
文章编号: 中图分类号: 文献标志码:
基金项目:
引用文本:
王茂林,郝刚.基于Attention-CGRU网络的中文语音情感识别.计算机系统应用,2023,32(1):296-301
WANG Mao-Lin,HAO Gang.Chinese Speech Emotion Recognition Based on Attention-CGRU Network.COMPUTER SYSTEMS APPLICATIONS,2023,32(1):296-301
王茂林,郝刚.基于Attention-CGRU网络的中文语音情感识别.计算机系统应用,2023,32(1):296-301
WANG Mao-Lin,HAO Gang.Chinese Speech Emotion Recognition Based on Attention-CGRU Network.COMPUTER SYSTEMS APPLICATIONS,2023,32(1):296-301