###

计算机系统应用英文版:2022,31(1):322-326

View/Add Comment 过刊浏览高级检索 HTML

←前一篇 | 后一篇→

码上扫一扫！

下载全文

基于条件生成式对抗网络的情感语音生成模型

崔新明, 贾宁, 周洁美慧

(大连东软信息学院计算机与软件学院, 大连 116023)

Speech Generation Model Based on Conditional Generative Adversarial Network

CUI Xin-Ming, JIA Ning, ZHOU Jie-Mei-Hui

(School of Computer and Software, Dalian Neusoft Institute of Information, Dalian 116023, China)

摘要

图/表

参考文献

相似文献

本文已被：浏览 872次下载 1502次
Received:March 09, 2021 Revised:April 07, 2021

中文摘要: 提出了一种基于条件生成对抗网络的情感语音生成技术，在引入情感条件的基础上，通过学习语音库中的情感信息，能够自主生成全新的富有指定情感的语音.生成式对抗网络是由一个判别网络和一个生成器组成.使用TensorFlow作为学习框架，利用条件GAN模型对大量情感语音进行训练，利用语音生成网络G和生成网络D构成动态“博弈过程”，更好地学习观测语音情感数据的条件分布.其生成样本接近原始学习内容的自然语音信号，具有多样性，而且能够逼近符合真实情感的语音数据.所提出的解决方案在交互式情绪二进制动作捕捉IEMOCAP语料库和自建情感语料库上进行评估，并且与现有情感语音生成算法相比显示出提供更准确的结果.

中文关键词: 条件生成式对抗网络条件GAN模型情感判别语音生成模型 TensorFlow框架

Abstract:An affective speech generation technology based on a conditional generative adversarial network (GAN) is proposed in this study. After the introduction of affective conditions and the learning of affective information from the phonetic database, a brand new affective speech with specified emotions can be generated independently. GAN is composed of a discrimination network and a generator. With TensorFlow as the learning framework, the conditional GAN model is employed to train plenty of affective speech, and the speech generation network G and generation network D are used to form a dynamic “game process” for better learning and observation of the conditional distribution of speech emotion data. The generated sample is close to the natural speech signal of the original learning content, which has diversity and can approximate the speech data consistent with the real emotion. The proposed solution is evaluated on the interactive emotional dyadic motion capture (IEMOCAP) corpus and the self-built emotional corpus. It generates more accurate results than the existing affective speech generation algorithms.

keywords: conditional generative adversarial network (GAN) conditional GAN model emotion discrimination speech generation model TensorFlo???????w framework

文章编号： 中图分类号： 文献标志码：

基金项目:辽宁省教育厅校际合作项目（86896244）；大连市科技计划（2019RQ120）

Author Name	Affiliation	E-mail
CUI Xin-Ming	School of Computer and Software, Dalian Neusoft Institute of Information, Dalian 116023, China
JIA Ning	School of Computer and Software, Dalian Neusoft Institute of Information, Dalian 116023, China	jianing@neusoft.edu.cn
ZHOU Jie-Mei-Hui	School of Computer and Software, Dalian Neusoft Institute of Information, Dalian 116023, China

Author Name	Affiliation	E-mail
CUI Xin-Ming	School of Computer and Software, Dalian Neusoft Institute of Information, Dalian 116023, China
JIA Ning	School of Computer and Software, Dalian Neusoft Institute of Information, Dalian 116023, China	jianing@neusoft.edu.cn
ZHOU Jie-Mei-Hui	School of Computer and Software, Dalian Neusoft Institute of Information, Dalian 116023, China

引用文本：
崔新明,贾宁,周洁美慧.基于条件生成式对抗网络的情感语音生成模型.计算机系统应用,2022,31(1):322-326
CUI Xin-Ming,JIA Ning,ZHOU Jie-Mei-Hui.Speech Generation Model Based on Conditional Generative Adversarial Network.COMPUTER SYSTEMS APPLICATIONS,2022,31(1):322-326