Key Technologies of Speech Intelligibility Based on CycleGAN
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Speech intelligibility enhancement is a perceptual enhancement technique for clean speech reproduced in noisy environments. Speaking style conversion (SSC) is used in many studies to achieve speech intelligibility, which relies solely on the Lombard effect and thus demonstrates poor performance with strong noise interference. In addition, the SSC method models the conversion of fundamental frequency (F0) with a straight forward linear transform and only maps Mel-frequency cepstral coefficients (MFCCs) with few dimensions. As F0 and MFCCs are critical aspects of hierarchical intonation, adequate modeling of these features is essential. Therefore, we use the continuous wavelet transform (CWT) to decompose F0 into ten dimensions to describe speech at different time scales for effective F0 conversion and represent MFCCs with 20 dimensions for MFCC conversion. Furthermore, we utilize an iMetricGAN to optimize speech intelligibility metrics in strong noise. The experimental results show that in objective and subjective evaluations, the proposed non-parallel speech style conversion method using CWT and iMetricGAN based on CycleGAN (NS-CiC) significantly increases speech intelligibility in robust noise environments.

    Reference
    Related
    Cited by
Get Citation

肖晶,刘佳奇,李登实,赵兰馨,王前瑞.基于CycleGAN的语音可懂度关键技术.计算机系统应用,2022,31(6):1-9

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:September 14,2021
  • Revised:October 14,2021
  • Adopted:
  • Online: May 26,2022
  • Published:
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063