Speech Enhancement Based on Deep Complex Axial Self-attention Convolutional RecurrentNetwork

doi:10.15888/j.cnki.csa.009458

AIPUB归智期刊联盟

WeChat

Mobile website

2025-4-25- 14

Home > Archive>Volume 33, Issue 4, 2024 >60-68. DOI:10.15888/j.cnki.csa.009458

PDF HTML XML Export Cite reminder

Speech Enhancement Based on Deep Complex Axial Self-attention Convolutional RecurrentNetwork
DOI:
                        10.15888/j.cnki.csa.009458
                    
CSTR:
                        [cstr]
                    
Author:
                        CAO JieCAO Jie
School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, China;School of Information Engineering, Lanzhou City University, Lanzhou 730020, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
WANG QiaoWANG Qiao
School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
LIANG Hao-PengLIANG Hao-Peng
School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
WANG Chen-ZhangWANG Chen-Zhang
School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
LI Xiao-XuLI Xiao-Xu
School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
YU HongYU Hong
School of Information and Electrical Engineering, Ludong University, Yantai 264025, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Inaccurate phase estimation in single-channel speech enhancement tasks will cause poor quality of the enhanced speech. To this end, this study proposes a speech enhancement method based on a deep complex axial self-attention convolutional recurrent network (DCACRN), which enhances speech amplitude information and phase information in the complex domain simultaneously. Firstly, a complex convolutional network-based encoder is employed to extract complex features from the input speech signal, and a convolutional hopping module is introduced to map the features into a high-dimensional space for feature fusion, which enhances the information interaction and the gradient flow. Then an encoder-decoder structure based on the axial self-attention mechanism is designed to enhance the model’s timing modeling ability and feature extraction ability. Finally, the reconstruction of the speech signals is realized by the decoder, while the hybrid loss function is adopted to optimize the network model to improve the quality of enhanced speech signals. Meanwhile, the mixed loss function is utilized to optimize the network model and improve the quality of enhanced speech signals. The experiments are conducted on the public datasets Valentini and DNS Challenge, and the results show that the proposed method improves both the perceptual evaluation of speech quality (PESQ) and short-time objective intelligibility (STOI) metrics compared to other models. In the non-reverberant dataset, PESQ is improved by 12.8% over DCTCRN and 3.9% over DCCRN, which validates the effectiveness of the proposed model in speech enhancement tasks.

Key words:single-channel speech enhancement;complex convolutional recurrent network;convolution jump;axial self-attention mechanism

Get Citation

曹洁,王乔,梁浩鹏,王宸章,李晓旭,于泓.深度复数轴向自注意力卷积循环网络的语音增强.计算机系统应用,2024,33(4):60-68

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:October 07,2023
Revised:November 09,2023
Adopted:
Online: January 18,2024
Published:

Article QR Code

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063