﻿ 改进的互功率谱时延估计算法
 计算机系统应用  2018, Vol. 27 Issue (8): 247-253 PDF

Improved Time-Delay Estimation Algorithm for Mutual Power Spectrum
WAN Meng-Shi, WU Xiao-Pei, ZHANG Chao
College of Computer Science and Technology, Anhui University, Hefei 230601, China
Abstract: The sound source location of the microphone array has been a hot topic in the field of array signal processing. The time-delay estimation method, which is represented by mutual power spectrum phase estimation (CSP), is widely used because of its simple principle, small computation, and easy implementation. Although the CSP algorithm has a good estimation effect in the high SNR environment, the accuracy is drastically reduced when the SNR is low and the acoustic scene is much complicated. In order to solve this problem, this study improves the CSP algorithm. By filtering the time-delay estimation results of the CSP algorithm, the unreasonable delay value is eliminated, the algorithm parameters are updated and estimated again to obtain reasonable delay value, and through multiple frame signal to obtain the time-delay and position information of the sound source. In order to verify the effectiveness of the proposed algorithm, this paper experimentally validates in Matlab and real environment respectively, and the results show that the enhanced CSP algorithm has improved the accuracy of the time-delay estimation compared with the original algorithm.
Key words: microphone array     the sound source localization     time delay estimation     mutual power spectrum

1994年Omologo[9]提出了互功率谱相位 (Cross-power Spectrum Phase, CSP)算法. 这种方法对中低混响有很好的抑制作用, 吸引了一大批学者的关注, 有一些文献对其进行了改进, 出现了很多基于CSP算法的时延估计算法和应用场景[1012]. 其中, 文献[13]提出了一种功率谱限幅后再进行反傅立叶变换的改进自相关方法. 采用相关峰跟踪搜索的方法实现了多径时延时间历程的自动提取, 这在无人系统中有较好应用前景. 文献[14]将时间细化逆傅里叶变换法应用到互功率谱相关算法中, 通过相关峰细化计算子带平移后互功率谱的相关函数, 并对各频带的峰值函数使用高斯函数进行尖锐化处理, 使得频带不重叠的多个不同声源的方位得以进行快速精确. 该方法解决了在小孔径、小阵元数接收基阵的情况下的多目标方位的精确快速估计问题, 适用于小孔径的探测节点与具有低功耗限制要求的平台使用. 文献[15]采用了互功率谱-平滑相干变换(CSP-SCOT)联合加权时延估计, 对声源进行空间搜索. 仿真实验结果表明, 在同等混响或噪声条件下, 此方法其定位优于CSP和SCOT算法, 且适于小型麦克风阵列.

1 互功率谱相位算法

 ${x_i}(t) = {\alpha _i}s(t - {\tau _i}) + {v_i}(t) = {h_i}(t) * s(t) + {n_i}(t)$ (1)

 ${R_{{x_i}{x_j}}}(\tau ) = E[{x_i}(t){x_j}(t - \tau )]$ (2)

 ${R_{{x_i}{x_j}}}(\tau ) = {\alpha _i}{\alpha _j}{R_{{s_i}{s_j}}}(\tau - {\tau _{ij}}) + {R_{{v_i}{v_j}}}(\tau )$ (3)

CSP算法就是基于上述情况所提出来. 对式(3)做傅里叶变换得到xi(t)与xj(t)的互功率谱:

 ${P_{{X_i}{X_j}}}(\omega ) = {\alpha _i}{\alpha _j}{P_{{S_i}{S_j}}}(\omega ){e^{ - j\omega {\tau _{ij}}}} + {P_{{V_i}{V_j}}}(\omega )$ (4)

 图 1 互相关函数的峰值被伪峰淹没

 ${R_{{\rm{csp}}}}(\tau ) = \int_{ - \infty }^\infty {\frac{1}{{\left| {{P_{{X_i}{X_j}}}(\omega )} \right|}}{P_{{X_i}{X_j}}}(\omega ){e^{i\omega \tau }}d\omega }$ (5)

 ${R_{m\_{\rm{csp}}}}(\tau ) = \int_{ - \infty }^\infty {\frac{1}{{{{\left| {{P_{{X_i}{X_j}}}(\omega )} \right|}^r}}}{P_{{X_i}{X_j}}}(\omega ){e^{i\omega \tau }}d\omega }$ (6)
2 改进CSP算法

CSP算法计算量小, 具备良好的跟踪计算能力, 适用于实时系统. 目前已经证明CSP算法在中等强度噪声、混响的环境下性能较好[18]. 但在实验过程中发现, 当处于低信噪比、高混响等复杂环境时, CSP算法准确率急剧下降, 误差明显增加. 本文的目的在于减少CSP算法在复杂环境下的错误率, 使其既满足于实时系统, 又能确保算法的准确性.

 图 2 改进CSP算法流程图

2.1 预处理

2.2 合理时延区间计算

 $\cos \theta = \frac{{{l_{e{m_2}}}}}{{{d_{12}}}}$ (7)

 图 4 声源到两路麦克风信号

 $\cos \theta \approx \frac{{{l_{{m_1}'{m_2}}}}}{{{d_{12}}}} = \frac{{\tau \cdot c}}{{{d_{12}}}}$ (8)

 $N = \tau \cdot Fs$ (9)

 $- \frac{{{d_{12}} \cdot Fs}}{c} \le N \le \frac{{{d_{12}} \cdot Fs}}{c}$ (10)

 图 5 近场误差

 $- \frac{{3{d_{12}} \cdot Fs}}{{2c}} \le N \le \frac{{3{d_{12}} \cdot Fs}}{{2c}}$ (11)

2.3 更新加权因子

CSP算法中加权因子r(0.5≤r≤1)的选取极为重要. r值过大会使噪声过滤效果很差, r值过小又会过度加权, 造成峰值检测不正确. 一般来说, r的大小由经验决定. 为了确定本文改进算法中r的有效值, 本文对100组不同信噪比的语音信号进行了分析与统计. 对每组语音信号分别计算r为0.65、0.70、0.75、0.80、0.85、0.90、0.95、1时的时延结果, 并与真实时延对比, 记录下最为接近真实值的估计结果对应的r值. 获得最多次最小误差的r值即为最佳加权因子, 统计结果表1所示.

2.3 多帧信号加权

M的取值是我们要考虑的问题. M太小, 帧数太少达不到加权的效果; M太大又会降低算法执行效率. 为了得到合理的M值, 本文通过对100组语音信号进行计算, 得到M等于5时效果最佳. 由此可得语音信号的时延值τ为:

 $\tau = ({\tau _1} + {\tau _2} + {\tau _3} + {\tau _4} + {\tau _5})/5$ (12)

3 实验分析

3.1 MATLAB仿真

3.2 真实实验

3.2.1 信号采集

 图 6 信号采集设备

 图 7 信号采集模型

3.2.2 麦克风校准

 图 8 麦克风阵列误差校准

3.2.3 实验结果

4 结束语

 [1] 陶巍, 刘建平, 张一闻. 基于麦克风阵列的声源定位系统. 计算机应用, 2012, 32(5): 1457-1459. [2] Magassouba A, Bertin N, Chaumette F. First applications of sound-based control on a mobile robot equipped with two microphones. Proceedings of 2016 IEEE International Conference on Robotics and Automation. Stockholm, Sweden. 2016. 2557–2562. [3] Jang Y, Kim J, Kim J. The development of the vehicle sound source localization system. Proceedings of 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, Hong Kong, China. 2016. 1241–1244. [4] Hirano Y, Iwai T, Kominami D, et al. Implementation of a sound-source localization method for calling frog in an outdoor environment using a wireless sensor network. Proceedings of 2016 International Conference on Wireless Communications, Signal Processing and Networking. Chennai, India. 2016. 2458–2462. [5] Knapp C, Carter G. The generalized correlation method for estimation of time delay. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1976, 24(4): 320-327. DOI:10.1109/TASSP.1976.1162830 [6] Stéphenne A, Champagne B. A new cepstral prefiltering technique for estimating time delay under reverberant conditions. Signal Processing, 1997, 59(3): 253-266. DOI:10.1016/S0165-1684(97)00051-0 [7] Benesty J. Adaptive eigenvalue decomposition algorithm for passive acoustic source localization. The Journal of the Acoustical Society of America, 2000, 107(1): 384-391. DOI:10.1121/1.428310 [8] Dvorkind TG, Gannot S. Time difference of arrival estimation of speech source in a noisy and reverberant environment. Signal Processing, 2005, 85(1): 177-204. DOI:10.1016/j.sigpro.2004.09.014 [9] Omologo M, Svaizer P. Acoustic event localization using a crosspower-spectrum phase based technique. Proceedings of 1994 IEEE International Conference on Acoustics, Speech, and Signal Processing. Adelaide, SA, Australia. 1994. II/273–II/276. [10] Kou WZ, Duan WJ, Li MY. An improved time delay estimation method based on cross-power spectrum phase. Proceedings of 2012 IEEE International Conference on Signal Processing, Communication and Computing. Hong Kong, China. 2012. 686–690. [11] Zhou WZ, Ling Y, Zhang YQ, et al. Time difference calculation based on signal starting point detection. Proceedings of the 2015 7th International Conference on Modelling, Identification and Control. Sousse, Tunisia. 2015. 1–5. [12] Mao HD, Zhang LH. An improved accumulated cross-power spectrum phase method for time delay estimation. Proceedings of 2015 IEEE Advanced Information Technology, Electronic and Automation Control Conference Chongqing, China. 2015. 563–566. [13] 陈韶华, 汪小亚. 一种改进的自相关多径时延估计及其自动提取. 中国声学学会第十一届青年学术会议会议论文集. 西安. 2015. [14] 刘超, 黄迪. 基于子带平移的精确时延快速估计. 舰船电子工程, 2016, 36(6): 127-130. DOI:10.3969/j.issn.1672-9730.2016.06.034 [15] 杨艺敏, 刘涛. 改进的GCC算法在声源定位中的研究. 电子世界, 2017(10): 180, 186. [16] 崔玮玮. 基于麦克风阵列的声源定位与语音增强方法研究[博士学位论文]. 北京: 清华大学, 2009. [17] Rabinkin DV, Renomeron RJ, Dahl AJ, et al. DSP implementation of source location using microphone arrays. Journal of the Acoustical Society of America, 1996, 99(4): 88-99. [18] 陆晓燕. 基于麦克风阵列实现声源定位[硕士学位论文]. 大连: 大连理工大学, 2003. [19] Fu ZH, Li JW. GPU-Based Image Method For Room Impulse Response Calculation. Hingham, MA, USA: Kluwer Academic Publishers, 2016.