###
计算机系统应用英文版:2019,28(11):147-152
本文二维码信息
码上扫一扫!
基于迁移学习的暴恐音频判别方法
(四川大学 电子信息学院, 成都 610065)
Discrimination Method of Terrorism Audio Based on Transfer Learning
(College of Electronics and Information Engineering, Sichuan University, Chengdu 610065, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 1392次   下载 1450
Received:April 11, 2019    Revised:May 08, 2019
中文摘要: 本文从网络和电影中截取暴恐音频片段组成暴恐音频库,由于暴恐音频来源受限,而卷积神经网络需要大量的数据训练,为此,将迁移学习技术引入暴恐音频的判别中.首先采用公开的TUT音频数据集进行预训练,然后保留模型权重并迁移网络在暴恐音频库上继续训练,最后在fine-tune后的网络中增加网络的层数,添加了一种类似于残差网络的结构使其能够利用更多的音频信息.实验结果表明,使用迁移学习方法比未使用迁移学习方法的平均判别率提升了3.97%,有效解决了在暴恐音频判别研究中音频数据集过小而带来的训练问题,且改进后的迁移学习网络进一步提升了1.01%的平均判别率,最终达到96.97%的判别率.
Abstract:This article intercepts the horror audio clips from the network and movies to build terrorism audio dataset. However, the source of the horror audio is limited, whereas the convolutional neural network depends on a large amount of data. To this end, the transfer learning technology is performed in the discrimination of the terrorism audio. Firstly, pre-train the network by using the public TUT acoustic scenes dataset, and then retain the model weight and transfer the neural network to the discrimination of terrorism audio. Finally, add more layers after the fine-tune network to utilize more audio information, the structure of the added layers is similar to the residual network. The experimental results indicate that the average discriminant rate of the transfer learning method is 3.97% higher than that of the non-transfer learning method, which effectively solves the training problem caused by small audio dataset in the study of terrorism audio discrimination, and the average discriminant rate of the improved transfer learning network has increased by 1.01%, finally reaches the discriminant rate of 96.97%.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(61871278);成都市产业集群协同创新项目(2016-XT00-00015-GX);四川省科技计划项目(2018HH0143);四川省教育厅科研项目(18ZB0355)
引用文本:
胡鑫旭,周欣,何小海,熊淑华,王正勇.基于迁移学习的暴恐音频判别方法.计算机系统应用,2019,28(11):147-152
HU Xin-Xu,ZHOU Xin,HE Xiao-Hai,XIONG Shu-Hua,WANG Zheng-Yong.Discrimination Method of Terrorism Audio Based on Transfer Learning.COMPUTER SYSTEMS APPLICATIONS,2019,28(11):147-152