###
计算机系统应用英文版:2023,32(10):85-95
本文二维码信息
码上扫一扫!
基于频谱增强和卷积宽度学习的音乐流派分类
(辽宁工程技术大学 软件学院, 葫芦岛 125105)
Music Genre Classification Based on Spectrogram Enhancement and CNNBLS
(School of Software, Liaoning Technical University, Huludao 125105, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 523次   下载 1533
Received:March 30, 2023    Revised:May 11, 2023
中文摘要: 针对频谱图对于音乐特征挖掘较弱、深度学习分类模型复杂且训练时间长的问题, 设计了一种基于频谱增强和卷积宽度学习(CNNBLS)的音乐流派分类模型. 该模型首先通过SpecAugment中随机屏蔽部分频率信道的方法增强梅尔频谱图, 再将切割后的梅尔频谱图作为CNNBLS的输入, 同时将指数线性单元函数(ELU)融合进CNNBLS的卷积层, 以增强其分类精度. 相较于其他机器学习网络框架, CNNBLS能用少量的训练时间获得较高的分类精度. 此外, CNNBLS可以对增量数据进行快速学习. 实验结果表明: 无增量模型CNNBLS在训练400首音乐数据可获得90.06%的分类准确率, 增量模型Incremental-CNNBLS在增加400首训练数据后可达91.53%的分类准确率.
Abstract:For the problems of weak music feature mining, complex deep learning classification models, and long training time, a music genre classification model based on spectrogram enhancement and convolutional neural network-based broad learning system (CNNBLS) is designed. This model first enhances the Mel spectrogram by randomly masking part of frequency channels in SpecAugment and then uses the cut Mel spectrogram as the input of CNNBLS. At the same time, exponential linear unit functions (ELUs) are fused into the convolutional layer of CNNBLS to enhance its classification accuracy. Compared to other machine learning network frameworks, CNNBLS can achieve higher classification accuracy with less training time. In addition, CNNBLS can quickly learn incremental data. The experimental results show that the non-incremental model of CNNBLS can achieve a classification accuracy of 90.06% after training 400 pieces of music data, while the incremental model of Incremental-CNNBLS can achieve a classification accuracy of 91.53% after adding 400 pieces of training data.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金面上项目(42271409); 辽宁省高等学校基本科研项目(LIKMZ20220699)
引用文本:
刘万军,李雨萌,曲海成.基于频谱增强和卷积宽度学习的音乐流派分类.计算机系统应用,2023,32(10):85-95
LIU Wan-Jun,LI Yu-Meng,QU Hai-Cheng.Music Genre Classification Based on Spectrogram Enhancement and CNNBLS.COMPUTER SYSTEMS APPLICATIONS,2023,32(10):85-95