Music Mood Classification Method Based on Deep Belief Network and Multi-Feature Fusion
CSTR:
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference [17]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    In the paper we explore the two important parts of music emotion classification:feature selection and classifier. In terms of feature selection, single feature cannot fully present music emotions in the traditional algorithm, which, however, can be solved by the multi-feature fusion put forward in this paper. Specifically, the sound characteristics and prosodic features are combined as a symbol to express music emotion. In the classifier selection, the deep belief networks are adopted to train and classify music emotions, which had a better performance in the area of audio retrieval. The results show that the algorithm performs better than the single feature classification and SVM classification in music emotion classification.

    Reference
    [1] Krumhansl CL. An exploratory study of musical emotions and psychophysiology. Canadian Journal of Experimental Psychology/Revue Canadienne de Psychologie Expérimentale, 1997, 51(4):336-353.[DOI:10.1037/1196-1961.51.4.336]
    [2] Rajanna AR, Aryafar K, Shokoufandeh A, et al. Deep neural networks:A case study for music genre classification. Proc. of the 14th International Conference on Machine Learning and Applications (ICMLA). Miami, FL, USA. 2015. 655-660.
    [3] Tzanetakis G, Cook P. Musical genre classification of audio signals. IEEE Trans. on Speech and Audio Processing, 2002, 10(5):293-302.[DOI:10.1109/TSA.2002.800560]
    [4] Li T, Ogihara M, Li Q. A comparative study on content-based music genre classification. Proc. of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval. Toronto, Canada. 2003. 282-289.
    [5] Li T, Ogihara M. Toward intelligent music information retrieval. IEEE Trans. on Multimedia, 2006, 8(3):564-574.[DOI:10.1109/TMM.2006.870730]
    [6] Feng YZ, Zhuang YT, Pan YH. Music information retrieval by detecting mood via computational media aesthetics. Proc. of IEEE/WIC International Conference on Web Intelligence. Halifax, NS, Canada. 2003. 235-241.
    [7] Yang D, Lee W. Disambiguating music emotion using software agents. Proc. of the 5th International Conference on Music Information Retrieval. Barcelona, Spain. 2004. 10-14.
    [8] Yang YH, Lin YC, Su YF, et al. A regression approach to music emotion recognition. IEEE Trans. on Audio, Speech, and Language Processing, 2008, 16(2):448-457.[DOI:10.1109/TASL.2007.911513]
    [9] Chua BY, Lu GJ. Perceptual rhythm determination of music signal for emotion-based classification. Proc. of the 12th International Multi-Media Modelling Conference Proceedings. Beijing, China. 2006, 8.
    [10] Fujishima T. Realtime chord recognition of musical sound:A system using common lisp music. Proc. International Computer Music Association. Stanford, CA, USA. 1999. 464-467.
    [11] Gómez E, Herrera P. Estimating the tonality of polyphonic audio files:Cognitive versus machine learning modelling strategies. Proc. of the 5th International Conference on Music Information Retrieval. Barcelona, Spain. 2004.
    [12] Basili R, Serafini A, Stellato A. Classification of musical genre:A machine learning approach. Proc. of the 5th International Conference on Music Information Retrieval. Barcelona, Spain. 2004.
    [13] Cuthbert MS, Ariza C. music21:A toolkit for computer-aided musicology and symbolic music data. Proc. of the 11th International Society for Music Information Retrieval Conference (ISMIR 2010). Netherlands. 2010. 637-642.
    [14] Gouyon F, Dixon S. A review of automatic rhythm description systems. Computer Music Journal, 2005, 29(1):34-54.[DOI:10.1162/comj.2005.29.1.34]
    [15] Yu D, Deng L. Deep learning and its applications to signal and information processing[Exploratory DSP]. IEEE Signal Processing Magazine, 2011, 28(1):145-154.[DOI:10.1109/MSP.2010.939038]
    [16] Sarikaya R, Hinton GE, Deoras A. Application of deep belief networks for natural language understanding. IEEE/ACM Trans. on Audio, Speech, and Language Processing, 2014, 22(4):778-784.[DOI:10.1109/TASLP.2014.2303296]
    [17] Yoshioka T, Gales MJF. Environmentally robust ASR front-end for deep neural network acoustic models. Computer Speech & Language, 2015, 31(1):65-86.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

龚安,丁明波,窦菲.基于DBN的多特征融合音乐情感分类方法.计算机系统应用,2017,26(9):158-164

Copy
Share
Article Metrics
  • Abstract:1363
  • PDF: 4205
  • HTML: 0
  • Cited by: 0
History
  • Received:December 28,2016
  • Online: October 31,2017
Article QR Code
You are the first990404Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063