Using Word Clustering to Improve Recurrent Neural Network Language Model
CSTR:
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference [11]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    Previous studies proved that, adding part of speech tag information to the input layer of neural language model, can improve the performance significantly. But part of speech tag need hand-annotated data to train the tag model, which consumes a lot and the extra tagger also makes the model more complicated. To solve the problem, this article propose adding the results of brown clustering, instead of part of speech tag information to the input layer of the recurrent network language model. In the Penn Treebank corpus, the relative improvement over the original recurrent neural network language model reaches 8%~9%.

    Reference
    1 Bengio Y, Ducharme R, Vincent P, Jauvin C. A neural probabilistic language model. The Journal of Machine Learning Research, 2003, 3: 1137-1155.
    2 Mikolov T, Karafiát M, Burget L, Cernocky JH, Khudanpur S. Recurrent neural network based language model. Proc. of Interspeech. 2010. 1045-1048.
    3 Kombrink S, Mikolov T, Karafiát M, Burget L. Recurrent neural network based language modeling in meeting recognition. Proc. of Interspeech. 2011. 2877-2880.
    4 Mikolov T, Kombrink S, Burget L, Cernocky JH, Khudanpur S. Extensions of recurrent neural network language model. Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on. IEEE. 2011. 5528-5531.
    5 Emami A, Jelinek F. A neural syntactic language model. Machine Learning, 2005, 60(1-3):195-227.
    6 Yamamoto YWXLH, Matsuda S, Kashioka CHH. Factored language model based on recurrent neural network. Proc. of COLING. 2012. 2835-2850.
    7 Si Y, Guo Y, Liu Y, Pan J, Yan Y. Impact of word classing on recurrent neural network language model. Intelligent Systems (GCIS), 2012 Third Global Congress on. IEEE. 2012. 100-103.
    8 Miller S, Guinness J, Zamanian A. Name tagging with word clusters and discriminative training. Proc. of HLT-NAACL, 2004, 4: 337-342.
    9 Turian J, Ratinov L, Bengio Y. Word representations: A simple and general method for semi-supervised learning. Proc. of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. 2010. 384-394.
    10 Brown PF, Desouza PV, Mercer RL, Pietra VJD, Lai JC. Class-based n-gram models of natural language. Computational Linguistics, 1992, 18(4): 467-479.
    11 Lau R, Rosenfeld R, Roukos S. Trigger-based language models: A maximum entropy approach. Acoustics, Speech, and Signal Processing, 1993. ICASSP-93, 1993 IEEE International Conference on. IEEE. 1993, 2. 45-48.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

刘章,陈小平.联合无监督词聚类的递归神经网络语言模型.计算机系统应用,2014,23(5):101-106

Copy
Share
Article Metrics
  • Abstract:1646
  • PDF: 3328
  • HTML: 0
  • Cited by: 0
History
  • Received:September 12,2013
  • Revised:November 11,2013
  • Online: May 29,2014
Article QR Code
You are the first987198Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063