Survey of Speech Recognition and End-to-End Techniques
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    The paper briefly introduces the history and application of speech recognition, traditional speech recognition techniques, and current research progress. Traditional speech recognition relies on statistics-based methods and sound spectrum features to train Gaussian Mixture Model-Hidden Markov Model (GMM-HMM) hybrid model. Nowadays, speech recognition models are mainly based on deep learning. Generally, Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) can effectively extract features to establish acoustic models. Further research depends on end-to-end techniques to avoid error transmission among models, and these techniques mainly include Connectionist Temporal Classification (CTC) and attention. The latest models and methods highlight attention, which are trying to integrate it with CTC to achieve better results. Finally, combined with the authors’ understanding, the paper summarizes the existing problems and future development in speech recognition.

    Reference
    Related
    Cited by
Get Citation

鱼昆,张绍阳,侯佳正,张少博.语音识别及端到端技术现状及展望.计算机系统应用,2021,30(3):14-23

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:July 27,2020
  • Revised:August 25,2020
  • Adopted:
  • Online: March 06,2021
  • Published:
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063