Survey on N-gram Model
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    The N-gram model is one of the most commonly used language models in natural language processing and is widely used in many tasks such as speech recognition, handwriting recognition, spelling correction, machine translation and search engines. However, the N-gram model often presents zero-probability problems in training and application, resulting in failure to obtain a good language model. As a result, smoothing methods such as Laplace smoothing, Katz back-off, and Kneser-Ney smoothing appeared. After introducing the basic principles of these smoothing methods, we use the perplexity as a metric to compare the language models trained based on these types of smoothing methods.

    Reference
    Related
    Cited by
Get Citation

尹陈,吴敏. N-gram模型综述.计算机系统应用,2018,27(10):33-38

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:January 29,2018
  • Revised:February 27,2018
  • Adopted:
  • Online: September 29,2018
  • Published:
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063