###
计算机系统应用:2018,27(10):33-38
本文二维码信息
码上扫一扫!
N-gram模型综述
尹陈, 吴敏
(中国科学技术大学 软件学院, 合肥 230051)
Survey on N-gram Model
YIN Chen, WU Min
(School of Software Engineering, University of Science and Technology of China, Hefei 230051, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 160次   下载 145
投稿时间:2018-01-29    修订日期:2018-02-27
中文摘要: N-gram模型是自然语言处理中最常用的语言模型之一,广泛应用于语音识别、手写识别、拼写纠错、机器翻译和搜索引擎等众多任务.但是N-gram模型在训练和应用时经常会出现零概率问题,导致法获得良好的语言模型,因此出现了拉普拉斯平滑、卡茨回退和Kneser-Ney平滑等平滑方法.在介绍了这些平滑方法的基本原理后,使用困惑度作为度量标准去比较了基于这几种平滑方法所训练出的语言模型.
Abstract:The N-gram model is one of the most commonly used language models in natural language processing and is widely used in many tasks such as speech recognition, handwriting recognition, spelling correction, machine translation and search engines. However, the N-gram model often presents zero-probability problems in training and application, resulting in failure to obtain a good language model. As a result, smoothing methods such as Laplace smoothing, Katz back-off, and Kneser-Ney smoothing appeared. After introducing the basic principles of these smoothing methods, we use the perplexity as a metric to compare the language models trained based on these types of smoothing methods.
文章编号:     中图分类号:    文献标志码:
基金项目:
引用文本:
尹陈,吴敏.N-gram模型综述.计算机系统应用,2018,27(10):33-38
YIN Chen,WU Min.Survey on N-gram Model.COMPUTER SYSTEMS APPLICATIONS,2018,27(10):33-38

用微信扫一扫

用微信扫一扫