本文已被:浏览 2305次 下载 3695次
Received:March 29, 2011 Revised:May 04, 2011
Received:March 29, 2011 Revised:May 04, 2011
中文摘要: 根据拉丁维文的特点,分析了拉丁维文常见的拼写错误类型,提出了一种将最小编辑距离、基于有向图模型的词语切分和trigram 语言模型融合的方法,实现了基于上下文的拉丁维文的自动拼写校对系统,从而大大提高了拉丁维文的校对准确率。在新疆大学提供的维文语料库的测试中,拉丁维文的校对准确率达到了90.1%。
Abstract:According to the characteristics of Latin-Uighur, this paper analyzed the common spelling error types of Latin-Uighur, and then proposed a method which merged the minimum edit distance, directed graph model based lexical segmentation, trigram language model together. Finally, we implemented the automatically spelling check system of Latin-Uighur based on context. It has increased the accuracy of Latin-Uighur spelling check largely. The experiment on the Uighur corpus provided by Xinjiang University reaches an accuracy of 90.1%.
keywords: Latin-Uighur minimum edit distance directed graph model lexical segmentation language model context spelling check
文章编号: 中图分类号: 文献标志码:
基金项目:国家自然科学基金(60736014)
引用文本:
何晋一,陈红英,姜文斌,张海波,刘群.基于上下文的拉丁维文拼写校对的研究.计算机系统应用,2011,20(12):60-63
HE Jin-Yi,CHEN Hong-Ying,JIANG Wen-Bin,ZHANG Hai-Bo,LIU Qun.Latin-Uighur Spelling Check Based on Context.COMPUTER SYSTEMS APPLICATIONS,2011,20(12):60-63
何晋一,陈红英,姜文斌,张海波,刘群.基于上下文的拉丁维文拼写校对的研究.计算机系统应用,2011,20(12):60-63
HE Jin-Yi,CHEN Hong-Ying,JIANG Wen-Bin,ZHANG Hai-Bo,LIU Qun.Latin-Uighur Spelling Check Based on Context.COMPUTER SYSTEMS APPLICATIONS,2011,20(12):60-63