本文已被:浏览 1527次 下载 3291次
Received:March 10, 2010 Revised:April 12, 2010
Received:March 10, 2010 Revised:April 12, 2010
中文摘要: 逐字分词法是以汉语词典为基础对中文语句通过匹配进行切分的方法。该方法在分词中无法解决交叉歧义与组合歧义带来的问题。本文以词典分词为基础,从序列标注的角度,在逐字匹配过程中使用CRFs标注模型提供辅助决策,由此来处理歧义问题。经实验和分析,该方法较传统的CRFs模型分词法和词典分词,更适合对分词速率及正确率都有一定要求的系统。
Abstract:The Chinese Segmentation of matching literal based on Dictionary can not resolve the problem of segmenting ambiguousness and Combinatorial ambiguity. Based on the dictionary segmentation, this paper propose a method of Dictionary Chinese Word Segmentation combined with CRFs. It is proved that this method can have better performance than CRFs segmentation and traditional dictionary segmentation.
文章编号: 中图分类号: 文献标志码:
基金项目:国家863项目(2007AA12Z306)
Author Name | Affiliation |
张硕果 | 重庆大学 计算机学院 重庆 400044 |
汪成亮 |
Author Name | Affiliation |
张硕果 | 重庆大学 计算机学院 重庆 400044 |
汪成亮 |
引用文本:
张硕果,汪成亮.结合CRFs的词典分词法.计算机系统应用,2010,19(11):115-118
.Dictionary Chinese Word Segmentation Method Combined with CRFs.COMPUTER SYSTEMS APPLICATIONS,2010,19(11):115-118
张硕果,汪成亮.结合CRFs的词典分词法.计算机系统应用,2010,19(11):115-118
.Dictionary Chinese Word Segmentation Method Combined with CRFs.COMPUTER SYSTEMS APPLICATIONS,2010,19(11):115-118