• Article
  • | |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • | |
  • Comments
    Abstract:

    Chinese word segmentation technology is the basis of machine translation, classification, search engines, as well as information retrieval. But the Internet emerging new words have seriously affected the performance of word segmentation. To improve the recognition rate of new words, suffix array is used in this paper, and the number of length of common prefix is calculated. The candidates on their words are filtered out by the threshold. Experimental results show that the new word recognition method has advantages.

    Reference
    Related
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

任雪利,代余彪.基于后缀数组的分词技术①.计算机系统应用,2010,19(8):229-230

Copy
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:December 04,2009
  • Revised:January 18,2010
Article QR Code
You are the first1094888Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063