Microblog New Word Recognition Combining Skip-Gram Model and Word Vector Projection
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    With the popularity of microblog and other social networks, a steady stream of new words emerge, Chinese word segmentation systems often cut the new words into Chinese characters. The new word discovery has become a hot topic in the field of Chinese natural language processing. Existing new word recognition methods rely on the statistical data of large-scale corpus, the ability of new low-frequency word recognition is poor. This paper presents an extension of skip-gram model and word vector projection method, after the combination of the this two methods can ease the data sparseness problem effectively in natural language processing, to identify new low-frequency words, and to improve the precision and recall rate of Chinese word segmentation system.

    Reference
    Related
    Cited by
Get Citation

于洁. Skip-Gram模型融合词向量投影的微博新词发现.计算机系统应用,2016,25(7):130-136

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:November 17,2015
  • Revised:December 21,2015
  • Adopted:
  • Online: July 21,2016
  • Published:
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063