Chinese Text Information Extraction Based on NLTK

doi:10.15888/j.cnki.csa.006700

WeChat

Mobile website

Home > Archive>Volume 28, Issue 1, 2019 >275-278. DOI:10.15888/j.cnki.csa.006700

PDF HTML XML Export Cite reminder

Chinese Text Information Extraction Based on NLTK
DOI:
                        10.15888/j.cnki.csa.006700
                    
CSTR:
                        [cstr]
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

NLTK is a module for processing natural language text in Python, but it has limitations when processing Chinese text. To extracted information in the text by using NLTK, the means created in this study included a group of methods, such as common context words extraction, bigrams words extraction, probability statistics, and discourse analysis. Both of NLTK text content extraction framework suitable for Chinese texts and implementation method are obtained. In the results of empirical, it finds the content of the corpus which reflects the characteristics of the text, and gets the conclusion that a strong correlation between the results of extraction and text topic.

Reference

Cited by

Get Citation

李晨,刘卫国.基于NLTK的中文文本内容抽取方法.计算机系统应用,2019,28(1):275-278

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:May 28,2018
Revised:June 19,2018
Adopted:
Online: December 27,2018
Published:

Article QR Code

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063