###

DOI:

计算机系统应用英文版:2010,19(8):70-73

View/Add Comment 过刊浏览高级检索 HTML

←前一篇 | 后一篇→

码上扫一扫！

下载全文

基于正则表达式的企业主页信息抽取①

靳小川¹, 刘万军¹, 赵雷²

(1.辽宁工程技术大学软件学院辽宁葫芦岛 125105;2.沈阳师范大学计算机与数学基础教学部辽宁沈阳 110034)

Enterprise Homepage Information Extraction Based on Regular Expression

摘要

图/表

参考文献

相似文献

本文已被：浏览 1702次下载 2919次
Received:November 13, 2009 Revised:December 20, 2009

中文摘要: 主要分析了企业主页上描述企业基本信息表达语句的结构特点，提出了基于正则表达式的企业主页信息抽取的方法和技术，并设计开发了一个相应的原型系统对一些企业信息项进行抽取。实验结果表明，该系统可以有效地从企业主页上抽取企业相关信息，并得到较高的抽全率和抽准率。

中文关键词: 企业主页正则表达式信息抽取

Abstract:The paper mainly analyses the structural characteristic of the sentences that describe enterprise basic information on enterprise homepage. It proposes the method and technique of enterprise homepage information extraction based on regular expression, and develops an archetype system to extract some enterprise information items. The experimental results show that it can extract enterprise-related information from enterprise homepage effectively and get a high recall and precision.

keywords: enterprise homepage regular expression information extraction

文章编号： 中图分类号： 文献标志码：

基金项目:

Author Name	Affiliation
JIN Xiao-Chuan	辽宁工程技术大学软件学院辽宁葫芦岛 125105
LIU Wang-Jun	辽宁工程技术大学软件学院辽宁葫芦岛 125105
ZHAO Lei	沈阳师范大学计算机与数学基础教学部辽宁沈阳 110034

Author Name	Affiliation
JIN Xiao-Chuan	辽宁工程技术大学软件学院辽宁葫芦岛 125105
LIU Wang-Jun	辽宁工程技术大学软件学院辽宁葫芦岛 125105
ZHAO Lei	沈阳师范大学计算机与数学基础教学部辽宁沈阳 110034

引用文本：
靳小川,刘万军,赵雷.基于正则表达式的企业主页信息抽取①.计算机系统应用,2010,19(8):70-73
JIN Xiao-Chuan,LIU Wang-Jun,ZHAO Lei.Enterprise Homepage Information Extraction Based on Regular Expression.COMPUTER SYSTEMS APPLICATIONS,2010,19(8):70-73