Multidimensional-Semantics-Based Web Medicine Information Extraction

AIPUB归智期刊联盟

WeChat

Mobile website

2025-4-26- 21

Home > Archive>Volume 20, Issue 11, 2011 >50-54,19

PDF HTML XML Export Cite reminder

Multidimensional-Semantics-Based Web Medicine Information Extraction
DOI:
                        
                    
CSTR:
                        [cstr]
                    
Author:
                        GU Yi-LingGU Yi-Ling
Software School, Fudan University, Shanghai 201203, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

A multidimensional-semantics based Web information extraction method is proposed in this article to extract medicine information on the Web. The method overcomes the heterogeneity of Web pages from different sources and finds the common characteristics among them by building up a semantic dictionary and describes the knowledge of medicine information over the Web. At the same time, it utilizes a structural-semantic-entropy-based approach to detect data-rich sections on Web pages, then extract information of interest from them and finally verify and supplement the extracted information by generating extraction rules using XPath. The method is able to obtain information from heterogeneous sources both automatically and effectively. Experiments shown that it has high precision and recall, thus can provide sufficient information for the government to enhance supervision of medicine market on the Web.

Key words:Web information extraction;multidimensional semantic dictionary;Web medicine information;Structuralsemantic entropy;Xpath

Get Citation

顾轶灵.基于多维语义的互联网药品信息提取方法.计算机系统应用,2011,20(11):50-54,19

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:March 10,2011
Revised:April 18,2011
Adopted:
Online:
Published:

Article QR Code

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063