基于HGAV的多源异构数据集成方法
作者:
基金项目:

国家自然科学基金(U150120175)


Multi-Source Heterogeneous Data Integration Method Based on HGAV
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [15]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    针对信息系统中海量数据多源异构和难以共享的问题,提出了多源异构数据虚拟集成框架.数据集成系统中的GAV(Global-As-View)模式映射方法面对信息量分布不均匀的数据源时,查询效率较低,在对GAV改进的基础上,提出了基于HGAV(Hierarchical-Global-As-view)的模式映射算法,通过引入中间数据源模式,形成分层的全局视图,大大缩减了映射空间,简化了映射集合,便于查询的重写和优化.利用宁东智慧环保项目中的五大类数据对本文所提出的算法加以验证,实验结果表明该算法相较于GAV模式映射算法提高了数据集成效率,缩短了查询时间.

    Abstract:

    In order to solve the problem of heterogeneous data sources and data sharing in information systems, a virtual integration framework of multi-source heterogeneous data is proposed. Since GAV (Global-As-View) pattern mapping method in data integration system is less efficient when faced with uneven distribution of information, GAV method is improved, and the pattern mapping method based on HGAV (Hierarchical-Global-As-View) is proposed. By introducing the intermediate data source pattern, a hierarchical global view is formed, which greatly reduces the mapping space. In this way, the mapping set is simplified, and the query is easier to rewrite and optimize. The proposed algorithm is verified by the five main types of data in the Ningdong Intelligent Environment Protection project. The experimental results show that the pattern mapping algorithm based on HGAV improves the efficiency of data integration and shortens the query time, compared to the GAV schema mapping algorithm.

    参考文献
    [1] Doan AH, Halevy A, Ives Z. Principles of Data Integration. Waltham, MA: Morgan Kaufmann, 2012. 110-120.
    [2] Lenzerini M. Data integration: A theoretical perspective. Proceedings of the Twenty-first ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. Madison, WI, USA. 2002. 233-246.
    [3] Abiteboul S, Duschka O. Complexity of answering queries using materialized views. Proceedings of the 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. Seattle, WA, USA. 1998. 254-263.
    [4] Glenn IN. Multi-source data fusion in NATO coalition operations (a Canadian Army perspective on ISTAR). Proceedings of the Conference Record of the Thirty-Third Asilomar Conference on Signals, Systems and Computers. Pacific Grove, CA, USA. 1999. 407-411.
    [5] Fonseca FT, Egenhofer MJ, Agouris P, et al. Using ontologies for integrated geographic information systems. Transactions in GIS, 2002, 6(3): 231-257. [DOI:10.1111/1467-9671.00109]
    [6] Ullman JD. Information integration using logical views. Theoretical Computer Science, 2000, 239(2): 189-210. [DOI:10.1016/S0304-3975(99)00219-4]
    [7] 姚崇东. 基于XML的多源异构数据集成的实现方法研究[硕士学位论文]. 哈尔滨: 哈尔滨工程大学, 2007.
    [8] 朱珊娜, 李书琴, 安福定. XML文档到关系数据库的转换研究. 计算机工程与设计, 2008, 29(21): 5507-5509, 5571.
    [9] 张永新. 面向Web数据集成的数据融合问题研究[博士学位论文]. 济南: 山东大学, 2012.
    [10] 许平格. 数据库管理系统中查询优化的设计和实现[硕士学位论文]. 杭州: 浙江大学, 2005.
    [11] 钟将, 宋娟. 基于本体的异构数据集成框架. 计算机工程, 2011, 37(14): 44-46. [DOI:10.3969/j.issn.1000-3428.2011.14.013]
    [12] 刘伟, 孟小峰, 孟卫一. DeepWeb数据集成研究综述. 计算机学报, 2007, 30(9): 1475-1489.
    [13] 化柏林. 多源信息融合方法研究. 情报理论与实践, 2013, 36(11): 16-19.
    [14] 王艳华. 基于中间件技术的分布式数据集成研究与实现[硕士学位论文]. 武汉: 武汉理工大学, 2006.
    [15] Gao JJ, Xiao JQ. Research on heterogeneous data access and integration model based on OGSA-DAI. Proceedings of the 2013 5th International Conference on Computational and Information Sciences. Shiyang, China. 2013. 1690-1693.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

郏奎奎,刘海滨.基于HGAV的多源异构数据集成方法.计算机系统应用,2018,27(3):27-35

复制
分享
文章指标
  • 点击次数:3442
  • 下载次数: 3322
  • HTML阅读次数: 3595
  • 引用次数: 0
历史
  • 收稿日期:2017-06-05
  • 最后修改日期:2017-06-17
  • 在线发布日期: 2018-01-25
文章二维码
您是第12796495位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号