###
计算机系统应用英文版:2017,26(9):69-74
本文二维码信息
码上扫一扫!
基于Spark和Redis的大规模RDF数据查询系统
(中国石油大学(华东) 计算机与通信工程学院, 青岛 266580)
Big RDF Graph Query System Based on Spark and Redis
(School of Computer & Communication Engineering, China University of Petroleum, Qingdao 266580, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 1197次   下载 2248
Received:December 13, 2016    
中文摘要: 随着语义Web技术的不断发展,RDF数据量增长迅速,单机RDF查询系统已经难以满足现实需要,研究和构建分布式RDF查询系统已经成为学术界与工业界的研究热点之一.现有的RDF查询系统主要是基于Hadoop或通用分布式技术.前者磁盘I/O太高;后者则可扩展性较差.且两种系统在基本图模式查询时,效率都较低.针对上述问题,本文设计了基于Spark和Redis的分布式系统架构,并改进了查询计划生成算法,最后实现了原型系统RDF-SR.该系统使用Spark减少了磁盘I/O,借助Redis提高了数据映射速率,利用改进的算法减少了数据混洗次数.实验表明,相比于现有的其他系统,RDF-SR既保持了较高可扩展性,又在基本图模式查询时,具有更高的性能.
中文关键词: 语义Web  大规模RDF  Spark  Redis
Abstract:With the development of semantic web technology, RDF data grow rapidly. The single node RDF query system cannot meet the practical needs. Building distributed RDF query system has become one of the hotspots in the academia and industry. The existing RDF query system is based on Hadoop and general distributed technology. The disk I/O of the former is too high and the latter is less scalable. Besides, the two systems perform poorly in the basic pattern matching mode. In order to solve these problems, we design a distributed system architecture based on Spark and Redis, and improve the query plan generation algorithm. We call the prototype system RDF-SR. This system reduces the disk I/O by Spark, improves the data mapping rate by Redis and reduces the data shuffling process with improved algorithms. Our evaluation shows that RDF-SR performs better in the basic pattern matching mode compared with other systems.
keywords: semantic Web  big RDF graph  Spark  Redis
文章编号:     中图分类号:    文献标志码:
基金项目:
引用文本:
阳杰,王木涵,徐九韵.基于Spark和Redis的大规模RDF数据查询系统.计算机系统应用,2017,26(9):69-74
YANG Jie,WANG Mu-Han,XU Jiu-Yun.Big RDF Graph Query System Based on Spark and Redis.COMPUTER SYSTEMS APPLICATIONS,2017,26(9):69-74