###
计算机系统应用英文版:2016,25(6):231-237
本文二维码信息
码上扫一扫!
面向HBase的大规模数据加载研究
(桂林电子科技大学 计算机科学与工程学院, 桂林 541004)
Research on Large Scale Data Loading Based on HBase
(Computer Science and Engineering, Guilin University of Electronic Technology, Guilin 541004, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 1714次   下载 2450
Received:October 19, 2015    Revised:November 25, 2015
中文摘要: 分布式数据库HBase在大规模数据加载中较传统关系型数据库有较大的优势但也存在很大的优化空间.基于Hadoop分布式平台搭建HBase环境,并优化自定义数据加载算法.首先,分析HBase底层数据存储,实验得出HBase自带数据加载方式在效率和灵活性方面存在不足;进而,提出了自定义并行数据加载算法,并针对集群进行优化.实验结果表明,优化后的自定义并行数据加载方式能充分发挥集群性能,具有较好的加载效率和数据操作能力.
中文关键词: HBase  Hadoop  MapReduce  数据加载  性能优化
Abstract:Distributed database HBase has the greater advantage than traditional relational database in large scale data loading but there is also a lot of optimization space. We build HBase environment based on the Hadoop distributed platform, and optimize self-defining data loading algorithm. Firstly, this paper analysis the HBase underlying data store, experiments work out that data loading methods of HBase are insufficient in efficiency and flexibility. Furthermore, it proposes self-defining parallel data loading algorithm, and optimizes the cluster. The experimental results show that the optimized self-defining parallel data loading method can give full play to the cluster performance, has good loading efficiency and data operational capacity.
文章编号:     中图分类号:    文献标志码:
基金项目:
引用文本:
贺正红,周娅,文缔尧,吴清霞.面向HBase的大规模数据加载研究.计算机系统应用,2016,25(6):231-237
HE Zheng-Hong,ZHOU Ya,WEN Di-Yao,WU Qing-Xia.Research on Large Scale Data Loading Based on HBase.COMPUTER SYSTEMS APPLICATIONS,2016,25(6):231-237