面向HBase的大规模数据加载研究

doi:10.15888/j.cnki.csa.005194

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年4月26日 6:46 星期六

首页 > 过刊浏览>2016年第25卷第6期 >231-237. DOI:10.15888/j.cnki.csa.005194

PDF HTML阅读 XML下载导出引用引用提醒

面向HBase的大规模数据加载研究
DOI:
                        10.15888/j.cnki.csa.005194
                    
CSTR:
                        
                    
作者:
                        贺正红贺正红
桂林电子科技大学 计算机科学与工程学院, 桂林 541004
在期刊界中查找
在百度中查找
在本站中查找
周娅周娅
桂林电子科技大学 计算机科学与工程学院, 桂林 541004
在期刊界中查找
在百度中查找
在本站中查找
文缔尧文缔尧
桂林电子科技大学 计算机科学与工程学院, 桂林 541004
在期刊界中查找
在百度中查找
在本站中查找
吴清霞吴清霞
桂林电子科技大学 计算机科学与工程学院, 桂林 541004
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:

Research on Large Scale Data Loading Based on HBase

Author:

HE Zheng-Hong
HE Zheng-Hong
Computer Science and Engineering, Guilin University of Electronic Technology, Guilin 541004, China
在期刊界中查找
在百度中查找
在本站中查找
ZHOU Ya
ZHOU Ya
Computer Science and Engineering, Guilin University of Electronic Technology, Guilin 541004, China
在期刊界中查找
在百度中查找
在本站中查找
WEN Di-Yao
WEN Di-Yao
Computer Science and Engineering, Guilin University of Electronic Technology, Guilin 541004, China
在期刊界中查找
在百度中查找
在本站中查找
WU Qing-Xia
WU Qing-Xia
Computer Science and Engineering, Guilin University of Electronic Technology, Guilin 541004, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

分布式数据库HBase在大规模数据加载中较传统关系型数据库有较大的优势但也存在很大的优化空间.基于Hadoop分布式平台搭建HBase环境,并优化自定义数据加载算法.首先,分析HBase底层数据存储,实验得出HBase自带数据加载方式在效率和灵活性方面存在不足;进而,提出了自定义并行数据加载算法,并针对集群进行优化.实验结果表明,优化后的自定义并行数据加载方式能充分发挥集群性能,具有较好的加载效率和数据操作能力.

关键词:HBase;Hadoop;MapReduce;数据加载;性能优化

Abstract:

Distributed database HBase has the greater advantage than traditional relational database in large scale data loading but there is also a lot of optimization space. We build HBase environment based on the Hadoop distributed platform, and optimize self-defining data loading algorithm. Firstly, this paper analysis the HBase underlying data store, experiments work out that data loading methods of HBase are insufficient in efficiency and flexibility. Furthermore, it proposes self-defining parallel data loading algorithm, and optimizes the cluster. The experimental results show that the optimized self-defining parallel data loading method can give full play to the cluster performance, has good loading efficiency and data operational capacity.

Key words:HBase;Hadoop;MapReduce;data load;performance optimization

引用本文

贺正红,周娅,文缔尧,吴清霞.面向HBase的大规模数据加载研究.计算机系统应用,2016,25(6):231-237

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2015-10-19
最后修改日期:2015-11-25
录用日期:
在线发布日期: 2016-06-14
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码