###
DOI:
计算机系统应用英文版:2015,24(11):157-161
本文二维码信息
码上扫一扫!
Hadoop中处理海量小文件的方法
(湖南工业大学计算机与通信学院, 株洲 412007)
Methods of Dealing With Massive Small Files in Hadoop
(Hunan University of Technology, Zhuzhou 412007, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 1386次   下载 3823
Received:March 08, 2015    Revised:May 18, 2015
中文摘要: 针对Hadoop中提供底层存储的HDFS对处理海量小文件效率低下、严重影响性能的问题.设计了一种小文件合并、索引和提取方案,并与原始的HDFS以及HAR文件归档方案进行对比,通过一系列实验表明,本文的方案能有效减少Namenode内存占用,提高HDFS的I/O性能.
中文关键词: Hadoop  HDFS  小文件  HDFS的I/O性能
Abstract:HDFS provides the underlying storage for Hadoop, however, the HDFS deals with massive small files inefficiently and decreases system performance seriously. To solve this problem, we designed a file merging, indexing and retrieval solution. Then through a series of experiments compared to the original HDFS and HAR solution, it can be shown that our scheme can effectively reduce the memory usage of Namenode and improve the I/O performance of HDFS.
文章编号:     中图分类号:    文献标志码:
基金项目:2013年度科技部科技支撑计划(2013BAJ10B14-5)
引用文本:
李旭,李长云,张清清,胡淑新,周玲芳.Hadoop中处理海量小文件的方法.计算机系统应用,2015,24(11):157-161
LI Xu,LI Chang-Yun,ZHANG Qing-Qing,HU Shu-Xin,ZHOU Ling-Fang.Methods of Dealing With Massive Small Files in Hadoop.COMPUTER SYSTEMS APPLICATIONS,2015,24(11):157-161