MapReduce Performance Evaluation Model for Hadoop2.x
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    MapReduce-based systems are increasingly being used for large-scale data analysis applications. Apache Hadoop is one of the most common open-source implementations of such paradigm. Minimizing the execution time is vital for MapReduce as well as for all data-processing applications, and the accurate estimation of execution time is essential for optimization. In this study, the author created a MapReduce performance model for Hadoop2.x that can precisely estimate the execution time of workload in MapReduce. This model combines a precedence tree model that can capture dependencies between different tasks in one MapReduce job, and a queueing network model that can capture the intra-job synchronization constraints. Such an analytical performance model is a particularly attractive tool as it might provide reasonably accurate job response time at significantly lower cost than the simulation experiment of real data-analysis systems. Furthermore, a clear understanding of systematic job response time under different circumstances is key to making decisions in MapReduce workload management and resource capacity planning.

    Reference
    Cited by
Get Citation

吴岳.用于Hadoop2. x的MapReduce性能评估模型.计算机系统应用,2021,30(2):219-225

Copy
Related Videos

Share
Article Metrics
  • Abstract:992
  • PDF: 1723
  • HTML: 1359
  • Cited by: 0
History
  • Received:June 29,2020
  • Revised:July 27,2020
  • Online: January 29,2021
Article QR Code
You are the first1025786Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063