###
DOI:
计算机系统应用英文版:2015,24(3):44-49
本文二维码信息
码上扫一扫!
使用内存缓存的迭代应用编程框架
(1.中国科学院软件研究所 基础软件国家工程研究中心, 北京 100190;2.中国科学院大学, 北京 100190)
MemLoop: A Programming Framework Using In-Memory Cache for Iterative Application
(1.Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;2.University of Chinese Academy of Science, Beijing 100190, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 1479次   下载 2141
Received:July 04, 2014    Revised:August 11, 2014
中文摘要: 迭代式计算是一类重要的大数据分析应用. 在分布式计算框架MapReduce上实现迭代计算时, 计算会被分解成多个作业并按作业依存关系顺序运行, 这使得程序与分布式文件系统(DFS)有多次交互而影响程序执行时间. 对这些交互相关数据的缓存会降低与DFS的交互时间, 进而提升程序总体的性能. 考虑到集群中的大量内存在多数情况下会处于空闲状态, 提出了一种使用内存缓存的迭代式应用编程框架MemLoop. 该系统从作业提交API、调度算法、缓存管理模块实现缓存管理以充分利用内存缓存迭代间可驻留数据与迭代内依存数据. 我们将此框架与已有相关框架进行了比较, 实验结果表明该框架能够提升迭代程序的性能.
Abstract:The iterative computation is an important big data analysis application. While implementing iterative computation on the distributed computation framework MapReduce, the iterative program will be divided into more than one jobs which run in the order defined by the dependencies between jobs, which lead to many interactions between the program and distributed file system(DFS) that will affect the program's execution time. Caching these interaction-related data will reduce the time of interactions between the program and DFS and hence improve the overall performance of application. Considering that large amount of memory in cluster nodes is unused at most time, this paper proposes a programming framework called MemLoop using memory cache for iterative application. This system sufficiently uses the free memory in the cluster's nodes to cache data by implementing the memory caching management from three models: job submit API, task scheduling algorithm, cache management. The cached data is classified into two categories: inter-iteration resident data and intra-iteration dependent data. We compare this framework with previous related framework. The result shows that MemLoop can improve the performance of iterative program.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(61100067)
引用文本:
连文波,汪美玲,陶秋铭,赵琛.使用内存缓存的迭代应用编程框架.计算机系统应用,2015,24(3):44-49
LIAN Wen-Bo,WANG Mei-Ling,TAO Qiu-Ming,ZHAO Chen.MemLoop: A Programming Framework Using In-Memory Cache for Iterative Application.COMPUTER SYSTEMS APPLICATIONS,2015,24(3):44-49