支持分页显存的高性能哈希表索引系统
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:


High-performance Hash Table Indexing System Supporting Paging Memory
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    哈希表以访问效率时间复杂度O(1)著称, 作为一类可提供大规模数据高效访问的算法和数据结构为各类大数据应用所采用, 例如, 适用于各类新兴高性能(HPC)领域、数据库领域的工作负载和场景. 随着高性能协处理器GPU硬件性能的日益提升, 面向高性能GPU环境的哈希表并行优化已逐渐吸引了大量研究工作. 当前的各类GPU哈希表优化方法和解决方案集中于利用GPU的大规模线程环境和高内存带宽来提升哈希表的事务高并发性处理和键值对数据快速访问. 然而, 由于现有GPU哈希表结构的研究工作普遍忽略了GPU资源有效管理, 并没有以如何充分利用GPU线程资源和显存资源. 同时, 由于GPU显存空间的大小限制, 用于存储哈希表结构数据的空间有限, 无法应对更大规模的哈希表结构. 因此, 面向GPU环境下的哈希表方法的可扩展性和性能仍存在着技术挑战. 本文提出并设计了一种面向GPU环境的可处理大规模并发事务的哈希表技术, 命名为Starfish. Starfish提出了新的基于异步GPU流的“交换层” (swap layer)技术, 用以支持GPU显存外的动态哈希表, 同时也保障了GPU哈希表的索引方法性能. 为了解决GPU大规模线程的访问带来的哈希冲突开销, Starfish设计了一类紧凑型数据结构, 并研究了一种可分页显存的分配方法, 不仅为GPU哈希表技术提供了静态哈希方法的高性能, 而且也支持动态哈希的高可扩展性. 性能评估实验表明, Starfish显著优于其他GPU哈希表技术, 包括cudpp-Hash, SlabHash.

    Abstract:

    Hash tables are well known for their access efficiency and time complexity O(1). As an algorithm and data structure available for efficient access to large-scale data, hash tables have been widely adopted by big data applications. For example, they are applicable to various kinds of workloads and scenarios in the emerging high-performance computing (HPC) domain and the database domain. As the hardware performance of the high-performance coprocessor graphics processing unit (GPU) improves continuously, parallel hash table optimization for high-performance GPUs has attracted a lot of researchers. According to our previous research survey, most of the current methods and solutions for GPU-based hash table optimization focus on the large-scale thread scheduling and high memory bandwidth of GPUs to improve the high-concurrency processing of hash table transactions and fast key-value access to data. However, since existing research on GPU hash table structures generally ignores the effective management of GPU resources, no methods are available for making full use of GPU thread resources and memory resources. Moreover, due to the limitation of the GPU memory size, the space for storing data on hash table structures is limited and therefore unable to handle hash table structures of larger scales. Thus, technical challenges remain for the scalability and performance optimization of the hash table designs for GPUs. This study proposes a hash table technology that can process massive concurrent transactions for GPUs and names it Starfish. Starfish includes a novel “swap layer” technology based on asynchronous GPU streams to support the dynamic hash table outside the GPU memory and also guarantee the high indexing performance of GPU hash tables. To solve the massive hash transaction conflict overhead resulting from access of large-scale GPU threads, we design a class of compact data structures and a pageable memory distribution method, which not only provides the high performance of a static hash method for the GPU-based hash table technology but also supports the high scalability of dynamic hash. Our experimental performance evaluation shows that Starfish is significantly superior to other GPU-based hash table technologies including cudpp-Hash and SlabHash.

    参考文献
    相似文献
    引证文献
引用本文

熊轶翔,蒋筱斌,张珩,武延军.支持分页显存的高性能哈希表索引系统.计算机系统应用,2022,31(9):82-90

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-12-04
  • 最后修改日期:2022-01-04
  • 录用日期:
  • 在线发布日期: 2022-05-30
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号