本文已被:浏览 199次 下载 613次
Received:January 23, 2024 Revised:February 26, 2024
Received:January 23, 2024 Revised:February 26, 2024
中文摘要: 在支持纠删码的分布式存储系统中, 最常用的编码是RS (Reed-Solomon)码. 对于一个RS(k, m)编码条带, 常见的配置是一个节点仅存储条带中的一个分片, 这导致在节点出现故障的情况下, 对其存储分片的恢复需要跨多个节点读取分片并重新编码生成恢复分片, 容易造成系统网络拥塞. 在需要恢复大量数据的场合, 系统在恢复期间会处于较长时间的脆弱期, 容错能力和吞吐量下降、读写时延升高时有发生. LRCRaft是一个基于LRC (local reconstruction code)的改进Raft共识协议, 通过在Raft中引入LRC码、动态日志增补、状态机删减和分片版本一致性等机制, 降低了Raft的读写时延, 缩短了节点故障恢复时间. 实验结果表明, 相较于Raft, LRCRaft在不同恢复模式中恢复一个单节点故障数据时, 恢复用时有着49.25%–74.97%的减少.
中文关键词: 分布式存储 Raft共识协议 纠删码 局部重构码 (LRC) 节点数据恢复
Abstract:RS (Reed-Solomon) code is most widely adopted in distributed storage systems that support erasure coding. For an RS(k,m) coding stripe, a common approach to store it is to distribute one fragment to one node. Such an approach could cause network congestion when a node fails since the system needs to read fragments across multiple nodes before it can decode and rebuild the lost data. The system would be in a fragile period for a long time when a great amount of data recovery is taking place. During this period, the system would suffer from lower failure tolerance capability, lower throughput, and higher read/write latency constantly. LRCRaft is an optimized version of Raft based on local reconstruction code (LRC). By introducing LRC, dynamic log replenishment, state machine purge, and fragment version consistency to Raft, LRCRaft can reduce read/write latency and the time consumed for node failure recovery. The results of our experiments indicate that compared to Raft, LRCRaft can reduce the time for a single node recovery by up to 49.25%–74.97% in different recovery modes.
keywords: distributed storage Raft consensus protocol erasure coding local reconstruction code (LRC) node data recovery
文章编号: 中图分类号: 文献标志码:
基金项目:河北省自然科学基金 (F2022105033)
引用文本:
袁佳正,胡晓鹏.LRCRaft: 支持节点数据快速恢复的共识协议.计算机系统应用,2024,33(7):188-200
YUAN Jia-Zheng,HU Xiao-Peng.LRCRaft: Consensus Protocol with Rapid Node Data Recovery Support.COMPUTER SYSTEMS APPLICATIONS,2024,33(7):188-200
袁佳正,胡晓鹏.LRCRaft: 支持节点数据快速恢复的共识协议.计算机系统应用,2024,33(7):188-200
YUAN Jia-Zheng,HU Xiao-Peng.LRCRaft: Consensus Protocol with Rapid Node Data Recovery Support.COMPUTER SYSTEMS APPLICATIONS,2024,33(7):188-200