Abstract:Distributed storage systems achieve high-reliability and low-overhead data storage by erasure code. To provide different reliability and access performance, storage systems need to perform redundancy transitions on erasure code data by changing coding parameters. The stripe merging mechanism provides a way for redundancy transitioning in storage systems. However, the stripe merging process based on traditional erasure code can result in a large amount of data block redistribution and checksum block re-computation I/O overhead. Worst still, the I/Os will be amplified in multiple merging operations. In response to these issues, this study proposes new Tree Reed-Solomon (TRS) codes that eliminate data block redistribution I/Os by decentralizing data blocks, and save checksum block re-computation I/Os by designing coding matrices. TRS codes further design storage units to organize the stripes taking part in merging into a tree, enabling multiple merging operations to be efficiently completed from bottom to top based on tree structure. To test the performance of TRS codes, this study designs and implements a distributed storage prototype. Experiments have shown that compared to other erasure codes, TRS codes can greatly reduce stripe merging operation time.