Abstract:Through the multiple-backup strategy HDFS can restore data easily when data is damaged or missed. However, the data stored in system increases all the time. When the data scale has become very big, the strategy will need several times of storage space to store the backup data. This article proposes to use erasure codes to replace the multiple-backup strategy, which can greatly improve the storage efficiency and reduce extra storage expend.