Abstract:In cloud storage centers, replica of file may be lost because of the failure of nodes, which will affect the reliability of system, as well as the efficiency of file concurrent access. There are some deficiencies in the default replica copy algorithm in Hadoop, such as a concentration of data transfer process on a few DataNodes, load imbalance, low disk I/O throughput. To address this issue, this paper proposes a rapid replica copy algorithm based on popularity in Hadoop. It handles the popular block firstly, and chooses source and destination DataNodes properly. The simulation results show that the proposed algorithm improves the disk I/O throughput, load balance, and reduces average service response time significantly.