Abstract:In this paper, the HDFS distributed file system is conducted in-depth research. In HDFS the way of streaming to read and write large files is very efficient, but the efficiency on reading and writing of the mass of small files is relatively low. According to this problem this paper presents a small files based on relational database consolidation strategy. Firstly creating a user's file for each user, then uploading file's metadata information to relational database and the file is written to the user's file when user uploads small files. Finally user via streaming mode to read small files according to the metadata information. When user reads file which size is smaller than the file block, datanode takes load balancing strategy, the datanode of storing data transfers data directly so as to reduce the pressure of the main server and improve the efficiency of file's transfer. The experimental results show that this scheme solves the shortcoming of HDFS reading and writing small files, improves the HDFS file system of reading and writing performance on massive small files. This scheme can apply to massive small files on cloud storage system, and reduce memory consumption of NameNode to improve the efficiency of file's reading and writing.