Abstract:With the development of scientific research, there is a massive increase in scientific research data. PB-level scientific research data requires efficient and stable storage systems. The traditional data storage scheme has problems such as poor resource utilization, low cluster expansion performance, and unfriendly user interface operation, which seriously limit the effective use of data. Relying on the Big Data Project of the Chinese Academy of Sciences, we design and implement an efficient storage system i-Harbor. Its core architecture is based on object storage system, using open-sourced Ceph distributed system and MongoDB database as the storage carrier of object data and metadata. The data interface is designed on the basis of HTTP API and FTP. To improve the platform disaster tolerance and security, we use Multiple Copies and Erasure Coding technology to eliminate single node of failure. Meanwhile we locate the real-time platform parameters and faults by Zabbix cluster monitoring system. Based on the distribution characteristics, the cluster can add storage nodes at will, which improves the platform’s scalability.