本文已被:浏览 1361次 下载 2293次
Received:November 20, 2019 Revised:December 16, 2019
Received:November 20, 2019 Revised:December 16, 2019
中文摘要: 随着科研工作的推进, 科研数据出现了海量的增长, PB级科研数据需要高效、稳定的存储系统. 传统的数据存储方案存在资源利用率差、集群扩展性能低以及用户界面操作不友好等问题, 严重限制了数据在科研场景下的有效利用. 依托中科院地球科学大数据专项, 本文设计并实现高效的存储系统i-Harbor. 该系统以对象存储系统为核心架构, 以开源的Ceph分布式存储系统和MongoDB数据库作为对象数据和元数据的存储载体, 设计通用的基于HTTP和FTP协议的数据接口, 同时利用多副本和纠删码技术消除单点故障, 配合Zabbix集群监控系统, 实时定位平台参数以及故障, 提高平台容灾性和安全性. 此外, 基于底层分布式结构的特点, 集群可以随意添加存储节点, 提高了平台的扩展性.
Abstract:With the development of scientific research, there is a massive increase in scientific research data. PB-level scientific research data requires efficient and stable storage systems. The traditional data storage scheme has problems such as poor resource utilization, low cluster expansion performance, and unfriendly user interface operation, which seriously limit the effective use of data. Relying on the Big Data Project of the Chinese Academy of Sciences, we design and implement an efficient storage system i-Harbor. Its core architecture is based on object storage system, using open-sourced Ceph distributed system and MongoDB database as the storage carrier of object data and metadata. The data interface is designed on the basis of HTTP API and FTP. To improve the platform disaster tolerance and security, we use Multiple Copies and Erasure Coding technology to eliminate single node of failure. Meanwhile we locate the real-time platform parameters and faults by Zabbix cluster monitoring system. Based on the distribution characteristics, the cluster can add storage nodes at will, which improves the platform’s scalability.
keywords: object storage distributed storage Ceph
文章编号: 中图分类号: 文献标志码:
基金项目:中国科学院A类战略性先导科技专项(XDA19000000)
引用文本:
王锦涛,张海明.面向科研领域的分布式对象存储系统.计算机系统应用,2020,29(7):82-88
WANG Jin-Tao,ZHANG Hai-Ming.Distributed Object Storage System for Scientific Research.COMPUTER SYSTEMS APPLICATIONS,2020,29(7):82-88
王锦涛,张海明.面向科研领域的分布式对象存储系统.计算机系统应用,2020,29(7):82-88
WANG Jin-Tao,ZHANG Hai-Ming.Distributed Object Storage System for Scientific Research.COMPUTER SYSTEMS APPLICATIONS,2020,29(7):82-88