摘要
随着大数据时代的到来,数据存储正接受着严峻的考验。为了改进传统Hadoop分布式文件系统HDFS存在的冗余度高、负载均衡能力不足等问题,提出了一种基于柯西码的动态分散式存储优化策略CDDS。对于系统中的数据块,在保证数据可用性的基础上,依据其热度的不同生成相应的存储方案。对于系统中的冷数据与热数据,分别采用基于柯西码的纠删码技术进行单副本与多副本存储,既保证了数据的可靠性又保证了系统的I/O能力。经测试,运用该策略存储数据所需要的存储空间减小为原来的75%,系统的可靠性与负载均衡能力也得到了增强。
With the advent of the big data era, data storage is facing severe challenges. The traditional Hadoop distributed file system (HDFS) has problems such as high storage redundancy and insufficient load balancing. Aiming at these problems, based on Cauchy code, we propose a Cauchy dynamic decentralized storage (CDDS) strategy. For the data blocks in the system, this strategy can generate different storage schemes based on their heat levels while ensuring data availability. For the cold data and hot data in the system, we adopt the Cauchy based erasure code technology to perform single-copy storage and multi-copy storage respectively, which guarantees the reliability of the data and the I/O capability of the system. Test results show that the CDDS strategy reduces data storage space to 75% of the original, and enhances the system’s reliability and load balancing capability.
作者
谢果君
沈记全
杨焕焕
XIE Guo-jun;SHEN Ji-quan;YANG Huan-huan(School of Computer Science and Technology,Henan Polytechnic University,Jiaozuo 454000,China)
出处
《计算机工程与科学》
CSCD
北大核心
2019年第3期440-445,共6页
Computer Engineering & Science
基金
河南省基础与前沿研究项目(152300410212)
关键词
数据存储
柯西码
动态副本
负载均衡
data storage
Cauchy code
dynamic replica
load balancing