摘要
为了高效可靠的存储海量数据,分布式存储系统常利用纠删码来降低存储开销.Hitchhiker码是Piggybacking架构下易于工程实现的双条带MDS(Maximum Distance Separable Code)码,具有参数(k,r)取值任意、修复成本较低等特征.然而,目前Hitchhiker码只优化了数据单元的修复带宽,未优化校验单元的修复带宽.针对此问题,本文提出了利用LRC(Locally Repairable Code)的思想同时优化数据单元和校验单元的编码(Hitchhiker-LRC和Hitchhiker-LRC+).该方法是对第一个子条带中l个校验求局部校验,将其存放在第一个子条带的某个校验上,要求该校验的数据已通过局部校验的形式捎带在了第二个子条带的后r-1个校验中,并且对该校验单元做了横向减法.最后,理论和实验证明,Hitchhiker-LRC和Hitchhiker-LRC+这两种编码在2≤r<k/2时可降低1%~5%修复带宽和节省约10%的修复时间,在k/2≤r<k时,Hitchhiker-LRC+在r较大时,相比Hitchhiker-LRC具有更低的修复带宽;并存在l使得修复带宽率达到最优.
In order to store big data efficiently and reliably,Erasure-Codes are often used to reduce storage overhead in distributed storage systems.Hitchhiker code is a double stripes coding easy to be implemented in Piggybacking architecture.It is an MDS code with arbitrary parameters and low repair cost.How ever,at present,Hitchhiker code only optimizes the data unit,while the repair of the parity unit is not improved.To solve this problem,this paper proposes two new codes(Hitchhiker-LRC and Hitchhiker-LRC+)which use LRC.They can optimize the bandwidth of repair data and parity unit simultaneously.The method is to obtain a local parity for l parity in the first substripe and store it on a parity in the first substripe.It requires that the data of the parity has been pigmented on the later parity in the second substripe by means of local parity,and the horizontal subtraction is made for the parity unit.Finally,the theory combined with the experiment proves that Hitchhiker-LRC and Hitchhiker-LRC+can reduce the repair bandw idth by 1%~5%and save about 10%of repair time at 2≤r<k/2.Hitchhiker-LRC+have lower repair bandw idth than Hitchhiker-LRC at k/2≤r<k.And there is a l can make the repair bandw idth rate reach the optimal.
作者
胡金平
李贵洋
江小玉
周悦
韩鸿宇
HU Jin-ping;LI Gui-yang;JIANG Xiao-yu;ZHOU Yue;HAN Hong-yu(Department of Computer Science,Sichuan Normal University,Chengdu 610101,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2020年第7期1559-1568,共10页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61701331)资助。