摘要
针对当前Hadoop采用固定个数的数据复制来提高数据可用性方法的不足,建立了数据复制的数学模型,该模型根据数据节点失效率、数据访问延迟、数据节点的网络带宽、期望的数据可用性计算优化的数据复制个数,在Hadoop上实现了提出的数据复制优化方法,进行性能测试实验,实验结果表明该模型不仅可以改进数据可用性,而且提高了系统存储空间的利用率。
To solve the lack of improving data availability using fixed number of replication in Hadoop,an optimized mathematical model for data replication is proposed.The minimum number of data replication is calculated with this model based on failure rate of data nodes,data access latency,network bandwidth of data nodes,expected data availability.The proposed optimization method of data replication is implemented on Hadoop and the performance testing experiments are conducted.Experimental results show that the proposed model can improve data availability and utilization of storage space in cloud storage system.
出处
《计算机工程与应用》
CSCD
2012年第21期58-61,共4页
Computer Engineering and Applications
基金
广东省自然科学基金(No.S2011010001754)
广东省科技计划项目资助(No.2010B010600032)
关键词
云存储
数据复制
优化
可用性
cloud storage
data replication
optimization
availability