期刊文献+

一种Hadoop数据复制优化方法 被引量:3

Method for optimization of data replication in Hadoop
在线阅读 下载PDF
导出
摘要 针对当前Hadoop采用固定个数的数据复制来提高数据可用性方法的不足,建立了数据复制的数学模型,该模型根据数据节点失效率、数据访问延迟、数据节点的网络带宽、期望的数据可用性计算优化的数据复制个数,在Hadoop上实现了提出的数据复制优化方法,进行性能测试实验,实验结果表明该模型不仅可以改进数据可用性,而且提高了系统存储空间的利用率。 To solve the lack of improving data availability using fixed number of replication in Hadoop,an optimized mathematical model for data replication is proposed.The minimum number of data replication is calculated with this model based on failure rate of data nodes,data access latency,network bandwidth of data nodes,expected data availability.The proposed optimization method of data replication is implemented on Hadoop and the performance testing experiments are conducted.Experimental results show that the proposed model can improve data availability and utilization of storage space in cloud storage system.
出处 《计算机工程与应用》 CSCD 2012年第21期58-61,共4页 Computer Engineering and Applications
基金 广东省自然科学基金(No.S2011010001754) 广东省科技计划项目资助(No.2010B010600032)
关键词 云存储 数据复制 优化 可用性 cloud storage data replication optimization availability
  • 相关文献

参考文献13

  • 1Foster I,Zhao Yong,Raicu I,et al.Cloud computing andgrid computing 360-degree compared[C]//Grid Comput- in~ Environments Workshop, JCE 08,2008: 1-10.
  • 2Armbrust M, Fox A.Above the clouds: a Berkeley view of cloud computing[EB/OL]. (2009).http://www.eecs.berke- ley.edu/Pubs/TechRpts/2009/EEC S-2009-28 .html.
  • 3Ghemawat S,Gobioff H,Leung S T.The google file sys- tem[J].ACM SIGOPS Operating Systems Review, 2003, 37(5) :29-43.
  • 4Apache, Hadoop[EB/OL]. (2012-03-19).http://lucene.apache. org/hadoop.
  • 5栾亚建,黄翀民,龚高晟,赵铁柱.Hadoop平台的性能优化研究[J].计算机工程,2010,36(14):262-263. 被引量:51
  • 6Deelman E, Chervenak A.Data management challenges of data intensive scientific workflows[C]//Proceedings of the IEEE International Symposium on Cluster Computing and the Grid(CCGRID), Lyon, Fr,'mce, 2008 : 687-692.
  • 7Deelman E, Blythe J, Gil Y, et al.Pegasus: mapping scien- tific workflows onto the grid[C]//Proceedings of the Eu- ropean Across Grids Conference (AxGrids), Nicosia, Cy- prus, 2004 : 11-20.
  • 8Ludascher B,Altintas I,Berkley C,et al.Scientific work- flow management and the Kepler system[J].Concurrency and Computation: Practice and Experience, 2005,18 (10) : 1039-1065.
  • 9Ranganathan K, Iamnitchi A,Foster I.Improving data avail- ability through dynamic model-driven replication in large peer-to-peer communifies[C]//Proceedings of the Workshop on Global and P2P Computing on Large Scale Distrib- uted Systems, Berlin, 2002.
  • 10Wang Feng,Qiu Jie,Yang Jie, et al.Hadoop high avail- ability through metadata replication[C]//Proceeding of the First International Wod:shop on Cloud Data Man- agement, CloudDB ' 09,2009 : 37-44.

二级参考文献5

  • 1Dean J,Ghemawat S.MapReduce:Simplified Data Processing on Large Cluster[C] //Proc.of OSDI'04.Boston,MA,USA:[s.n.] ,2004.
  • 2Hadoop Distributed Filesystem[EB/OL].(2008-12-13).http://hadoop.apache.org/hdfs/.
  • 3IBM Research.Cloud Analytics:Do We Really Need to Reinvent the Storage Stack?[Z].2009.
  • 4Apache Hadoop[EB/OL].(2009-09-12).http://hadoop.apache.org/.
  • 5郑欣杰,朱程荣,熊齐邦.基于MapReduce的分布式光线跟踪的设计与实现[J].计算机工程,2007,33(22):83-85. 被引量:7

共引文献50

同被引文献37

  • 1刘田甜,李超,胡庆成,张桂刚.云环境下多副本管理综述[J].计算机研究与发展,2011,48(S3):254-260. 被引量:20
  • 2谈华芳,孙丽丽,侯紫峰.一种多副本一致性控制方法[J].计算机工程,2006,32(11):52-54. 被引量:2
  • 3贾艳燕 娄燕飞 杨树强等.分布异构多数据库中多副本一致性维护研究与实现.计算机科学,2006,33(11):184-186.
  • 4刘萍芬,马瑞芳,王军.分布式数据库系统及其一致性方法研究[J].微电子学与计算机,2007,24(10):137-139. 被引量:6
  • 5张泉,邰晓英.基于Bayesian的相关反馈在医学图像检索中的应用[J].计算机工程,2008,44(17):158-161.
  • 6Chang F,Dean J.Bigtable:a distributed storage system forstructured data[C]//7th OSDI,2006:276-290.
  • 7Kekre H B,Thepade S,Sanas S.Improving performance ofmultileveled BTC based CBIR using sundry color spaces[J].International Journal of Image Processing,2010,4(6):620-630.
  • 8Ghemawat S,Gobioff H,Leung S T.The Google file system[C]//Proceedings of the 19th ACM Symposium on OperatingSystems Principles.Bolton Landing:ACM,2003:29-43.
  • 9Dean J,Ghemawat S.MapReduce:a flexible data processingtool[J].Communications of the ACM,2010,53(1):72-77.
  • 10Shvacliko K,Kuang H,Radia S,et al.Hadoop distributedfile system for the grid[C]//Proceedings of the NuclearScience Symposium Conference Record(NSS/MIC),2009:1056-1061.

引证文献3

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部