期刊文献+

基于MapReduce的分布式ETL体系结构研究 被引量:9

Research of Distributed ETL Architecture Based on MapReduce
在线阅读 下载PDF
导出
摘要 针对传统ETL工具集中式执行方式的不足,提出了一种基于MapReduce的分布式ETL体系结构——MDETL(MapReduce Distributed ETL)。该体系结构采用MapReduce并发处理海量数据的并行编程模型,结合分布式ETL的集群运算方法,实现了集群分布式执行ETL流程,从而提高了整个ETL系统的灵活性和吞吐率,并具有较好的可扩展性和负载平衡性能,提高了执行效率。 Aiming at deficiency of centralized execution mode of traditional extraction-transformation-loading (ETL) tools, this paper put forward the architecture of distributed ETL based on MapReduce MDETL(MapReduce Distributed ETL). The ETL architecture which uses a parallel programming model of massive data parallel processing with cluster computing methods of distributed ETL,achieves the cluster distributed ETL processing. It improves the whole ETL system's flexibility and throughput rate, and has better expansibility and load-balancing, raises the performance efficiency.
出处 《计算机科学》 CSCD 北大核心 2013年第6期152-154,共3页 Computer Science
基金 国家自然科学基金项目(70971137)资助
关键词 ETL MAPREDUCE 分布式 ETL, MapReduce, Distributed
  • 相关文献

参考文献7

二级参考文献118

共引文献827

同被引文献105

  • 1章水鑫,徐宏炳,于立.增量式ETL工具的研究与实现[J].现代计算机,2005,11(3):6-10. 被引量:20
  • 2马瑞新,许力.基于SOA的实时ETL的研究与实现[J].计算机工程与科学,2007,29(8):115-117. 被引量:4
  • 3符丽锦,覃华,邓海,等.一种改进的Apriori算法[J].广西科技学院学报,2013,29(1):123-127.
  • 4Wegener D, Mock M, Adranale D. Toolkit based high-performance data mining of large data on Ma- pReduce clusters[C]//IEEE International Confer- ence on Data Mining-ICDM. Washington: IEEE, 2009.
  • 5S Chakrabarti. Data mining for hypertext: a tutorial survey[J]. SIGKDD Exploration, 2009,1 (3) : 4-12.
  • 6Zou Quan, Li Xu-Bin, Jiang Wen Rui. Survey of MapReduce frame operation in bioinformatics[J]. Briefings in bioinformatics, 2013,15(6) : 189-199.
  • 7Sumithra R, Paul S. Using distributed apriori as- sociation rule and classical apriori mining algo-rithms for grid based knowledge diseovery[C]// Computing Communication and Networking Tech nologies (ICCCNT), IEEE 2010 International Conference on Data Mining. IEEE,2010.
  • 8LinJing, Lu Yongquan, Wang Jintao. An im- proved apriori algorithm for early warning of e- quipment failue[C]//IEEE International Confer- ence on Computer Science and Information Tech- nology (ICCSIT). IEEE Computer Society, 2009.
  • 9Gunarathne T, Tak-Lon Wu, Qiu J. Mapreduce in the clouds for science[C]//IEEE second Interna tional Conference on Cloud Computing Technology and Science (Cloud COM), 2010 : 565-572.
  • 10Jiaqi Tan, Kavulya S, Gandhi R. Visual log-based causal tracing for performance debugging of Ma- pReduce system [C]// IEEE 30th International Conference on Distributed Computing Systems (ICDCS), 2010.

引证文献9

二级引证文献42

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部