期刊文献+

DMGrid:基于网格计算的数据挖掘系统(英文) 被引量:2

DMGrid:A Data Mining System Based on Grid Computing
在线阅读 下载PDF
导出
摘要 数据挖掘工作面临一个问题:由于数据挖掘任务需要处理大规模数据,导致任务执行时间过长。网格计算的研究目标就是将分散的、异构的、闲置的计算机结合为一个高性能的计算机系统,因此可以利用网格系统提供的高性能计算能力来有效降低数据处理时间。提出并实现基于网格计算的数据挖掘系统——DMGrid。重点考虑了并行计算功能,同时考虑了网格计算资源的动态配置。和现存的数据挖掘网格不同的是,DMGrid提供了一个引擎来执行应用中设定的工作流,同时还提供了应用运行监控功能。最后在实验中通过设计两个应用程序(客户流失分析和客户价值分析),证明了DMGrid的可行性。 The field of data mining now confronts a common problem that data mining tasks are time-consuming in that these tasks have to process large-scale datasets. Grid computing focuses on integrating distributed, heterogeneous and idle computers from the Internet to be a service system with high performance. Thus, it is possible to take advantage of grid computing to provide high performance computation capability to effectively reduce task durations. Here, DMGrid, a grid handling data mining applications, has been successfully developed. In DMGrid, it not only considers efficient parallel computing as a crucial aspect, but also takes into account dynamic resource configuration. Unlike many existing data mining grids, DMGrid also provides an engine to execute the algorithm flow specified in an application. Moreover, it offers application of execution monitoring. At last, the feasibility of DMGrid is validated by performing experiments, and two applications are designed: Customer churning analysis and customer value analysis.
出处 《计算机科学与探索》 CSCD 2010年第2期180-190,共11页 Journal of Frontiers of Computer Science and Technology
基金 The National Natural Science Foundation of China under Grant No.60402011 the National Eleven Five-Year Scientific and Technical Support Plans of China under Grant No.2006BAH03B05~~
关键词 网格计算 数据挖掘 动态配置 工作流 运行监控 grid computing data mining dynamic configuration data flow execution monitoring
  • 相关文献

参考文献16

  • 1Foster I, Kesselman C. The grid: Blueprint for a new computing infrastructure[M]. San Francisco: Morgan-Kaufman, 1998.
  • 2Chattratichat J, Darlington J, for distributed enterprise data Guo Y, et al. An architecture mining[C]//Proceedings of the 7th Intermational Conference on High-Performance Computing and Networking (HPCN Europe' 99), 1999 : 573-582.
  • 3Hinke T H, Novotny J. Data mining on NASA's information power grid[C]//the 9th IEEE International Symposium on High Performance Distributed Computing, Pittsburgh, Pennsylvania, 2000.
  • 4Cannataro M, Talia D. The knowledge grid[J]. Communications of the ACM, 2003,46: 89-93.
  • 5Cannataro M, Pugliese A, Talia D, et al. Distributed data mining on grids: Service, tools, and applications[J]. IEEE Transactions on System, Man, and Cybernetics-Part B: Cybernetic, 2004,34(6) :2451-2465.
  • 6Jiang W, Yu J. Distributed data mining on the grid [C]// Proceedings of 2005 International Conference on Machine Learning and Cybernetics, 2005,4 : 2010-2014.
  • 7Chen P, Wang B, Xu L, et al. The design of data mining Web service architecture based on JDM in grid environment[C]// International Symposium on Pervasive Computing and Applications, 2006: 684-689.
  • 8Brezany P, Janciak I, Woehrer A, et al. Gridminer: A framework for knowledge discovery on the grid from a vision to design and implementation[C]//Cracow Grid Workshop, Cracow, December 12-15, 2004.
  • 9Perez M S, Sanchez A, Robles V, et al. Design and implementation of a data mining grid-aware architecture[J]. Future Generation Computer Systems, 2007,23:42-47.
  • 10Alessandro D, Amihai M. Virtue a formal model of virtual enterprises for information markets[J]. J Intell Inf Syst, 2008, 30( 1 ) :33-53.

同被引文献23

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部