This paper proposes an analytical mining tool for big graph data based on MapReduce and bulk synchronous parallel (BSP) com puting model. The tool is named Mapreduce and BSP based Graphmining tool (MBGM). The core...This paper proposes an analytical mining tool for big graph data based on MapReduce and bulk synchronous parallel (BSP) com puting model. The tool is named Mapreduce and BSP based Graphmining tool (MBGM). The core of this mining system are four sets of parallel graphmining algorithms programmed in the BSP parallel model and one set of data extractiontransformationload ing (ETE) algorithms implemented in MapReduce. To invoke these algorithm sets, we designed a workflow engine which optimized for cloud computing. Finally, a welldesigned data management function enables users to view, delete and input data in the Ha doop distributed file system (HDFS). Experiments on artificial data show that the components of graphmining algorithm in MBGM are efficient.展开更多
The design and implementation of a scalable parallel mining system target for big graph analysis has proven to be challenging. In this study, we propose a parallel data mining system for analyzing big graph data gener...The design and implementation of a scalable parallel mining system target for big graph analysis has proven to be challenging. In this study, we propose a parallel data mining system for analyzing big graph data generated on a Bulk Synchronous Parallel (BSP) computing model named BSP-based Parallel Graph Mining (BPGM). This system has four sets of parallel graph mining algorithms programmed in the BSP parallel model and a well-designed workflow engine optimized for cloud computing to invoke these algorithms. Experimental results show that the graph mining algorithm components in BPGM are efficient and have better performance than big cloud-based parallel data miner and BC-BSP.展开更多
基金supported by ZTE Industry-Academia-Research Cooperaton Funds
文摘This paper proposes an analytical mining tool for big graph data based on MapReduce and bulk synchronous parallel (BSP) com puting model. The tool is named Mapreduce and BSP based Graphmining tool (MBGM). The core of this mining system are four sets of parallel graphmining algorithms programmed in the BSP parallel model and one set of data extractiontransformationload ing (ETE) algorithms implemented in MapReduce. To invoke these algorithm sets, we designed a workflow engine which optimized for cloud computing. Finally, a welldesigned data management function enables users to view, delete and input data in the Ha doop distributed file system (HDFS). Experiments on artificial data show that the components of graphmining algorithm in MBGM are efficient.
基金supported by the National Key Basic Research and Department (973) Program of China (No. 2013CB329603)the National Natural Science Foundation of China (Nos. 61074128, 61375058, and 71231002)
文摘The design and implementation of a scalable parallel mining system target for big graph analysis has proven to be challenging. In this study, we propose a parallel data mining system for analyzing big graph data generated on a Bulk Synchronous Parallel (BSP) computing model named BSP-based Parallel Graph Mining (BPGM). This system has four sets of parallel graph mining algorithms programmed in the BSP parallel model and a well-designed workflow engine optimized for cloud computing to invoke these algorithms. Experimental results show that the graph mining algorithm components in BPGM are efficient and have better performance than big cloud-based parallel data miner and BC-BSP.