Recently,a massive quantity of data is being produced from a distinct number of sources and the size of the daily created on the Internet has crossed two Exabytes.At the same time,clustering is one of the efficient te...Recently,a massive quantity of data is being produced from a distinct number of sources and the size of the daily created on the Internet has crossed two Exabytes.At the same time,clustering is one of the efficient techniques for mining big data to extract the useful and hidden patterns that exist in it.Density-based clustering techniques have gained significant attention owing to the fact that it helps to effectively recognize complex patterns in spatial dataset.Big data clustering is a trivial process owing to the increasing quantity of data which can be solved by the use of Map Reduce tool.With this motivation,this paper presents an efficient Map Reduce based hybrid density based clustering and classification algorithm for big data analytics(MR-HDBCC).The proposed MR-HDBCC technique is executed on Map Reduce tool for handling the big data.In addition,the MR-HDBCC technique involves three distinct processes namely pre-processing,clustering,and classification.The proposed model utilizes the Density-Based Spatial Clustering of Applications with Noise(DBSCAN)techni-que which is capable of detecting random shapes and diverse clusters with noisy data.For improving the performance of the DBSCAN technique,a hybrid model using cockroach swarm optimization(CSO)algorithm is developed for the exploration of the search space and determine the optimal parameters for density based clustering.Finally,bidirectional gated recurrent neural network(BGRNN)is employed for the classification of big data.The experimental validation of the proposed MR-HDBCC technique takes place using the benchmark dataset and the simulation outcomes demonstrate the promising performance of the proposed model interms of different measures.展开更多
The maturity of big data analysis theory and its tools improve the efficiency and reduce the cost of massive data mining.This paper discusses the method of product customer demand mining based on big data,and further ...The maturity of big data analysis theory and its tools improve the efficiency and reduce the cost of massive data mining.This paper discusses the method of product customer demand mining based on big data,and further studies the configuration of product function attributes.Firstly,the Hadoop platform was used to perform product attribute data participle and feature word extraction based on Apriori algorithm was used to mine product customer demand information.And then the MapReduce model on the big data platform was applied into efficient parallel data processing,obtaining product attributes with research value,and their weights and attribute levels.After that,the cloud model and the MNL model were employed to construct the product function attribute configuration model,and the improved artificial bee colony algorithm was used to solve the model.The optimal solution of the product function attribute configuration model was got.Finally,an example was given to illustrate the feasibility of the proposed method in this paper.展开更多
基金supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute(KHIDI),funded by the Ministry of Health&Welfare,Republic of Korea(Grant Number:HI21C1831)the Soonchunhyang University Research Fund.
文摘Recently,a massive quantity of data is being produced from a distinct number of sources and the size of the daily created on the Internet has crossed two Exabytes.At the same time,clustering is one of the efficient techniques for mining big data to extract the useful and hidden patterns that exist in it.Density-based clustering techniques have gained significant attention owing to the fact that it helps to effectively recognize complex patterns in spatial dataset.Big data clustering is a trivial process owing to the increasing quantity of data which can be solved by the use of Map Reduce tool.With this motivation,this paper presents an efficient Map Reduce based hybrid density based clustering and classification algorithm for big data analytics(MR-HDBCC).The proposed MR-HDBCC technique is executed on Map Reduce tool for handling the big data.In addition,the MR-HDBCC technique involves three distinct processes namely pre-processing,clustering,and classification.The proposed model utilizes the Density-Based Spatial Clustering of Applications with Noise(DBSCAN)techni-que which is capable of detecting random shapes and diverse clusters with noisy data.For improving the performance of the DBSCAN technique,a hybrid model using cockroach swarm optimization(CSO)algorithm is developed for the exploration of the search space and determine the optimal parameters for density based clustering.Finally,bidirectional gated recurrent neural network(BGRNN)is employed for the classification of big data.The experimental validation of the proposed MR-HDBCC technique takes place using the benchmark dataset and the simulation outcomes demonstrate the promising performance of the proposed model interms of different measures.
基金the National Natural Science Foundation of China granted 71961005the Guangxi Science and Technology Program granted 1598007-15.
文摘The maturity of big data analysis theory and its tools improve the efficiency and reduce the cost of massive data mining.This paper discusses the method of product customer demand mining based on big data,and further studies the configuration of product function attributes.Firstly,the Hadoop platform was used to perform product attribute data participle and feature word extraction based on Apriori algorithm was used to mine product customer demand information.And then the MapReduce model on the big data platform was applied into efficient parallel data processing,obtaining product attributes with research value,and their weights and attribute levels.After that,the cloud model and the MNL model were employed to construct the product function attribute configuration model,and the improved artificial bee colony algorithm was used to solve the model.The optimal solution of the product function attribute configuration model was got.Finally,an example was given to illustrate the feasibility of the proposed method in this paper.