In this paper, we conduct research on the structured data mining algorithm and applications on machine learning field. Various fields due to the advancement of informatization and digitization, a lot of multi-source a...In this paper, we conduct research on the structured data mining algorithm and applications on machine learning field. Various fields due to the advancement of informatization and digitization, a lot of multi-source and heterogeneous data distributed storage, in order to achieve the sharing, we must solve from the storage management to the interoperability of a series of mechanism, the method and implementation technology. Unstructured data does not have strict structure, therefore, compared with structured information that is more difficult to standardization, with management more difficult. According to these characteristics, the large capacity of unstructured data or using files separately store, is stored in the database index of similar pointer. Under this background, we propose the new idea on the structured data mining algorithm that is meaningful.展开更多
Particle Swarm Optimization(PSO)has been utilized as a useful tool for solving intricate optimization problems for various applications in different fields.This paper attempts to carry out an update on PSO and gives a...Particle Swarm Optimization(PSO)has been utilized as a useful tool for solving intricate optimization problems for various applications in different fields.This paper attempts to carry out an update on PSO and gives a review of its recent developments and applications,but also provides arguments for its efficacy in resolving optimization problems in comparison with other algorithms.Covering six strategic areas,which include Data Mining,Machine Learning,Engineering Design,Energy Systems,Healthcare,and Robotics,the study demonstrates the versatility and effectiveness of the PSO.Experimental results are,however,used to show the strong and weak parts of PSO,and performance results are included in tables for ease of comparison.The results stress PSO’s efficiency in providing optimal solutions but also show that there are aspects that need to be improved through combination with algorithms or tuning to the parameters of the method.The review of the advantages and limitations of PSO is intended to provide academics and practitioners with a well-rounded view of the methods of employing such a tool most effectively and to encourage optimized designs of PSO in solving theoretical and practical problems in the future.展开更多
Improvement on mining the frequently visited groups of web pages was studied. First, in the data preprocessing phrase, we introduce an extra frame filtering step that reduces the negative influence of frame pages on t...Improvement on mining the frequently visited groups of web pages was studied. First, in the data preprocessing phrase, we introduce an extra frame filtering step that reduces the negative influence of frame pages on the result page groups. Through recognizing the frame pages in the site documents and constructing the frame subframe relation set, the subframe pages that influence the final mining result can be efficiently filtered. Second, we enhance the mining algorithm with the consideration of both the site topology and the content of the web pages. By the introduction of the normalized content link ratio of the web page and the group interlink degree of the page group, the enhanced algorithm concentrates more on the content pages that are less interlinked together. The experiments show that the new approach can effectively reveal more interesting page groups, which would not be found without these enhancements.展开更多
The most common way to analyze economics data is to use statistics software and spreadsheets.The paper presents opportunities of modern Geographical Information System (GIS) for analysis of marketing, statistical, a...The most common way to analyze economics data is to use statistics software and spreadsheets.The paper presents opportunities of modern Geographical Information System (GIS) for analysis of marketing, statistical, and macroeconomic data. It considers existing tools and models and their applications in various sectors. The advantage is that the statistical data could be combined with geographic views, maps and also additional data derived from the GIS. As a result, a programming system is developed, using GIS for analysis of marketing, statistical, macroeconomic data, and risk assessment in real time and prevention. The system has been successfully implemented as web-based software application designed for use with a variety of hardware platforms (mobile devices, laptops, and desktop computers). The software is mainly written in the programming language Python, which offers a better structure and supports for the development of large applications. Optimization of the analysis, visualization of macroeconomic, and statistical data by region for different business research are achieved. The system is designed with Geographical Information System for settlements in their respective countries and regions. Information system integration with external software packages for statistical calculations and analysis is implemented in order to share data analyzing, processing, and forecasting. Technologies and processes for loading data from different sources and tools for data analysis are developed. The successfully developed system allows implementation of qualitative data analysis.展开更多
介绍了一种 Web 挖掘的分类,包括 Web 内容挖掘、Web 结构挖掘和 Web 使用挖掘。讨论了 Web 使用挖掘过程的三个步骤,即数据获取与数据预处理、模式发现和模式分析,详细分析了每一个步骤中所使用的技术。指出了目前 Web 使用挖掘研究存...介绍了一种 Web 挖掘的分类,包括 Web 内容挖掘、Web 结构挖掘和 Web 使用挖掘。讨论了 Web 使用挖掘过程的三个步骤,即数据获取与数据预处理、模式发现和模式分析,详细分析了每一个步骤中所使用的技术。指出了目前 Web 使用挖掘研究存在的不足,给出了 Web 使用挖掘未来的研究方向。展开更多
文摘In this paper, we conduct research on the structured data mining algorithm and applications on machine learning field. Various fields due to the advancement of informatization and digitization, a lot of multi-source and heterogeneous data distributed storage, in order to achieve the sharing, we must solve from the storage management to the interoperability of a series of mechanism, the method and implementation technology. Unstructured data does not have strict structure, therefore, compared with structured information that is more difficult to standardization, with management more difficult. According to these characteristics, the large capacity of unstructured data or using files separately store, is stored in the database index of similar pointer. Under this background, we propose the new idea on the structured data mining algorithm that is meaningful.
文摘Particle Swarm Optimization(PSO)has been utilized as a useful tool for solving intricate optimization problems for various applications in different fields.This paper attempts to carry out an update on PSO and gives a review of its recent developments and applications,but also provides arguments for its efficacy in resolving optimization problems in comparison with other algorithms.Covering six strategic areas,which include Data Mining,Machine Learning,Engineering Design,Energy Systems,Healthcare,and Robotics,the study demonstrates the versatility and effectiveness of the PSO.Experimental results are,however,used to show the strong and weak parts of PSO,and performance results are included in tables for ease of comparison.The results stress PSO’s efficiency in providing optimal solutions but also show that there are aspects that need to be improved through combination with algorithms or tuning to the parameters of the method.The review of the advantages and limitations of PSO is intended to provide academics and practitioners with a well-rounded view of the methods of employing such a tool most effectively and to encourage optimized designs of PSO in solving theoretical and practical problems in the future.
文摘Improvement on mining the frequently visited groups of web pages was studied. First, in the data preprocessing phrase, we introduce an extra frame filtering step that reduces the negative influence of frame pages on the result page groups. Through recognizing the frame pages in the site documents and constructing the frame subframe relation set, the subframe pages that influence the final mining result can be efficiently filtered. Second, we enhance the mining algorithm with the consideration of both the site topology and the content of the web pages. By the introduction of the normalized content link ratio of the web page and the group interlink degree of the page group, the enhanced algorithm concentrates more on the content pages that are less interlinked together. The experiments show that the new approach can effectively reveal more interesting page groups, which would not be found without these enhancements.
文摘The most common way to analyze economics data is to use statistics software and spreadsheets.The paper presents opportunities of modern Geographical Information System (GIS) for analysis of marketing, statistical, and macroeconomic data. It considers existing tools and models and their applications in various sectors. The advantage is that the statistical data could be combined with geographic views, maps and also additional data derived from the GIS. As a result, a programming system is developed, using GIS for analysis of marketing, statistical, macroeconomic data, and risk assessment in real time and prevention. The system has been successfully implemented as web-based software application designed for use with a variety of hardware platforms (mobile devices, laptops, and desktop computers). The software is mainly written in the programming language Python, which offers a better structure and supports for the development of large applications. Optimization of the analysis, visualization of macroeconomic, and statistical data by region for different business research are achieved. The system is designed with Geographical Information System for settlements in their respective countries and regions. Information system integration with external software packages for statistical calculations and analysis is implemented in order to share data analyzing, processing, and forecasting. Technologies and processes for loading data from different sources and tools for data analysis are developed. The successfully developed system allows implementation of qualitative data analysis.
文摘介绍了一种 Web 挖掘的分类,包括 Web 内容挖掘、Web 结构挖掘和 Web 使用挖掘。讨论了 Web 使用挖掘过程的三个步骤,即数据获取与数据预处理、模式发现和模式分析,详细分析了每一个步骤中所使用的技术。指出了目前 Web 使用挖掘研究存在的不足,给出了 Web 使用挖掘未来的研究方向。