In recent years,with the development of the social Internet of Things(IoT),all kinds of data accumulated on the network.These data,which contain a lot of social information and opinions.However,these data are rarely f...In recent years,with the development of the social Internet of Things(IoT),all kinds of data accumulated on the network.These data,which contain a lot of social information and opinions.However,these data are rarely fully analyzed,which is a major obstacle to the intelligent development of the social IoT.In this paper,we propose a sentence similarity analysis model to analyze the similarity in people’s opinions on hot topics in social media and news pages.Most of these data are unstructured or semi-structured sentences,so the accuracy of sentence similarity analysis largely determines the model’s performance.For the purpose of improving accuracy,we propose a novel method of sentence similarity computation to extract the syntactic and semantic information of the semi-structured and unstructured sentences.We mainly consider the subjects,predicates and objects of sentence pairs and use Stanford Parser to classify the dependency relation triples to calculate the syntactic and semantic similarity between two sentences.Finally,we verify the performance of the model with the Microsoft Research Paraphrase Corpus(MRPC),which consists of 4076 pairs of training sentences and 1725 pairs of test sentences,and most of the data came from the news of social data.Extensive simulations demonstrate that our method outperforms other state-of-the-art methods regarding the correlation coefficient and the mean deviation.展开更多
Supply Chain Finance(SCF)is important for improving the effectiveness of supply chain capital operations and reducing the overall management cost of a supply chain.In recent years,with the deep integration of supply c...Supply Chain Finance(SCF)is important for improving the effectiveness of supply chain capital operations and reducing the overall management cost of a supply chain.In recent years,with the deep integration of supply chain and Internet,Big Data,Artificial Intelligence,Internet of Things,Blockchain,etc.,the efficiency of supply chain financial services can be greatly promoted through building more customized risk pricing models and conducting more rigorous investment decision-making processes.However,with the rapid development of new technologies,the SCF data has been massively increased and new financial fraud behaviors or patterns are becoming more covertly scattered among normal ones.The lack of enough capability to handle the big data volumes and mitigate the financial frauds may lead to huge losses in supply chains.In this article,a distributed approach of big data mining is proposed for financial fraud detection in a supply chain,which implements the distributed deep learning model of Convolutional Neural Network(CNN)on big data infrastructure of Apache Spark and Hadoop to speed up the processing of the large dataset in parallel and reduce the processing time significantly.By training and testing on the continually updated SCF dataset,the approach can intelligently and automatically classify the massive data samples and discover the fraudulent financing behaviors,so as to enhance the financial fraud detection with high precision and recall rates,and reduce the losses of frauds in a supply chain.展开更多
Information disclosure can reduce information asymmetry between health care providers and patients, thus improving both patient safety and medical quality. The National Bureau of Health Insurance (NBHI) inTaiwancurren...Information disclosure can reduce information asymmetry between health care providers and patients, thus improving both patient safety and medical quality. The National Bureau of Health Insurance (NBHI) inTaiwancurrently publishes health-related information online in order to enhance service efficiency and enable the public to monitor the country’s medical system. A data mining technique, classification and regression tree (CART), is used in this work to investigate online public quality information to compare the characteristics of hospital. The hospital quality indicators and characteristics data are available on the websites of the NBHI (http://www.nhi.gov.tw/AmountInfoWeb/Index.aspx) and the Department of Health (http://www.doh.gov.tw/). The full classification and regression tree presented in this work, grown using the hospitals’ quality medical indicators and characteristic values, classifies all hospitals into seven groups. The rate of stays longer than 30 days, which is the dependent variable in this study, is most influenced by the number of medical staff. This reflects the fact that the fewer medical staffs that are employed, the smaller the hospital is, and patients who are likely to have longer stays tend to go to the medium or large hospitals. Policy makers should work to decrease or eliminate persistent healthcare disparities among different socioeconomic groups and offer more online healthrelated services to reduce information asymmetry between health care providers and patients.展开更多
This paper makes a brief description of the definition and methods of data mining.It describes the characteristics of agricultural data(value delivery,specialization,spatio-temporal bidimensionality)and the status of ...This paper makes a brief description of the definition and methods of data mining.It describes the characteristics of agricultural data(value delivery,specialization,spatio-temporal bidimensionality)and the status of application of data mining technology in agriculture.展开更多
Many high quality studies have emerged from public databases,such as Surveillance,Epidemiology,and End Results(SEER),National Health and Nutrition Examination Survey(NHANES),The Cancer Genome Atlas(TCGA),and Medical I...Many high quality studies have emerged from public databases,such as Surveillance,Epidemiology,and End Results(SEER),National Health and Nutrition Examination Survey(NHANES),The Cancer Genome Atlas(TCGA),and Medical Information Mart for Intensive Care(MIMIC);however,these data are often characterized by a high degree of dimensional heterogeneity,timeliness,scarcity,irregularity,and other characteristics,resulting in the value of these data not being fully utilized.Data-mining technology has been a frontier field in medical research,as it demonstrates excellent performance in evaluating patient risks and assisting clinical decision-making in building disease-prediction models.Therefore,data mining has unique advantages in clinical big-data research,especially in large-scale medical public databases.This article introduced the main medical public database and described the steps,tasks,and models of data mining in simple language.Additionally,we described data-mining methods along with their practical applications.The goal of this work was to aid clinical researchers in gaining a clear and intuitive understanding of the application of data-mining technology on clinical big-data in order to promote the production of research results that are beneficial to doctors and patients.展开更多
A new method of establishing rolling load distribution model was developed by online intelligent information-processing technology for plate rolling. The model combines knowledge model and mathematical model with usin...A new method of establishing rolling load distribution model was developed by online intelligent information-processing technology for plate rolling. The model combines knowledge model and mathematical model with using knowledge discovery in database (KDD) and data mining (DM) as the start. The online maintenance and optimization of the load model are realized. The effectiveness of this new method was testified by offline simulation and online application.展开更多
Protecting the privacy of data in the multi-cloud is a crucial task.Data mining is a technique that protects the privacy of individual data while mining those data.The most significant task entails obtaining data from...Protecting the privacy of data in the multi-cloud is a crucial task.Data mining is a technique that protects the privacy of individual data while mining those data.The most significant task entails obtaining data from numerous remote databases.Mining algorithms can obtain sensitive information once the data is in the data warehouse.Many traditional algorithms/techniques promise to provide safe data transfer,storing,and retrieving over the cloud platform.These strategies are primarily concerned with protecting the privacy of user data.This study aims to present data mining with privacy protection(DMPP)using precise elliptic curve cryptography(PECC),which builds upon that algebraic elliptic curve infinitefields.This approach enables safe data exchange by utilizing a reliable data consolidation approach entirely reliant on rewritable data concealing techniques.Also,it outperforms data mining in terms of solid privacy procedures while maintaining the quality of the data.Average approximation error,computational cost,anonymizing time,and data loss are considered performance measures.The suggested approach is practical and applicable in real-world situations according to the experimentalfindings.展开更多
Research and application of big data mining,at present,is a hot issue. This paper briefly introduces the basic ideas of big data research, analyses the necessity of big data application in earthquake precursor observa...Research and application of big data mining,at present,is a hot issue. This paper briefly introduces the basic ideas of big data research, analyses the necessity of big data application in earthquake precursor observation,and probes certain issues and solutions when applying this technology to work in the seismic-related domain. By doing so,we hope it can promote the innovative use of big data in earthquake precursor observation data analysis.展开更多
Data mining involves extracting information from large data sets,discovering the hidden relationships and unknown dependencies,and supporting strategic decision-making tasks.The alignment of data mining and business w...Data mining involves extracting information from large data sets,discovering the hidden relationships and unknown dependencies,and supporting strategic decision-making tasks.The alignment of data mining and business would bring benefits to the organization's management.The study investigated the adoption of data mining technologies in managerial accounting system,concentrating on the challenges and opportunities.The research showed that with the technology adoption,managerial functions could be improved and current information system could be upgraded.Since the technical progresses are reshaping the world of business and accountancy,it is significant for accountants and finance professionals to exploit information technologies.展开更多
Energy Internet is deeply integrated by Internet concept, information technology and energy industry, and Energy Internet Big Data are one of core technologies that achieve energy-information-economic interconnection ...Energy Internet is deeply integrated by Internet concept, information technology and energy industry, and Energy Internet Big Data are one of core technologies that achieve energy-information-economic interconnection and improve the development and evolution of Energy Internet. This paper describes the concept and characteristics of Energy Internet Big Data, and feasibility of applying Energy Internet Big Data to integrated energy market. On this basis, as for integrated energy market and multi-subjects of Energy Internet, typical application and technical system based on Energy Internet Big Data in integrated energy market is put forward, which provides a reference for the analysis and decision of integrated energy market in Energy Internet.展开更多
基金supported by the Major Scientific and Technological Projects of CNPC under Grant ZD2019-183-006partially supported by the Shandong Provincial Natural Science Foundation,China under Grant ZR2020MF006partially supported by“the Fundamental Research Funds for the Central Universities”of China University of Petroleum(East China)under Grant 20CX05017A,18CX02139A.
文摘In recent years,with the development of the social Internet of Things(IoT),all kinds of data accumulated on the network.These data,which contain a lot of social information and opinions.However,these data are rarely fully analyzed,which is a major obstacle to the intelligent development of the social IoT.In this paper,we propose a sentence similarity analysis model to analyze the similarity in people’s opinions on hot topics in social media and news pages.Most of these data are unstructured or semi-structured sentences,so the accuracy of sentence similarity analysis largely determines the model’s performance.For the purpose of improving accuracy,we propose a novel method of sentence similarity computation to extract the syntactic and semantic information of the semi-structured and unstructured sentences.We mainly consider the subjects,predicates and objects of sentence pairs and use Stanford Parser to classify the dependency relation triples to calculate the syntactic and semantic similarity between two sentences.Finally,we verify the performance of the model with the Microsoft Research Paraphrase Corpus(MRPC),which consists of 4076 pairs of training sentences and 1725 pairs of test sentences,and most of the data came from the news of social data.Extensive simulations demonstrate that our method outperforms other state-of-the-art methods regarding the correlation coefficient and the mean deviation.
基金This research work is supported by Hunan Provincial Education Science 13th Five-Year Plan(Grant No.XJK016BXX001,Zhou,H.,http://jyt.hunan.gov.cn/jyt/sjyt/jky/index.html)Social Science Foundation of Hunan Province(Grant No.17YBA049,Zhou,H.,https://sk.rednet.cn/channel/7862.html)The work is also supported by Open Foundation for University Innovation Platform from Hunan Province,China(Grand No.18K103,Sun,G.,http://kxjsc.gov.hnedu.cn/).
文摘Supply Chain Finance(SCF)is important for improving the effectiveness of supply chain capital operations and reducing the overall management cost of a supply chain.In recent years,with the deep integration of supply chain and Internet,Big Data,Artificial Intelligence,Internet of Things,Blockchain,etc.,the efficiency of supply chain financial services can be greatly promoted through building more customized risk pricing models and conducting more rigorous investment decision-making processes.However,with the rapid development of new technologies,the SCF data has been massively increased and new financial fraud behaviors or patterns are becoming more covertly scattered among normal ones.The lack of enough capability to handle the big data volumes and mitigate the financial frauds may lead to huge losses in supply chains.In this article,a distributed approach of big data mining is proposed for financial fraud detection in a supply chain,which implements the distributed deep learning model of Convolutional Neural Network(CNN)on big data infrastructure of Apache Spark and Hadoop to speed up the processing of the large dataset in parallel and reduce the processing time significantly.By training and testing on the continually updated SCF dataset,the approach can intelligently and automatically classify the massive data samples and discover the fraudulent financing behaviors,so as to enhance the financial fraud detection with high precision and recall rates,and reduce the losses of frauds in a supply chain.
文摘Information disclosure can reduce information asymmetry between health care providers and patients, thus improving both patient safety and medical quality. The National Bureau of Health Insurance (NBHI) inTaiwancurrently publishes health-related information online in order to enhance service efficiency and enable the public to monitor the country’s medical system. A data mining technique, classification and regression tree (CART), is used in this work to investigate online public quality information to compare the characteristics of hospital. The hospital quality indicators and characteristics data are available on the websites of the NBHI (http://www.nhi.gov.tw/AmountInfoWeb/Index.aspx) and the Department of Health (http://www.doh.gov.tw/). The full classification and regression tree presented in this work, grown using the hospitals’ quality medical indicators and characteristic values, classifies all hospitals into seven groups. The rate of stays longer than 30 days, which is the dependent variable in this study, is most influenced by the number of medical staff. This reflects the fact that the fewer medical staffs that are employed, the smaller the hospital is, and patients who are likely to have longer stays tend to go to the medium or large hospitals. Policy makers should work to decrease or eliminate persistent healthcare disparities among different socioeconomic groups and offer more online healthrelated services to reduce information asymmetry between health care providers and patients.
文摘This paper makes a brief description of the definition and methods of data mining.It describes the characteristics of agricultural data(value delivery,specialization,spatio-temporal bidimensionality)and the status of application of data mining technology in agriculture.
基金the National Social Science Foundation of China(No.16BGL183).
文摘Many high quality studies have emerged from public databases,such as Surveillance,Epidemiology,and End Results(SEER),National Health and Nutrition Examination Survey(NHANES),The Cancer Genome Atlas(TCGA),and Medical Information Mart for Intensive Care(MIMIC);however,these data are often characterized by a high degree of dimensional heterogeneity,timeliness,scarcity,irregularity,and other characteristics,resulting in the value of these data not being fully utilized.Data-mining technology has been a frontier field in medical research,as it demonstrates excellent performance in evaluating patient risks and assisting clinical decision-making in building disease-prediction models.Therefore,data mining has unique advantages in clinical big-data research,especially in large-scale medical public databases.This article introduced the main medical public database and described the steps,tasks,and models of data mining in simple language.Additionally,we described data-mining methods along with their practical applications.The goal of this work was to aid clinical researchers in gaining a clear and intuitive understanding of the application of data-mining technology on clinical big-data in order to promote the production of research results that are beneficial to doctors and patients.
文摘A new method of establishing rolling load distribution model was developed by online intelligent information-processing technology for plate rolling. The model combines knowledge model and mathematical model with using knowledge discovery in database (KDD) and data mining (DM) as the start. The online maintenance and optimization of the load model are realized. The effectiveness of this new method was testified by offline simulation and online application.
文摘Protecting the privacy of data in the multi-cloud is a crucial task.Data mining is a technique that protects the privacy of individual data while mining those data.The most significant task entails obtaining data from numerous remote databases.Mining algorithms can obtain sensitive information once the data is in the data warehouse.Many traditional algorithms/techniques promise to provide safe data transfer,storing,and retrieving over the cloud platform.These strategies are primarily concerned with protecting the privacy of user data.This study aims to present data mining with privacy protection(DMPP)using precise elliptic curve cryptography(PECC),which builds upon that algebraic elliptic curve infinitefields.This approach enables safe data exchange by utilizing a reliable data consolidation approach entirely reliant on rewritable data concealing techniques.Also,it outperforms data mining in terms of solid privacy procedures while maintaining the quality of the data.Average approximation error,computational cost,anonymizing time,and data loss are considered performance measures.The suggested approach is practical and applicable in real-world situations according to the experimentalfindings.
基金sponsored by the Earthquake Monitoring Special Project of "Precursor Observation Data Mining",Key Laboratory of Crustal Dynamics,Institute of Crustal Dynamics,China Earthquake Administration
文摘Research and application of big data mining,at present,is a hot issue. This paper briefly introduces the basic ideas of big data research, analyses the necessity of big data application in earthquake precursor observation,and probes certain issues and solutions when applying this technology to work in the seismic-related domain. By doing so,we hope it can promote the innovative use of big data in earthquake precursor observation data analysis.
文摘Data mining involves extracting information from large data sets,discovering the hidden relationships and unknown dependencies,and supporting strategic decision-making tasks.The alignment of data mining and business would bring benefits to the organization's management.The study investigated the adoption of data mining technologies in managerial accounting system,concentrating on the challenges and opportunities.The research showed that with the technology adoption,managerial functions could be improved and current information system could be upgraded.Since the technical progresses are reshaping the world of business and accountancy,it is significant for accountants and finance professionals to exploit information technologies.
文摘Energy Internet is deeply integrated by Internet concept, information technology and energy industry, and Energy Internet Big Data are one of core technologies that achieve energy-information-economic interconnection and improve the development and evolution of Energy Internet. This paper describes the concept and characteristics of Energy Internet Big Data, and feasibility of applying Energy Internet Big Data to integrated energy market. On this basis, as for integrated energy market and multi-subjects of Energy Internet, typical application and technical system based on Energy Internet Big Data in integrated energy market is put forward, which provides a reference for the analysis and decision of integrated energy market in Energy Internet.