OBJECTIVE: To help researchers selecting appropriate data mining models to provide better evidence for the clinical practice of Traditional Chinese Medicine(TCM) diagnosis and therapy.METHODS: Clinical issues based on...OBJECTIVE: To help researchers selecting appropriate data mining models to provide better evidence for the clinical practice of Traditional Chinese Medicine(TCM) diagnosis and therapy.METHODS: Clinical issues based on data mining models were comprehensively summarized from four significant elements of the clinical studies:symptoms, symptom patterns, herbs, and efficacy.Existing problems were further generalized to determine the relevant factors of the performance of data mining models, e.g. data type, samples, parameters, variable labels. Combining these relevant factors, the TCM clinical data features were compared with regards to statistical characters and informatics properties. Data models were compared simultaneously from the view of applied conditions and suitable scopes.RESULTS: The main application problems were the inconsistent data type and the small samples for the used data mining models, which caused the inappropriate results, even the mistake results. These features, i.e. advantages, disadvantages, satisfied data types, tasks of data mining, and the TCM issues, were summarized and compared.CONCLUSION: By aiming at the special features of different data mining models, the clinical doctors could select the suitable data mining models to resolve the TCM problem.展开更多
Cooling process of iron ore pellets in a circular cooler has great impacts on the pellet quality and systematic energy exploitation. However, multi-variables and non-visualization of this gray system is unfavorable to...Cooling process of iron ore pellets in a circular cooler has great impacts on the pellet quality and systematic energy exploitation. However, multi-variables and non-visualization of this gray system is unfavorable to efficient production. Thus, the cooling process of iron ore pellets was optimized using mathematical model and data mining techniques. A mathematical model was established and validated by steady-state production data, and the results show that the calculated values coincide very well with the measured values. Based on the proposed model, effects of important process parameters on gas-pellet temperature profiles within the circular cooler were analyzed to better understand the entire cooling process. Two data mining techniques—Association Rules Induction and Clustering were also applied on the steady-state production data to obtain expertise operating rules and optimized targets. Finally, an optimized control strategy for the circular cooler was proposed and an operation guidance system was developed. The system could realize the visualization of thermal process at steady state and provide operation guidance to optimize the circular cooler.展开更多
An experience is presented using the finite element method (FEM) and data mining (DM) techniques to develop models that can be used to optimieze the skin-pass rolling process based on its operating conditions. A F...An experience is presented using the finite element method (FEM) and data mining (DM) techniques to develop models that can be used to optimieze the skin-pass rolling process based on its operating conditions. A FE model based on a real skin-pass process is built and validated. Based on this model, a group of FE models is simulated with different adjustment parameters and with different materials for the sheet; both variables are chosen from pre-set ranges, From all FE model simulations, a database is generated; this database is made up of the above mentioned adjustment parameters, sheet properties and the variables of the process arising from the simulation of the model. Various types of data mining algorithms are used to develop predictive models for each of the variables of the process.The best predictive models can be used to predict experimentally hard-to-measure variables (internal stresses, internal straine, etc.) which are useful in the optimal design of the process or to be applied in real time control systems of a skin-pass process in -plant.展开更多
In order to find an effective way to improve the quality of school management,finding valuable information from students' original data and providing feedback for student management are necessary. Firstly,some new...In order to find an effective way to improve the quality of school management,finding valuable information from students' original data and providing feedback for student management are necessary. Firstly,some new and successful educational data mining models were analyzed and compared. These models have better performance than traditional models( such as Knowledge Tracing Model) in efficiency,comprehensiveness,ease of use,stability and so on. Then,the neural network algorithm was conducted to explore the feasibility of the application of educational data mining in student management,and the results show that it has enough predictive accuracy and reliability to be put into practice. In the end,the possibility and prospect of the application of educational data mining in teaching management system for university students was assessed.展开更多
The intensity of environmental regulation (ERI) affects the short-term effect of the level of green mining (GML),and which structure determines the long-term mechanism.Based on the panel data from 2001 to 2015,with th...The intensity of environmental regulation (ERI) affects the short-term effect of the level of green mining (GML),and which structure determines the long-term mechanism.Based on the panel data from 2001 to 2015,with the dynamic panel model and system GMM estimation method were employed to test the influence of heterogeneous environmental regulation on green mining and its transmission mechanism.The results show that,there is a 'U' type nonlinear relationship between the ERI and GML.The direct effect of command-control-based (CAC) and the market incentive-based (MBI) environmental regulation on green development of mining shows the characteristics of inhibition and promotion.There is a 'U' type of indirectly moderating effect between technological innovation and the energy consumption structure on the GML.The technological innovation promotes the green development of the mining industry only after pass the inflection point of MBI,while the CAC plays a significant guiding role in upgrading of the energy consumption structure.There is an inhibition and promotion effect of MBI on the GML in the southeast coastal area,and the CAC is not significantly.Meanwhile,both of the ERI shows no positive effects in the central and western inland region.展开更多
The high temperature dielectrics of Quartz fiber-reinforced silicon dioxide ceramic (Si02/SiO2 ) composites were studied both theoretically and experimentally. A multi-scale theoretical model was developed based on ...The high temperature dielectrics of Quartz fiber-reinforced silicon dioxide ceramic (Si02/SiO2 ) composites were studied both theoretically and experimentally. A multi-scale theoretical model was developed based on the theory of dielectrics. It was realized to predict dielectric properties at higher temperature ( 〉 1200 ℃) by experimental data mining for correlative coefficients in model. The results show that the dielectrics of SiO2/SiO2, which were calculated with the theoretical model, were in agreement with experimental measured value.展开更多
Data Mining has become an important technique for the exploration and extraction of data in numerous and various research projects in different fields (technology, information technology, business, the environment, ec...Data Mining has become an important technique for the exploration and extraction of data in numerous and various research projects in different fields (technology, information technology, business, the environment, economics, etc.). In the context of the analysis and visualisation of large amounts of data extracted using Data Mining on a temporary basis (time-series), free software such as R has appeared in the international context as a perfect inexpensive and efficient tool of exploitation and visualisation of time series. This has allowed the development of models, which help to extract the most relevant information from large volumes of data. In this regard, a script has been developed with the goal of implementing ARIMA models, showing these as useful and quick mechanisms for the extraction, analysis and visualisation of large data volumes, in addition to presenting the great advantage of being applied in multiple branches of knowledge from economy, demography, physics, mathematics and fisheries among others. Therefore, ARIMA models appear as a Data Mining technique, offering reliable, robust and high-quality results, to help validate and sustain the research carried out.展开更多
Facing the development of future 5 G, the emerging technologies such as Internet of things, big data, cloud computing, and artificial intelligence is enhancing an explosive growth in data traffic. Radical changes in c...Facing the development of future 5 G, the emerging technologies such as Internet of things, big data, cloud computing, and artificial intelligence is enhancing an explosive growth in data traffic. Radical changes in communication theory and implement technologies, the wireless communications and wireless networks have entered a new era. Among them, wireless big data(WBD) has tremendous value, and artificial intelligence(AI) gives unthinkable possibilities. However, in the big data development and artificial intelligence application groups, the lack of a sound theoretical foundation and mathematical methods is regarded as a real challenge that needs to be solved. From the basic problem of wireless communication, the interrelationship of demand, environment and ability, this paper intends to investigate the concept and data model of WBD, the wireless data mining, the wireless knowledge and wireless knowledge learning(WKL), and typical practices examples, to facilitate and open up more opportunities of WBD research and developments. Such research is beneficial for creating new theoretical foundation and emerging technologies of future wireless communications.展开更多
For the multi-mode radar working in the modern electronicbattlefield, different working states of one single radar areprone to being classified as multiple emitters when adoptingtraditional classification methods to p...For the multi-mode radar working in the modern electronicbattlefield, different working states of one single radar areprone to being classified as multiple emitters when adoptingtraditional classification methods to process intercepted signals,which has a negative effect on signal classification. A classificationmethod based on spatial data mining is presented to address theabove challenge. Inspired by the idea of spatial data mining, theclassification method applies nuclear field to depicting the distributioninformation of pulse samples in feature space, and digs out thehidden cluster information by analyzing distribution characteristics.In addition, a membership-degree criterion to quantify the correlationamong all classes is established, which ensures classificationaccuracy of signal samples. Numerical experiments show that thepresented method can effectively prevent different working statesof multi-mode emitter from being classified as several emitters,and achieve higher classification accuracy.展开更多
The veracity of land evaluation is tightly related to the reasonable weights of land evaluation fac- tors. By mapping qualitative linguistic words into a fine-changeable cloud drops and translating the uncertain facto...The veracity of land evaluation is tightly related to the reasonable weights of land evaluation fac- tors. By mapping qualitative linguistic words into a fine-changeable cloud drops and translating the uncertain factor conditions into quantitative values with the uncertain illation based on cloud model, and then, inte- grating correlation analysis, a new way of figuring out the weight of land evaluation factors is proposed. It may solve the limitations of the conventional ways.展开更多
In the electron beam selective melting(EBSM)process,the quality of each deposited melt track has an effect on the properties of the manufactured component.However,the formation of the melt track is governed by various...In the electron beam selective melting(EBSM)process,the quality of each deposited melt track has an effect on the properties of the manufactured component.However,the formation of the melt track is governed by various physical phenomena and influenced by various process parameters,and the correlation of these parameters is complicated and difficult to establish experimentally.The mesoscopic modeling technique was recently introduced as a means of simulating the electron beam(EB)melting process and revealing the formation mechanisms of specific melt track morphologies.However,the correlation between the process parameters and the melt track features has not yet been quantitatively understood.This paper investigates the morphological features of the melt track from the results of mesoscopic simulation,while introducing key descriptive indexes such as melt track width and height in order to numerically assess the deposition quality.The effects of various processing parameters are also quantitatively investigated,and the correlation between the processing conditions and the melt track features is thereby derived.Finally,a simulation-driven optimization framework consisting of mesoscopic modeling and data mining is proposed,and its potential and limitations are discussed.展开更多
Data mining in the educational field can be used to optimize the teaching and learning performance among the students.The recently developed machine learning(ML)and deep learning(DL)approaches can be utilized to mine ...Data mining in the educational field can be used to optimize the teaching and learning performance among the students.The recently developed machine learning(ML)and deep learning(DL)approaches can be utilized to mine the data effectively.This study proposes an Improved Sailfish Optimizer-based Feature SelectionwithOptimal Stacked Sparse Autoencoder(ISOFS-OSSAE)for data mining and pattern recognition in the educational sector.The proposed ISOFS-OSSAE model aims to mine the educational data and derive decisions based on the feature selection and classification process.Moreover,the ISOFS-OSSAEmodel involves the design of the ISOFS technique to choose an optimal subset of features.Moreover,the swallow swarm optimization(SSO)with the SSAE model is derived to perform the classification process.To showcase the enhanced outcomes of the ISOFSOSSAE model,a wide range of experiments were taken place on a benchmark dataset from the University of California Irvine(UCI)Machine Learning Repository.The simulation results pointed out the improved classification performance of the ISOFS-OSSAE model over the recent state of art approaches interms of different performance measures.展开更多
The traditional generalization-based knowledge discovery method is introduced. A new kind of multilevel spatial association of the rules mining method based on the cloud model is presented. The cloud model integrates ...The traditional generalization-based knowledge discovery method is introduced. A new kind of multilevel spatial association of the rules mining method based on the cloud model is presented. The cloud model integrates the vague and random use of linguistic terms in a unified way. With these models, spatial and nonspatial attribute values are well generalized at multiple levels, allowing discovery of strong spatial association rules. Combining the cloud model based method with Apriori algorithms for mining association rules from a spatial database shows benefits in being effective and flexible.展开更多
Direct soil temperature(ST)measurement is time-consuming and costly;thus,the use of simple and cost-effective machine learning(ML)tools is helpful.In this study,ML approaches,including KStar,instance-based K-nearest l...Direct soil temperature(ST)measurement is time-consuming and costly;thus,the use of simple and cost-effective machine learning(ML)tools is helpful.In this study,ML approaches,including KStar,instance-based K-nearest learning(IBK),and locally weighted learning(LWL),coupled with resampling algorithms of bagging(BA)and dagging(DA)(BA-IBK,BA-KStar,BA-LWL,DA-IBK,DA-KStar,and DA-LWL)were developed and tested for multi-step ahead(3,6,and 9 d ahead)ST forecasting.In addition,a linear regression(LR)model was used as a benchmark to evaluate the results.A dataset was established,with daily ST time-series at 5 and 50 cm soil depths in a farmland as models’output and meteorological data as models’input,including mean(T_(mean)),minimum(Tmin),and maximum(T_(max))air temperatures,evaporation(Eva),sunshine hours(SSH),and solar radiation(SR),which were collected at Isfahan Synoptic Station(Iran)for 13 years(1992–2005).Six different input combination scenarios were selected based on Pearson’s correlation coefficients between inputs and outputs and fed into the models.We used 70%of the data to train the models,with the remaining 30%used for model evaluation via multiple visual and quantitative metrics.Our?ndings showed that T_(mean)was the most effective input variable for ST forecasting in most of the developed models,while in some cases the combinations of variables,including T_(mean)and T_(max)and T_(mean),T_(max),Tmin,Eva,and SSH proved to be the best input combinations.Among the evaluated models,BA-KStar showed greater compatibility,while in most cases,BA-IBK and-LWL provided more accurate results,depending on soil depth.For the 5 cm soil depth,BA-KStar had superior performance(i.e.,Nash-Sutcliffe efficiency(NSE)=0.90,0.87,and 0.85 for 3,6,and 9 d ahead forecasting,respectively);for the 50 cm soil depth,DA-KStar outperformed the other models(i.e.,NSE=0.88,0.89,and 0.89 for 3,6,and 9 d ahead forecasting,respectively).The results con?rmed that all hybrid models had higher prediction capabilities than the LR model.展开更多
In conjunction with association rules for data mining, the connections between testing indices and strong and weak association rules were determined, and new derivative rules were obtained by further reasoning. Associ...In conjunction with association rules for data mining, the connections between testing indices and strong and weak association rules were determined, and new derivative rules were obtained by further reasoning. Association rules were used to analyze correlation and check consistency between indices. This study shows that the judgment obtained by weak association rules or non-association rules is more accurate and more credible than that obtained by strong association rules. When the testing grades of two indices in the weak association rules are inconsistent, the testing grades of indices are more likely to be erroneous, and the mistakes are often caused by human factors. Clustering data mining technology was used to analyze the reliability of a diagnosis, or to perform health diagnosis directly. Analysis showed that the clustering results are related to the indices selected, and that if the indices selected are more significant, the characteristics of clustering results are also more significant, and the analysis or diagnosis is more credible. The indices and diagnosis analysis function produced by this study provide a necessary theoretical foundation and new ideas for the development of hydraulic metal structure health diagnosis technology.展开更多
基金Supported by Research on Pattern differentiation of AIDS based on Graph Theroy of National Natural Science Foundation of China(No.81202858)Research on Intervention Evaluation of TCM Health Differentiation of National Key Technology Support Program(No.2012BAI25B02)+3 种基金Research and Development in Digital Information System of Traditional Chinese Medicine of National 863 Program of China(No.2012AA02A609)Acupuncture Efficacy of Gastrointestinal Dysfunction(No.ZZ05003)Acupuncture-point Specialty Analysis based on Image Processing Technology(No.ZZ03090)of Self-selected subject of China Academy of Chinese Medical SciencesSemantic Recognition of Tongue and Pulse based on Image Content of the Beijing Key Laboratory of Advanced Information Science and Network Technology(No.XDXX1306)
文摘OBJECTIVE: To help researchers selecting appropriate data mining models to provide better evidence for the clinical practice of Traditional Chinese Medicine(TCM) diagnosis and therapy.METHODS: Clinical issues based on data mining models were comprehensively summarized from four significant elements of the clinical studies:symptoms, symptom patterns, herbs, and efficacy.Existing problems were further generalized to determine the relevant factors of the performance of data mining models, e.g. data type, samples, parameters, variable labels. Combining these relevant factors, the TCM clinical data features were compared with regards to statistical characters and informatics properties. Data models were compared simultaneously from the view of applied conditions and suitable scopes.RESULTS: The main application problems were the inconsistent data type and the small samples for the used data mining models, which caused the inappropriate results, even the mistake results. These features, i.e. advantages, disadvantages, satisfied data types, tasks of data mining, and the TCM issues, were summarized and compared.CONCLUSION: By aiming at the special features of different data mining models, the clinical doctors could select the suitable data mining models to resolve the TCM problem.
基金Item Sponsored by National Natural Science Foundation of China(51174253)
文摘Cooling process of iron ore pellets in a circular cooler has great impacts on the pellet quality and systematic energy exploitation. However, multi-variables and non-visualization of this gray system is unfavorable to efficient production. Thus, the cooling process of iron ore pellets was optimized using mathematical model and data mining techniques. A mathematical model was established and validated by steady-state production data, and the results show that the calculated values coincide very well with the measured values. Based on the proposed model, effects of important process parameters on gas-pellet temperature profiles within the circular cooler were analyzed to better understand the entire cooling process. Two data mining techniques—Association Rules Induction and Clustering were also applied on the steady-state production data to obtain expertise operating rules and optimized targets. Finally, an optimized control strategy for the circular cooler was proposed and an operation guidance system was developed. The system could realize the visualization of thermal process at steady state and provide operation guidance to optimize the circular cooler.
基金Item Sponsored by Spanish Ministry of Education and Science(DPI2007-61090)European Commission Research Programme of the Research Fund for Coal and Steel(RFS-PR-06035)
文摘An experience is presented using the finite element method (FEM) and data mining (DM) techniques to develop models that can be used to optimieze the skin-pass rolling process based on its operating conditions. A FE model based on a real skin-pass process is built and validated. Based on this model, a group of FE models is simulated with different adjustment parameters and with different materials for the sheet; both variables are chosen from pre-set ranges, From all FE model simulations, a database is generated; this database is made up of the above mentioned adjustment parameters, sheet properties and the variables of the process arising from the simulation of the model. Various types of data mining algorithms are used to develop predictive models for each of the variables of the process.The best predictive models can be used to predict experimentally hard-to-measure variables (internal stresses, internal straine, etc.) which are useful in the optimal design of the process or to be applied in real time control systems of a skin-pass process in -plant.
基金Sponsored by the Ability Enhancement Project of Teaching Staff in Harbin Institute of Technology(Grant No.06)
文摘In order to find an effective way to improve the quality of school management,finding valuable information from students' original data and providing feedback for student management are necessary. Firstly,some new and successful educational data mining models were analyzed and compared. These models have better performance than traditional models( such as Knowledge Tracing Model) in efficiency,comprehensiveness,ease of use,stability and so on. Then,the neural network algorithm was conducted to explore the feasibility of the application of educational data mining in student management,and the results show that it has enough predictive accuracy and reliability to be put into practice. In the end,the possibility and prospect of the application of educational data mining in teaching management system for university students was assessed.
文摘The intensity of environmental regulation (ERI) affects the short-term effect of the level of green mining (GML),and which structure determines the long-term mechanism.Based on the panel data from 2001 to 2015,with the dynamic panel model and system GMM estimation method were employed to test the influence of heterogeneous environmental regulation on green mining and its transmission mechanism.The results show that,there is a 'U' type nonlinear relationship between the ERI and GML.The direct effect of command-control-based (CAC) and the market incentive-based (MBI) environmental regulation on green development of mining shows the characteristics of inhibition and promotion.There is a 'U' type of indirectly moderating effect between technological innovation and the energy consumption structure on the GML.The technological innovation promotes the green development of the mining industry only after pass the inflection point of MBI,while the CAC plays a significant guiding role in upgrading of the energy consumption structure.There is an inhibition and promotion effect of MBI on the GML in the southeast coastal area,and the CAC is not significantly.Meanwhile,both of the ERI shows no positive effects in the central and western inland region.
基金the National Defense 973 (Grant No.513180303) and National Defense Basic Scientific Research (Grant No. A2220061080)the Na-tional Defense Foundation (Grant No. 5142040205BQ0154).
文摘The high temperature dielectrics of Quartz fiber-reinforced silicon dioxide ceramic (Si02/SiO2 ) composites were studied both theoretically and experimentally. A multi-scale theoretical model was developed based on the theory of dielectrics. It was realized to predict dielectric properties at higher temperature ( 〉 1200 ℃) by experimental data mining for correlative coefficients in model. The results show that the dielectrics of SiO2/SiO2, which were calculated with the theoretical model, were in agreement with experimental measured value.
文摘Data Mining has become an important technique for the exploration and extraction of data in numerous and various research projects in different fields (technology, information technology, business, the environment, economics, etc.). In the context of the analysis and visualisation of large amounts of data extracted using Data Mining on a temporary basis (time-series), free software such as R has appeared in the international context as a perfect inexpensive and efficient tool of exploitation and visualisation of time series. This has allowed the development of models, which help to extract the most relevant information from large volumes of data. In this regard, a script has been developed with the goal of implementing ARIMA models, showing these as useful and quick mechanisms for the extraction, analysis and visualisation of large data volumes, in addition to presenting the great advantage of being applied in multiple branches of knowledge from economy, demography, physics, mathematics and fisheries among others. Therefore, ARIMA models appear as a Data Mining technique, offering reliable, robust and high-quality results, to help validate and sustain the research carried out.
文摘Facing the development of future 5 G, the emerging technologies such as Internet of things, big data, cloud computing, and artificial intelligence is enhancing an explosive growth in data traffic. Radical changes in communication theory and implement technologies, the wireless communications and wireless networks have entered a new era. Among them, wireless big data(WBD) has tremendous value, and artificial intelligence(AI) gives unthinkable possibilities. However, in the big data development and artificial intelligence application groups, the lack of a sound theoretical foundation and mathematical methods is regarded as a real challenge that needs to be solved. From the basic problem of wireless communication, the interrelationship of demand, environment and ability, this paper intends to investigate the concept and data model of WBD, the wireless data mining, the wireless knowledge and wireless knowledge learning(WKL), and typical practices examples, to facilitate and open up more opportunities of WBD research and developments. Such research is beneficial for creating new theoretical foundation and emerging technologies of future wireless communications.
基金supported by the National Natural Science Foundation of China(61371172)the International S&T Cooperation Program of China(2015DFR10220)+1 种基金the Ocean Engineering Project of National Key Laboratory Foundation(1213)the Fundamental Research Funds for the Central Universities(HEUCF1608)
文摘For the multi-mode radar working in the modern electronicbattlefield, different working states of one single radar areprone to being classified as multiple emitters when adoptingtraditional classification methods to process intercepted signals,which has a negative effect on signal classification. A classificationmethod based on spatial data mining is presented to address theabove challenge. Inspired by the idea of spatial data mining, theclassification method applies nuclear field to depicting the distributioninformation of pulse samples in feature space, and digs out thehidden cluster information by analyzing distribution characteristics.In addition, a membership-degree criterion to quantify the correlationamong all classes is established, which ensures classificationaccuracy of signal samples. Numerical experiments show that thepresented method can effectively prevent different working statesof multi-mode emitter from being classified as several emitters,and achieve higher classification accuracy.
文摘The veracity of land evaluation is tightly related to the reasonable weights of land evaluation fac- tors. By mapping qualitative linguistic words into a fine-changeable cloud drops and translating the uncertain factor conditions into quantitative values with the uncertain illation based on cloud model, and then, inte- grating correlation analysis, a new way of figuring out the weight of land evaluation factors is proposed. It may solve the limitations of the conventional ways.
文摘In the electron beam selective melting(EBSM)process,the quality of each deposited melt track has an effect on the properties of the manufactured component.However,the formation of the melt track is governed by various physical phenomena and influenced by various process parameters,and the correlation of these parameters is complicated and difficult to establish experimentally.The mesoscopic modeling technique was recently introduced as a means of simulating the electron beam(EB)melting process and revealing the formation mechanisms of specific melt track morphologies.However,the correlation between the process parameters and the melt track features has not yet been quantitatively understood.This paper investigates the morphological features of the melt track from the results of mesoscopic simulation,while introducing key descriptive indexes such as melt track width and height in order to numerically assess the deposition quality.The effects of various processing parameters are also quantitatively investigated,and the correlation between the processing conditions and the melt track features is thereby derived.Finally,a simulation-driven optimization framework consisting of mesoscopic modeling and data mining is proposed,and its potential and limitations are discussed.
文摘Data mining in the educational field can be used to optimize the teaching and learning performance among the students.The recently developed machine learning(ML)and deep learning(DL)approaches can be utilized to mine the data effectively.This study proposes an Improved Sailfish Optimizer-based Feature SelectionwithOptimal Stacked Sparse Autoencoder(ISOFS-OSSAE)for data mining and pattern recognition in the educational sector.The proposed ISOFS-OSSAE model aims to mine the educational data and derive decisions based on the feature selection and classification process.Moreover,the ISOFS-OSSAEmodel involves the design of the ISOFS technique to choose an optimal subset of features.Moreover,the swallow swarm optimization(SSO)with the SSAE model is derived to perform the classification process.To showcase the enhanced outcomes of the ISOFSOSSAE model,a wide range of experiments were taken place on a benchmark dataset from the University of California Irvine(UCI)Machine Learning Repository.The simulation results pointed out the improved classification performance of the ISOFS-OSSAE model over the recent state of art approaches interms of different performance measures.
文摘The traditional generalization-based knowledge discovery method is introduced. A new kind of multilevel spatial association of the rules mining method based on the cloud model is presented. The cloud model integrates the vague and random use of linguistic terms in a unified way. With these models, spatial and nonspatial attribute values are well generalized at multiple levels, allowing discovery of strong spatial association rules. Combining the cloud model based method with Apriori algorithms for mining association rules from a spatial database shows benefits in being effective and flexible.
文摘Direct soil temperature(ST)measurement is time-consuming and costly;thus,the use of simple and cost-effective machine learning(ML)tools is helpful.In this study,ML approaches,including KStar,instance-based K-nearest learning(IBK),and locally weighted learning(LWL),coupled with resampling algorithms of bagging(BA)and dagging(DA)(BA-IBK,BA-KStar,BA-LWL,DA-IBK,DA-KStar,and DA-LWL)were developed and tested for multi-step ahead(3,6,and 9 d ahead)ST forecasting.In addition,a linear regression(LR)model was used as a benchmark to evaluate the results.A dataset was established,with daily ST time-series at 5 and 50 cm soil depths in a farmland as models’output and meteorological data as models’input,including mean(T_(mean)),minimum(Tmin),and maximum(T_(max))air temperatures,evaporation(Eva),sunshine hours(SSH),and solar radiation(SR),which were collected at Isfahan Synoptic Station(Iran)for 13 years(1992–2005).Six different input combination scenarios were selected based on Pearson’s correlation coefficients between inputs and outputs and fed into the models.We used 70%of the data to train the models,with the remaining 30%used for model evaluation via multiple visual and quantitative metrics.Our?ndings showed that T_(mean)was the most effective input variable for ST forecasting in most of the developed models,while in some cases the combinations of variables,including T_(mean)and T_(max)and T_(mean),T_(max),Tmin,Eva,and SSH proved to be the best input combinations.Among the evaluated models,BA-KStar showed greater compatibility,while in most cases,BA-IBK and-LWL provided more accurate results,depending on soil depth.For the 5 cm soil depth,BA-KStar had superior performance(i.e.,Nash-Sutcliffe efficiency(NSE)=0.90,0.87,and 0.85 for 3,6,and 9 d ahead forecasting,respectively);for the 50 cm soil depth,DA-KStar outperformed the other models(i.e.,NSE=0.88,0.89,and 0.89 for 3,6,and 9 d ahead forecasting,respectively).The results con?rmed that all hybrid models had higher prediction capabilities than the LR model.
基金supported by the Key Program of the National Natural Science Foundation of China(Grant No.50539010)the Special Fund for Public Welfare Industry of the Ministry of Water Resources of China(Grant No.200801019)
文摘In conjunction with association rules for data mining, the connections between testing indices and strong and weak association rules were determined, and new derivative rules were obtained by further reasoning. Association rules were used to analyze correlation and check consistency between indices. This study shows that the judgment obtained by weak association rules or non-association rules is more accurate and more credible than that obtained by strong association rules. When the testing grades of two indices in the weak association rules are inconsistent, the testing grades of indices are more likely to be erroneous, and the mistakes are often caused by human factors. Clustering data mining technology was used to analyze the reliability of a diagnosis, or to perform health diagnosis directly. Analysis showed that the clustering results are related to the indices selected, and that if the indices selected are more significant, the characteristics of clustering results are also more significant, and the analysis or diagnosis is more credible. The indices and diagnosis analysis function produced by this study provide a necessary theoretical foundation and new ideas for the development of hydraulic metal structure health diagnosis technology.