Molten iron temperature as well as Si, P, and S contents is the most essential molten iron quality (MIQ) indices in the blast furnace (BF) ironmaking, which requires strict monitoring during the whole ironmaking p...Molten iron temperature as well as Si, P, and S contents is the most essential molten iron quality (MIQ) indices in the blast furnace (BF) ironmaking, which requires strict monitoring during the whole ironmaking production. However, these MIQ parameters are difficult to be directly measured online, and large-time delay exists in off-line analysis through laboratory sampling. Focusing on the practical challenge, a data-driven modeling method was presented for the prediction of MIQ using the improved muhivariable incremental random vector functional-link net- works (M-I-RVFLNs). Compared with the conventional random vector functional-link networks (RVFLNs) and the online sequential RVFLNs, the M-I-RVFLNs have solved the problem of deciding the optimal number of hidden nodes and overcome the overfitting problems. Moreover, the proposed M I RVFLNs model has exhibited the potential for multivariable prediction of the MIQ and improved the terminal condition for the multiple-input multiple-out- put (MIMO) dynamic system, which is suitable for the BF ironmaking process in practice. Ultimately, industrial experiments and contrastive researches have been conducted on the BF No. 2 in Liuzhou Iron and Steel Group Co. Ltd. of China using the proposed method, and the results demonstrate that the established model produces better estima ting accuracy than other MIQ modeling methods.展开更多
In the contemporary era, the proliferation of information technology has led to an unprecedented surge in data generation, with this data being dispersed across a multitude of mobile devices. Facing these situations a...In the contemporary era, the proliferation of information technology has led to an unprecedented surge in data generation, with this data being dispersed across a multitude of mobile devices. Facing these situations and the training of deep learning model that needs great computing power support, the distributed algorithm that can carry out multi-party joint modeling has attracted everyone’s attention. The distributed training mode relieves the huge pressure of centralized model on computer computing power and communication. However, most distributed algorithms currently work in a master-slave mode, often including a central server for coordination, which to some extent will cause communication pressure, data leakage, privacy violations and other issues. To solve these problems, a decentralized fully distributed algorithm based on deep random weight neural network is proposed. The algorithm decomposes the original objective function into several sub-problems under consistency constraints, combines the decentralized average consensus (DAC) and alternating direction method of multipliers (ADMM), and achieves the goal of joint modeling and training through local calculation and communication of each node. Finally, we compare the proposed decentralized algorithm with several centralized deep neural networks with random weights, and experimental results demonstrate the effectiveness of the proposed algorithm.展开更多
In this paper, sixty-eight research articles published between 2000 and 2017 as well as textbooks which employed four classification algorithms: K-Nearest-Neighbor (KNN), Support Vector Machines (SVM), Random Forest (...In this paper, sixty-eight research articles published between 2000 and 2017 as well as textbooks which employed four classification algorithms: K-Nearest-Neighbor (KNN), Support Vector Machines (SVM), Random Forest (RF) and Neural Network (NN) as the main statistical tools were reviewed. The aim was to examine and compare these nonparametric classification methods on the following attributes: robustness to training data, sensitivity to changes, data fitting, stability, ability to handle large data sizes, sensitivity to noise, time invested in parameter tuning, and accuracy. The performances, strengths and shortcomings of each of the algorithms were examined, and finally, a conclusion was arrived at on which one has higher performance. It was evident from the literature reviewed that RF is too sensitive to small changes in the training dataset and is occasionally unstable and tends to overfit in the model. KNN is easy to implement and understand but has a major drawback of becoming significantly slow as the size of the data in use grows, while the ideal value of K for the KNN classifier is difficult to set. SVM and RF are insensitive to noise or overtraining, which shows their ability in dealing with unbalanced data. Larger input datasets will lengthen classification times for NN and KNN more than for SVM and RF. Among these nonparametric classification methods, NN has the potential to become a more widely used classification algorithm, but because of their time-consuming parameter tuning procedure, high level of complexity in computational processing, the numerous types of NN architectures to choose from and the high number of algorithms used for training, most researchers recommend SVM and RF as easier and wieldy used methods which repeatedly achieve results with high accuracies and are often faster to implement.展开更多
The ocean plays an important role in maintaining the equilibrium of Earth’s ecology and providing humans access to a wealth of resources.To obtain a high-precision underwater image classification model,we propose a c...The ocean plays an important role in maintaining the equilibrium of Earth’s ecology and providing humans access to a wealth of resources.To obtain a high-precision underwater image classification model,we propose a classification model that combines an EfficientnetB0 neural network and a two-hidden-layer random vector functional link network(EfficientnetB0-TRVFL).The features of underwater images were extracted using the EfficientnetB0 neural network pretrained via ImageNet,and a new fully connected layer was trained on the underwater image dataset using the transfer learning method.Transfer learning ensures the initial performance of the network and helps in the development of a high-precision classification model.Subsequently,a TRVFL was proposed to improve the classification property of the model.Net construction of the two hidden layers exhibited a high accuracy when the same hidden layer nodes were used.The parameters of the second hidden layer were obtained using a novel calculation method,which reduced the outcome error to improve the performance instability caused by the random generation of parameters of RVFL.Finally,the TRVFL classifier was used to classify features and obtain classification results.The proposed EfficientnetB0-TRVFL classification model achieved 87.28%,74.06%,and 99.59%accuracy on the MLC2008,MLC2009,and Fish-gres datasets,respectively.The best convolutional neural networks and existing methods were stacked up through box plots and Kolmogorov-Smirnov tests,respectively.The increases imply improved systematization properties in underwater image classification tasks.The image classification model offers important performance advantages and better stability compared with existing methods.展开更多
针对输油气管道的故障种类多、现场数据无法长期有效保存等问题,提出了一种基于边缘计算和改进随机向量函数链接(random vector functional-link,RVFL)网络的输油气管道故障分类方法。该方法扩展了监控和数据采集(supervisory control a...针对输油气管道的故障种类多、现场数据无法长期有效保存等问题,提出了一种基于边缘计算和改进随机向量函数链接(random vector functional-link,RVFL)网络的输油气管道故障分类方法。该方法扩展了监控和数据采集(supervisory control and data acquisition,SCADA)系统的功能,使其可以存储和访问大量的数据。首先,当输油气管道出现故障时,利用基于模糊似然函数的模糊聚类算法对故障发生前一段时间内的管道压力值进行聚类;然后,提取管道压力值密度特征,将其作为RVFL网络的增强节点,利用改进RVFL网络对故障进行分类。将改进RVFL网络部署在边缘计算模块中,对6种故障进行分类,其准确率可达到96.7%。展开更多
This paper proposes a hybrid Bayesian Network(BN)method for short-term forecasting of crude oil prices.The method performed is a hybrid,based on both the aspects of classification of influencing factors as well as the...This paper proposes a hybrid Bayesian Network(BN)method for short-term forecasting of crude oil prices.The method performed is a hybrid,based on both the aspects of classification of influencing factors as well as the regression of the out-ofsample values.For the sake of performance comparison,several other hybrid methods have also been devised using the methods of Markov Chain Monte Carlo(MCMC),Random Forest(RF),Support Vector Machine(SVM),neural networks(NNET)and generalized autoregressive conditional heteroskedasticity(GARCH).The hybrid methodology is primarily reliant upon constructing the crude oil price forecast from the summation of its Intrinsic Mode Functions(IMF)and its residue,extracted by an Empirical Mode Decomposition(EMD)of the original crude price signal.The Volatility Index(VIX)as well as the Implied Oil Volatility Index(OVX)has been considered among the influencing parameters of the crude price forecast.The final set of influencing parameters were selected as the whole set of significant contributors detected by the methods of Bayesian Network,Quantile Regression with Lasso penalty(QRL),Bayesian Lasso(BLasso)and the Bayesian Ridge Regression(BRR).The performance of the proposed hybrid-BN method is reported for the three crude price benchmarks:West Texas Intermediate,Brent Crude and the OPEC Reference Basket.展开更多
Heart failure is now widely spread throughout the world.Heart disease affects approximately 48%of the population.It is too expensive and also difficult to cure the disease.This research paper represents machine learni...Heart failure is now widely spread throughout the world.Heart disease affects approximately 48%of the population.It is too expensive and also difficult to cure the disease.This research paper represents machine learning models to predict heart failure.The fundamental concept is to compare the correctness of various Machine Learning(ML)algorithms and boost algorithms to improve models’accuracy for prediction.Some supervised algorithms like K-Nearest Neighbor(KNN),Support Vector Machine(SVM),Decision Trees(DT),Random Forest(RF),Logistic Regression(LR)are considered to achieve the best results.Some boosting algorithms like Extreme Gradient Boosting(XGBoost)and Cat-Boost are also used to improve the prediction using Artificial Neural Networks(ANN).This research also focuses on data visualization to identify patterns,trends,and outliers in a massive data set.Python and Scikit-learns are used for ML.Tensor Flow and Keras,along with Python,are used for ANN model train-ing.The DT and RF algorithms achieved the highest accuracy of 95%among the classifiers.Meanwhile,KNN obtained a second height accuracy of 93.33%.XGBoost had a gratified accuracy of 91.67%,SVM,CATBoost,and ANN had an accuracy of 90%,and LR had 88.33%accuracy.展开更多
Landslide susceptibility prediction(LSP)is significantly affected by the uncertainty issue of landslide related conditioning factor selection.However,most of literature only performs comparative studies on a certain c...Landslide susceptibility prediction(LSP)is significantly affected by the uncertainty issue of landslide related conditioning factor selection.However,most of literature only performs comparative studies on a certain conditioning factor selection method rather than systematically study this uncertainty issue.Targeted,this study aims to systematically explore the influence rules of various commonly used conditioning factor selection methods on LSP,and on this basis to innovatively propose a principle with universal application for optimal selection of conditioning factors.An'yuan County in southern China is taken as example considering 431 landslides and 29 types of conditioning factors.Five commonly used factor selection methods,namely,the correlation analysis(CA),linear regression(LR),principal component analysis(PCA),rough set(RS)and artificial neural network(ANN),are applied to select the optimal factor combinations from the original 29 conditioning factors.The factor selection results are then used as inputs of four types of common machine learning models to construct 20 types of combined models,such as CA-multilayer perceptron,CA-random forest.Additionally,multifactor-based multilayer perceptron random forest models that selecting conditioning factors based on the proposed principle of“accurate data,rich types,clear significance,feasible operation and avoiding duplication”are constructed for comparisons.Finally,the LSP uncertainties are evaluated by the accuracy,susceptibility index distribution,etc.Results show that:(1)multifactor-based models have generally higher LSP performance and lower uncertainties than those of factors selection-based models;(2)Influence degree of different machine learning on LSP accuracy is greater than that of different factor selection methods.Conclusively,the above commonly used conditioning factor selection methods are not ideal for improving LSP performance and may complicate the LSP processes.In contrast,a satisfied combination of conditioning factors can be constructed according to the proposed principle.展开更多
基金Item Sponsored by National Natural Science Foundation of China(61290323,61333007,61473064)Fundamental Research Funds for Central Universities of China(N130108001)+1 种基金National High Technology Research and Development Program of China(2015AA043802)General Project on Scientific Research for Education Department of Liaoning Province of China(L20150186)
文摘Molten iron temperature as well as Si, P, and S contents is the most essential molten iron quality (MIQ) indices in the blast furnace (BF) ironmaking, which requires strict monitoring during the whole ironmaking production. However, these MIQ parameters are difficult to be directly measured online, and large-time delay exists in off-line analysis through laboratory sampling. Focusing on the practical challenge, a data-driven modeling method was presented for the prediction of MIQ using the improved muhivariable incremental random vector functional-link net- works (M-I-RVFLNs). Compared with the conventional random vector functional-link networks (RVFLNs) and the online sequential RVFLNs, the M-I-RVFLNs have solved the problem of deciding the optimal number of hidden nodes and overcome the overfitting problems. Moreover, the proposed M I RVFLNs model has exhibited the potential for multivariable prediction of the MIQ and improved the terminal condition for the multiple-input multiple-out- put (MIMO) dynamic system, which is suitable for the BF ironmaking process in practice. Ultimately, industrial experiments and contrastive researches have been conducted on the BF No. 2 in Liuzhou Iron and Steel Group Co. Ltd. of China using the proposed method, and the results demonstrate that the established model produces better estima ting accuracy than other MIQ modeling methods.
文摘In the contemporary era, the proliferation of information technology has led to an unprecedented surge in data generation, with this data being dispersed across a multitude of mobile devices. Facing these situations and the training of deep learning model that needs great computing power support, the distributed algorithm that can carry out multi-party joint modeling has attracted everyone’s attention. The distributed training mode relieves the huge pressure of centralized model on computer computing power and communication. However, most distributed algorithms currently work in a master-slave mode, often including a central server for coordination, which to some extent will cause communication pressure, data leakage, privacy violations and other issues. To solve these problems, a decentralized fully distributed algorithm based on deep random weight neural network is proposed. The algorithm decomposes the original objective function into several sub-problems under consistency constraints, combines the decentralized average consensus (DAC) and alternating direction method of multipliers (ADMM), and achieves the goal of joint modeling and training through local calculation and communication of each node. Finally, we compare the proposed decentralized algorithm with several centralized deep neural networks with random weights, and experimental results demonstrate the effectiveness of the proposed algorithm.
文摘In this paper, sixty-eight research articles published between 2000 and 2017 as well as textbooks which employed four classification algorithms: K-Nearest-Neighbor (KNN), Support Vector Machines (SVM), Random Forest (RF) and Neural Network (NN) as the main statistical tools were reviewed. The aim was to examine and compare these nonparametric classification methods on the following attributes: robustness to training data, sensitivity to changes, data fitting, stability, ability to handle large data sizes, sensitivity to noise, time invested in parameter tuning, and accuracy. The performances, strengths and shortcomings of each of the algorithms were examined, and finally, a conclusion was arrived at on which one has higher performance. It was evident from the literature reviewed that RF is too sensitive to small changes in the training dataset and is occasionally unstable and tends to overfit in the model. KNN is easy to implement and understand but has a major drawback of becoming significantly slow as the size of the data in use grows, while the ideal value of K for the KNN classifier is difficult to set. SVM and RF are insensitive to noise or overtraining, which shows their ability in dealing with unbalanced data. Larger input datasets will lengthen classification times for NN and KNN more than for SVM and RF. Among these nonparametric classification methods, NN has the potential to become a more widely used classification algorithm, but because of their time-consuming parameter tuning procedure, high level of complexity in computational processing, the numerous types of NN architectures to choose from and the high number of algorithms used for training, most researchers recommend SVM and RF as easier and wieldy used methods which repeatedly achieve results with high accuracies and are often faster to implement.
基金support of the National Key R&D Program of China(No.2022YFC2803903)the Key R&D Program of Zhejiang Province(No.2021C03013)the Zhejiang Provincial Natural Science Foundation of China(No.LZ20F020003).
文摘The ocean plays an important role in maintaining the equilibrium of Earth’s ecology and providing humans access to a wealth of resources.To obtain a high-precision underwater image classification model,we propose a classification model that combines an EfficientnetB0 neural network and a two-hidden-layer random vector functional link network(EfficientnetB0-TRVFL).The features of underwater images were extracted using the EfficientnetB0 neural network pretrained via ImageNet,and a new fully connected layer was trained on the underwater image dataset using the transfer learning method.Transfer learning ensures the initial performance of the network and helps in the development of a high-precision classification model.Subsequently,a TRVFL was proposed to improve the classification property of the model.Net construction of the two hidden layers exhibited a high accuracy when the same hidden layer nodes were used.The parameters of the second hidden layer were obtained using a novel calculation method,which reduced the outcome error to improve the performance instability caused by the random generation of parameters of RVFL.Finally,the TRVFL classifier was used to classify features and obtain classification results.The proposed EfficientnetB0-TRVFL classification model achieved 87.28%,74.06%,and 99.59%accuracy on the MLC2008,MLC2009,and Fish-gres datasets,respectively.The best convolutional neural networks and existing methods were stacked up through box plots and Kolmogorov-Smirnov tests,respectively.The increases imply improved systematization properties in underwater image classification tasks.The image classification model offers important performance advantages and better stability compared with existing methods.
文摘针对输油气管道的故障种类多、现场数据无法长期有效保存等问题,提出了一种基于边缘计算和改进随机向量函数链接(random vector functional-link,RVFL)网络的输油气管道故障分类方法。该方法扩展了监控和数据采集(supervisory control and data acquisition,SCADA)系统的功能,使其可以存储和访问大量的数据。首先,当输油气管道出现故障时,利用基于模糊似然函数的模糊聚类算法对故障发生前一段时间内的管道压力值进行聚类;然后,提取管道压力值密度特征,将其作为RVFL网络的增强节点,利用改进RVFL网络对故障进行分类。将改进RVFL网络部署在边缘计算模块中,对6种故障进行分类,其准确率可达到96.7%。
文摘This paper proposes a hybrid Bayesian Network(BN)method for short-term forecasting of crude oil prices.The method performed is a hybrid,based on both the aspects of classification of influencing factors as well as the regression of the out-ofsample values.For the sake of performance comparison,several other hybrid methods have also been devised using the methods of Markov Chain Monte Carlo(MCMC),Random Forest(RF),Support Vector Machine(SVM),neural networks(NNET)and generalized autoregressive conditional heteroskedasticity(GARCH).The hybrid methodology is primarily reliant upon constructing the crude oil price forecast from the summation of its Intrinsic Mode Functions(IMF)and its residue,extracted by an Empirical Mode Decomposition(EMD)of the original crude price signal.The Volatility Index(VIX)as well as the Implied Oil Volatility Index(OVX)has been considered among the influencing parameters of the crude price forecast.The final set of influencing parameters were selected as the whole set of significant contributors detected by the methods of Bayesian Network,Quantile Regression with Lasso penalty(QRL),Bayesian Lasso(BLasso)and the Bayesian Ridge Regression(BRR).The performance of the proposed hybrid-BN method is reported for the three crude price benchmarks:West Texas Intermediate,Brent Crude and the OPEC Reference Basket.
基金Taif University Researchers Supporting Project Number(TURSP-2020/73)Taif University,Taif,Saudi Arabia.
文摘Heart failure is now widely spread throughout the world.Heart disease affects approximately 48%of the population.It is too expensive and also difficult to cure the disease.This research paper represents machine learning models to predict heart failure.The fundamental concept is to compare the correctness of various Machine Learning(ML)algorithms and boost algorithms to improve models’accuracy for prediction.Some supervised algorithms like K-Nearest Neighbor(KNN),Support Vector Machine(SVM),Decision Trees(DT),Random Forest(RF),Logistic Regression(LR)are considered to achieve the best results.Some boosting algorithms like Extreme Gradient Boosting(XGBoost)and Cat-Boost are also used to improve the prediction using Artificial Neural Networks(ANN).This research also focuses on data visualization to identify patterns,trends,and outliers in a massive data set.Python and Scikit-learns are used for ML.Tensor Flow and Keras,along with Python,are used for ANN model train-ing.The DT and RF algorithms achieved the highest accuracy of 95%among the classifiers.Meanwhile,KNN obtained a second height accuracy of 93.33%.XGBoost had a gratified accuracy of 91.67%,SVM,CATBoost,and ANN had an accuracy of 90%,and LR had 88.33%accuracy.
基金funded by the Natural Science Foundation of China(Grant Nos.42377164 and 41972280)the Badong National Observation and Research Station of Geohazards(Grant No.BNORSG-202305).
文摘Landslide susceptibility prediction(LSP)is significantly affected by the uncertainty issue of landslide related conditioning factor selection.However,most of literature only performs comparative studies on a certain conditioning factor selection method rather than systematically study this uncertainty issue.Targeted,this study aims to systematically explore the influence rules of various commonly used conditioning factor selection methods on LSP,and on this basis to innovatively propose a principle with universal application for optimal selection of conditioning factors.An'yuan County in southern China is taken as example considering 431 landslides and 29 types of conditioning factors.Five commonly used factor selection methods,namely,the correlation analysis(CA),linear regression(LR),principal component analysis(PCA),rough set(RS)and artificial neural network(ANN),are applied to select the optimal factor combinations from the original 29 conditioning factors.The factor selection results are then used as inputs of four types of common machine learning models to construct 20 types of combined models,such as CA-multilayer perceptron,CA-random forest.Additionally,multifactor-based multilayer perceptron random forest models that selecting conditioning factors based on the proposed principle of“accurate data,rich types,clear significance,feasible operation and avoiding duplication”are constructed for comparisons.Finally,the LSP uncertainties are evaluated by the accuracy,susceptibility index distribution,etc.Results show that:(1)multifactor-based models have generally higher LSP performance and lower uncertainties than those of factors selection-based models;(2)Influence degree of different machine learning on LSP accuracy is greater than that of different factor selection methods.Conclusively,the above commonly used conditioning factor selection methods are not ideal for improving LSP performance and may complicate the LSP processes.In contrast,a satisfied combination of conditioning factors can be constructed according to the proposed principle.