Non-technical losses(NTL)of electric power are a serious problem for electric distribution companies.The solution determines the cost,stability,reliability,and quality of the supplied electricity.The widespread use of...Non-technical losses(NTL)of electric power are a serious problem for electric distribution companies.The solution determines the cost,stability,reliability,and quality of the supplied electricity.The widespread use of advanced metering infrastructure(AMI)and Smart Grid allows all participants in the distribution grid to store and track electricity consumption.During the research,a machine learning model is developed that allows analyzing and predicting the probability of NTL for each consumer of the distribution grid based on daily electricity consumption readings.This model is an ensemble meta-algorithm(stacking)that generalizes the algorithms of random forest,LightGBM,and a homogeneous ensemble of artificial neural networks.The best accuracy of the proposed meta-algorithm in comparison to basic classifiers is experimentally confirmed on the test sample.Such a model,due to good accuracy indicators(ROC-AUC-0.88),can be used as a methodological basis for a decision support system,the purpose of which is to form a sample of suspected NTL sources.The use of such a sample will allow the top management of electric distribution companies to increase the efficiency of raids by performers,making them targeted and accurate,which should contribute to the fight against NTL and the sustainable development of the electric power industry.展开更多
Drought is the least understood natural disaster due to the complex relationship of multiple contributory factors. Itsbeginning and end are hard to gauge, and they can last for months or even for years. India has face...Drought is the least understood natural disaster due to the complex relationship of multiple contributory factors. Itsbeginning and end are hard to gauge, and they can last for months or even for years. India has faced many droughtsin the last few decades. Predicting future droughts is vital for framing drought management plans to sustainnatural resources. The data-driven modelling for forecasting the metrological time series prediction is becomingmore powerful and flexible with computational intelligence techniques. Machine learning (ML) techniques havedemonstrated success in the drought prediction process and are becoming popular to predict the weather, especiallythe minimum temperature using backpropagation algorithms. The favourite ML techniques for weather forecastinginclude support vector machines (SVM), support vector regression, random forest, decision tree, logistic regression,Naive Bayes, linear regression, gradient boosting tree, k-nearest neighbours (KNN), the adaptive neuro-fuzzyinference system, the feed-forward neural networks, Markovian chain, Bayesian network, hidden Markov models,and autoregressive moving averages, evolutionary algorithms, deep learning and many more. This paper presentsa recent review of the literature using ML in drought prediction, the drought indices, dataset, and performancemetrics.展开更多
Modeling and optimization is crucial to smart chemical process operations.However,a large number of nonlinearities must be considered in a typical chemical process according to complex unit operations,chemical reactio...Modeling and optimization is crucial to smart chemical process operations.However,a large number of nonlinearities must be considered in a typical chemical process according to complex unit operations,chemical reactions and separations.This leads to a great challenge of implementing mechanistic models into industrial-scale problems due to the resulting computational complexity.Thus,this paper presents an efficient hybrid framework of integrating machine learning and particle swarm optimization to overcome the aforementioned difficulties.An industrial propane dehydrogenation process was carried out to demonstrate the validity and efficiency of our method.Firstly,a data set was generated based on process mechanistic simulation validated by industrial data,which provides sufficient and reasonable samples for model training and testing.Secondly,four well-known machine learning methods,namely,K-nearest neighbors,decision tree,support vector machine,and artificial neural network,were compared and used to obtain the prediction models of the processes operation.All of these methods achieved highly accurate model by adjusting model parameters on the basis of high-coverage data and properly features.Finally,optimal process operations were obtained by using the particle swarm optimization approach.展开更多
针对最小二乘孪生支持向量机受误差值影响大,对噪声样本敏感及核函数、核参数选择困难等问题,提出一种Critic特征加权的多核最小二乘孪生支持向量机(Multi-Kernel Least-Squares Twin Support Vector Machine based on Critic weighted,...针对最小二乘孪生支持向量机受误差值影响大,对噪声样本敏感及核函数、核参数选择困难等问题,提出一种Critic特征加权的多核最小二乘孪生支持向量机(Multi-Kernel Least-Squares Twin Support Vector Machine based on Critic weighted,CMKLSTSVM)分类方法。首先,CMKLSTSVM使用Critic法赋予特征权重,反映不同特征间重要性差异,降低冗余特征及噪声样本影响。其次,根据混合多核学习策略构造了一种新的多核权重系数确定方法。该方法通过基核与理想核间的混合核对齐值判断核函数相似程度,确定权重系数,可以合理地组合多个核函数,最大程度地发挥不同核函数的映射能力。最后,采用加权求和的方式将特征权重与核权重进行统一并构造多核结构,使数据表达更全面,提高模型灵活性。在UCI数据集上的对比实验表明,CMKLSTSVM的分类准确率优于单核结构的SVM(support vector machine)算法,同时在高光谱图像上的对比实验反映了CMKLSTSVM对于包含噪声的真实分类问题的有效性。展开更多
Due to the complexity of economic system and the interactive effects between all kinds of economic variables and foreign trade, it is not easy to predict foreign trade volume. However, the difficulty in predicting for...Due to the complexity of economic system and the interactive effects between all kinds of economic variables and foreign trade, it is not easy to predict foreign trade volume. However, the difficulty in predicting foreign trade volume is usually attributed to the limitation of many conventional forecasting models. To improve the prediction performance, the study proposes a novel kernel-based ensemble learning approach hybridizing econometric models and artificial intelligence (AI) models to predict China's foreign trade volume. In the proposed approach, an important econometric model, the co-integration-based error correction vector auto-regression (EC-VAR) model is first used to capture the impacts of all kinds of economic variables on Chinese foreign trade from a multivariate linear anal- ysis perspective. Then an artificial neural network (ANN) based EC-VAR model is used to capture the nonlinear effects of economic variables on foreign trade from the nonlinear viewpoint. Subsequently, for incorporating the effects of irregular events on foreign trade, the text mining and expert's judgmental adjustments are also integrated into the nonlinear ANN-based EC-VAR model. Finally, all kinds of economic variables, the outputs of linear and nonlinear EC-VAR models and judgmental adjustment model are used as input variables of a typical kernel-based support vector regression (SVR) for en- semble prediction purpose. For illustration, the proposed kernel-based ensemble learning methodology hybridizing econometric techniques and AI methods is applied to China's foreign trade volume predic- tion problem. Experimental results reveal that the hybrid econometric-AI ensemble learning approach can significantly improve the prediction performance over other linear and nonlinear models listed in this study.展开更多
Renewable energy is essential for planet sustainability.Renewable energy output forecasting has a significant impact on making decisions related to operating and managing power systems.Accurate prediction of renewable...Renewable energy is essential for planet sustainability.Renewable energy output forecasting has a significant impact on making decisions related to operating and managing power systems.Accurate prediction of renewable energy output is vital to ensure grid reliability and permanency and reduce the risk and cost of the energy market and systems.Deep learning’s recent success in many applications has attracted researchers to this field and its promising potential is manifested in the richness of the proposed methods and the increasing number of publications.To facilitate further research and development in this area,this paper provides a review of deep learning-based solar and wind energy forecasting research published during the last five years discussing extensively the data and datasets used in the reviewed works,the data pre-processing methods,deterministic and probabilistic methods,and evaluation and comparison methods.The core characteristics of all the reviewed works are summarised in tabular forms to enable methodological comparisons.The current challenges in the field and future research directions are given.The trends show that hybrid forecasting models are the most used in this field followed by Recurrent Neural Network models including Long Short-Term Memory and Gated Recurrent Unit,and in the third place Convolutional Neural Networks.We also find that probabilistic and multistep ahead forecasting methods are gaining more attention.Moreover,we devise a broad taxonomy of the research using the key insights gained from this extensive review,the taxonomy we believe will be vital in understanding the cutting-edge and accelerating innovation in this field.展开更多
A clustering-based undersampling(CUS)and distance-based near-miss method are widely used in current imbalanced learning algorithms,but this method has certain drawbacks.In particular,the CUS does not consider the infl...A clustering-based undersampling(CUS)and distance-based near-miss method are widely used in current imbalanced learning algorithms,but this method has certain drawbacks.In particular,the CUS does not consider the influence of the distance factor on the majority of instances,and the near-miss method omits the inter-class(es)within the majority of samples.To overcome these drawbacks,this study proposes an undersampling method combining distance measurement and majority class clustering.Resampling methods are used to develop an ensemble-based imbalanced-learning algorithm called the clustering and distance-based imbalance learning model(CDEILM).This algorithm combines distance-based undersampling,feature selection,and ensemble learning.In addition,a cluster size-based resampling(CSBR)method is proposed for preserving the original distribution of the majority class,and a hybrid imbalanced learning framework is constructed by fusing various types of resampling methods.The combination of CDEILM and CSBR can be considered as a specific case of this hybrid framework.The experimental results show that the CDEILM and CSBR methods can achieve better performance than the benchmark methods,and that the hybrid model provides the best results under most circumstances.Therefore,the proposed model can be used as an alternative imbalanced learning method under specific circumstances,e.g.,for providing a solution to credit evaluation problems in financial applications.展开更多
Root zone soil moisture(RZSM)plays a critical role in land-atmosphere hydrological cycles and serves as the primary water source for vegetation growth.However,the correlations between RZSM and its associated variables...Root zone soil moisture(RZSM)plays a critical role in land-atmosphere hydrological cycles and serves as the primary water source for vegetation growth.However,the correlations between RZSM and its associated variables,including surface soil moisture(SSM),often exhibit nonlinearities that are challenging to identify and quantify using conventional statistical techniques.Therefore,this study presents a hybrid convolutional neural network(CNN)-long short-term memory neural network(LSTM)-attention(CLA)model for predicting RZSM.Owing to the scarcity of soil moisture(SM)observation data,the physical model Hydrus-1D was employed to simulate a comprehensive dataset of spatial-temporal SM.Meteorological data and moderate resolution imaging spectroradiometer vegetation characterization parameters were used as predictor variables for the training and validation of the CLA model.The results of the CLA model for SM prediction in the root zone were significantly enhanced compared with those of the traditional LSTM and CNN-LSTM models.This was particularly notable at the depth of 80–100 cm,where the fitness(R^(2))reached nearly 0.9298.Moreover,the root mean square error of the CLA model was reduced by 49%and 57%compared with those of the LSTM and CNN-LSTM models,respectively.This study demonstrates that the integration of physical modeling and deep learning methods provides a more comprehensive and accurate understanding of spatial-temporal SM variations in the root zone.展开更多
将改进遗传算法(GA)和误差反向传播(BP)算法相结合构成的混合算法用于训练人工神经网络。该混合算法有效地解决了常规 BP 算法学习网络权值收敛速度慢、易陷入局部极小和 GA 算法独立训练神经网络速度缓慢等缺点,并对其应用于电力变压...将改进遗传算法(GA)和误差反向传播(BP)算法相结合构成的混合算法用于训练人工神经网络。该混合算法有效地解决了常规 BP 算法学习网络权值收敛速度慢、易陷入局部极小和 GA 算法独立训练神经网络速度缓慢等缺点,并对其应用于电力变压器故障诊断进行了仿真,仿真结果表明了该算法具有较快的收敛速度和较高的计算精度,故障诊断结果证实了该算法应用于电力变压器故障诊断的有效性。展开更多
提出了一种基于DI-FCM(double indices fuzzy C-means)算法框架的无监督距离学习算法——基于混合距离学习的双指数模糊C均值算法HDDI-FCM(double indices fuzzy C-m eans with hybrid distance).数据集未知距离度量被表示为若干已有距...提出了一种基于DI-FCM(double indices fuzzy C-means)算法框架的无监督距离学习算法——基于混合距离学习的双指数模糊C均值算法HDDI-FCM(double indices fuzzy C-m eans with hybrid distance).数据集未知距离度量被表示为若干已有距离的线性组合,然后执行HDDI-FCM,在对数据集进行有效聚类的同时进行距离学习.为了保证迭代算法收敛,引入了Steffensen迭代法来改进计算簇中心点的迭代公式.讨论了算法中参数的选择.基于UCI(University of California,Irvine)数据集的实验结果表明该算法是有效的.展开更多
文摘Non-technical losses(NTL)of electric power are a serious problem for electric distribution companies.The solution determines the cost,stability,reliability,and quality of the supplied electricity.The widespread use of advanced metering infrastructure(AMI)and Smart Grid allows all participants in the distribution grid to store and track electricity consumption.During the research,a machine learning model is developed that allows analyzing and predicting the probability of NTL for each consumer of the distribution grid based on daily electricity consumption readings.This model is an ensemble meta-algorithm(stacking)that generalizes the algorithms of random forest,LightGBM,and a homogeneous ensemble of artificial neural networks.The best accuracy of the proposed meta-algorithm in comparison to basic classifiers is experimentally confirmed on the test sample.Such a model,due to good accuracy indicators(ROC-AUC-0.88),can be used as a methodological basis for a decision support system,the purpose of which is to form a sample of suspected NTL sources.The use of such a sample will allow the top management of electric distribution companies to increase the efficiency of raids by performers,making them targeted and accurate,which should contribute to the fight against NTL and the sustainable development of the electric power industry.
文摘Drought is the least understood natural disaster due to the complex relationship of multiple contributory factors. Itsbeginning and end are hard to gauge, and they can last for months or even for years. India has faced many droughtsin the last few decades. Predicting future droughts is vital for framing drought management plans to sustainnatural resources. The data-driven modelling for forecasting the metrological time series prediction is becomingmore powerful and flexible with computational intelligence techniques. Machine learning (ML) techniques havedemonstrated success in the drought prediction process and are becoming popular to predict the weather, especiallythe minimum temperature using backpropagation algorithms. The favourite ML techniques for weather forecastinginclude support vector machines (SVM), support vector regression, random forest, decision tree, logistic regression,Naive Bayes, linear regression, gradient boosting tree, k-nearest neighbours (KNN), the adaptive neuro-fuzzyinference system, the feed-forward neural networks, Markovian chain, Bayesian network, hidden Markov models,and autoregressive moving averages, evolutionary algorithms, deep learning and many more. This paper presentsa recent review of the literature using ML in drought prediction, the drought indices, dataset, and performancemetrics.
基金This work was supported by the“Zhujiang Talent Program”High Talent Project of Guangdong Province(Grant No.2017GC010614)the National Natural Science Foundation of China(Grant No.22078372).
文摘Modeling and optimization is crucial to smart chemical process operations.However,a large number of nonlinearities must be considered in a typical chemical process according to complex unit operations,chemical reactions and separations.This leads to a great challenge of implementing mechanistic models into industrial-scale problems due to the resulting computational complexity.Thus,this paper presents an efficient hybrid framework of integrating machine learning and particle swarm optimization to overcome the aforementioned difficulties.An industrial propane dehydrogenation process was carried out to demonstrate the validity and efficiency of our method.Firstly,a data set was generated based on process mechanistic simulation validated by industrial data,which provides sufficient and reasonable samples for model training and testing.Secondly,four well-known machine learning methods,namely,K-nearest neighbors,decision tree,support vector machine,and artificial neural network,were compared and used to obtain the prediction models of the processes operation.All of these methods achieved highly accurate model by adjusting model parameters on the basis of high-coverage data and properly features.Finally,optimal process operations were obtained by using the particle swarm optimization approach.
文摘针对最小二乘孪生支持向量机受误差值影响大,对噪声样本敏感及核函数、核参数选择困难等问题,提出一种Critic特征加权的多核最小二乘孪生支持向量机(Multi-Kernel Least-Squares Twin Support Vector Machine based on Critic weighted,CMKLSTSVM)分类方法。首先,CMKLSTSVM使用Critic法赋予特征权重,反映不同特征间重要性差异,降低冗余特征及噪声样本影响。其次,根据混合多核学习策略构造了一种新的多核权重系数确定方法。该方法通过基核与理想核间的混合核对齐值判断核函数相似程度,确定权重系数,可以合理地组合多个核函数,最大程度地发挥不同核函数的映射能力。最后,采用加权求和的方式将特征权重与核权重进行统一并构造多核结构,使数据表达更全面,提高模型灵活性。在UCI数据集上的对比实验表明,CMKLSTSVM的分类准确率优于单核结构的SVM(support vector machine)算法,同时在高光谱图像上的对比实验反映了CMKLSTSVM对于包含噪声的真实分类问题的有效性。
基金the National Natural Science Foundation of China under Grant Nos.70601029 and 70221001the Knowledge Innovation Program of the Chinese Academy of Sciences under Grant Nos.3547600,3046540,and 3047540the Strategy Research Grant of City University of Hong Kong under Grant No.7001806
文摘Due to the complexity of economic system and the interactive effects between all kinds of economic variables and foreign trade, it is not easy to predict foreign trade volume. However, the difficulty in predicting foreign trade volume is usually attributed to the limitation of many conventional forecasting models. To improve the prediction performance, the study proposes a novel kernel-based ensemble learning approach hybridizing econometric models and artificial intelligence (AI) models to predict China's foreign trade volume. In the proposed approach, an important econometric model, the co-integration-based error correction vector auto-regression (EC-VAR) model is first used to capture the impacts of all kinds of economic variables on Chinese foreign trade from a multivariate linear anal- ysis perspective. Then an artificial neural network (ANN) based EC-VAR model is used to capture the nonlinear effects of economic variables on foreign trade from the nonlinear viewpoint. Subsequently, for incorporating the effects of irregular events on foreign trade, the text mining and expert's judgmental adjustments are also integrated into the nonlinear ANN-based EC-VAR model. Finally, all kinds of economic variables, the outputs of linear and nonlinear EC-VAR models and judgmental adjustment model are used as input variables of a typical kernel-based support vector regression (SVR) for en- semble prediction purpose. For illustration, the proposed kernel-based ensemble learning methodology hybridizing econometric techniques and AI methods is applied to China's foreign trade volume predic- tion problem. Experimental results reveal that the hybrid econometric-AI ensemble learning approach can significantly improve the prediction performance over other linear and nonlinear models listed in this study.
文摘Renewable energy is essential for planet sustainability.Renewable energy output forecasting has a significant impact on making decisions related to operating and managing power systems.Accurate prediction of renewable energy output is vital to ensure grid reliability and permanency and reduce the risk and cost of the energy market and systems.Deep learning’s recent success in many applications has attracted researchers to this field and its promising potential is manifested in the richness of the proposed methods and the increasing number of publications.To facilitate further research and development in this area,this paper provides a review of deep learning-based solar and wind energy forecasting research published during the last five years discussing extensively the data and datasets used in the reviewed works,the data pre-processing methods,deterministic and probabilistic methods,and evaluation and comparison methods.The core characteristics of all the reviewed works are summarised in tabular forms to enable methodological comparisons.The current challenges in the field and future research directions are given.The trends show that hybrid forecasting models are the most used in this field followed by Recurrent Neural Network models including Long Short-Term Memory and Gated Recurrent Unit,and in the third place Convolutional Neural Networks.We also find that probabilistic and multistep ahead forecasting methods are gaining more attention.Moreover,we devise a broad taxonomy of the research using the key insights gained from this extensive review,the taxonomy we believe will be vital in understanding the cutting-edge and accelerating innovation in this field.
文摘A clustering-based undersampling(CUS)and distance-based near-miss method are widely used in current imbalanced learning algorithms,but this method has certain drawbacks.In particular,the CUS does not consider the influence of the distance factor on the majority of instances,and the near-miss method omits the inter-class(es)within the majority of samples.To overcome these drawbacks,this study proposes an undersampling method combining distance measurement and majority class clustering.Resampling methods are used to develop an ensemble-based imbalanced-learning algorithm called the clustering and distance-based imbalance learning model(CDEILM).This algorithm combines distance-based undersampling,feature selection,and ensemble learning.In addition,a cluster size-based resampling(CSBR)method is proposed for preserving the original distribution of the majority class,and a hybrid imbalanced learning framework is constructed by fusing various types of resampling methods.The combination of CDEILM and CSBR can be considered as a specific case of this hybrid framework.The experimental results show that the CDEILM and CSBR methods can achieve better performance than the benchmark methods,and that the hybrid model provides the best results under most circumstances.Therefore,the proposed model can be used as an alternative imbalanced learning method under specific circumstances,e.g.,for providing a solution to credit evaluation problems in financial applications.
基金supported by the National Natural Science Foundation of China(No.42061065)the Third Xinjiang Comprehensive Scientific Expedition,China(No.2022xjkk03010102).
文摘Root zone soil moisture(RZSM)plays a critical role in land-atmosphere hydrological cycles and serves as the primary water source for vegetation growth.However,the correlations between RZSM and its associated variables,including surface soil moisture(SSM),often exhibit nonlinearities that are challenging to identify and quantify using conventional statistical techniques.Therefore,this study presents a hybrid convolutional neural network(CNN)-long short-term memory neural network(LSTM)-attention(CLA)model for predicting RZSM.Owing to the scarcity of soil moisture(SM)observation data,the physical model Hydrus-1D was employed to simulate a comprehensive dataset of spatial-temporal SM.Meteorological data and moderate resolution imaging spectroradiometer vegetation characterization parameters were used as predictor variables for the training and validation of the CLA model.The results of the CLA model for SM prediction in the root zone were significantly enhanced compared with those of the traditional LSTM and CNN-LSTM models.This was particularly notable at the depth of 80–100 cm,where the fitness(R^(2))reached nearly 0.9298.Moreover,the root mean square error of the CLA model was reduced by 49%and 57%compared with those of the LSTM and CNN-LSTM models,respectively.This study demonstrates that the integration of physical modeling and deep learning methods provides a more comprehensive and accurate understanding of spatial-temporal SM variations in the root zone.
文摘将改进遗传算法(GA)和误差反向传播(BP)算法相结合构成的混合算法用于训练人工神经网络。该混合算法有效地解决了常规 BP 算法学习网络权值收敛速度慢、易陷入局部极小和 GA 算法独立训练神经网络速度缓慢等缺点,并对其应用于电力变压器故障诊断进行了仿真,仿真结果表明了该算法具有较快的收敛速度和较高的计算精度,故障诊断结果证实了该算法应用于电力变压器故障诊断的有效性。
文摘提出了一种基于DI-FCM(double indices fuzzy C-means)算法框架的无监督距离学习算法——基于混合距离学习的双指数模糊C均值算法HDDI-FCM(double indices fuzzy C-m eans with hybrid distance).数据集未知距离度量被表示为若干已有距离的线性组合,然后执行HDDI-FCM,在对数据集进行有效聚类的同时进行距离学习.为了保证迭代算法收敛,引入了Steffensen迭代法来改进计算簇中心点的迭代公式.讨论了算法中参数的选择.基于UCI(University of California,Irvine)数据集的实验结果表明该算法是有效的.