To address the global issue of climate change and create focused mitigation plans,accurate CO_(2)emissions forecasting is essential.Using CO_(2)emissions data from 1990 to 2023,this study assesses the predicting perfo...To address the global issue of climate change and create focused mitigation plans,accurate CO_(2)emissions forecasting is essential.Using CO_(2)emissions data from 1990 to 2023,this study assesses the predicting performance of five sophisticated models:Random Forest(RF),XGBoost,Support Vector Regression(SVR),Long Short-Term Memory networks(LSTM),and ARIMA To give a thorough evaluation of the models’performance,measures including Mean Absolute Error(MAE),Root Mean Square Error(RMSE),and Mean Absolute Percentage Error(MAPE)are used.To guarantee dependable model implementation,preprocessing procedures are carried out,such as feature engineering and stationarity tests.Machine learning models outperform ARIMA in identifying complex patterns and long-term associations,but ARIMA does better with data that exhibits strong linear trends.These results provide important information about how well the model fits various forecasting scenarios,which helps develop data-driven carbon reduction programs.Predictive modeling should be incorporated into sustainable climate policy to encourage the adoption of low-carbon technologies and proactive decisionmaking.Achieving long-term environmental sustainability requires strengthening carbon trading systems,encouraging clean energy investments,and enacting stronger emission laws.In line with international climate goals,suggestions for lowering CO_(2)emissions include switching to renewable energy,increasing energy efficiency,and putting afforestation initiatives into action.展开更多
Accurate forecasting of oil production is essential for optimizing resource management and minimizing operational risks in the energy sector. Traditional time-series forecasting techniques, despite their widespread ap...Accurate forecasting of oil production is essential for optimizing resource management and minimizing operational risks in the energy sector. Traditional time-series forecasting techniques, despite their widespread application, often encounter difficulties in handling the complexities of oil production data, which is characterized by non-linear patterns, skewed distributions, and the presence of outliers. To overcome these limitations, deep learning methods have emerged as more robust alternatives. However, while deep neural networks offer improved accuracy, they demand substantial amounts of data for effective training. Conversely, shallow networks with fewer layers lack the capacity to model complex data distributions adequately. To address these challenges, this study introduces a novel hybrid model called Transfer LSTM to GRU (TLTG), which combines the strengths of deep and shallow networks using transfer learning. The TLTG model integrates Long Short-Term Memory (LSTM) networks and Gated Recurrent Units (GRU) to enhance predictive accuracy while maintaining computational efficiency. Gaussian transformation is applied to the input data to reduce outliers and skewness, creating a more normal-like distribution. The proposed approach is validated on datasets from various wells in the Tahe oil field, China. Experimental results highlight the superior performance of the TLTG model, achieving 100% accuracy and faster prediction times (200 s) compared to eight other approaches, demonstrating its effectiveness and efficiency.展开更多
Deep learning(DL)has revolutionized time series forecasting(TSF),surpassing traditional statistical methods(e.g.,ARIMA)and machine learning techniques in modeling complex nonlinear dynamics and long-term dependencies ...Deep learning(DL)has revolutionized time series forecasting(TSF),surpassing traditional statistical methods(e.g.,ARIMA)and machine learning techniques in modeling complex nonlinear dynamics and long-term dependencies prevalent in real-world temporal data.This comprehensive survey reviews state-of-the-art DL architectures forTSF,focusing on four core paradigms:(1)ConvolutionalNeuralNetworks(CNNs),adept at extracting localized temporal features;(2)Recurrent Neural Networks(RNNs)and their advanced variants(LSTM,GRU),designed for sequential dependency modeling;(3)Graph Neural Networks(GNNs),specialized for forecasting structured relational data with spatial-temporal dependencies;and(4)Transformer-based models,leveraging self-attention mechanisms to capture global temporal patterns efficiently.We provide a rigorous analysis of the theoretical underpinnings,recent algorithmic advancements(e.g.,TCNs,attention mechanisms,hybrid architectures),and practical applications of each framework,supported by extensive benchmark datasets(e.g.,ETT,traffic flow,financial indicators)and standardized evaluation metrics(MAE,MSE,RMSE).Critical challenges,including handling irregular sampling intervals,integrating domain knowledge for robustness,and managing computational complexity,are thoroughly discussed.Emerging research directions highlighted include diffusion models for uncertainty quantification,hybrid pipelines combining classical statistical and DL techniques for enhanced interpretability,quantile regression with Transformers for riskaware forecasting,and optimizations for real-time deployment.This work serves as an essential reference,consolidating methodological innovations,empirical resources,and future trends to bridge the gap between theoretical research and practical implementation needs for researchers and practitioners in the field.展开更多
In this paper a new .mnultidimensional time series forecasting scheme based on the empirical orthogonal function (EOF) stepwise iteration process is introduced. The scheme is tested in a series of forecast experiments...In this paper a new .mnultidimensional time series forecasting scheme based on the empirical orthogonal function (EOF) stepwise iteration process is introduced. The scheme is tested in a series of forecast experiments of Nino3 SST anomalies and Tahiti-Darwin SO index. The results show that the scheme is feasible and ENSO predictable.展开更多
Time series forecasting has become an important aspect of data analysis and has many real-world applications.However,undesirable missing values are often encountered,which may adversely affect many forecasting tasks.I...Time series forecasting has become an important aspect of data analysis and has many real-world applications.However,undesirable missing values are often encountered,which may adversely affect many forecasting tasks.In this study,we evaluate and compare the effects of imputationmethods for estimating missing values in a time series.Our approach does not include a simulation to generate pseudo-missing data,but instead perform imputation on actual missing data and measure the performance of the forecasting model created therefrom.In an experiment,therefore,several time series forecasting models are trained using different training datasets prepared using each imputation method.Subsequently,the performance of the imputation methods is evaluated by comparing the accuracy of the forecasting models.The results obtained from a total of four experimental cases show that the k-nearest neighbor technique is the most effective in reconstructing missing data and contributes positively to time series forecasting compared with other imputation methods.展开更多
Background:Improving financial time series forecasting is one of the most challenging and vital issues facing numerous financial analysts and decision makers.Given its direct impact on related decisions,various attemp...Background:Improving financial time series forecasting is one of the most challenging and vital issues facing numerous financial analysts and decision makers.Given its direct impact on related decisions,various attempts have been made to achieve more accurate and reliable forecasting results,of which the combining of individual models remains a widely applied approach.In general,individual models are combined under two main strategies:series and parallel.While it has been proven that these strategies can improve overall forecasting accuracy,the literature on time series forecasting remains vague on the choice of an appropriate strategy to generate a more accurate hybrid model.Methods:Therefore,this study’s key aim is to evaluate the performance of series and parallel strategies to determine a more accurate one.Results:Accordingly,the predictive capabilities of five hybrid models are constructed on the basis of series and parallel strategies compared with each other and with their base models to forecast stock price.To do so,autoregressive integrated moving average(ARIMA)and multilayer perceptrons(MLPs)are used to construct two series hybrid models,ARIMA-MLP and MLP-ARIMA,and three parallel hybrid models,simple average,linear regression,and genetic algorithm models.Conclusion:The empirical forecasting results for two benchmark datasets,that is,the closing of the Shenzhen Integrated Index(SZII)and that of Standard and Poor’s 500(S&P 500),indicate that although all hybrid models perform better than at least one of their individual components,the series combination strategy produces more accurate hybrid models for financial time series forecasting.展开更多
Today, COVID-19 pandemic has become the greatest worldwide threat, as it spreads rapidly among individuals in most countries around the world. This study concerns the problem of daily prediction of new COVID-19 cases ...Today, COVID-19 pandemic has become the greatest worldwide threat, as it spreads rapidly among individuals in most countries around the world. This study concerns the problem of daily prediction of new COVID-19 cases in Italy, aiming to find the best predictive model for daily infection number in countries with a large number of confirmed cases. Finding the most accurate forecasting model would help allocate medical resources, handle the spread of the pandemic and get more prepared in terms of health care systems. We compare the forecasting performance of linear and nonlinear forecasting models using daily COVID-19 data for the period between 22 February 2020 and 10 January 2022. We discuss various forecasting approaches, including an Autoregressive Integrated Moving Average (ARIMA) model, a Nonlinear Autoregressive Neural Network (NARNN) model, a TBATS model and Exponential Smoothing on the data collected from 22 February 2020 to 10 January 2022 and compared their accuracy using the data collected from 26 March 2020 to 04 April 2020, choosing the model with the lowest Mean Absolute Percentage Error (MAPE) value. Since the linear models seem not to easily follow the nonlinear patterns of daily confirmed COVID-19 cases, Artificial Neural Network (ANN) has been successfully applied to solve problems of forecasting nonlinear models. The model has been used for daily prediction of COVID-19 cases for the next 20 days without any additional intervention. The prediction model can be applied to other countries struggling with the COVID-19 pandemic and to any possible future pandemics.展开更多
Time series forecasting is essential for generating predictive insights across various domains, including healthcare, finance, and energy. This study focuses on forecasting patient health data by comparing the perform...Time series forecasting is essential for generating predictive insights across various domains, including healthcare, finance, and energy. This study focuses on forecasting patient health data by comparing the performance of traditional linear time series models, namely Autoregressive Integrated Moving Average (ARIMA), Seasonal ARIMA, and Moving Average (MA) against neural network architectures. The primary goal is to evaluate the effectiveness of these models in predicting healthcare outcomes using patient records, specifically the Cancerpatient.xlsx dataset, which tracks variables such as patient age, symptoms, genetic risk factors, and environmental exposures over time. The proposed strategy involves training each model on historical patient data to predict age progression and other related health indicators, with performance evaluated using Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) metrics. Our findings reveal that neural networks consistently outperform ARIMA and SARIMA by capturing non-linear patterns and complex temporal dependencies within the dataset, resulting in lower forecasting errors. This research highlights the potential of neural networks to enhance predictive accuracy in healthcare applications, supporting better resource allocation, patient monitoring, and long-term health outcome predictions.展开更多
Data breaches are widely reported in the media, attracting the attention of dedicated scientists and professionals working on solutions. Commercial organizations, businesses, and government agencies that acquire, hand...Data breaches are widely reported in the media, attracting the attention of dedicated scientists and professionals working on solutions. Commercial organizations, businesses, and government agencies that acquire, handle, and retain personal or business-related data face the risk of client personal information and organizational intellectual property being compromised, resulting in potential legal, reputational, and financial harm. Data breaches represent a critical cybersecurity challenge that has led to financial losses and infringements of privacy, including the compromise of social security numbers. This underscores the necessity for a more profound understanding of the risks associated with data breaches. Despite much focus, some fundamental issues persist unaddressed. This study concentrates on the modeling and prediction of data breaches through time series forecasting algorithms. The forecasting techniques for time series data represent an emerging area of research, driven by the increasing complexity of such data. This paper analyzes modern modeling methodologies, compares different approaches, and outlines potential options for time series forecasting.This study aims to leverage ARIMA and its variations, such as SARIMA, for building robust prediction models using historical data to forecast the likelihood and magnitude of future data breaches. A comprehensive dataset, sourced from the Privacy Rights Clearinghouse (PRC), encompassing all documented instances of data breaches in the United States, was utilized as input for the predictive models. The ARIMA and SARIMA models demonstrated strong predictive capabilities, with minimal deviation from actual occurrences, highlighting their potential in accurately forecasting data breach incidences. These findings provide valuable insights for organizations aiming to enhance their cybersecurity strategies through data-driven forecasting approaches.展开更多
Predictive maintenance often involves imbalanced multivariate time series datasets with scarce failure events,posing challenges for model training due to the high dimensionality of the data and the need for domain-spe...Predictive maintenance often involves imbalanced multivariate time series datasets with scarce failure events,posing challenges for model training due to the high dimensionality of the data and the need for domain-specific preprocessing,which frequently leads to the development of large and complex models.Inspired by the success of Large Language Models(LLMs),transformer-based foundation models have been developed for time series(TSFM).These models have been proven to reconstruct time series in a zero-shot manner,being able to capture different patterns that effectively characterize time series.This paper proposes the use of TSFM to generate embeddings of the input data space,making them more interpretable for machine learning models.To evaluate the effectiveness of our approach,we trained three classical machine learning algorithms and one neural network using the embeddings generated by the TSFM called Moment for predicting the remaining useful life of aircraft engines.We test the models trained with both the full training dataset and only 10%of the training samples.Our results show that training simple models,such as support vector regressors or neural networks,with embeddings generated by Moment not only accelerates the training process but also enhances performance in few-shot learning scenarios,where data is scarce.This suggests a promising alternative to complex deep learning architectures,particularly in industrial contexts with limited labeled data.展开更多
Long-term multivariate time series forecasting is an important task in engineering applications. It helps grasp the future development trend of data in real-time, which is of great significance for a wide variety of f...Long-term multivariate time series forecasting is an important task in engineering applications. It helps grasp the future development trend of data in real-time, which is of great significance for a wide variety of fields. Due to the non-linear and unstable characteristics of multivariate time series, the existing methods encounter difficulties in analyzing complex high-dimensional data and capturing latent relationships between multivariates in time series, thus affecting the performance of long-term prediction. In this paper, we propose a novel time series forecasting model based on multilayer perceptron that combines spatio-temporal decomposition and doubly residual stacking, namely Spatio-Temporal Decomposition Neural Network (STDNet). We decompose the originally complex and unstable time series into two parts, temporal term and spatial term. We design temporal module based on auto-correlation mechanism to discover temporal dependencies at the sub-series level, and spatial module based on convolutional neural network and self-attention mechanism to integrate multivariate information from two dimensions, global and local, respectively. Then we integrate the results obtained from the different modules to get the final forecast. Extensive experiments on four real-world datasets show that STDNet significantly outperforms other state-of-the-art methods, which provides an effective solution for long-term time series forecasting.展开更多
Tunnel boring machines(TBMs)have been widely utilised in tunnel construction due to their high efficiency and reliability.Accurately predicting TBM performance can improve project time management,cost control,and risk...Tunnel boring machines(TBMs)have been widely utilised in tunnel construction due to their high efficiency and reliability.Accurately predicting TBM performance can improve project time management,cost control,and risk management.This study aims to use deep learning to develop real-time models for predicting the penetration rate(PR).The models are built using data from the Changsha metro project,and their performances are evaluated using unseen data from the Zhengzhou Metro project.In one-step forecast,the predicted penetration rate follows the trend of the measured penetration rate in both training and testing.The autoregressive integrated moving average(ARIMA)model is compared with the recurrent neural network(RNN)model.The results show that univariate models,which only consider historical penetration rate itself,perform better than multivariate models that take into account multiple geological and operational parameters(GEO and OP).Next,an RNN variant combining time series of penetration rate with the last-step geological and operational parameters is developed,and it performs better than other models.A sensitivity analysis shows that the penetration rate is the most important parameter,while other parameters have a smaller impact on time series forecasting.It is also found that smoothed data are easier to predict with high accuracy.Nevertheless,over-simplified data can lose real characteristics in time series.In conclusion,the RNN variant can accurately predict the next-step penetration rate,and data smoothing is crucial in time series forecasting.This study provides practical guidance for TBM performance forecasting in practical engineering.展开更多
Various mathematical models have been commonly used in time series analysis and forecasting. In these processes, academic researchers and business practitioners often come up against two important problems. One is whe...Various mathematical models have been commonly used in time series analysis and forecasting. In these processes, academic researchers and business practitioners often come up against two important problems. One is whether to select an appropriate modeling approach for prediction purposes or to combine these different individual approaches into a single forecast for the different/dissimilar modeling approaches. Another is whether to select the best candidate model for forecasting or to mix the various candidate models with different parameters into a new forecast for the same/similar modeling approaches. In this study, we propose a set of computational procedures to solve the above two issues via two judgmental criteria. Meanwhile, in view of the problems presented in the literature, a novel modeling technique is also proposed to overcome the drawbacks of existing combined forecasting methods. To verify the efficiency and reliability of the proposed procedure and modeling technique, the simulations and real data examples are conducted in this study.The results obtained reveal that the proposed procedure and modeling technique can be used as a feasible solution for time series forecasting with multiple candidate models.展开更多
Time series forecasting research area mainly focuses on developing effective forecasting models toimprove prediction accuracy. An ensemble model composed of autoregressive integrated movingaverage (ARIMA), artificia...Time series forecasting research area mainly focuses on developing effective forecasting models toimprove prediction accuracy. An ensemble model composed of autoregressive integrated movingaverage (ARIMA), artificial neural network (ANN), restricted Boltzmann machines (RBM), anddiscrete wavelet transform (DWT) is presented in this paper. In the proposed model, DWT firstdecomposes time series into approximation and detail. Then Khashei and Bijari's model, which is anensemble model of ARIMA and ANN, is applied to the approximation and detail to extract their bothlinear and nonlinear components and fit the relationship between the components as a function insteadof additive relationship. Furthermore, RBM is used to perform pre-training for generating initialweights and biases based on inputs feature for ANN. Finally, the forecasted approximation and detailare combined to obtain final forecasting. The forecasting capability of the proposed model is testedwith three well-known time series: sunspot, Canadian lynx, exchange rate time series. The predictionperformance is compared to the other six forecasting models. The results indicate that the proposedmodel gives the best performance in all three data sets and all three measures (i.e. MSE, MAE andMAPE).展开更多
Consumption of the electric power highly depends on the Season under consideration. The various means of power generation methods using renewable resources such as sunlight, wind, rain, tides, and waves are season dep...Consumption of the electric power highly depends on the Season under consideration. The various means of power generation methods using renewable resources such as sunlight, wind, rain, tides, and waves are season dependent. This paves the way for analyzing the demand for electric power based on various Seasons. Many traditional methods are utilized previously for the seasonal based electricity demand forecasting. With the development of the advanced tools, these methods are replaced by efficient forecasting techniques. In this paper, a WEKA time series forecasting is being done for the electric power demand for the three seasons such as summer, winter and rainy seasons. The monthly electric consumption data of domestic category is collected from Tamil Nadu Electricity Board (TNEB). Data collected has been pruned based on the three seasons. The WEKA learning algorithms such as Multilayer Perceptron, Support Vector Machine, Linear Regression, and Gaussian Process are used for implementation. The Mean Absolute Error (MAE) and Direction Accuracy (DA) are calculated for the WEKA learning algorithms and they are compared to find the best learning algorithm. The Support Vector Machine algorithm exhibits low Mean Absolute Error and high Direction Accuracy than other WEKA learning algorithms. Hence, the Support Vector Machine learning algorithm is proven to be the WEKA learning algorithm for seasonal based electricity demand forecasting. The need of the hour is to predict and act in the deficit power. This paper is a prelude for such activity and an eye opener in this field.展开更多
Fuzzy sets theory cannot describe the neutrality degreeof data, which has largely limited the objectivity of fuzzy time seriesin uncertain data forecasting. With this regard, a multi-factor highorderintuitionistic fuz...Fuzzy sets theory cannot describe the neutrality degreeof data, which has largely limited the objectivity of fuzzy time seriesin uncertain data forecasting. With this regard, a multi-factor highorderintuitionistic fuzzy time series forecasting model is built. Inthe new model, a fuzzy clustering algorithm is used to get unequalintervals, and a more objective technique for ascertaining membershipand non-membership functions of the intuitionistic fuzzy setis proposed. On these bases, forecast rules based on multidimensionalintuitionistic fuzzy modus ponens inference are established.Finally, contrast experiments on the daily mean temperature ofBeijing are carried out, which show that the novel model has aclear advantage of improving the forecast accuracy.展开更多
Multivariate Time Series(MTS)forecasting is an essential problem in many fields.Accurate forecasting results can effectively help in making decisions.To date,many MTS forecasting methods have been proposed and widely ...Multivariate Time Series(MTS)forecasting is an essential problem in many fields.Accurate forecasting results can effectively help in making decisions.To date,many MTS forecasting methods have been proposed and widely applied.However,these methods assume that the predicted value of a single variable is affected by all other variables,ignoring the causal relationship among variables.To address the above issue,we propose a novel end-to-end deep learning model,termed graph neural network with neural Granger causality,namely CauGNN,in this paper.To characterize the causal information among variables,we introduce the neural Granger causality graph in our model.Each variable is regarded as a graph node,and each edge represents the casual relationship between variables.In addition,convolutional neural network filters with different perception scales are used for time series feature extraction,to generate the feature of each node.Finally,the graph neural network is adopted to tackle the forecasting problem of the graph structure generated by the MTS.Three benchmark datasets from the real world are used to evaluate the proposed CauGNN,and comprehensive experiments show that the proposed method achieves state-of-the-art results in the MTS forecasting task.展开更多
Statistical analyses and descriptive characterizations are sometimes assumed to be offering information on time series forecastability.Despite the scientific interest suggested by such assumptions,the relationships be...Statistical analyses and descriptive characterizations are sometimes assumed to be offering information on time series forecastability.Despite the scientific interest suggested by such assumptions,the relationships between descriptive time series features(e.g.,temporal dependence,entropy,seasonality,trend and linearity features)and actual time series forecastability(quantified by issuing and assessing forecasts for the past)are scarcely studied and quantified in the literature.In this work,we aim to fill in this gap by investigating such relationships,and the way that they can be exploited for understanding hydroclimatic forecastability and its patterns.To this end,we follow a systematic framework bringing together a variety of–mostly new for hydrology–concepts and methods,including 57 descriptive features and nine seasonal time series forecasting methods(i.e.,one simple,five exponential smoothing,two state space and one automated autoregressive fractionally integrated moving average methods).We apply this framework to three global datasets originating from the larger Global Historical Climatology Network(GHCN)and Global Streamflow Indices and Metadata(GSIM)archives.As these datasets comprise over 13,000 monthly temperature,precipitation and river flow time series from several continents and hydroclimatic regimes,they allow us to provide trustable characterizations and interpretations of 12-month ahead hydroclimatic forecastability at the global scale.We first find that the exponential smoothing and state space methods for time series forecasting are rather equally efficient in identifying an upper limit of this forecastability in terms of Nash-Sutcliffe efficiency,while the simple method is shown to be mostly useful in identifying its lower limit.We then demonstrate that the assessed forecastability is strongly related to several descriptive features,including seasonality,entropy,(partial)autocorrelation,stability,(non)linearity,spikiness and heterogeneity features,among others.We further(i)show that,if such descriptive information is available for a monthly hydroclimatic time series,we can even foretell the quality of its future forecasts with a considerable degree of confidence,and(ii)rank the features according to their efficiency in explaining and foretelling forecastability.We believe that the obtained rankings are of key importance for understanding forecastability.Spatial forecastability patterns are also revealed through our experiments,with East Asia(Europe)being characterized by larger(smaller)monthly temperature time series forecastability and the Indian subcontinent(Australia)being characterized by larger(smaller)monthly precipitation time series forecastability,compared to other continental-scale regions,and less notable differences characterizing monthly river flow from continent to continent.A comprehensive interpretation of such patters through massive feature extraction and feature-based time series clustering is shown to be possible.Indeed,continental-scale regions characterized by different degrees of forecastability are also attributed to different clusters or mixtures of clusters(because of their essential differences in terms of descriptive features).展开更多
In today’s rapidly evolving internet landscape,prominent companies across various industries face increasingly complex business operations,leading to significant cluster-scale growth.However,this growth brings about ...In today’s rapidly evolving internet landscape,prominent companies across various industries face increasingly complex business operations,leading to significant cluster-scale growth.However,this growth brings about challenges in cluster management and the inefficient utilization of vast amounts of data due to its low value density.This paper,based on the large-scale cluster virtualization and monitoring system of the data center of the Bureau of Geophysical Prospecting(BGP),utilizes time series data of host resources from the monitoring system’s time series database to propose a multivariate multi-step time series forecasting model,MUL-CNN-BiGRU-Attention,for forecasting CPU load on virtual cluster hosts.The model undergoes extensive offline training using a large volume of time series data,followed by deployment using TensorFlow Serving.Recent small-batch data are employed for fine-tuning model parameters to better adapt to current data patterns.Comparative experiments are conducted between the proposed model and other baseline models,demonstrating notable improvements in Mean Absolute Error(MAE),Mean Squared Error(MSE),Root Mean Squared Error(RMSE),and R2 metrics by up to 35.2%,56.1%,32.5%,and 10.3%,respectively.Additionally,ablation experiments are designed to investigate the impact of different factors on the performance of the forecasting model,providing valuable insights for parameter optimization based on experimental results.展开更多
A nonlinear feedback term is introduced into the evaluation equation of weights of the backpropagation algorithm for neural network, the network becomes a chaotic one. For the purpose of that we can investigate how th...A nonlinear feedback term is introduced into the evaluation equation of weights of the backpropagation algorithm for neural network, the network becomes a chaotic one. For the purpose of that we can investigate how the different feedback terms affect the process of learning and forecasting, we use the model to forecast the nonlinear time series which is produced by Makey-Glass equation. By selecting the suitable feedback term, the system can escape from the local minima and converge to the global minimum or its approximate solutions, and the forecasting results are better than those of backpropagation algorithm.展开更多
文摘To address the global issue of climate change and create focused mitigation plans,accurate CO_(2)emissions forecasting is essential.Using CO_(2)emissions data from 1990 to 2023,this study assesses the predicting performance of five sophisticated models:Random Forest(RF),XGBoost,Support Vector Regression(SVR),Long Short-Term Memory networks(LSTM),and ARIMA To give a thorough evaluation of the models’performance,measures including Mean Absolute Error(MAE),Root Mean Square Error(RMSE),and Mean Absolute Percentage Error(MAPE)are used.To guarantee dependable model implementation,preprocessing procedures are carried out,such as feature engineering and stationarity tests.Machine learning models outperform ARIMA in identifying complex patterns and long-term associations,but ARIMA does better with data that exhibits strong linear trends.These results provide important information about how well the model fits various forecasting scenarios,which helps develop data-driven carbon reduction programs.Predictive modeling should be incorporated into sustainable climate policy to encourage the adoption of low-carbon technologies and proactive decisionmaking.Achieving long-term environmental sustainability requires strengthening carbon trading systems,encouraging clean energy investments,and enacting stronger emission laws.In line with international climate goals,suggestions for lowering CO_(2)emissions include switching to renewable energy,increasing energy efficiency,and putting afforestation initiatives into action.
文摘Accurate forecasting of oil production is essential for optimizing resource management and minimizing operational risks in the energy sector. Traditional time-series forecasting techniques, despite their widespread application, often encounter difficulties in handling the complexities of oil production data, which is characterized by non-linear patterns, skewed distributions, and the presence of outliers. To overcome these limitations, deep learning methods have emerged as more robust alternatives. However, while deep neural networks offer improved accuracy, they demand substantial amounts of data for effective training. Conversely, shallow networks with fewer layers lack the capacity to model complex data distributions adequately. To address these challenges, this study introduces a novel hybrid model called Transfer LSTM to GRU (TLTG), which combines the strengths of deep and shallow networks using transfer learning. The TLTG model integrates Long Short-Term Memory (LSTM) networks and Gated Recurrent Units (GRU) to enhance predictive accuracy while maintaining computational efficiency. Gaussian transformation is applied to the input data to reduce outliers and skewness, creating a more normal-like distribution. The proposed approach is validated on datasets from various wells in the Tahe oil field, China. Experimental results highlight the superior performance of the TLTG model, achieving 100% accuracy and faster prediction times (200 s) compared to eight other approaches, demonstrating its effectiveness and efficiency.
基金funded by Natural Science Foundation of Heilongjiang Province,grant number LH2023F020.
文摘Deep learning(DL)has revolutionized time series forecasting(TSF),surpassing traditional statistical methods(e.g.,ARIMA)and machine learning techniques in modeling complex nonlinear dynamics and long-term dependencies prevalent in real-world temporal data.This comprehensive survey reviews state-of-the-art DL architectures forTSF,focusing on four core paradigms:(1)ConvolutionalNeuralNetworks(CNNs),adept at extracting localized temporal features;(2)Recurrent Neural Networks(RNNs)and their advanced variants(LSTM,GRU),designed for sequential dependency modeling;(3)Graph Neural Networks(GNNs),specialized for forecasting structured relational data with spatial-temporal dependencies;and(4)Transformer-based models,leveraging self-attention mechanisms to capture global temporal patterns efficiently.We provide a rigorous analysis of the theoretical underpinnings,recent algorithmic advancements(e.g.,TCNs,attention mechanisms,hybrid architectures),and practical applications of each framework,supported by extensive benchmark datasets(e.g.,ETT,traffic flow,financial indicators)and standardized evaluation metrics(MAE,MSE,RMSE).Critical challenges,including handling irregular sampling intervals,integrating domain knowledge for robustness,and managing computational complexity,are thoroughly discussed.Emerging research directions highlighted include diffusion models for uncertainty quantification,hybrid pipelines combining classical statistical and DL techniques for enhanced interpretability,quantile regression with Transformers for riskaware forecasting,and optimizations for real-time deployment.This work serves as an essential reference,consolidating methodological innovations,empirical resources,and future trends to bridge the gap between theoretical research and practical implementation needs for researchers and practitioners in the field.
文摘In this paper a new .mnultidimensional time series forecasting scheme based on the empirical orthogonal function (EOF) stepwise iteration process is introduced. The scheme is tested in a series of forecast experiments of Nino3 SST anomalies and Tahiti-Darwin SO index. The results show that the scheme is feasible and ENSO predictable.
基金This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(Grant Number 2020R1A6A1A03040583).
文摘Time series forecasting has become an important aspect of data analysis and has many real-world applications.However,undesirable missing values are often encountered,which may adversely affect many forecasting tasks.In this study,we evaluate and compare the effects of imputationmethods for estimating missing values in a time series.Our approach does not include a simulation to generate pseudo-missing data,but instead perform imputation on actual missing data and measure the performance of the forecasting model created therefrom.In an experiment,therefore,several time series forecasting models are trained using different training datasets prepared using each imputation method.Subsequently,the performance of the imputation methods is evaluated by comparing the accuracy of the forecasting models.The results obtained from a total of four experimental cases show that the k-nearest neighbor technique is the most effective in reconstructing missing data and contributes positively to time series forecasting compared with other imputation methods.
文摘Background:Improving financial time series forecasting is one of the most challenging and vital issues facing numerous financial analysts and decision makers.Given its direct impact on related decisions,various attempts have been made to achieve more accurate and reliable forecasting results,of which the combining of individual models remains a widely applied approach.In general,individual models are combined under two main strategies:series and parallel.While it has been proven that these strategies can improve overall forecasting accuracy,the literature on time series forecasting remains vague on the choice of an appropriate strategy to generate a more accurate hybrid model.Methods:Therefore,this study’s key aim is to evaluate the performance of series and parallel strategies to determine a more accurate one.Results:Accordingly,the predictive capabilities of five hybrid models are constructed on the basis of series and parallel strategies compared with each other and with their base models to forecast stock price.To do so,autoregressive integrated moving average(ARIMA)and multilayer perceptrons(MLPs)are used to construct two series hybrid models,ARIMA-MLP and MLP-ARIMA,and three parallel hybrid models,simple average,linear regression,and genetic algorithm models.Conclusion:The empirical forecasting results for two benchmark datasets,that is,the closing of the Shenzhen Integrated Index(SZII)and that of Standard and Poor’s 500(S&P 500),indicate that although all hybrid models perform better than at least one of their individual components,the series combination strategy produces more accurate hybrid models for financial time series forecasting.
文摘Today, COVID-19 pandemic has become the greatest worldwide threat, as it spreads rapidly among individuals in most countries around the world. This study concerns the problem of daily prediction of new COVID-19 cases in Italy, aiming to find the best predictive model for daily infection number in countries with a large number of confirmed cases. Finding the most accurate forecasting model would help allocate medical resources, handle the spread of the pandemic and get more prepared in terms of health care systems. We compare the forecasting performance of linear and nonlinear forecasting models using daily COVID-19 data for the period between 22 February 2020 and 10 January 2022. We discuss various forecasting approaches, including an Autoregressive Integrated Moving Average (ARIMA) model, a Nonlinear Autoregressive Neural Network (NARNN) model, a TBATS model and Exponential Smoothing on the data collected from 22 February 2020 to 10 January 2022 and compared their accuracy using the data collected from 26 March 2020 to 04 April 2020, choosing the model with the lowest Mean Absolute Percentage Error (MAPE) value. Since the linear models seem not to easily follow the nonlinear patterns of daily confirmed COVID-19 cases, Artificial Neural Network (ANN) has been successfully applied to solve problems of forecasting nonlinear models. The model has been used for daily prediction of COVID-19 cases for the next 20 days without any additional intervention. The prediction model can be applied to other countries struggling with the COVID-19 pandemic and to any possible future pandemics.
文摘Time series forecasting is essential for generating predictive insights across various domains, including healthcare, finance, and energy. This study focuses on forecasting patient health data by comparing the performance of traditional linear time series models, namely Autoregressive Integrated Moving Average (ARIMA), Seasonal ARIMA, and Moving Average (MA) against neural network architectures. The primary goal is to evaluate the effectiveness of these models in predicting healthcare outcomes using patient records, specifically the Cancerpatient.xlsx dataset, which tracks variables such as patient age, symptoms, genetic risk factors, and environmental exposures over time. The proposed strategy involves training each model on historical patient data to predict age progression and other related health indicators, with performance evaluated using Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) metrics. Our findings reveal that neural networks consistently outperform ARIMA and SARIMA by capturing non-linear patterns and complex temporal dependencies within the dataset, resulting in lower forecasting errors. This research highlights the potential of neural networks to enhance predictive accuracy in healthcare applications, supporting better resource allocation, patient monitoring, and long-term health outcome predictions.
文摘Data breaches are widely reported in the media, attracting the attention of dedicated scientists and professionals working on solutions. Commercial organizations, businesses, and government agencies that acquire, handle, and retain personal or business-related data face the risk of client personal information and organizational intellectual property being compromised, resulting in potential legal, reputational, and financial harm. Data breaches represent a critical cybersecurity challenge that has led to financial losses and infringements of privacy, including the compromise of social security numbers. This underscores the necessity for a more profound understanding of the risks associated with data breaches. Despite much focus, some fundamental issues persist unaddressed. This study concentrates on the modeling and prediction of data breaches through time series forecasting algorithms. The forecasting techniques for time series data represent an emerging area of research, driven by the increasing complexity of such data. This paper analyzes modern modeling methodologies, compares different approaches, and outlines potential options for time series forecasting.This study aims to leverage ARIMA and its variations, such as SARIMA, for building robust prediction models using historical data to forecast the likelihood and magnitude of future data breaches. A comprehensive dataset, sourced from the Privacy Rights Clearinghouse (PRC), encompassing all documented instances of data breaches in the United States, was utilized as input for the predictive models. The ARIMA and SARIMA models demonstrated strong predictive capabilities, with minimal deviation from actual occurrences, highlighting their potential in accurately forecasting data breach incidences. These findings provide valuable insights for organizations aiming to enhance their cybersecurity strategies through data-driven forecasting approaches.
基金Funded by the Spanish Government and FEDER funds(AEI/FEDER,UE)under grant PID2021-124502OB-C42(PRESECREL)the predoctoral program“Concepción Arenal del Programa de Personal Investigador en formación Predoctoral”funded by Universidad de Cantabria and Cantabria’s Government(BOC 18-10-2021).
文摘Predictive maintenance often involves imbalanced multivariate time series datasets with scarce failure events,posing challenges for model training due to the high dimensionality of the data and the need for domain-specific preprocessing,which frequently leads to the development of large and complex models.Inspired by the success of Large Language Models(LLMs),transformer-based foundation models have been developed for time series(TSFM).These models have been proven to reconstruct time series in a zero-shot manner,being able to capture different patterns that effectively characterize time series.This paper proposes the use of TSFM to generate embeddings of the input data space,making them more interpretable for machine learning models.To evaluate the effectiveness of our approach,we trained three classical machine learning algorithms and one neural network using the embeddings generated by the TSFM called Moment for predicting the remaining useful life of aircraft engines.We test the models trained with both the full training dataset and only 10%of the training samples.Our results show that training simple models,such as support vector regressors or neural networks,with embeddings generated by Moment not only accelerates the training process but also enhances performance in few-shot learning scenarios,where data is scarce.This suggests a promising alternative to complex deep learning architectures,particularly in industrial contexts with limited labeled data.
基金supported by the National Key Research and Development Program of China (No. 2021YFB3300503)Regional Innovation and Development Joint Fund of National Natural Science Foundation of China (No. U22A20167)National Natural Science Foundation of China (No. 61872260).
文摘Long-term multivariate time series forecasting is an important task in engineering applications. It helps grasp the future development trend of data in real-time, which is of great significance for a wide variety of fields. Due to the non-linear and unstable characteristics of multivariate time series, the existing methods encounter difficulties in analyzing complex high-dimensional data and capturing latent relationships between multivariates in time series, thus affecting the performance of long-term prediction. In this paper, we propose a novel time series forecasting model based on multilayer perceptron that combines spatio-temporal decomposition and doubly residual stacking, namely Spatio-Temporal Decomposition Neural Network (STDNet). We decompose the originally complex and unstable time series into two parts, temporal term and spatial term. We design temporal module based on auto-correlation mechanism to discover temporal dependencies at the sub-series level, and spatial module based on convolutional neural network and self-attention mechanism to integrate multivariate information from two dimensions, global and local, respectively. Then we integrate the results obtained from the different modules to get the final forecast. Extensive experiments on four real-world datasets show that STDNet significantly outperforms other state-of-the-art methods, which provides an effective solution for long-term time series forecasting.
文摘Tunnel boring machines(TBMs)have been widely utilised in tunnel construction due to their high efficiency and reliability.Accurately predicting TBM performance can improve project time management,cost control,and risk management.This study aims to use deep learning to develop real-time models for predicting the penetration rate(PR).The models are built using data from the Changsha metro project,and their performances are evaluated using unseen data from the Zhengzhou Metro project.In one-step forecast,the predicted penetration rate follows the trend of the measured penetration rate in both training and testing.The autoregressive integrated moving average(ARIMA)model is compared with the recurrent neural network(RNN)model.The results show that univariate models,which only consider historical penetration rate itself,perform better than multivariate models that take into account multiple geological and operational parameters(GEO and OP).Next,an RNN variant combining time series of penetration rate with the last-step geological and operational parameters is developed,and it performs better than other models.A sensitivity analysis shows that the penetration rate is the most important parameter,while other parameters have a smaller impact on time series forecasting.It is also found that smoothed data are easier to predict with high accuracy.Nevertheless,over-simplified data can lose real characteristics in time series.In conclusion,the RNN variant can accurately predict the next-step penetration rate,and data smoothing is crucial in time series forecasting.This study provides practical guidance for TBM performance forecasting in practical engineering.
基金This paper was partially supported by NSFC,CAS,RGC of Hong Kong and Ministry of Education and Technology of Japan.
文摘Various mathematical models have been commonly used in time series analysis and forecasting. In these processes, academic researchers and business practitioners often come up against two important problems. One is whether to select an appropriate modeling approach for prediction purposes or to combine these different individual approaches into a single forecast for the different/dissimilar modeling approaches. Another is whether to select the best candidate model for forecasting or to mix the various candidate models with different parameters into a new forecast for the same/similar modeling approaches. In this study, we propose a set of computational procedures to solve the above two issues via two judgmental criteria. Meanwhile, in view of the problems presented in the literature, a novel modeling technique is also proposed to overcome the drawbacks of existing combined forecasting methods. To verify the efficiency and reliability of the proposed procedure and modeling technique, the simulations and real data examples are conducted in this study.The results obtained reveal that the proposed procedure and modeling technique can be used as a feasible solution for time series forecasting with multiple candidate models.
文摘Time series forecasting research area mainly focuses on developing effective forecasting models toimprove prediction accuracy. An ensemble model composed of autoregressive integrated movingaverage (ARIMA), artificial neural network (ANN), restricted Boltzmann machines (RBM), anddiscrete wavelet transform (DWT) is presented in this paper. In the proposed model, DWT firstdecomposes time series into approximation and detail. Then Khashei and Bijari's model, which is anensemble model of ARIMA and ANN, is applied to the approximation and detail to extract their bothlinear and nonlinear components and fit the relationship between the components as a function insteadof additive relationship. Furthermore, RBM is used to perform pre-training for generating initialweights and biases based on inputs feature for ANN. Finally, the forecasted approximation and detailare combined to obtain final forecasting. The forecasting capability of the proposed model is testedwith three well-known time series: sunspot, Canadian lynx, exchange rate time series. The predictionperformance is compared to the other six forecasting models. The results indicate that the proposedmodel gives the best performance in all three data sets and all three measures (i.e. MSE, MAE andMAPE).
文摘Consumption of the electric power highly depends on the Season under consideration. The various means of power generation methods using renewable resources such as sunlight, wind, rain, tides, and waves are season dependent. This paves the way for analyzing the demand for electric power based on various Seasons. Many traditional methods are utilized previously for the seasonal based electricity demand forecasting. With the development of the advanced tools, these methods are replaced by efficient forecasting techniques. In this paper, a WEKA time series forecasting is being done for the electric power demand for the three seasons such as summer, winter and rainy seasons. The monthly electric consumption data of domestic category is collected from Tamil Nadu Electricity Board (TNEB). Data collected has been pruned based on the three seasons. The WEKA learning algorithms such as Multilayer Perceptron, Support Vector Machine, Linear Regression, and Gaussian Process are used for implementation. The Mean Absolute Error (MAE) and Direction Accuracy (DA) are calculated for the WEKA learning algorithms and they are compared to find the best learning algorithm. The Support Vector Machine algorithm exhibits low Mean Absolute Error and high Direction Accuracy than other WEKA learning algorithms. Hence, the Support Vector Machine learning algorithm is proven to be the WEKA learning algorithm for seasonal based electricity demand forecasting. The need of the hour is to predict and act in the deficit power. This paper is a prelude for such activity and an eye opener in this field.
基金supported by the National Natural Science Foundation of China(61309022)
文摘Fuzzy sets theory cannot describe the neutrality degreeof data, which has largely limited the objectivity of fuzzy time seriesin uncertain data forecasting. With this regard, a multi-factor highorderintuitionistic fuzzy time series forecasting model is built. Inthe new model, a fuzzy clustering algorithm is used to get unequalintervals, and a more objective technique for ascertaining membershipand non-membership functions of the intuitionistic fuzzy setis proposed. On these bases, forecast rules based on multidimensionalintuitionistic fuzzy modus ponens inference are established.Finally, contrast experiments on the daily mean temperature ofBeijing are carried out, which show that the novel model has aclear advantage of improving the forecast accuracy.
基金supported in part by the National Natural Science Foundation of China (No.62002035)the Natural Science Foundation of Chongqing (No.cstc2020jcyj-bshX0034).
文摘Multivariate Time Series(MTS)forecasting is an essential problem in many fields.Accurate forecasting results can effectively help in making decisions.To date,many MTS forecasting methods have been proposed and widely applied.However,these methods assume that the predicted value of a single variable is affected by all other variables,ignoring the causal relationship among variables.To address the above issue,we propose a novel end-to-end deep learning model,termed graph neural network with neural Granger causality,namely CauGNN,in this paper.To characterize the causal information among variables,we introduce the neural Granger causality graph in our model.Each variable is regarded as a graph node,and each edge represents the casual relationship between variables.In addition,convolutional neural network filters with different perception scales are used for time series feature extraction,to generate the feature of each node.Finally,the graph neural network is adopted to tackle the forecasting problem of the graph structure generated by the MTS.Three benchmark datasets from the real world are used to evaluate the proposed CauGNN,and comprehensive experiments show that the proposed method achieves state-of-the-art results in the MTS forecasting task.
基金Funding from the Italian Ministry of Environment, Land and Sea Protection (MATTM) for the Sim PRO project (2020–2021) is acknowledged by (in alphabetical order): S. Grimaldi, G. Papacharalampous and E. Volpifunding from the Italian Ministry of Education, University and Research (MIUR), in the frame of the Departments of Excellence Initiative 2018–2022, attributed to the Department of Engineering of Roma Tre Universityfunding from the EU Horizon 2020 project CLINT (Climate Intelligence: Extreme events detection, attribution and adaptation design using machine learning) under Grant Agreement 101003876
文摘Statistical analyses and descriptive characterizations are sometimes assumed to be offering information on time series forecastability.Despite the scientific interest suggested by such assumptions,the relationships between descriptive time series features(e.g.,temporal dependence,entropy,seasonality,trend and linearity features)and actual time series forecastability(quantified by issuing and assessing forecasts for the past)are scarcely studied and quantified in the literature.In this work,we aim to fill in this gap by investigating such relationships,and the way that they can be exploited for understanding hydroclimatic forecastability and its patterns.To this end,we follow a systematic framework bringing together a variety of–mostly new for hydrology–concepts and methods,including 57 descriptive features and nine seasonal time series forecasting methods(i.e.,one simple,five exponential smoothing,two state space and one automated autoregressive fractionally integrated moving average methods).We apply this framework to three global datasets originating from the larger Global Historical Climatology Network(GHCN)and Global Streamflow Indices and Metadata(GSIM)archives.As these datasets comprise over 13,000 monthly temperature,precipitation and river flow time series from several continents and hydroclimatic regimes,they allow us to provide trustable characterizations and interpretations of 12-month ahead hydroclimatic forecastability at the global scale.We first find that the exponential smoothing and state space methods for time series forecasting are rather equally efficient in identifying an upper limit of this forecastability in terms of Nash-Sutcliffe efficiency,while the simple method is shown to be mostly useful in identifying its lower limit.We then demonstrate that the assessed forecastability is strongly related to several descriptive features,including seasonality,entropy,(partial)autocorrelation,stability,(non)linearity,spikiness and heterogeneity features,among others.We further(i)show that,if such descriptive information is available for a monthly hydroclimatic time series,we can even foretell the quality of its future forecasts with a considerable degree of confidence,and(ii)rank the features according to their efficiency in explaining and foretelling forecastability.We believe that the obtained rankings are of key importance for understanding forecastability.Spatial forecastability patterns are also revealed through our experiments,with East Asia(Europe)being characterized by larger(smaller)monthly temperature time series forecastability and the Indian subcontinent(Australia)being characterized by larger(smaller)monthly precipitation time series forecastability,compared to other continental-scale regions,and less notable differences characterizing monthly river flow from continent to continent.A comprehensive interpretation of such patters through massive feature extraction and feature-based time series clustering is shown to be possible.Indeed,continental-scale regions characterized by different degrees of forecastability are also attributed to different clusters or mixtures of clusters(because of their essential differences in terms of descriptive features).
文摘In today’s rapidly evolving internet landscape,prominent companies across various industries face increasingly complex business operations,leading to significant cluster-scale growth.However,this growth brings about challenges in cluster management and the inefficient utilization of vast amounts of data due to its low value density.This paper,based on the large-scale cluster virtualization and monitoring system of the data center of the Bureau of Geophysical Prospecting(BGP),utilizes time series data of host resources from the monitoring system’s time series database to propose a multivariate multi-step time series forecasting model,MUL-CNN-BiGRU-Attention,for forecasting CPU load on virtual cluster hosts.The model undergoes extensive offline training using a large volume of time series data,followed by deployment using TensorFlow Serving.Recent small-batch data are employed for fine-tuning model parameters to better adapt to current data patterns.Comparative experiments are conducted between the proposed model and other baseline models,demonstrating notable improvements in Mean Absolute Error(MAE),Mean Squared Error(MSE),Root Mean Squared Error(RMSE),and R2 metrics by up to 35.2%,56.1%,32.5%,and 10.3%,respectively.Additionally,ablation experiments are designed to investigate the impact of different factors on the performance of the forecasting model,providing valuable insights for parameter optimization based on experimental results.
文摘A nonlinear feedback term is introduced into the evaluation equation of weights of the backpropagation algorithm for neural network, the network becomes a chaotic one. For the purpose of that we can investigate how the different feedback terms affect the process of learning and forecasting, we use the model to forecast the nonlinear time series which is produced by Makey-Glass equation. By selecting the suitable feedback term, the system can escape from the local minima and converge to the global minimum or its approximate solutions, and the forecasting results are better than those of backpropagation algorithm.