In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is es...In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is established by using a visual long short term memory network in the three-dimensional(3D)space and the motion estimations jointly performed on object trajectory segments.Object visual field information is added to the long short term memory network to improve the accuracy of the motion related object pair selection and motion estimation.To address the uncertainty of the length and interval of trajectory segments,a multimode long short term memory network is proposed for the object motion estimation.The tracking performance is evaluated using the PETS2009 dataset.The experimental results show that the proposed method achieves better performance than the tracking methods based on the independent motion estimation.展开更多
Speaker separation in complex acoustic environment is one of challenging tasks in speech separation.In practice,speakers are very often unmoving or moving slowly in normal communication.In this case,the spatial featur...Speaker separation in complex acoustic environment is one of challenging tasks in speech separation.In practice,speakers are very often unmoving or moving slowly in normal communication.In this case,the spatial features among the consecutive speech frames become highly correlated such that it is helpful for speaker separation by providing additional spatial information.To fully exploit this information,we design a separation system on Recurrent Neural Network(RNN)with long short-term memory(LSTM)which effectively learns the temporal dynamics of spatial features.In detail,a LSTM-based speaker separation algorithm is proposed to extract the spatial features in each time-frequency(TF)unit and form the corresponding feature vector.Then,we treat speaker separation as a supervised learning problem,where a modified ideal ratio mask(IRM)is defined as the training function during LSTM learning.Simulations show that the proposed system achieves attractive separation performance in noisy and reverberant environments.Specifically,during the untrained acoustic test with limited priors,e.g.,unmatched signal to noise ratio(SNR)and reverberation,the proposed LSTM based algorithm can still outperforms the existing DNN based method in the measures of PESQ and STOI.It indicates our method is more robust in untrained conditions.展开更多
In today’s world, there are many people suffering from mentalhealth problems such as depression and anxiety. If these conditions are notidentified and treated early, they can get worse quickly and have far-reachingne...In today’s world, there are many people suffering from mentalhealth problems such as depression and anxiety. If these conditions are notidentified and treated early, they can get worse quickly and have far-reachingnegative effects. Unfortunately, many people suffering from these conditions,especially depression and hypertension, are unaware of their existence until theconditions become chronic. Thus, this paper proposes a novel approach usingBi-directional Long Short-Term Memory (Bi-LSTM) algorithm and GlobalVector (GloVe) algorithm for the prediction and treatment of these conditions.Smartwatches and fitness bands can be equipped with these algorithms whichcan share data with a variety of IoT devices and smart systems to betterunderstand and analyze the user’s condition. We compared the accuracy andloss of the training dataset and the validation dataset of the two modelsnamely, Bi-LSTM without a global vector layer and with a global vector layer.It was observed that the model of Bi-LSTM without a global vector layer hadan accuracy of 83%,while Bi-LSTMwith a global vector layer had an accuracyof 86% with a precision of 86.4%, and an F1 score of 0.861. In addition toproviding basic therapies for the treatment of identified cases, our model alsohelps prevent the deterioration of associated conditions, making our methoda real-world solution.展开更多
Mental workload plays a vital role in cognitive impairment. The impairment refers to a person’s difficulty in remembering, receiving new information, learning new things, concentrating, or making decisions that serio...Mental workload plays a vital role in cognitive impairment. The impairment refers to a person’s difficulty in remembering, receiving new information, learning new things, concentrating, or making decisions that seriously affect everyday life. In this paper, the simultaneous capacity (SIMKAP) experiment-based EEG workload analysis was presented using 45 subjects for multitasking mental workload estimation with subject wise attention loss calculation as well as short term memory loss measurement. Using an open access preprocessed EEG dataset, Discrete wavelet transforms (DWT) was utilized for feature extraction and Minimum redundancy and maximum relevancy (MRMR) technique was used to select most relevance features. Wavelet decomposition technique was also used for decomposing EEG signals into five sub bands. Fourteen statistical features were calculated from each sub band signal to form a 5 × 14 window size. The Neural Network (Narrow) classification algorithm was used to classify dataset for low and high workload conditions and comparison was made using some other machine learning models. The results show the classifier’s accuracy of 86.7%, precision of 84.4%, F1 score of 86.33%, and recall of 88.37% that crosses the state-of-the art methodologies in the literature. This prediction is expected to greatly facilitate the improved way in memory and attention loss impairments assessment.展开更多
Aiming at the problem that some existing traffic flow prediction models are only for a single road segment and the model input data are not pre-processed,a heuristic threshold algorithm is used to de-noise the origina...Aiming at the problem that some existing traffic flow prediction models are only for a single road segment and the model input data are not pre-processed,a heuristic threshold algorithm is used to de-noise the original traffic flow data after wavelet decomposition.The correlation coefficients of road traffic flow data are calculated and the data compression matrix of road traffic flow is constructed.Data de-noising minimizes the interference of data to the model,while the correlation analysis of road network data realizes the prediction at the road network level.Utilizing the advantages of long short term memory(LSTM)network in time series data processing,the compression matrix is input into the constructed LSTM model for short-term traffic flow prediction.The LSTM-1 and LSTM-2 models were respectively trained by de-noising processed data and original data.Through simulation experiments,different prediction times were set,and the prediction results of the prediction model proposed in this paper were compared with those of other methods.It is found that the accuracy of the LSTM-2 model proposed in this paper increases by 10.278%on average compared with other prediction methods,and the prediction accuracy reaches 95.58%,which proves that the short-term traffic flow prediction method proposed in this paper is efficient.展开更多
Predicting wind speed accurately is essential to ensure the stability of the wind power system and improve the utilization rate of wind energy.However,owing to the stochastic and intermittent of wind speed,predicting ...Predicting wind speed accurately is essential to ensure the stability of the wind power system and improve the utilization rate of wind energy.However,owing to the stochastic and intermittent of wind speed,predicting wind speed accurately is difficult.A new hybrid deep learning model based on empirical wavelet transform,recurrent neural network and error correction for short-term wind speed prediction is proposed in this paper.The empirical wavelet transformation is applied to decompose the original wind speed series.The long short term memory network and the Elman neural network are adopted to predict low-frequency and high-frequency wind speed sub-layers respectively to balance the calculation efficiency and prediction accuracy.The error correction strategy based on deep long short term memory network is developed to modify the prediction errors.Four actual wind speed series are utilized to verify the effectiveness of the proposed model.The empirical results indicate that the method proposed in this paper has satisfactory performance in wind speed prediction.展开更多
Traffic flow prediction in urban areas is essential in the IntelligentTransportation System (ITS). Short Term Traffic Flow (STTF) predictionimpacts traffic flow series, where an estimation of the number of vehicleswil...Traffic flow prediction in urban areas is essential in the IntelligentTransportation System (ITS). Short Term Traffic Flow (STTF) predictionimpacts traffic flow series, where an estimation of the number of vehicleswill appear during the next instance of time per hour. Precise STTF iscritical in Intelligent Transportation System. Various extinct systems aim forshort-term traffic forecasts, ensuring a good precision outcome which was asignificant task over the past few years. The main objective of this paper is topropose a new model to predict STTF for every hour of a day. In this paper,we have proposed a novel hybrid algorithm utilizing Principal ComponentAnalysis (PCA), Stacked Auto-Encoder (SAE), Long Short Term Memory(LSTM), and K-Nearest Neighbors (KNN) named PALKNN. Firstly, PCAremoves unwanted information from the dataset and selects essential features.Secondly, SAE is used to reduce the dimension of input data using onehotencoding so the model can be trained with better speed. Thirdly, LSTMtakes the input from SAE, where the data is sorted in ascending orderbased on the important features and generates the derived value. Finally,KNN Regressor takes information from LSTM to predict traffic flow. Theforecasting performance of the PALKNN model is investigated with OpenRoad Traffic Statistics dataset, Great Britain, UK. This paper enhanced thetraffic flow prediction for every hour of a day with a minimal error value.An extensive experimental analysis was performed on the benchmark dataset.The evaluated results indicate the significant improvement of the proposedPALKNN model over the recent approaches such as KNN, SARIMA, LogisticRegression, RNN, and LSTM in terms of root mean square error (RMSE)of 2.07%, mean square error (MSE) of 4.1%, and mean absolute error (MAE)of 2.04%.展开更多
Wind power volatility not only limits the large-scale grid connection but also poses many challenges to safe grid operation.Accurate wind power prediction can mitigate the adverse effects of wind power volatility on w...Wind power volatility not only limits the large-scale grid connection but also poses many challenges to safe grid operation.Accurate wind power prediction can mitigate the adverse effects of wind power volatility on wind power grid connections.For the characteristics of wind power antecedent data and precedent data jointly to determine the prediction accuracy of the prediction model,the short-term prediction of wind power based on a combined neural network is proposed.First,the Bi-directional Long Short Term Memory(BiLSTM)network prediction model is constructed,and the bi-directional nature of the BiLSTM network is used to deeply mine the wind power data information and find the correlation information within the data.Secondly,to avoid the limitation of a single prediction model when the wind power changes abruptly,the Wavelet Transform-Improved Adaptive Genetic Algorithm-Back Propagation(WT-IAGA-BP)neural network based on the combination of the WT-IAGA-BP neural network and BiLSTM network is constructed for the short-term prediction of wind power.Finally,comparing with LSTM,BiLSTM,WT-LSTM,WT-BiLSTM,WT-IAGA-BP,and WT-IAGA-BP&LSTM prediction models,it is verified that the wind power short-term prediction model based on the combination of WT-IAGA-BP neural network and BiLSTM network has higher prediction accuracy.展开更多
In the electricity market,fluctuations in real-time prices are unstable,and changes in short-term load are determined by many factors.By studying the timing of charging and discharging,as well as the economic benefits...In the electricity market,fluctuations in real-time prices are unstable,and changes in short-term load are determined by many factors.By studying the timing of charging and discharging,as well as the economic benefits of energy storage in the process of participating in the power market,this paper takes energy storage scheduling as merely one factor affecting short-term power load,which affects short-term load time series along with time-of-use price,holidays,and temperature.A deep learning network is used to predict the short-term load,a convolutional neural network(CNN)is used to extract the features,and a long short-term memory(LSTM)network is used to learn the temporal characteristics of the load value,which can effectively improve prediction accuracy.Taking the load data of a certain region as an example,the CNN-LSTM prediction model is compared with the single LSTM prediction model.The experimental results show that the CNN-LSTM deep learning network with the participation of energy storage in dispatching can have high prediction accuracy for short-term power load forecasting.展开更多
1 A study shows that music lessons obviously enhance children's cognitive abilities,including short⁃term memory and planning,which lead to improving academic performance.The research is the first large⁃scale and l...1 A study shows that music lessons obviously enhance children's cognitive abilities,including short⁃term memory and planning,which lead to improving academic performance.The research is the first large⁃scale and long⁃term study to be adapted into the regular school curriculum.Visual arts lessons were also found to significantly improve children's visual memory.展开更多
Voice, motion, and mimicry are naturalistic control modalities that have replaced text or display-driven control in human-computer communication (HCC). Specifically, the vocals contain a lot of knowledge, revealing de...Voice, motion, and mimicry are naturalistic control modalities that have replaced text or display-driven control in human-computer communication (HCC). Specifically, the vocals contain a lot of knowledge, revealing details about the speaker’s goals and desires, as well as their internal condition. Certain vocal characteristics reveal the speaker’s mood, intention, and motivation, while word study assists the speaker’s demand to be understood. Voice emotion recognition has become an essential component of modern HCC networks. Integrating findings from the various disciplines involved in identifying vocal emotions is also challenging. Many sound analysis techniques were developed in the past. Learning about the development of artificial intelligence (AI), and especially Deep Learning (DL) technology, research incorporating real data is becoming increasingly common these days. Thus, this research presents a novel selfish herd optimization-tuned long/short-term memory (SHO-LSTM) strategy to identify vocal emotions in human communication. The RAVDESS public dataset is used to train the suggested SHO-LSTM technique. Mel-frequency cepstral coefficient (MFCC) and wiener filter (WF) techniques are used, respectively, to remove noise and extract features from the data. LSTM and SHO are applied to the extracted data to optimize the LSTM network’s parameters for effective emotion recognition. Python Software was used to execute our proposed framework. In the finding assessment phase, Numerous metrics are used to evaluate the proposed model’s detection capability, Such as F1-score (95%), precision (95%), recall (96%), and accuracy (97%). The suggested approach is tested on a Python platform, and the SHO-LSTM’s outcomes are contrasted with those of other previously conducted research. Based on comparative assessments, our suggested approach outperforms the current approaches in vocal emotion recognition.展开更多
Text extraction from images using the traditional techniques of image collecting,and pattern recognition using machine learning consume time due to the amount of extracted features from the images.Deep Neural Networks...Text extraction from images using the traditional techniques of image collecting,and pattern recognition using machine learning consume time due to the amount of extracted features from the images.Deep Neural Networks introduce effective solutions to extract text features from images using a few techniques and the ability to train large datasets of images with significant results.This study proposes using Dual Maxpooling and concatenating convolution Neural Networks(CNN)layers with the activation functions Relu and the Optimized Leaky Relu(OLRelu).The proposed method works by dividing the word image into slices that contain characters.Then pass them to deep learning layers to extract feature maps and reform the predicted words.Bidirectional Short Memory(BiLSTM)layers extractmore compelling features and link the time sequence fromforward and backward directions during the training phase.The Connectionist Temporal Classification(CTC)function calcifies the training and validation loss rates.In addition to decoding the extracted feature to reform characters again and linking them according to their time sequence.The proposed model performance is evaluated using training and validation loss errors on the Mjsynth and Integrated Argument Mining Tasks(IAM)datasets.The result of IAM was 2.09%for the average loss errors with the proposed dualMaxpooling and OLRelu.In the Mjsynth dataset,the best validation loss rate shrunk to 2.2%by applying concatenating CNN layers,and Relu.展开更多
目的 2D姿态估计的误差是导致3D人体姿态估计产生误差的主要原因,如何在2D误差或噪声干扰下从2D姿态映射到最优、最合理的3D姿态,是提高3D人体姿态估计的关键。本文提出了一种稀疏表示与深度模型联合的3D姿态估计方法,以将3D姿态空间几...目的 2D姿态估计的误差是导致3D人体姿态估计产生误差的主要原因,如何在2D误差或噪声干扰下从2D姿态映射到最优、最合理的3D姿态,是提高3D人体姿态估计的关键。本文提出了一种稀疏表示与深度模型联合的3D姿态估计方法,以将3D姿态空间几何先验与时间信息相结合,达到提高3D姿态估计精度的目的。方法利用融合稀疏表示的3D可变形状模型得到单帧图像可靠的3D初始值。构建多通道长短时记忆MLSTM(multichannel long short term memory)降噪编/解码器,将获得的单帧3D初始值以时间序列形式输入到其中,利用MLSTM降噪编/解码器学习相邻帧之间人物姿态的时间依赖关系,并施加时间平滑约束,得到最终优化的3D姿态。结果在Human3.6M数据集上进行了对比实验。对于两种输入数据:数据集给出的2D坐标和通过卷积神经网络获得的2D估计坐标,相比于单帧估计,通过MLSTM降噪编/解码器优化后的视频序列平均重构误差分别下降了12.6%,13%;相比于现有的基于视频的稀疏模型方法,本文方法对视频的平均重构误差下降了6.4%,9.1%。对于2D估计坐标数据,相比于现有的深度模型方法,本文方法对视频的平均重构误差下降了12.8%。结论本文提出的基于时间信息的MLSTM降噪编/解码器与稀疏模型相结合,有效利用了3D姿态先验知识,视频帧间人物姿态连续变化的时间和空间依赖性,一定程度上提高了单目视频3D姿态估计的精度。展开更多
Short-term traffic flow forecasting is a significant part of intelligent transportation system.In some traffic control scenarios,obtaining future traffic flow in advance is conducive to highway management department t...Short-term traffic flow forecasting is a significant part of intelligent transportation system.In some traffic control scenarios,obtaining future traffic flow in advance is conducive to highway management department to have sufficient time to formulate corresponding traffic flow control measures.In hence,it is meaningful to establish an accurate short-term traffic flow method and provide reference for peak traffic flow warning.This paper proposed a new hybrid model for traffic flow forecasting,which is composed of the variational mode decomposition(VMD)method,the group method of data handling(GMDH)neural network,bi-directional long and short term memory(BILSTM)network and ELMAN network,and is optimized by the imperialist competitive algorithm(ICA)method.To illustrate the performance of the proposed model,there are several comparative experiments between the proposed model and other models.The experiment results show that 1)BILSTM network,GMDH network and ELMAN network have better predictive performance than other single models;2)VMD can significantly improve the predictive performance of the ICA-GMDH-BILSTM-ELMAN model.The effect of VMD method is better than that of EEMD method and FEEMD method.To conclude,the proposed model which is made up of the VMD method,the ICA method,the BILSTM network,the GMDH network and the ELMAN network has excellent predictive ability for traffic flow series.展开更多
Rotating machinery is important to industrial production. Any failure of rotating machinery, especially the failure of rolling bearings, can lead to equipment shutdown and even more serious incidents. Therefore, accur...Rotating machinery is important to industrial production. Any failure of rotating machinery, especially the failure of rolling bearings, can lead to equipment shutdown and even more serious incidents. Therefore, accurate residual life prediction plays a crucial role in guaranteeing machine operation safety and reliability and reducing maintenance cost. In order to increase the forecasting precision of the remaining useful life(RUL) of the rolling bearing, an advanced approach combining elastic net with long short-time memory network(LSTM) is proposed, and the new approach is referred to as E-LSTM. The E-LSTM algorithm consists of an elastic mesh and LSTM, taking temporal-spatial correlation into consideration to forecast the RUL through the LSTM. To solve the over-fitting problem of the LSTM neural network during the training process, the elastic net based regularization term is introduced to the LSTM structure.In this way, the change of the output can be well characterized to express the bearing degradation mode. Experimental results from the real-world data demonstrate that the proposed E-LSTM method can obtain higher stability and relevant values that are useful for the RUL forecasting of bearing. Furthermore, these results also indicate that E-LSTM can achieve better performance.展开更多
With the increase in the complexity of industrial system, simply detecting and diagnosing a fault may be insufficient in some cases, and prognosing the fault ahead of time could have a certain necessity. Accurate pred...With the increase in the complexity of industrial system, simply detecting and diagnosing a fault may be insufficient in some cases, and prognosing the fault ahead of time could have a certain necessity. Accurate prediction of key alarm variables in chemical process can indicate the possible change to reduce the probability of abnormal conditions. According to the characteristics of chemical process data, this work proposed a key alarm variables prediction model in chemical process based on dynamic-inner principal component analysis(DiPCA) and long short-term memory(LSTM). DiPCA is used to extract the most dynamic components for prediction. While LSTM is used to learn the relationship and predict the key alarm variables. This work used a simulation data set and a real hydrogenation process data set for applications and explained the model validity from the essential characteristics. Comparison of results with different models shows that our model has better prediction accuracy and performance, which can provide the basis for fault prognosis and health management.展开更多
Nowadays,renewable energy has been emerging as the major source of energy and is driven by its aggressive expansion and falling costs.Most of the renewable energy sources involve turbines and their operation and maint...Nowadays,renewable energy has been emerging as the major source of energy and is driven by its aggressive expansion and falling costs.Most of the renewable energy sources involve turbines and their operation and maintenance are vital and a difficult task.Condition monitoring and fault diagnosis have seen remarkable and revolutionary up-gradation in approaches,practices and technology during the last decade.Turbines mostly do use a rotating type of machinery and analysis of those signals has been challenging to localize the defect.This paper proposes a new hybrid model wherein multiple swarm intelligence models have been evaluated to optimize the conventional Long Short-Term Memory(LSTM)model in classifying the faults from the vibration signals data acquired from the gearbox.This helps to analyze the performance and behavioral patterns of the system more effectively and efficiently which helps to suggest for replacement of the unit with higher precision.The results have demonstrated that the proposed hybrid modeling approach is effective in classifying the faults of the gearbox from the time series data and achieve higher diagnostic accuracy in comparison to the conventional LSTM methods.展开更多
A deep-learning-based method,called ConvLSTMP3,is developed to predict the sea surface heights(SSHs).ConvLSTMP3 is data-driven by treating the SSH prediction problem as the one of extracting the spatial-temporal featu...A deep-learning-based method,called ConvLSTMP3,is developed to predict the sea surface heights(SSHs).ConvLSTMP3 is data-driven by treating the SSH prediction problem as the one of extracting the spatial-temporal features of SSHs,in which the spatial features are“learned”by convolutional operations while the temporal features are tracked by long short term memory(LSTM).Trained by a reanalysis dataset of the South China Sea(SCS),ConvLSTMP3 is applied to the SSH prediction in a region of the SCS east off Vietnam coast featured with eddied and offshore currents in summer.Experimental results show that ConvLSTMP3 achieves a good prediction skill with a mean RMSE of 0.057 m and accuracy of 93.4%averaged over a 15-d prediction period.In particular,ConvLSTMP3 shows a better performance in predicting the temporal evolution of mesoscale eddies in the region than a full-dynamics ocean model.Given the much less computation in the prediction required by ConvLSTMP3,our study suggests that the deep learning technique is very useful and effective in the SSH prediction,and could be an alternative way in the operational prediction for ocean environments in the future.展开更多
Computer-empowered detection of possible faults for Heating,Ventilation and Air-Conditioning(HVAC)subsystems,e.g.,chillers,is one of the most important applications in Artificial Intelligence(AI)integrated Internet of...Computer-empowered detection of possible faults for Heating,Ventilation and Air-Conditioning(HVAC)subsystems,e.g.,chillers,is one of the most important applications in Artificial Intelligence(AI)integrated Internet of Things(IoT).The cyber-physical system greatly enhances the safety and security of the working facilities,reducing time,saving energy and protecting humans’health.Under the current trends of smart building design and energy management optimization,Automated Fault Detection and Diagnosis(AFDD)of chillers integrated with IoT is highly demanded.Recent studies show that standard machine learning techniques,such as Principal Component Analysis(PCA),Support Vector Machine(SVM)and tree-structure-based algorithms,are useful in capturing various chiller faults with high accuracy rates.With the fast development of deep learning technology,Convolutional Neural Networks(CNNs)have been widely and successfully applied to various fields.However,for chiller AFDD,few existing works are adopting CNN and its extensions in the feature extraction and classification processes.In this study,we propose to perform chiller FDD using a CNN-based approach.The proposed approach has two distinct advantages over existing machine learning-based chiller AFDD methods.First,the CNN-based approach does not require the feature selection/extraction process.Since CNN is reputable with its feature extraction capability,the feature extraction and classification processes are merged,leading to a more neat AFDD framework compared to traditional approaches.Second,the classification accuracy is significantly improved compared to traditional methods using the CNN-based approach.展开更多
Traffic flow prediction,as the basis of signal coordination and travel time prediction,has become a research point in the field of transportation.For traffic flow prediction,researchers have proposed a variety of meth...Traffic flow prediction,as the basis of signal coordination and travel time prediction,has become a research point in the field of transportation.For traffic flow prediction,researchers have proposed a variety of methods,but most of these methods only use the time domain information of traffic flow data to predict the traffic flow,ignoring the impact of spatial correlation on the prediction of target road segment flow,which leads to poor prediction accuracy.In this paper,a traffic flow prediction model called as long short time memory and random forest(LSTM-RF)was proposed based on the combination model.In the process of traffic flow prediction,the long short time memory(LSTM)model was used to extract the time sequence features of the predicted target road segment.Then,the predicted value of LSTM and the collected information of adjacent upstream and downstream sections were simultaneously used as the input features of the random forest model to analyze the spatial-temporal correlation of traffic flow,so as to obtain the final prediction results.The traffic flow data of 132 urban road sections collected by the license plate recognition system in Guiyang City were tested and verified.The results show that the method is better than the single model in prediction accuracy,and the prediction error is obviously reduced compared with the single model.展开更多
文摘In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is established by using a visual long short term memory network in the three-dimensional(3D)space and the motion estimations jointly performed on object trajectory segments.Object visual field information is added to the long short term memory network to improve the accuracy of the motion related object pair selection and motion estimation.To address the uncertainty of the length and interval of trajectory segments,a multimode long short term memory network is proposed for the object motion estimation.The tracking performance is evaluated using the PETS2009 dataset.The experimental results show that the proposed method achieves better performance than the tracking methods based on the independent motion estimation.
基金This work is supported by the National Nature Science Foundation of China(NSFC)under Grant Nos.61571106,61501169,41706103the Fundamental Research Funds for the Central Universities under Grant No.2242013K30010.
文摘Speaker separation in complex acoustic environment is one of challenging tasks in speech separation.In practice,speakers are very often unmoving or moving slowly in normal communication.In this case,the spatial features among the consecutive speech frames become highly correlated such that it is helpful for speaker separation by providing additional spatial information.To fully exploit this information,we design a separation system on Recurrent Neural Network(RNN)with long short-term memory(LSTM)which effectively learns the temporal dynamics of spatial features.In detail,a LSTM-based speaker separation algorithm is proposed to extract the spatial features in each time-frequency(TF)unit and form the corresponding feature vector.Then,we treat speaker separation as a supervised learning problem,where a modified ideal ratio mask(IRM)is defined as the training function during LSTM learning.Simulations show that the proposed system achieves attractive separation performance in noisy and reverberant environments.Specifically,during the untrained acoustic test with limited priors,e.g.,unmatched signal to noise ratio(SNR)and reverberation,the proposed LSTM based algorithm can still outperforms the existing DNN based method in the measures of PESQ and STOI.It indicates our method is more robust in untrained conditions.
基金This research is funded by Vellore Institute of Technology,Chennai,India.
文摘In today’s world, there are many people suffering from mentalhealth problems such as depression and anxiety. If these conditions are notidentified and treated early, they can get worse quickly and have far-reachingnegative effects. Unfortunately, many people suffering from these conditions,especially depression and hypertension, are unaware of their existence until theconditions become chronic. Thus, this paper proposes a novel approach usingBi-directional Long Short-Term Memory (Bi-LSTM) algorithm and GlobalVector (GloVe) algorithm for the prediction and treatment of these conditions.Smartwatches and fitness bands can be equipped with these algorithms whichcan share data with a variety of IoT devices and smart systems to betterunderstand and analyze the user’s condition. We compared the accuracy andloss of the training dataset and the validation dataset of the two modelsnamely, Bi-LSTM without a global vector layer and with a global vector layer.It was observed that the model of Bi-LSTM without a global vector layer hadan accuracy of 83%,while Bi-LSTMwith a global vector layer had an accuracyof 86% with a precision of 86.4%, and an F1 score of 0.861. In addition toproviding basic therapies for the treatment of identified cases, our model alsohelps prevent the deterioration of associated conditions, making our methoda real-world solution.
文摘Mental workload plays a vital role in cognitive impairment. The impairment refers to a person’s difficulty in remembering, receiving new information, learning new things, concentrating, or making decisions that seriously affect everyday life. In this paper, the simultaneous capacity (SIMKAP) experiment-based EEG workload analysis was presented using 45 subjects for multitasking mental workload estimation with subject wise attention loss calculation as well as short term memory loss measurement. Using an open access preprocessed EEG dataset, Discrete wavelet transforms (DWT) was utilized for feature extraction and Minimum redundancy and maximum relevancy (MRMR) technique was used to select most relevance features. Wavelet decomposition technique was also used for decomposing EEG signals into five sub bands. Fourteen statistical features were calculated from each sub band signal to form a 5 × 14 window size. The Neural Network (Narrow) classification algorithm was used to classify dataset for low and high workload conditions and comparison was made using some other machine learning models. The results show the classifier’s accuracy of 86.7%, precision of 84.4%, F1 score of 86.33%, and recall of 88.37% that crosses the state-of-the art methodologies in the literature. This prediction is expected to greatly facilitate the improved way in memory and attention loss impairments assessment.
基金National Natural Science Foundation of China(No.71961016)Planning Fund for the Humanities and Social Sciences of the Ministry of Education(Nos.15XJAZH002,18YJAZH148)Natural Science Foundation of Gansu Province(No.18JR3RA125)。
文摘Aiming at the problem that some existing traffic flow prediction models are only for a single road segment and the model input data are not pre-processed,a heuristic threshold algorithm is used to de-noise the original traffic flow data after wavelet decomposition.The correlation coefficients of road traffic flow data are calculated and the data compression matrix of road traffic flow is constructed.Data de-noising minimizes the interference of data to the model,while the correlation analysis of road network data realizes the prediction at the road network level.Utilizing the advantages of long short term memory(LSTM)network in time series data processing,the compression matrix is input into the constructed LSTM model for short-term traffic flow prediction.The LSTM-1 and LSTM-2 models were respectively trained by de-noising processed data and original data.Through simulation experiments,different prediction times were set,and the prediction results of the prediction model proposed in this paper were compared with those of other methods.It is found that the accuracy of the LSTM-2 model proposed in this paper increases by 10.278%on average compared with other prediction methods,and the prediction accuracy reaches 95.58%,which proves that the short-term traffic flow prediction method proposed in this paper is efficient.
基金the Gansu Province Soft Scientific Research Projects(No.2015GS06516)the Funds for Distinguished Young Scientists of Lanzhou University of Technology,China(No.J201304)。
文摘Predicting wind speed accurately is essential to ensure the stability of the wind power system and improve the utilization rate of wind energy.However,owing to the stochastic and intermittent of wind speed,predicting wind speed accurately is difficult.A new hybrid deep learning model based on empirical wavelet transform,recurrent neural network and error correction for short-term wind speed prediction is proposed in this paper.The empirical wavelet transformation is applied to decompose the original wind speed series.The long short term memory network and the Elman neural network are adopted to predict low-frequency and high-frequency wind speed sub-layers respectively to balance the calculation efficiency and prediction accuracy.The error correction strategy based on deep long short term memory network is developed to modify the prediction errors.Four actual wind speed series are utilized to verify the effectiveness of the proposed model.The empirical results indicate that the method proposed in this paper has satisfactory performance in wind speed prediction.
文摘Traffic flow prediction in urban areas is essential in the IntelligentTransportation System (ITS). Short Term Traffic Flow (STTF) predictionimpacts traffic flow series, where an estimation of the number of vehicleswill appear during the next instance of time per hour. Precise STTF iscritical in Intelligent Transportation System. Various extinct systems aim forshort-term traffic forecasts, ensuring a good precision outcome which was asignificant task over the past few years. The main objective of this paper is topropose a new model to predict STTF for every hour of a day. In this paper,we have proposed a novel hybrid algorithm utilizing Principal ComponentAnalysis (PCA), Stacked Auto-Encoder (SAE), Long Short Term Memory(LSTM), and K-Nearest Neighbors (KNN) named PALKNN. Firstly, PCAremoves unwanted information from the dataset and selects essential features.Secondly, SAE is used to reduce the dimension of input data using onehotencoding so the model can be trained with better speed. Thirdly, LSTMtakes the input from SAE, where the data is sorted in ascending orderbased on the important features and generates the derived value. Finally,KNN Regressor takes information from LSTM to predict traffic flow. Theforecasting performance of the PALKNN model is investigated with OpenRoad Traffic Statistics dataset, Great Britain, UK. This paper enhanced thetraffic flow prediction for every hour of a day with a minimal error value.An extensive experimental analysis was performed on the benchmark dataset.The evaluated results indicate the significant improvement of the proposedPALKNN model over the recent approaches such as KNN, SARIMA, LogisticRegression, RNN, and LSTM in terms of root mean square error (RMSE)of 2.07%, mean square error (MSE) of 4.1%, and mean absolute error (MAE)of 2.04%.
基金support of national natural science foundation of China(No.52067021)natural science foundation of Xinjiang(2022D01C35)+1 种基金excellent youth scientific and technological talents plan of Xinjiang(No.2019Q012)major science&technology special project of Xinjiang Uygur Autonomous Region(2022A01002-2)。
文摘Wind power volatility not only limits the large-scale grid connection but also poses many challenges to safe grid operation.Accurate wind power prediction can mitigate the adverse effects of wind power volatility on wind power grid connections.For the characteristics of wind power antecedent data and precedent data jointly to determine the prediction accuracy of the prediction model,the short-term prediction of wind power based on a combined neural network is proposed.First,the Bi-directional Long Short Term Memory(BiLSTM)network prediction model is constructed,and the bi-directional nature of the BiLSTM network is used to deeply mine the wind power data information and find the correlation information within the data.Secondly,to avoid the limitation of a single prediction model when the wind power changes abruptly,the Wavelet Transform-Improved Adaptive Genetic Algorithm-Back Propagation(WT-IAGA-BP)neural network based on the combination of the WT-IAGA-BP neural network and BiLSTM network is constructed for the short-term prediction of wind power.Finally,comparing with LSTM,BiLSTM,WT-LSTM,WT-BiLSTM,WT-IAGA-BP,and WT-IAGA-BP&LSTM prediction models,it is verified that the wind power short-term prediction model based on the combination of WT-IAGA-BP neural network and BiLSTM network has higher prediction accuracy.
基金supported by a State Grid Zhejiang Electric Power Co.,Ltd.Economic and Technical Research Institute Project(Key Technologies and Empirical Research of Diversified Integrated Operation of User-Side Energy Storage in Power Market Environment,No.5211JY19000W)supported by the National Natural Science Foundation of China(Research on Power Market Management to Promote Large-Scale New Energy Consumption,No.71804045).
文摘In the electricity market,fluctuations in real-time prices are unstable,and changes in short-term load are determined by many factors.By studying the timing of charging and discharging,as well as the economic benefits of energy storage in the process of participating in the power market,this paper takes energy storage scheduling as merely one factor affecting short-term power load,which affects short-term load time series along with time-of-use price,holidays,and temperature.A deep learning network is used to predict the short-term load,a convolutional neural network(CNN)is used to extract the features,and a long short-term memory(LSTM)network is used to learn the temporal characteristics of the load value,which can effectively improve prediction accuracy.Taking the load data of a certain region as an example,the CNN-LSTM prediction model is compared with the single LSTM prediction model.The experimental results show that the CNN-LSTM deep learning network with the participation of energy storage in dispatching can have high prediction accuracy for short-term power load forecasting.
文摘1 A study shows that music lessons obviously enhance children's cognitive abilities,including short⁃term memory and planning,which lead to improving academic performance.The research is the first large⁃scale and long⁃term study to be adapted into the regular school curriculum.Visual arts lessons were also found to significantly improve children's visual memory.
基金The author Dr.Arshiya S.Ansari extends the appreciation to the Deanship of Postgraduate Studies and Scientific Research at Majmaah University for funding this research work through the project number(R-2025-1538).
文摘Voice, motion, and mimicry are naturalistic control modalities that have replaced text or display-driven control in human-computer communication (HCC). Specifically, the vocals contain a lot of knowledge, revealing details about the speaker’s goals and desires, as well as their internal condition. Certain vocal characteristics reveal the speaker’s mood, intention, and motivation, while word study assists the speaker’s demand to be understood. Voice emotion recognition has become an essential component of modern HCC networks. Integrating findings from the various disciplines involved in identifying vocal emotions is also challenging. Many sound analysis techniques were developed in the past. Learning about the development of artificial intelligence (AI), and especially Deep Learning (DL) technology, research incorporating real data is becoming increasingly common these days. Thus, this research presents a novel selfish herd optimization-tuned long/short-term memory (SHO-LSTM) strategy to identify vocal emotions in human communication. The RAVDESS public dataset is used to train the suggested SHO-LSTM technique. Mel-frequency cepstral coefficient (MFCC) and wiener filter (WF) techniques are used, respectively, to remove noise and extract features from the data. LSTM and SHO are applied to the extracted data to optimize the LSTM network’s parameters for effective emotion recognition. Python Software was used to execute our proposed framework. In the finding assessment phase, Numerous metrics are used to evaluate the proposed model’s detection capability, Such as F1-score (95%), precision (95%), recall (96%), and accuracy (97%). The suggested approach is tested on a Python platform, and the SHO-LSTM’s outcomes are contrasted with those of other previously conducted research. Based on comparative assessments, our suggested approach outperforms the current approaches in vocal emotion recognition.
基金supported this project under the Fundamental Research Grant Scheme(FRGS)FRGS/1/2019/ICT02/UKM/02/9 entitled“Convolution Neural Network Enhancement Based on Adaptive Convexity and Regularization Functions for Fake Video Analytics”.This grant was received by Prof.Assis.Dr.S.N.H.Sheikh Abdullah,https://www.ukm.my/spifper/research_news/instrumentfunds.
文摘Text extraction from images using the traditional techniques of image collecting,and pattern recognition using machine learning consume time due to the amount of extracted features from the images.Deep Neural Networks introduce effective solutions to extract text features from images using a few techniques and the ability to train large datasets of images with significant results.This study proposes using Dual Maxpooling and concatenating convolution Neural Networks(CNN)layers with the activation functions Relu and the Optimized Leaky Relu(OLRelu).The proposed method works by dividing the word image into slices that contain characters.Then pass them to deep learning layers to extract feature maps and reform the predicted words.Bidirectional Short Memory(BiLSTM)layers extractmore compelling features and link the time sequence fromforward and backward directions during the training phase.The Connectionist Temporal Classification(CTC)function calcifies the training and validation loss rates.In addition to decoding the extracted feature to reform characters again and linking them according to their time sequence.The proposed model performance is evaluated using training and validation loss errors on the Mjsynth and Integrated Argument Mining Tasks(IAM)datasets.The result of IAM was 2.09%for the average loss errors with the proposed dualMaxpooling and OLRelu.In the Mjsynth dataset,the best validation loss rate shrunk to 2.2%by applying concatenating CNN layers,and Relu.
文摘目的 2D姿态估计的误差是导致3D人体姿态估计产生误差的主要原因,如何在2D误差或噪声干扰下从2D姿态映射到最优、最合理的3D姿态,是提高3D人体姿态估计的关键。本文提出了一种稀疏表示与深度模型联合的3D姿态估计方法,以将3D姿态空间几何先验与时间信息相结合,达到提高3D姿态估计精度的目的。方法利用融合稀疏表示的3D可变形状模型得到单帧图像可靠的3D初始值。构建多通道长短时记忆MLSTM(multichannel long short term memory)降噪编/解码器,将获得的单帧3D初始值以时间序列形式输入到其中,利用MLSTM降噪编/解码器学习相邻帧之间人物姿态的时间依赖关系,并施加时间平滑约束,得到最终优化的3D姿态。结果在Human3.6M数据集上进行了对比实验。对于两种输入数据:数据集给出的2D坐标和通过卷积神经网络获得的2D估计坐标,相比于单帧估计,通过MLSTM降噪编/解码器优化后的视频序列平均重构误差分别下降了12.6%,13%;相比于现有的基于视频的稀疏模型方法,本文方法对视频的平均重构误差下降了6.4%,9.1%。对于2D估计坐标数据,相比于现有的深度模型方法,本文方法对视频的平均重构误差下降了12.8%。结论本文提出的基于时间信息的MLSTM降噪编/解码器与稀疏模型相结合,有效利用了3D姿态先验知识,视频帧间人物姿态连续变化的时间和空间依赖性,一定程度上提高了单目视频3D姿态估计的精度。
基金Project(61873283)supported by the National Natural Science Foundation of ChinaProject(KQ1707017)supported by the Changsha Science&Technology Project,ChinaProject(2019CX005)supported by the Innovation Driven Project of the Central South University,China。
文摘Short-term traffic flow forecasting is a significant part of intelligent transportation system.In some traffic control scenarios,obtaining future traffic flow in advance is conducive to highway management department to have sufficient time to formulate corresponding traffic flow control measures.In hence,it is meaningful to establish an accurate short-term traffic flow method and provide reference for peak traffic flow warning.This paper proposed a new hybrid model for traffic flow forecasting,which is composed of the variational mode decomposition(VMD)method,the group method of data handling(GMDH)neural network,bi-directional long and short term memory(BILSTM)network and ELMAN network,and is optimized by the imperialist competitive algorithm(ICA)method.To illustrate the performance of the proposed model,there are several comparative experiments between the proposed model and other models.The experiment results show that 1)BILSTM network,GMDH network and ELMAN network have better predictive performance than other single models;2)VMD can significantly improve the predictive performance of the ICA-GMDH-BILSTM-ELMAN model.The effect of VMD method is better than that of EEMD method and FEEMD method.To conclude,the proposed model which is made up of the VMD method,the ICA method,the BILSTM network,the GMDH network and the ELMAN network has excellent predictive ability for traffic flow series.
基金by National Natural Science Foundation of China(No.61972443)National Key Research and Development Plan Program of China(No.2019YFE0105300)+1 种基金Hunan Provincial Hu-Xiang Young Talents Project of China(No.2018RS3095)Hunan Provincial Natural Science Foundation of China(No.2020JJ5199).
文摘Rotating machinery is important to industrial production. Any failure of rotating machinery, especially the failure of rolling bearings, can lead to equipment shutdown and even more serious incidents. Therefore, accurate residual life prediction plays a crucial role in guaranteeing machine operation safety and reliability and reducing maintenance cost. In order to increase the forecasting precision of the remaining useful life(RUL) of the rolling bearing, an advanced approach combining elastic net with long short-time memory network(LSTM) is proposed, and the new approach is referred to as E-LSTM. The E-LSTM algorithm consists of an elastic mesh and LSTM, taking temporal-spatial correlation into consideration to forecast the RUL through the LSTM. To solve the over-fitting problem of the LSTM neural network during the training process, the elastic net based regularization term is introduced to the LSTM structure.In this way, the change of the output can be well characterized to express the bearing degradation mode. Experimental results from the real-world data demonstrate that the proposed E-LSTM method can obtain higher stability and relevant values that are useful for the RUL forecasting of bearing. Furthermore, these results also indicate that E-LSTM can achieve better performance.
基金support from the National Natural Science Foundation of China (21878171)。
文摘With the increase in the complexity of industrial system, simply detecting and diagnosing a fault may be insufficient in some cases, and prognosing the fault ahead of time could have a certain necessity. Accurate prediction of key alarm variables in chemical process can indicate the possible change to reduce the probability of abnormal conditions. According to the characteristics of chemical process data, this work proposed a key alarm variables prediction model in chemical process based on dynamic-inner principal component analysis(DiPCA) and long short-term memory(LSTM). DiPCA is used to extract the most dynamic components for prediction. While LSTM is used to learn the relationship and predict the key alarm variables. This work used a simulation data set and a real hydrogenation process data set for applications and explained the model validity from the essential characteristics. Comparison of results with different models shows that our model has better prediction accuracy and performance, which can provide the basis for fault prognosis and health management.
文摘Nowadays,renewable energy has been emerging as the major source of energy and is driven by its aggressive expansion and falling costs.Most of the renewable energy sources involve turbines and their operation and maintenance are vital and a difficult task.Condition monitoring and fault diagnosis have seen remarkable and revolutionary up-gradation in approaches,practices and technology during the last decade.Turbines mostly do use a rotating type of machinery and analysis of those signals has been challenging to localize the defect.This paper proposes a new hybrid model wherein multiple swarm intelligence models have been evaluated to optimize the conventional Long Short-Term Memory(LSTM)model in classifying the faults from the vibration signals data acquired from the gearbox.This helps to analyze the performance and behavioral patterns of the system more effectively and efficiently which helps to suggest for replacement of the unit with higher precision.The results have demonstrated that the proposed hybrid modeling approach is effective in classifying the faults of the gearbox from the time series data and achieve higher diagnostic accuracy in comparison to the conventional LSTM methods.
基金The National Key Research and Development Program under contract Nos 2018YFC1406204 and 2018YFC1406201the Guangdong Special Support Program under contract No.2019BT2H594+5 种基金the Taishan Scholar Foundation under contract No.tsqn201812029the National Natural Science Foundation of China under contract Nos U1811464,61572522,61572523,61672033,61672248,61873280,41676016 and 41776028the Natural Science Foundation of Shandong Province under contract Nos ZR2019MF012 and 2019GGX101067the Fundamental Research Funds of Central Universities under contract Nos 18CX02152A and 19CX05003A-6the fund of the Shandong Province Innovation Researching Group under contract No.2019KJN014the Key Special Project for Introduced Talents Team of the Southern Marine Science and Engineering Guangdong Laboratory(Guangzhou)under contract No.GML2019ZD0303.
文摘A deep-learning-based method,called ConvLSTMP3,is developed to predict the sea surface heights(SSHs).ConvLSTMP3 is data-driven by treating the SSH prediction problem as the one of extracting the spatial-temporal features of SSHs,in which the spatial features are“learned”by convolutional operations while the temporal features are tracked by long short term memory(LSTM).Trained by a reanalysis dataset of the South China Sea(SCS),ConvLSTMP3 is applied to the SSH prediction in a region of the SCS east off Vietnam coast featured with eddied and offshore currents in summer.Experimental results show that ConvLSTMP3 achieves a good prediction skill with a mean RMSE of 0.057 m and accuracy of 93.4%averaged over a 15-d prediction period.In particular,ConvLSTMP3 shows a better performance in predicting the temporal evolution of mesoscale eddies in the region than a full-dynamics ocean model.Given the much less computation in the prediction required by ConvLSTMP3,our study suggests that the deep learning technique is very useful and effective in the SSH prediction,and could be an alternative way in the operational prediction for ocean environments in the future.
基金supported by two Ministry of Education(MoE)Singapore Tier 1 research grants under grant numbers R-296-000-208-133 and R-296-000-241-114.
文摘Computer-empowered detection of possible faults for Heating,Ventilation and Air-Conditioning(HVAC)subsystems,e.g.,chillers,is one of the most important applications in Artificial Intelligence(AI)integrated Internet of Things(IoT).The cyber-physical system greatly enhances the safety and security of the working facilities,reducing time,saving energy and protecting humans’health.Under the current trends of smart building design and energy management optimization,Automated Fault Detection and Diagnosis(AFDD)of chillers integrated with IoT is highly demanded.Recent studies show that standard machine learning techniques,such as Principal Component Analysis(PCA),Support Vector Machine(SVM)and tree-structure-based algorithms,are useful in capturing various chiller faults with high accuracy rates.With the fast development of deep learning technology,Convolutional Neural Networks(CNNs)have been widely and successfully applied to various fields.However,for chiller AFDD,few existing works are adopting CNN and its extensions in the feature extraction and classification processes.In this study,we propose to perform chiller FDD using a CNN-based approach.The proposed approach has two distinct advantages over existing machine learning-based chiller AFDD methods.First,the CNN-based approach does not require the feature selection/extraction process.Since CNN is reputable with its feature extraction capability,the feature extraction and classification processes are merged,leading to a more neat AFDD framework compared to traditional approaches.Second,the classification accuracy is significantly improved compared to traditional methods using the CNN-based approach.
文摘Traffic flow prediction,as the basis of signal coordination and travel time prediction,has become a research point in the field of transportation.For traffic flow prediction,researchers have proposed a variety of methods,but most of these methods only use the time domain information of traffic flow data to predict the traffic flow,ignoring the impact of spatial correlation on the prediction of target road segment flow,which leads to poor prediction accuracy.In this paper,a traffic flow prediction model called as long short time memory and random forest(LSTM-RF)was proposed based on the combination model.In the process of traffic flow prediction,the long short time memory(LSTM)model was used to extract the time sequence features of the predicted target road segment.Then,the predicted value of LSTM and the collected information of adjacent upstream and downstream sections were simultaneously used as the input features of the random forest model to analyze the spatial-temporal correlation of traffic flow,so as to obtain the final prediction results.The traffic flow data of 132 urban road sections collected by the license plate recognition system in Guiyang City were tested and verified.The results show that the method is better than the single model in prediction accuracy,and the prediction error is obviously reduced compared with the single model.