Based on the CNN-LSTM fusion deep neural network,this paper proposes a seismic velocity model building method that can simultaneously estimate the root mean square(RMS)velocity and interval velocity from the common-mi...Based on the CNN-LSTM fusion deep neural network,this paper proposes a seismic velocity model building method that can simultaneously estimate the root mean square(RMS)velocity and interval velocity from the common-midpoint(CMP)gather.In the proposed method,a convolutional neural network(CNN)Encoder and two long short-term memory networks(LSTMs)are used to extract spatial and temporal features from seismic signals,respectively,and a CNN Decoder is used to recover RMS velocity and interval velocity of underground media from various feature vectors.To address the problems of unstable gradients and easily fall into a local minimum in the deep neural network training process,we propose to use Kaiming normal initialization with zero negative slopes of rectifi ed units and to adjust the network learning process by optimizing the mean square error(MSE)loss function with the introduction of a freezing factor.The experiments on testing dataset show that CNN-LSTM fusion deep neural network can predict RMS velocity as well as interval velocity more accurately,and its inversion accuracy is superior to that of single neural network models.The predictions on the complex structures and Marmousi model are consistent with the true velocity variation trends,and the predictions on fi eld data can eff ectively correct the phase axis,improve the lateral continuity of phase axis and quality of stack section,indicating the eff ectiveness and decent generalization capability of the proposed method.展开更多
Action recognition is important for understanding the human behaviors in the video,and the video representation is the basis for action recognition.This paper provides a new video representation based on convolution n...Action recognition is important for understanding the human behaviors in the video,and the video representation is the basis for action recognition.This paper provides a new video representation based on convolution neural networks(CNN).For capturing human motion information in one CNN,we take both the optical flow maps and gray images as input,and combine multiple convolutional features by max pooling across frames.In another CNN,we input single color frame to capture context information.Finally,we take the top full connected layer vectors as video representation and train the classifiers by linear support vector machine.The experimental results show that the representation which integrates the optical flow maps and gray images obtains more discriminative properties than those which depend on only one element.On the most challenging data sets HMDB51 and UCF101,this video representation obtains competitive performance.展开更多
Action recognition is an important topic in computer vision. Recently, deep learning technologies have been successfully used in lots of applications including video data for sloving recognition problems. However, mos...Action recognition is an important topic in computer vision. Recently, deep learning technologies have been successfully used in lots of applications including video data for sloving recognition problems. However, most existing deep learning based recognition frameworks are not optimized for action in the surveillance videos. In this paper, we propose a novel method to deal with the recognition of different types of actions in outdoor surveillance videos. The proposed method first introduces motion compensation to improve the detection of human target. Then, it uses three different types of deep models with single and sequenced images as inputs for the recognition of different types of actions. Finally, predictions from different models are fused with a linear model. Experimental results show that the proposed method works well on the real surveillance videos.展开更多
Combining both visible and infrared object information, multispectral data is a promising source data for automatic maritime ship recognition. In this paper, in order to take advantage of deep convolutional neural net...Combining both visible and infrared object information, multispectral data is a promising source data for automatic maritime ship recognition. In this paper, in order to take advantage of deep convolutional neural network and multispectral data, we model multispectral ship recognition task into a convolutional feature fusion problem, and propose a feature fusion architecture called Hybrid Fusion. We fine-tune the VGG-16 model pre-trained on ImageNet through three channels single spectral image and four channels multispectral images, and use existing regularization techniques to avoid over-fitting problem. Hybrid Fusion as well as the other three feature fusion architectures is investigated. Each fusion architecture consists of visible image and infrared image feature extraction branches, in which the pre-trained and fine-tuned VGG-16 models are taken as feature extractor. In each fusion architecture, image features of two branches are firstly extracted from the same layer or different layers of VGG-16 model. Subsequently, the features extracted from the two branches are flattened and concatenated to produce a multispectral feature vector, which is finally fed into a classifier to achieve ship recognition task. Furthermore, based on these fusion architectures, we also evaluate recognition performance of a feature vector normalization method and three combinations of feature extractors. Experimental results on the visible and infrared ship (VAIS) dataset show that the best Hybrid Fusion achieves 89.6% mean per-class recognition accuracy on daytime paired images and 64.9% on nighttime infrared images, and outperforms the state-of-the-art method by 1.4% and 3.9%, respectively.展开更多
为了提高短期风电功率预测的准确性,提出一种基于贝叶斯优化和特征融合的xLSTM(extended Long Short-Term Memory)-Transformer模型。该模型综合应用长短期记忆(LSTM)网络的时序处理能力和Transformer的自注意力机制的动态特征融合能力...为了提高短期风电功率预测的准确性,提出一种基于贝叶斯优化和特征融合的xLSTM(extended Long Short-Term Memory)-Transformer模型。该模型综合应用长短期记忆(LSTM)网络的时序处理能力和Transformer的自注意力机制的动态特征融合能力。借助贝叶斯优化方法,模型可在较少的迭代次数条件下优化超参数,显著降低模型对计算资源的依赖。实验结果表明,内蒙古某风电场数据集上,与单一的LSTM模型、Transformer模型、门控循环单元(GRU)模型以及未采用贝叶斯优化和特征融合的xLSTM-Transformer模型相比,当步长(LookBack)为4和8时,所提模型的决定系数R2较基准模型平均提升1.2%~11.3%;平均绝对误差(MAE)平均降低12.8%~38.4%;均方根误差(RMSE)平均降低8.6%~35.8%。结果表明,所提模型在短历史输入条件下具有更高的预测精度与稳定性。展开更多
基金financially supported by the Key Project of National Natural Science Foundation of China (No. 41930431)the Project of National Natural Science Foundation of China (Nos. 41904121, 41804133, and 41974116)Joint Guidance Project of Natural Science Foundation of Heilongjiang Province (No. LH2020D006)
文摘Based on the CNN-LSTM fusion deep neural network,this paper proposes a seismic velocity model building method that can simultaneously estimate the root mean square(RMS)velocity and interval velocity from the common-midpoint(CMP)gather.In the proposed method,a convolutional neural network(CNN)Encoder and two long short-term memory networks(LSTMs)are used to extract spatial and temporal features from seismic signals,respectively,and a CNN Decoder is used to recover RMS velocity and interval velocity of underground media from various feature vectors.To address the problems of unstable gradients and easily fall into a local minimum in the deep neural network training process,we propose to use Kaiming normal initialization with zero negative slopes of rectifi ed units and to adjust the network learning process by optimizing the mean square error(MSE)loss function with the introduction of a freezing factor.The experiments on testing dataset show that CNN-LSTM fusion deep neural network can predict RMS velocity as well as interval velocity more accurately,and its inversion accuracy is superior to that of single neural network models.The predictions on the complex structures and Marmousi model are consistent with the true velocity variation trends,and the predictions on fi eld data can eff ectively correct the phase axis,improve the lateral continuity of phase axis and quality of stack section,indicating the eff ectiveness and decent generalization capability of the proposed method.
基金Supported by the National High Technology Research and Development Program of China(863 Program,2015AA016306)National Nature Science Foundation of China(61231015)+2 种基金Internet of Things Development Funding Project of Ministry of Industry in 2013(25)Technology Research Program of Ministry of Public Security(2016JSYJA12)the Nature Science Foundation of Hubei Province(2014CFB712)
文摘Action recognition is important for understanding the human behaviors in the video,and the video representation is the basis for action recognition.This paper provides a new video representation based on convolution neural networks(CNN).For capturing human motion information in one CNN,we take both the optical flow maps and gray images as input,and combine multiple convolutional features by max pooling across frames.In another CNN,we input single color frame to capture context information.Finally,we take the top full connected layer vectors as video representation and train the classifiers by linear support vector machine.The experimental results show that the representation which integrates the optical flow maps and gray images obtains more discriminative properties than those which depend on only one element.On the most challenging data sets HMDB51 and UCF101,this video representation obtains competitive performance.
文摘Action recognition is an important topic in computer vision. Recently, deep learning technologies have been successfully used in lots of applications including video data for sloving recognition problems. However, most existing deep learning based recognition frameworks are not optimized for action in the surveillance videos. In this paper, we propose a novel method to deal with the recognition of different types of actions in outdoor surveillance videos. The proposed method first introduces motion compensation to improve the detection of human target. Then, it uses three different types of deep models with single and sequenced images as inputs for the recognition of different types of actions. Finally, predictions from different models are fused with a linear model. Experimental results show that the proposed method works well on the real surveillance videos.
文摘Combining both visible and infrared object information, multispectral data is a promising source data for automatic maritime ship recognition. In this paper, in order to take advantage of deep convolutional neural network and multispectral data, we model multispectral ship recognition task into a convolutional feature fusion problem, and propose a feature fusion architecture called Hybrid Fusion. We fine-tune the VGG-16 model pre-trained on ImageNet through three channels single spectral image and four channels multispectral images, and use existing regularization techniques to avoid over-fitting problem. Hybrid Fusion as well as the other three feature fusion architectures is investigated. Each fusion architecture consists of visible image and infrared image feature extraction branches, in which the pre-trained and fine-tuned VGG-16 models are taken as feature extractor. In each fusion architecture, image features of two branches are firstly extracted from the same layer or different layers of VGG-16 model. Subsequently, the features extracted from the two branches are flattened and concatenated to produce a multispectral feature vector, which is finally fed into a classifier to achieve ship recognition task. Furthermore, based on these fusion architectures, we also evaluate recognition performance of a feature vector normalization method and three combinations of feature extractors. Experimental results on the visible and infrared ship (VAIS) dataset show that the best Hybrid Fusion achieves 89.6% mean per-class recognition accuracy on daytime paired images and 64.9% on nighttime infrared images, and outperforms the state-of-the-art method by 1.4% and 3.9%, respectively.
文摘现有的基于深度学习的医学图像分割方法,大多是利用大量的训练数据拟合检测网络,以获得优异的检测性能。这些方法往往需要较大的模型参数,导致检测实时性较差。为此,提出了基于局部上下文引导特征深度融合轻量级医学分割网络(local context guided feature deep fusion lightweight medical segmentation network,LCGML-net)。LCGML-net通过精确的特征选择与特征融合来减少模型拟合所需的参数数量,从而在保证检测精度的同时实现更小的模型参数。在特征提取阶段和映射阶段,分别通过提取和融合目标的多层次多尺度局部上下文特征来丰富特征表达和精准分割。在STARE、CHASEDB1和KITS19等多个基准数据集上开展的实验证明,与其他方法相比,所提出的LCGML-net具有最佳的检测性能和最小的模型参数。
文摘为了提高短期风电功率预测的准确性,提出一种基于贝叶斯优化和特征融合的xLSTM(extended Long Short-Term Memory)-Transformer模型。该模型综合应用长短期记忆(LSTM)网络的时序处理能力和Transformer的自注意力机制的动态特征融合能力。借助贝叶斯优化方法,模型可在较少的迭代次数条件下优化超参数,显著降低模型对计算资源的依赖。实验结果表明,内蒙古某风电场数据集上,与单一的LSTM模型、Transformer模型、门控循环单元(GRU)模型以及未采用贝叶斯优化和特征融合的xLSTM-Transformer模型相比,当步长(LookBack)为4和8时,所提模型的决定系数R2较基准模型平均提升1.2%~11.3%;平均绝对误差(MAE)平均降低12.8%~38.4%;均方根误差(RMSE)平均降低8.6%~35.8%。结果表明,所提模型在短历史输入条件下具有更高的预测精度与稳定性。