Efficient Convolution Operator(ECO)algorithms have achieved impressive performances in visual tracking.However,its feature extraction network of ECO is unconducive for capturing the correlation features of occluded an...Efficient Convolution Operator(ECO)algorithms have achieved impressive performances in visual tracking.However,its feature extraction network of ECO is unconducive for capturing the correlation features of occluded and blurred targets between long-range complex scene frames.More so,its fixed weight fusion strategy does not use the complementary properties of deep and shallow features.In this paper,we propose a new target tracking method,namely ECO++,using deep feature adaptive fusion in a complex scene,in the following two aspects:First,we constructed a new temporal convolution mode and used it to replace the underlying convolution layer in Conformer network to obtain an improved Conformer network.Second,we adaptively fuse the deep features,which output through the improved Conformer network,by combining the Peak to Sidelobe Ratio(PSR),frame smoothness scores and adaptive adjustment weight.Extensive experiments on the OTB-2013,OTB-2015,UAV123,and VOT2019 benchmarks demonstrate that the proposed approach outperforms the state-of-the-art algorithms in tracking accuracy and robustness in complex scenes with occluded,blurred,and fast-moving targets.展开更多
With the rapid development of information technology,the speed and efficiency of image retrieval are increasingly required in many fields,and a compelling image retrieval method is critical for the development of info...With the rapid development of information technology,the speed and efficiency of image retrieval are increasingly required in many fields,and a compelling image retrieval method is critical for the development of information.Feature extraction based on deep learning has become dominant in image retrieval due to their discrimination more complete,information more complementary and higher precision.However,the high-dimension deep features extracted by CNNs(convolutional neural networks)limits the retrieval efficiency and makes it difficult to satisfy the requirements of existing image retrieval.To solving this problem,the high-dimension feature reduction technology is proposed with improved CNN and PCA quadratic dimensionality reduction.Firstly,in the last layer of the classical networks,this study makes a well-designed DR-Module(dimensionality reduction module)to compress the number of channels of the feature map as much as possible,and ensures the amount of information.Secondly,the deep features are compressed again with PCA(Principal Components Analysis),and the compression ratios of the two dimensionality reductions are reduced,respectively.Therefore,the retrieval efficiency is dramatically improved.Finally,it is proved on the Cifar100 and Caltech101 datasets that the novel method not only improves the retrieval accuracy but also enhances the retrieval efficiency.Experimental results strongly demonstrate that the proposed method performs well in small and medium-sized datasets.展开更多
Individual identification of dairy cows is the prerequisite for automatic analysis and intelligent perception of dairy cows'behavior.At present,individual identification of dairy cows based on deep convolutional n...Individual identification of dairy cows is the prerequisite for automatic analysis and intelligent perception of dairy cows'behavior.At present,individual identification of dairy cows based on deep convolutional neural network had the disadvantages in prolonged training at the additions of new cows samples.Therefore,a cow individual identification framework was proposed based on deep feature extraction and matching,and the individual identification of dairy cows based on this framework could avoid repeated training.Firstly,the trained convolutional neural network model was used as the feature extractor;secondly,the feature extraction was used to extract features and stored the features into the template feature library to complete the enrollment;finally,the identifies of dairy cows were identified.Based on this framework,when new cows joined the herd,enrollment could be completed quickly.In order to evaluate the application performance of this method in closed-set and open-set individual identification of dairy cows,back images of 524 cows were collected,among which the back images of 150 cows were selected as the training data to train feature extractor.The data of the remaining 374 cows were used to generate the template data set and the data to be identified.The experiment results showed that in the closed-set individual identification of dairy cows,the highest identification accuracy of top-1 was 99.73%,the highest identification accuracy from top-2 to top-5 was 100%,and the identification time of a single cow was 0.601 s,this method was verified to be effective.In the open-set individual identification of dairy cows,the recall was 90.38%,and the accuracy was 89.46%.When false accept rate(FAR)=0.05,true accept rate(TAR)=84.07%,this method was verified that the application had certain research value in open-set individual identification of dairy cows,which provided a certain idea for the application of individual identification in the field of intelligent animal husbandry.展开更多
Aiming to the problem of pedestrian tracking with frequent or long-term occlusion in complex scenes,an anti-occlusion pedestrian tracking algorithm based on location prediction and deep feature rematch is proposed.Fir...Aiming to the problem of pedestrian tracking with frequent or long-term occlusion in complex scenes,an anti-occlusion pedestrian tracking algorithm based on location prediction and deep feature rematch is proposed.Firstly,the occlusion judgment is realized by extracting and utilizing deep feature of pedestrian’s appearance,and then the scale adaptive kernelized correlation filter is introduced to implement pedestrian tracking without occlusion.Secondly,Karman filter is introduced to predict the location of occluded pedestrian position.Finally,the deep feature is used to the rematch of pedestrian in the reappearance process.Simulation experiment and analysis show that the proposed algorithm can effectively detect and rematch pedestrian under the condition of frequent or long-term occlusion.展开更多
The human ear has been substantiated as a viable nonintrusive biometric modality for identification or verification.Among many feasible techniques for ear biometric recognition,convolutional neural network(CNN)models ...The human ear has been substantiated as a viable nonintrusive biometric modality for identification or verification.Among many feasible techniques for ear biometric recognition,convolutional neural network(CNN)models have recently offered high-performance and reliable systems.However,their performance can still be further improved using the capabilities of soft biometrics,a research question yet to be investigated.This research aims to augment the traditional CNN-based ear recognition performance by adding increased discriminatory ear soft biometric traits.It proposes a novel framework of augmented ear identification/verification using a group of discriminative categorical soft biometrics and deriving new,more perceptive,comparative soft biometrics for feature-level fusion with hard biometric deep features.It conducts several identification and verification experiments for performance evaluation,analysis,and comparison while varying ear image datasets,hard biometric deep-feature extractors,soft biometric augmentation methods,and classifiers used.The experimental work yields promising results,reaching up to 99.94%accuracy and up to 14%improvement using the AMI and AMIC datasets,along with their corresponding soft biometric label data.The results confirm the proposed augmented approaches’superiority over their standard counterparts and emphasize the robustness of the new ear comparative soft biometrics over their categorical peers.展开更多
Human Activity Recognition(HAR)has become increasingly critical in civic surveillance,medical care monitoring,and institutional protection.Current deep learning-based approaches often suffer from excessive computation...Human Activity Recognition(HAR)has become increasingly critical in civic surveillance,medical care monitoring,and institutional protection.Current deep learning-based approaches often suffer from excessive computational complexity,limited generalizability under varying conditions,and compromised real-time performance.To counter these,this paper introduces an Active Learning-aided Heuristic Deep Spatio-Textural Ensemble Learning(ALH-DSEL)framework.The model initially identifies keyframes from the surveillance videos with a Multi-Constraint Active Learning(MCAL)approach,with features extracted from DenseNet121.The frames are then segmented employing an optimized Fuzzy C-Means clustering algorithm with Firefly to identify areas of interest.A deep ensemble feature extractor,comprising DenseNet121,EfficientNet-B7,MobileNet,and GLCM,extracts varied spatial and textural features.Fused characteristics are enhanced through PCA and Min-Max normalization and discriminated by a maximum voting ensemble of RF,AdaBoost,and XGBoost.The experimental results show that ALH-DSEL provides higher accuracy,precision,recall,and F1-score,validating its superiority for real-time HAR in surveillance scenarios.展开更多
The traditional Chinese-English translation model tends to translate some source words repeatedly,while mistakenly ignoring some words.Therefore,we propose a novel English-Chinese neural machine translation based on s...The traditional Chinese-English translation model tends to translate some source words repeatedly,while mistakenly ignoring some words.Therefore,we propose a novel English-Chinese neural machine translation based on self-organizing mapping neural network and deep feature matching.In this model,word vector,two-way LSTM,2D neural network and other deep learning models are used to extract the semantic matching features of question-answer pairs.Self-organizing mapping(SOM)is used to classify and identify the sentence feature.The attention mechanism-based neural machine translation model is taken as the baseline system.The experimental results show that this framework significantly improves the adequacy of English-Chinese machine translation and achieves better results than the traditional attention mechanism-based English-Chinese machine translation model.展开更多
Face antispoofing has received a lot of attention because it plays a role in strengthening the security of face recognition systems.Face recognition is commonly used for authentication in surveillance applications.How...Face antispoofing has received a lot of attention because it plays a role in strengthening the security of face recognition systems.Face recognition is commonly used for authentication in surveillance applications.However,attackers try to compromise these systems by using spoofing techniques such as using photos or videos of users to gain access to services or information.Many existing methods for face spoofing face difficulties when dealing with new scenarios,especially when there are variations in background,lighting,and other environmental factors.Recent advancements in deep learning with multi-modality methods have shown their effectiveness in face antispoofing,surpassing single-modal methods.However,these approaches often generate several features that can lead to issues with data dimensionality.In this study,we introduce a multimodal deep fusion network for face anti-spoofing that incorporates cross-axial attention and deep reinforcement learning techniques.This network operates at three patch levels and analyzes images from modalities(RGB,IR,and depth).Initially,our design includes an axial attention network(XANet)model that extracts deeply hidden features from multimodal images.Further,we use a bidirectional fusion technique that pays attention to both directions to combine features from each mode effectively.We further improve feature optimization by using the Enhanced Pity Beetle Optimization(EPBO)algorithm,which selects the features to address data dimensionality problems.Moreover,our proposed model employs a hybrid federated reinforcement learning(FDDRL)approach to detect and classify face anti-spoofing,achieving a more optimal tradeoff between detection rates and false positive rates.We evaluated the proposed approach on publicly available datasets,including CASIA-SURF and GREATFASD-S,and realized 98.985%and 97.956%classification accuracy,respectively.In addition,the current method outperforms other state-of-the-art methods in terms of precision,recall,and Fmeasures.Overall,the developed methodology boosts the effectiveness of our model in detecting various types of spoofing attempts.展开更多
Sepsis poses a serious threat to health of children in pediatric intensive care unit.The mortality from pediatric sepsis can be effectively reduced through in-time diagnosis and therapeutic intervention.The bacillicul...Sepsis poses a serious threat to health of children in pediatric intensive care unit.The mortality from pediatric sepsis can be effectively reduced through in-time diagnosis and therapeutic intervention.The bacilliculture detection method is too time-consuming to receive timely treatment.In this research,we propose a new framework:a deep encoding network with cross features(CF-DEN)that enables accurate early detection of sepsis.Cross features are automatically constructed via the gradient boosting decision tree and distilled into the deep encoding network(DEN)we designed.The DEN is aimed at learning sufficiently effective representation from clinical test data.Each layer of the DEN fltrates the features involved in computation at current layer via attention mechanism and outputs the current prediction which is additive layer by layer to obtain the embedding feature at last layer.The framework takes the advantage of tree-based method and neural network method to extract effective representation from small clinical dataset and obtain accurate prediction in order to prompt patient to get timely treatment.We evaluate the performance of the framework on the dataset collected from Shanghai Children's Medical Center.Compared with common machine learning methods,our method achieves the increase on F1-score by 16.06%on the test set.展开更多
In the area of medical image processing,stomach cancer is one of the most important cancers which need to be diagnose at the early stage.In this paper,an optimized deep learning method is presented for multiple stomac...In the area of medical image processing,stomach cancer is one of the most important cancers which need to be diagnose at the early stage.In this paper,an optimized deep learning method is presented for multiple stomach disease classication.The proposed method work in few important steps—preprocessing using the fusion of ltering images along with Ant Colony Optimization(ACO),deep transfer learning-based features extraction,optimization of deep extracted features using nature-inspired algorithms,and nally fusion of optimal vectors and classication using Multi-Layered Perceptron Neural Network(MLNN).In the feature extraction step,pretrained Inception V3 is utilized and retrained on selected stomach infection classes using the deep transfer learning step.Later on,the activation function is applied to Global Average Pool(GAP)for feature extraction.However,the extracted features are optimized through two different nature-inspired algorithms—Particle Swarm Optimization(PSO)with dynamic tness function and Crow Search Algorithm(CSA).Hence,both methods’output is fused by a maximal value approach and classied the fused feature vector by MLNN.Two datasets are used to evaluate the proposed method—CUI WahStomach Diseases and Combined dataset and achieved an average accuracy of 99.5%.The comparison with existing techniques,it is shown that the proposed method shows signicant performance.展开更多
Background—Human Gait Recognition(HGR)is an approach based on biometric and is being widely used for surveillance.HGR is adopted by researchers for the past several decades.Several factors are there that affect the s...Background—Human Gait Recognition(HGR)is an approach based on biometric and is being widely used for surveillance.HGR is adopted by researchers for the past several decades.Several factors are there that affect the system performance such as the walking variation due to clothes,a person carrying some luggage,variations in the view angle.Proposed—In this work,a new method is introduced to overcome different problems of HGR.A hybrid method is proposed or efficient HGR using deep learning and selection of best features.Four major steps are involved in this work-preprocessing of the video frames,manipulation of the pre-trained CNN model VGG-16 for the computation of the features,removing redundant features extracted from the CNN model,and classification.In the reduction of irrelevant features Principal Score and Kurtosis based approach is proposed named PSbK.After that,the features of PSbK are fused in one materix.Finally,this fused vector is fed to the One against All Multi Support Vector Machine(OAMSVM)classifier for the final results.Results—The system is evaluated by utilizing the CASIA B database and six angles 00◦,18◦,36◦,54◦,72◦,and 90◦are used and attained the accuracy of 95.80%,96.0%,95.90%,96.20%,95.60%,and 95.50%,respectively.Conclusion—The comparison with recent methods show the proposed method work better.展开更多
In this paper we consider the problem of“end-to-end”digital camera identification by considering sequence of images obtained from the cameras.The problem of digital camera identification is harder than the problem o...In this paper we consider the problem of“end-to-end”digital camera identification by considering sequence of images obtained from the cameras.The problem of digital camera identification is harder than the problem of identifying its analog counterpart since the process of analog to digital conversion smooths out the intrinsic noise in the analog signal.However it is known that identifying a digital camera is possible by analyzing the camera’s intrinsic sensor artifacts that are introduced into the images/videos during the process of photo/video capture.It is known that such methods are computationally intensive requiring expensive pre-processing steps.In this paper we propose an end-to-end deep feature learning framework for identifying cameras using images obtained from them.We conduct experiments using three custom datasets:the first containing two cameras in an indoor environment where each camera may observe different scenes having no overlapping features,the second containing images from four cameras in an outdoor setting but where each camera observes scenes having overlapping features and the third containing images from two cameras observing the same checkerboard pattern in an indoor setting.Our results show that it is possible to capture the intrinsic hardware signature of the cameras using deep feature representations in an end-to-end framework.These deep feature maps can in turn be used to disambiguate the cameras from each another.Our system is end-to-end,requires no complicated pre-processing steps and the trained model is computationally efficient during testing,paving a way to have near instantaneous decisions for the problem of digital camera identification in production environments.Finally we present comparisons against the current state-of-the-art in digital camera identification which clearly establishes the superiority of the end-to-end solution.展开更多
Objective To construct a precise model for identifying traditional Chinese medicine(TCM)constitutions;thereby offering optimized guidance for clinical diagnosis and treatment plan-ning;and ultimately enhancing medical...Objective To construct a precise model for identifying traditional Chinese medicine(TCM)constitutions;thereby offering optimized guidance for clinical diagnosis and treatment plan-ning;and ultimately enhancing medical efficiency and treatment outcomes.Methods First;TCM full-body inspection data acquisition equipment was employed to col-lect full-body standing images of healthy people;from which the constitutions were labelled and defined in accordance with the Constitution in Chinese Medicine Questionnaire(CCMQ);and a dataset encompassing labelled constitutions was constructed.Second;heat-suppres-sion valve(HSV)color space and improved local binary patterns(LBP)algorithm were lever-aged for the extraction of features such as facial complexion and body shape.In addition;a dual-branch deep network was employed to collect deep features from the full-body standing images.Last;the random forest(RF)algorithm was utilized to learn the extracted multifea-tures;which were subsequently employed to establish a TCM constitution identification mod-el.Accuracy;precision;and F1 score were the three measures selected to assess the perfor-mance of the model.Results It was found that the accuracy;precision;and F1 score of the proposed model based on multifeatures for identifying TCM constitutions were 0.842;0.868;and 0.790;respectively.In comparison with the identification models that encompass a single feature;either a single facial complexion feature;a body shape feature;or deep features;the accuracy of the model that incorporating all the aforementioned features was elevated by 0.105;0.105;and 0.079;the precision increased by 0.164;0.164;and 0.211;and the F1 score rose by 0.071;0.071;and 0.084;respectively.Conclusion The research findings affirmed the viability of the proposed model;which incor-porated multifeatures;including the facial complexion feature;the body shape feature;and the deep feature.In addition;by employing the proposed model;the objectification and intel-ligence of identifying constitutions in TCM practices could be optimized.展开更多
As petroleum exploration advances and as most of the oil-gas reservoirs in shallow layers have been explored, petroleum exploration starts to move toward deep basins, which has become an inevitable choice. In this pap...As petroleum exploration advances and as most of the oil-gas reservoirs in shallow layers have been explored, petroleum exploration starts to move toward deep basins, which has become an inevitable choice. In this paper, the petroleum geology features and research progress on oil-gas reservoirs in deep petroliferous basins across the world are characterized by using the latest results of worldwide deep petroleum exploration. Research has demonstrated that the deep petroleum shows ten major geological features. (1) While oil-gas reservoirs have been discovered in many different types of deep petroliferous basins, most have been discovered in low heat flux deep basins. (2) Many types of petroliferous traps are developed in deep basins, and tight oil-gas reservoirs in deep basin traps are arousing increasing attention. (3) Deep petroleum normally has more natural gas than liquid oil, and the natural gas ratio increases with the burial depth. (4) The residual organic matter in deep source rocks reduces but the hydrocarbon expulsion rate and efficiency increase with the burial depth. (5) There are many types of rocks in deep hydrocarbon reservoirs, and most are clastic rocks and carbonates. (6) The age of deep hydrocarbon reservoirs is widely different, but those recently discovered are pre- dominantly Paleogene and Upper Paleozoic. (7) The porosity and permeability of deep hydrocarbon reservoirs differ widely, but they vary in a regular way with lithology and burial depth. (8) The temperatures of deep oil-gas reservoirs are widely different, but they typically vary with the burial depth and basin geothermal gradient. (9) The pressures of deep oil-gas reservoirs differ significantly, but they typically vary with burial depth, genesis, and evolu- tion period. (10) Deep oil-gas reservoirs may exist with or without a cap, and those without a cap are typically of unconventional genesis. Over the past decade, six major steps have been made in the understanding of deep hydrocarbon reservoir formation. (1) Deep petroleum in petroliferous basins has multiple sources and many dif- ferent genetic mechanisms. (2) There are high-porosity, high-permeability reservoirs in deep basins, the formation of which is associated with tectonic events and subsurface fluid movement. (3) Capillary pressure differences inside and outside the target reservoir are the principal driving force of hydrocarbon enrichment in deep basins. (4) There are three dynamic boundaries for deep oil-gas reservoirs; a buoyancy-controlled threshold, hydrocarbon accumulation limits, and the upper limit of hydrocarbon generation. (5) The formation and distribution of deep hydrocarbon res- ervoirs are controlled by free, limited, and bound fluid dynamic fields. And (6) tight conventional, tight deep, tight superimposed, and related reconstructed hydrocarbon reservoirs formed in deep-limited fluid dynamic fields have great resource potential and vast scope for exploration. Compared with middle-shallow strata, the petroleum geology and accumulation in deep basins are more complex, which overlap the feature of basin evolution in different stages. We recommend that further study should pay more attention to four aspects: (1) identification of deep petroleum sources and evaluation of their relative contributions; (2) preservation conditions and genetic mechanisms of deep high-quality reservoirs with high permeability and high porosity; (3) facies feature and transformation of deep petroleum and their potential distribution; and (4) economic feasibility evaluation of deep tight petroleum exploration and development.展开更多
Human action recognition under complex environment is a challenging work.Recently,sparse representation has achieved excellent results of dealing with human action recognition problem under different conditions.The ma...Human action recognition under complex environment is a challenging work.Recently,sparse representation has achieved excellent results of dealing with human action recognition problem under different conditions.The main idea of sparse representation classification is to construct a general classification scheme where the training samples of each class can be considered as the dictionary to express the query class,and the minimal reconstruction error indicates its corresponding class.However,how to learn a discriminative dictionary is still a difficult work.In this work,we make two contributions.First,we build a new and robust human action recognition framework by combining one modified sparse classification model and deep convolutional neural network(CNN)features.Secondly,we construct a novel classification model which consists of the representation-constrained term and the coefficients incoherence term.Experimental results on benchmark datasets show that our modified model can obtain competitive results in comparison to other state-of-the-art models.展开更多
Multi-object tracking(MOT) techniques have been increasingly applied in a diverse range of tasks. Unmanned aerial vehicle(UAV) is one of its typical application scenarios. Due to the scene complexity and the low resol...Multi-object tracking(MOT) techniques have been increasingly applied in a diverse range of tasks. Unmanned aerial vehicle(UAV) is one of its typical application scenarios. Due to the scene complexity and the low resolution of moving targets in UAV applications, it is difficult to extract target features and identify them. In order to solve this problem, we propose a new re-identification(re-ID) network to extract association features for tracking in the association stage. Moreover, in order to reduce the complexity of detection model, we perform the lightweight optimization for it. Experimental results show that the proposed re-ID network can effectively reduce the number of identity switches, and surpass current state-of-the-art algorithms. In the meantime, the optimized detector can increase the speed by 27% owing to its lightweight design, which enables it to further meet the requirements of UAV tracking tasks.展开更多
Gait recognition is an active research area that uses a walking theme to identify the subject correctly.Human Gait Recognition(HGR)is performed without any cooperation from the individual.However,in practice,it remain...Gait recognition is an active research area that uses a walking theme to identify the subject correctly.Human Gait Recognition(HGR)is performed without any cooperation from the individual.However,in practice,it remains a challenging task under diverse walking sequences due to the covariant factors such as normal walking and walking with wearing a coat.Researchers,over the years,have worked on successfully identifying subjects using different techniques,but there is still room for improvement in accuracy due to these covariant factors.This paper proposes an automated model-free framework for human gait recognition in this article.There are a few critical steps in the proposed method.Firstly,optical flow-based motion region esti-mation and dynamic coordinates-based cropping are performed.The second step involves training a fine-tuned pre-trained MobileNetV2 model on both original and optical flow cropped frames;the training has been conducted using static hyperparameters.The third step proposed a fusion technique known as normal distribution serially fusion.In the fourth step,a better optimization algorithm is applied to select the best features,which are then classified using a Bi-Layered neural network.Three publicly available datasets,CASIA A,CASIA B,and CASIA C,were used in the experimental process and obtained average accuracies of 99.6%,91.6%,and 95.02%,respectively.The proposed framework has achieved improved accuracy compared to the other methods.展开更多
Thunderstorm wind gusts are small in scale,typically occurring within a range of a few kilometers.It is extremely challenging to monitor and forecast thunderstorm wind gusts using only automatic weather stations.There...Thunderstorm wind gusts are small in scale,typically occurring within a range of a few kilometers.It is extremely challenging to monitor and forecast thunderstorm wind gusts using only automatic weather stations.Therefore,it is necessary to establish thunderstorm wind gust identification techniques based on multisource high-resolution observations.This paper introduces a new algorithm,called thunderstorm wind gust identification network(TGNet).It leverages multimodal feature fusion to fuse the temporal and spatial features of thunderstorm wind gust events.The shapelet transform is first used to extract the temporal features of wind speeds from automatic weather stations,which is aimed at distinguishing thunderstorm wind gusts from those caused by synoptic-scale systems or typhoons.Then,the encoder,structured upon the U-shaped network(U-Net)and incorporating recurrent residual convolutional blocks(R2U-Net),is employed to extract the corresponding spatial convective characteristics of satellite,radar,and lightning observations.Finally,by using the multimodal deep fusion module based on multi-head cross-attention,the temporal features of wind speed at each automatic weather station are incorporated into the spatial features to obtain 10-minutely classification of thunderstorm wind gusts.TGNet products have high accuracy,with a critical success index reaching 0.77.Compared with those of U-Net and R2U-Net,the false alarm rate of TGNet products decreases by 31.28%and 24.15%,respectively.The new algorithm provides grid products of thunderstorm wind gusts with a spatial resolution of 0.01°,updated every 10minutes.The results are finer and more accurate,thereby helping to improve the accuracy of operational warnings for thunderstorm wind gusts.展开更多
With the development of Deep Convolutional Neural Networks(DCNNs),the extracted features for image recognition tasks have shifted from low-level features to the high-level semantic features of DCNNs.Previous studies h...With the development of Deep Convolutional Neural Networks(DCNNs),the extracted features for image recognition tasks have shifted from low-level features to the high-level semantic features of DCNNs.Previous studies have shown that the deeper the network is,the more abstract the features are.However,the recognition ability of deep features would be limited by insufficient training samples.To address this problem,this paper derives an improved Deep Fusion Convolutional Neural Network(DF-Net)which can make full use of the differences and complementarities during network learning and enhance feature expression under the condition of limited datasets.Specifically,DF-Net organizes two identical subnets to extract features from the input image in parallel,and then a well-designed fusion module is introduced to the deep layer of DF-Net to fuse the subnet’s features in multi-scale.Thus,the more complex mappings are created and the more abundant and accurate fusion features can be extracted to improve recognition accuracy.Furthermore,a corresponding training strategy is also proposed to speed up the convergence and reduce the computation overhead of network training.Finally,DF-Nets based on the well-known ResNet,DenseNet and MobileNetV2 are evaluated on CIFAR100,Stanford Dogs,and UECFOOD-100.Theoretical analysis and experimental results strongly demonstrate that DF-Net enhances the performance of DCNNs and increases the accuracy of image recognition.展开更多
Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recogniti...Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recognition.We propose in this paper an advanced feature fusion algorithm using Multiple Convolutional Neural Network(Multi-CNN)for scene recognition.Unlike existing works that usually use individual convolutional neural network,a fusion of multiple different convolutional neural networks is applied for scene recognition.Firstly,we split training images in two directions and apply to three deep CNN model,and then extract features from the last full-connected(FC)layer and probabilistic layer on each model.Finally,feature vectors are fused with different fusion strategies in groups forwarded into SoftMax classifier.Our proposed algorithm is evaluated on three scene datasets for scene recognition.The experimental results demonstrate the effectiveness of proposed algorithm compared with other state-of-art approaches.展开更多
基金supported by the National Key R&D Plan"Intelligent Robots"Key Project of P.R.China(Grant No.2018YFB1308602)the National Natural Science Foundation of P.R.China(Grant No.61173184)+3 种基金the Chongqing Natural Science Foundation of P.R.China(Grant No.cstc2018jcyj AX0694)Research Project of Chongqing Big Data Application and Development Administration Bureau(No.22-30)Basic and Advanced Research Projects of CSTC(No.cstc2019jcyj-zdxmX0008)the Science and Technology Research Program of Chongqing Municipal Education Commission(Grant No.KJZD-K201900605)。
文摘Efficient Convolution Operator(ECO)algorithms have achieved impressive performances in visual tracking.However,its feature extraction network of ECO is unconducive for capturing the correlation features of occluded and blurred targets between long-range complex scene frames.More so,its fixed weight fusion strategy does not use the complementary properties of deep and shallow features.In this paper,we propose a new target tracking method,namely ECO++,using deep feature adaptive fusion in a complex scene,in the following two aspects:First,we constructed a new temporal convolution mode and used it to replace the underlying convolution layer in Conformer network to obtain an improved Conformer network.Second,we adaptively fuse the deep features,which output through the improved Conformer network,by combining the Peak to Sidelobe Ratio(PSR),frame smoothness scores and adaptive adjustment weight.Extensive experiments on the OTB-2013,OTB-2015,UAV123,and VOT2019 benchmarks demonstrate that the proposed approach outperforms the state-of-the-art algorithms in tracking accuracy and robustness in complex scenes with occluded,blurred,and fast-moving targets.
基金supported by National Natural Foundation of China(Grant No.61772561)the Key Research&Development Plan of Hunan Province(Grant No.2018NK2012).
文摘With the rapid development of information technology,the speed and efficiency of image retrieval are increasingly required in many fields,and a compelling image retrieval method is critical for the development of information.Feature extraction based on deep learning has become dominant in image retrieval due to their discrimination more complete,information more complementary and higher precision.However,the high-dimension deep features extracted by CNNs(convolutional neural networks)limits the retrieval efficiency and makes it difficult to satisfy the requirements of existing image retrieval.To solving this problem,the high-dimension feature reduction technology is proposed with improved CNN and PCA quadratic dimensionality reduction.Firstly,in the last layer of the classical networks,this study makes a well-designed DR-Module(dimensionality reduction module)to compress the number of channels of the feature map as much as possible,and ensures the amount of information.Secondly,the deep features are compressed again with PCA(Principal Components Analysis),and the compression ratios of the two dimensionality reductions are reduced,respectively.Therefore,the retrieval efficiency is dramatically improved.Finally,it is proved on the Cifar100 and Caltech101 datasets that the novel method not only improves the retrieval accuracy but also enhances the retrieval efficiency.Experimental results strongly demonstrate that the proposed method performs well in small and medium-sized datasets.
基金Supported by the National Key Research and Development Program of China(2019YFE0125600)China Agriculture Research System(CARS-36)。
文摘Individual identification of dairy cows is the prerequisite for automatic analysis and intelligent perception of dairy cows'behavior.At present,individual identification of dairy cows based on deep convolutional neural network had the disadvantages in prolonged training at the additions of new cows samples.Therefore,a cow individual identification framework was proposed based on deep feature extraction and matching,and the individual identification of dairy cows based on this framework could avoid repeated training.Firstly,the trained convolutional neural network model was used as the feature extractor;secondly,the feature extraction was used to extract features and stored the features into the template feature library to complete the enrollment;finally,the identifies of dairy cows were identified.Based on this framework,when new cows joined the herd,enrollment could be completed quickly.In order to evaluate the application performance of this method in closed-set and open-set individual identification of dairy cows,back images of 524 cows were collected,among which the back images of 150 cows were selected as the training data to train feature extractor.The data of the remaining 374 cows were used to generate the template data set and the data to be identified.The experiment results showed that in the closed-set individual identification of dairy cows,the highest identification accuracy of top-1 was 99.73%,the highest identification accuracy from top-2 to top-5 was 100%,and the identification time of a single cow was 0.601 s,this method was verified to be effective.In the open-set individual identification of dairy cows,the recall was 90.38%,and the accuracy was 89.46%.When false accept rate(FAR)=0.05,true accept rate(TAR)=84.07%,this method was verified that the application had certain research value in open-set individual identification of dairy cows,which provided a certain idea for the application of individual identification in the field of intelligent animal husbandry.
基金the National Natural Science Foundation of China(No.61976080,61771006)the Key Project of Henan Province Education Department(No.19A413006).
文摘Aiming to the problem of pedestrian tracking with frequent or long-term occlusion in complex scenes,an anti-occlusion pedestrian tracking algorithm based on location prediction and deep feature rematch is proposed.Firstly,the occlusion judgment is realized by extracting and utilizing deep feature of pedestrian’s appearance,and then the scale adaptive kernelized correlation filter is introduced to implement pedestrian tracking without occlusion.Secondly,Karman filter is introduced to predict the location of occluded pedestrian position.Finally,the deep feature is used to the rematch of pedestrian in the reappearance process.Simulation experiment and analysis show that the proposed algorithm can effectively detect and rematch pedestrian under the condition of frequent or long-term occlusion.
基金funded by WAQF at King Abdulaziz University,Jeddah,Saudi Arabia.
文摘The human ear has been substantiated as a viable nonintrusive biometric modality for identification or verification.Among many feasible techniques for ear biometric recognition,convolutional neural network(CNN)models have recently offered high-performance and reliable systems.However,their performance can still be further improved using the capabilities of soft biometrics,a research question yet to be investigated.This research aims to augment the traditional CNN-based ear recognition performance by adding increased discriminatory ear soft biometric traits.It proposes a novel framework of augmented ear identification/verification using a group of discriminative categorical soft biometrics and deriving new,more perceptive,comparative soft biometrics for feature-level fusion with hard biometric deep features.It conducts several identification and verification experiments for performance evaluation,analysis,and comparison while varying ear image datasets,hard biometric deep-feature extractors,soft biometric augmentation methods,and classifiers used.The experimental work yields promising results,reaching up to 99.94%accuracy and up to 14%improvement using the AMI and AMIC datasets,along with their corresponding soft biometric label data.The results confirm the proposed augmented approaches’superiority over their standard counterparts and emphasize the robustness of the new ear comparative soft biometrics over their categorical peers.
文摘Human Activity Recognition(HAR)has become increasingly critical in civic surveillance,medical care monitoring,and institutional protection.Current deep learning-based approaches often suffer from excessive computational complexity,limited generalizability under varying conditions,and compromised real-time performance.To counter these,this paper introduces an Active Learning-aided Heuristic Deep Spatio-Textural Ensemble Learning(ALH-DSEL)framework.The model initially identifies keyframes from the surveillance videos with a Multi-Constraint Active Learning(MCAL)approach,with features extracted from DenseNet121.The frames are then segmented employing an optimized Fuzzy C-Means clustering algorithm with Firefly to identify areas of interest.A deep ensemble feature extractor,comprising DenseNet121,EfficientNet-B7,MobileNet,and GLCM,extracts varied spatial and textural features.Fused characteristics are enhanced through PCA and Min-Max normalization and discriminated by a maximum voting ensemble of RF,AdaBoost,and XGBoost.The experimental results show that ALH-DSEL provides higher accuracy,precision,recall,and F1-score,validating its superiority for real-time HAR in surveillance scenarios.
文摘The traditional Chinese-English translation model tends to translate some source words repeatedly,while mistakenly ignoring some words.Therefore,we propose a novel English-Chinese neural machine translation based on self-organizing mapping neural network and deep feature matching.In this model,word vector,two-way LSTM,2D neural network and other deep learning models are used to extract the semantic matching features of question-answer pairs.Self-organizing mapping(SOM)is used to classify and identify the sentence feature.The attention mechanism-based neural machine translation model is taken as the baseline system.The experimental results show that this framework significantly improves the adequacy of English-Chinese machine translation and achieves better results than the traditional attention mechanism-based English-Chinese machine translation model.
文摘Face antispoofing has received a lot of attention because it plays a role in strengthening the security of face recognition systems.Face recognition is commonly used for authentication in surveillance applications.However,attackers try to compromise these systems by using spoofing techniques such as using photos or videos of users to gain access to services or information.Many existing methods for face spoofing face difficulties when dealing with new scenarios,especially when there are variations in background,lighting,and other environmental factors.Recent advancements in deep learning with multi-modality methods have shown their effectiveness in face antispoofing,surpassing single-modal methods.However,these approaches often generate several features that can lead to issues with data dimensionality.In this study,we introduce a multimodal deep fusion network for face anti-spoofing that incorporates cross-axial attention and deep reinforcement learning techniques.This network operates at three patch levels and analyzes images from modalities(RGB,IR,and depth).Initially,our design includes an axial attention network(XANet)model that extracts deeply hidden features from multimodal images.Further,we use a bidirectional fusion technique that pays attention to both directions to combine features from each mode effectively.We further improve feature optimization by using the Enhanced Pity Beetle Optimization(EPBO)algorithm,which selects the features to address data dimensionality problems.Moreover,our proposed model employs a hybrid federated reinforcement learning(FDDRL)approach to detect and classify face anti-spoofing,achieving a more optimal tradeoff between detection rates and false positive rates.We evaluated the proposed approach on publicly available datasets,including CASIA-SURF and GREATFASD-S,and realized 98.985%and 97.956%classification accuracy,respectively.In addition,the current method outperforms other state-of-the-art methods in terms of precision,recall,and Fmeasures.Overall,the developed methodology boosts the effectiveness of our model in detecting various types of spoofing attempts.
文摘Sepsis poses a serious threat to health of children in pediatric intensive care unit.The mortality from pediatric sepsis can be effectively reduced through in-time diagnosis and therapeutic intervention.The bacilliculture detection method is too time-consuming to receive timely treatment.In this research,we propose a new framework:a deep encoding network with cross features(CF-DEN)that enables accurate early detection of sepsis.Cross features are automatically constructed via the gradient boosting decision tree and distilled into the deep encoding network(DEN)we designed.The DEN is aimed at learning sufficiently effective representation from clinical test data.Each layer of the DEN fltrates the features involved in computation at current layer via attention mechanism and outputs the current prediction which is additive layer by layer to obtain the embedding feature at last layer.The framework takes the advantage of tree-based method and neural network method to extract effective representation from small clinical dataset and obtain accurate prediction in order to prompt patient to get timely treatment.We evaluate the performance of the framework on the dataset collected from Shanghai Children's Medical Center.Compared with common machine learning methods,our method achieves the increase on F1-score by 16.06%on the test set.
基金supported by Korea Institute for Advancement of Technology(KIAT)grant funded by the Korea Government(MOTIE)(P0012724,The Competency Development Program for Industry Specialist)and the Soonchunhyang University Research Fund.
文摘In the area of medical image processing,stomach cancer is one of the most important cancers which need to be diagnose at the early stage.In this paper,an optimized deep learning method is presented for multiple stomach disease classication.The proposed method work in few important steps—preprocessing using the fusion of ltering images along with Ant Colony Optimization(ACO),deep transfer learning-based features extraction,optimization of deep extracted features using nature-inspired algorithms,and nally fusion of optimal vectors and classication using Multi-Layered Perceptron Neural Network(MLNN).In the feature extraction step,pretrained Inception V3 is utilized and retrained on selected stomach infection classes using the deep transfer learning step.Later on,the activation function is applied to Global Average Pool(GAP)for feature extraction.However,the extracted features are optimized through two different nature-inspired algorithms—Particle Swarm Optimization(PSO)with dynamic tness function and Crow Search Algorithm(CSA).Hence,both methods’output is fused by a maximal value approach and classied the fused feature vector by MLNN.Two datasets are used to evaluate the proposed method—CUI WahStomach Diseases and Combined dataset and achieved an average accuracy of 99.5%.The comparison with existing techniques,it is shown that the proposed method shows signicant performance.
基金This study was supported by the grants of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute(KHIDI),funded by the Ministry of Health&Welfare(HI18C1216)and the Soonchunhyang University Research Fund.
文摘Background—Human Gait Recognition(HGR)is an approach based on biometric and is being widely used for surveillance.HGR is adopted by researchers for the past several decades.Several factors are there that affect the system performance such as the walking variation due to clothes,a person carrying some luggage,variations in the view angle.Proposed—In this work,a new method is introduced to overcome different problems of HGR.A hybrid method is proposed or efficient HGR using deep learning and selection of best features.Four major steps are involved in this work-preprocessing of the video frames,manipulation of the pre-trained CNN model VGG-16 for the computation of the features,removing redundant features extracted from the CNN model,and classification.In the reduction of irrelevant features Principal Score and Kurtosis based approach is proposed named PSbK.After that,the features of PSbK are fused in one materix.Finally,this fused vector is fed to the One against All Multi Support Vector Machine(OAMSVM)classifier for the final results.Results—The system is evaluated by utilizing the CASIA B database and six angles 00◦,18◦,36◦,54◦,72◦,and 90◦are used and attained the accuracy of 95.80%,96.0%,95.90%,96.20%,95.60%,and 95.50%,respectively.Conclusion—The comparison with recent methods show the proposed method work better.
文摘In this paper we consider the problem of“end-to-end”digital camera identification by considering sequence of images obtained from the cameras.The problem of digital camera identification is harder than the problem of identifying its analog counterpart since the process of analog to digital conversion smooths out the intrinsic noise in the analog signal.However it is known that identifying a digital camera is possible by analyzing the camera’s intrinsic sensor artifacts that are introduced into the images/videos during the process of photo/video capture.It is known that such methods are computationally intensive requiring expensive pre-processing steps.In this paper we propose an end-to-end deep feature learning framework for identifying cameras using images obtained from them.We conduct experiments using three custom datasets:the first containing two cameras in an indoor environment where each camera may observe different scenes having no overlapping features,the second containing images from four cameras in an outdoor setting but where each camera observes scenes having overlapping features and the third containing images from two cameras observing the same checkerboard pattern in an indoor setting.Our results show that it is possible to capture the intrinsic hardware signature of the cameras using deep feature representations in an end-to-end framework.These deep feature maps can in turn be used to disambiguate the cameras from each another.Our system is end-to-end,requires no complicated pre-processing steps and the trained model is computationally efficient during testing,paving a way to have near instantaneous decisions for the problem of digital camera identification in production environments.Finally we present comparisons against the current state-of-the-art in digital camera identification which clearly establishes the superiority of the end-to-end solution.
基金National Key Research and Development Program of China(2022YFC3502302)National Natural Science Foundation of China(82074580)Graduate Research Innovation Program of Jiangsu Province(KYCX23_2078).
文摘Objective To construct a precise model for identifying traditional Chinese medicine(TCM)constitutions;thereby offering optimized guidance for clinical diagnosis and treatment plan-ning;and ultimately enhancing medical efficiency and treatment outcomes.Methods First;TCM full-body inspection data acquisition equipment was employed to col-lect full-body standing images of healthy people;from which the constitutions were labelled and defined in accordance with the Constitution in Chinese Medicine Questionnaire(CCMQ);and a dataset encompassing labelled constitutions was constructed.Second;heat-suppres-sion valve(HSV)color space and improved local binary patterns(LBP)algorithm were lever-aged for the extraction of features such as facial complexion and body shape.In addition;a dual-branch deep network was employed to collect deep features from the full-body standing images.Last;the random forest(RF)algorithm was utilized to learn the extracted multifea-tures;which were subsequently employed to establish a TCM constitution identification mod-el.Accuracy;precision;and F1 score were the three measures selected to assess the perfor-mance of the model.Results It was found that the accuracy;precision;and F1 score of the proposed model based on multifeatures for identifying TCM constitutions were 0.842;0.868;and 0.790;respectively.In comparison with the identification models that encompass a single feature;either a single facial complexion feature;a body shape feature;or deep features;the accuracy of the model that incorporating all the aforementioned features was elevated by 0.105;0.105;and 0.079;the precision increased by 0.164;0.164;and 0.211;and the F1 score rose by 0.071;0.071;and 0.084;respectively.Conclusion The research findings affirmed the viability of the proposed model;which incor-porated multifeatures;including the facial complexion feature;the body shape feature;and the deep feature.In addition;by employing the proposed model;the objectification and intel-ligence of identifying constitutions in TCM practices could be optimized.
基金the National Basic Research Program of China (973 Program, 2011CB201100)‘‘Complex hydrocarbon accumulation mechanism and enrichmentregularities of deep superimposed basins in Western China’’ National Natural Science Foundation of China (U1262205) under the guidance of related department heads and experts
文摘As petroleum exploration advances and as most of the oil-gas reservoirs in shallow layers have been explored, petroleum exploration starts to move toward deep basins, which has become an inevitable choice. In this paper, the petroleum geology features and research progress on oil-gas reservoirs in deep petroliferous basins across the world are characterized by using the latest results of worldwide deep petroleum exploration. Research has demonstrated that the deep petroleum shows ten major geological features. (1) While oil-gas reservoirs have been discovered in many different types of deep petroliferous basins, most have been discovered in low heat flux deep basins. (2) Many types of petroliferous traps are developed in deep basins, and tight oil-gas reservoirs in deep basin traps are arousing increasing attention. (3) Deep petroleum normally has more natural gas than liquid oil, and the natural gas ratio increases with the burial depth. (4) The residual organic matter in deep source rocks reduces but the hydrocarbon expulsion rate and efficiency increase with the burial depth. (5) There are many types of rocks in deep hydrocarbon reservoirs, and most are clastic rocks and carbonates. (6) The age of deep hydrocarbon reservoirs is widely different, but those recently discovered are pre- dominantly Paleogene and Upper Paleozoic. (7) The porosity and permeability of deep hydrocarbon reservoirs differ widely, but they vary in a regular way with lithology and burial depth. (8) The temperatures of deep oil-gas reservoirs are widely different, but they typically vary with the burial depth and basin geothermal gradient. (9) The pressures of deep oil-gas reservoirs differ significantly, but they typically vary with burial depth, genesis, and evolu- tion period. (10) Deep oil-gas reservoirs may exist with or without a cap, and those without a cap are typically of unconventional genesis. Over the past decade, six major steps have been made in the understanding of deep hydrocarbon reservoir formation. (1) Deep petroleum in petroliferous basins has multiple sources and many dif- ferent genetic mechanisms. (2) There are high-porosity, high-permeability reservoirs in deep basins, the formation of which is associated with tectonic events and subsurface fluid movement. (3) Capillary pressure differences inside and outside the target reservoir are the principal driving force of hydrocarbon enrichment in deep basins. (4) There are three dynamic boundaries for deep oil-gas reservoirs; a buoyancy-controlled threshold, hydrocarbon accumulation limits, and the upper limit of hydrocarbon generation. (5) The formation and distribution of deep hydrocarbon res- ervoirs are controlled by free, limited, and bound fluid dynamic fields. And (6) tight conventional, tight deep, tight superimposed, and related reconstructed hydrocarbon reservoirs formed in deep-limited fluid dynamic fields have great resource potential and vast scope for exploration. Compared with middle-shallow strata, the petroleum geology and accumulation in deep basins are more complex, which overlap the feature of basin evolution in different stages. We recommend that further study should pay more attention to four aspects: (1) identification of deep petroleum sources and evaluation of their relative contributions; (2) preservation conditions and genetic mechanisms of deep high-quality reservoirs with high permeability and high porosity; (3) facies feature and transformation of deep petroleum and their potential distribution; and (4) economic feasibility evaluation of deep tight petroleum exploration and development.
基金This research was funded by the National Natural Science Foundation of China(21878124,31771680 and 61773182).
文摘Human action recognition under complex environment is a challenging work.Recently,sparse representation has achieved excellent results of dealing with human action recognition problem under different conditions.The main idea of sparse representation classification is to construct a general classification scheme where the training samples of each class can be considered as the dictionary to express the query class,and the minimal reconstruction error indicates its corresponding class.However,how to learn a discriminative dictionary is still a difficult work.In this work,we make two contributions.First,we build a new and robust human action recognition framework by combining one modified sparse classification model and deep convolutional neural network(CNN)features.Secondly,we construct a novel classification model which consists of the representation-constrained term and the coefficients incoherence term.Experimental results on benchmark datasets show that our modified model can obtain competitive results in comparison to other state-of-the-art models.
基金supported by the Research Foundation of Nanjing University of Posts and Telecommunications (No.NY219076)。
文摘Multi-object tracking(MOT) techniques have been increasingly applied in a diverse range of tasks. Unmanned aerial vehicle(UAV) is one of its typical application scenarios. Due to the scene complexity and the low resolution of moving targets in UAV applications, it is difficult to extract target features and identify them. In order to solve this problem, we propose a new re-identification(re-ID) network to extract association features for tracking in the association stage. Moreover, in order to reduce the complexity of detection model, we perform the lightweight optimization for it. Experimental results show that the proposed re-ID network can effectively reduce the number of identity switches, and surpass current state-of-the-art algorithms. In the meantime, the optimized detector can increase the speed by 27% owing to its lightweight design, which enables it to further meet the requirements of UAV tracking tasks.
基金supported by“Human Resources Program in Energy Technology”of the Korea Institute of Energy Technology Evaluation and Planning(KETEP)granted financial resources from the Ministry of Trade,Industry&Energy,Republic of Korea.(No.20204010600090).
文摘Gait recognition is an active research area that uses a walking theme to identify the subject correctly.Human Gait Recognition(HGR)is performed without any cooperation from the individual.However,in practice,it remains a challenging task under diverse walking sequences due to the covariant factors such as normal walking and walking with wearing a coat.Researchers,over the years,have worked on successfully identifying subjects using different techniques,but there is still room for improvement in accuracy due to these covariant factors.This paper proposes an automated model-free framework for human gait recognition in this article.There are a few critical steps in the proposed method.Firstly,optical flow-based motion region esti-mation and dynamic coordinates-based cropping are performed.The second step involves training a fine-tuned pre-trained MobileNetV2 model on both original and optical flow cropped frames;the training has been conducted using static hyperparameters.The third step proposed a fusion technique known as normal distribution serially fusion.In the fourth step,a better optimization algorithm is applied to select the best features,which are then classified using a Bi-Layered neural network.Three publicly available datasets,CASIA A,CASIA B,and CASIA C,were used in the experimental process and obtained average accuracies of 99.6%,91.6%,and 95.02%,respectively.The proposed framework has achieved improved accuracy compared to the other methods.
基金supported by the National Key Research and Development Program of China(Grant No.2022YFC3004104)the National Natural Science Foundation of China(Grant No.U2342204)+4 种基金the Innovation and Development Program of the China Meteorological Administration(Grant No.CXFZ2024J001)the Open Research Project of the Key Open Laboratory of Hydrology and Meteorology of the China Meteorological Administration(Grant No.23SWQXZ010)the Science and Technology Plan Project of Zhejiang Province(Grant No.2022C03150)the Open Research Fund Project of Anyang National Climate Observatory(Grant No.AYNCOF202401)the Open Bidding for Selecting the Best Candidates Program(Grant No.CMAJBGS202318)。
文摘Thunderstorm wind gusts are small in scale,typically occurring within a range of a few kilometers.It is extremely challenging to monitor and forecast thunderstorm wind gusts using only automatic weather stations.Therefore,it is necessary to establish thunderstorm wind gust identification techniques based on multisource high-resolution observations.This paper introduces a new algorithm,called thunderstorm wind gust identification network(TGNet).It leverages multimodal feature fusion to fuse the temporal and spatial features of thunderstorm wind gust events.The shapelet transform is first used to extract the temporal features of wind speeds from automatic weather stations,which is aimed at distinguishing thunderstorm wind gusts from those caused by synoptic-scale systems or typhoons.Then,the encoder,structured upon the U-shaped network(U-Net)and incorporating recurrent residual convolutional blocks(R2U-Net),is employed to extract the corresponding spatial convective characteristics of satellite,radar,and lightning observations.Finally,by using the multimodal deep fusion module based on multi-head cross-attention,the temporal features of wind speed at each automatic weather station are incorporated into the spatial features to obtain 10-minutely classification of thunderstorm wind gusts.TGNet products have high accuracy,with a critical success index reaching 0.77.Compared with those of U-Net and R2U-Net,the false alarm rate of TGNet products decreases by 31.28%and 24.15%,respectively.The new algorithm provides grid products of thunderstorm wind gusts with a spatial resolution of 0.01°,updated every 10minutes.The results are finer and more accurate,thereby helping to improve the accuracy of operational warnings for thunderstorm wind gusts.
基金This work is partially supported by National Natural Foundation of China(Grant No.61772561)the Key Research&Development Plan of Hunan Province(Grant No.2018NK2012)+2 种基金the Degree&Postgraduate Education Reform Project of Hunan Province(Grant No.2019JGYB154)the Postgraduate Excellent teaching team Project of Hunan Province(Grant[2019]370-133)Teaching Reform Project of Central South University of Forestry and Technology(Grant No.20180682).
文摘With the development of Deep Convolutional Neural Networks(DCNNs),the extracted features for image recognition tasks have shifted from low-level features to the high-level semantic features of DCNNs.Previous studies have shown that the deeper the network is,the more abstract the features are.However,the recognition ability of deep features would be limited by insufficient training samples.To address this problem,this paper derives an improved Deep Fusion Convolutional Neural Network(DF-Net)which can make full use of the differences and complementarities during network learning and enhance feature expression under the condition of limited datasets.Specifically,DF-Net organizes two identical subnets to extract features from the input image in parallel,and then a well-designed fusion module is introduced to the deep layer of DF-Net to fuse the subnet’s features in multi-scale.Thus,the more complex mappings are created and the more abundant and accurate fusion features can be extracted to improve recognition accuracy.Furthermore,a corresponding training strategy is also proposed to speed up the convergence and reduce the computation overhead of network training.Finally,DF-Nets based on the well-known ResNet,DenseNet and MobileNetV2 are evaluated on CIFAR100,Stanford Dogs,and UECFOOD-100.Theoretical analysis and experimental results strongly demonstrate that DF-Net enhances the performance of DCNNs and increases the accuracy of image recognition.
文摘Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recognition.We propose in this paper an advanced feature fusion algorithm using Multiple Convolutional Neural Network(Multi-CNN)for scene recognition.Unlike existing works that usually use individual convolutional neural network,a fusion of multiple different convolutional neural networks is applied for scene recognition.Firstly,we split training images in two directions and apply to three deep CNN model,and then extract features from the last full-connected(FC)layer and probabilistic layer on each model.Finally,feature vectors are fused with different fusion strategies in groups forwarded into SoftMax classifier.Our proposed algorithm is evaluated on three scene datasets for scene recognition.The experimental results demonstrate the effectiveness of proposed algorithm compared with other state-of-art approaches.