In order to directly construct the mapping between multiple state parameters and remaining useful life(RUL),and reduce the interference of random error on prediction accuracy,a RUL prediction model of aeroengine based...In order to directly construct the mapping between multiple state parameters and remaining useful life(RUL),and reduce the interference of random error on prediction accuracy,a RUL prediction model of aeroengine based on principal component analysis(PCA)and one-dimensional convolution neural network(1D-CNN)is proposed in this paper.Firstly,multiple state parameters corresponding to massive cycles of aeroengine are collected and brought into PCA for dimensionality reduction,and principal components are extracted for further time series prediction.Secondly,the 1D-CNN model is constructed to directly study the mapping between principal components and RUL.Multiple convolution and pooling operations are applied for deep feature extraction,and the end-to-end RUL prediction of aeroengine can be realized.Experimental results show that the most effective principal component from the multiple state parameters can be obtained by PCA,and the long time series of multiple state parameters can be directly mapped to RUL by 1D-CNN,so as to improve the efficiency and accuracy of RUL prediction.Compared with other traditional models,the proposed method also has lower prediction error and better robustness.展开更多
Ultrasonic guided wave is an attractive monitoring technique for large-scale structures but is vulnerable to changes in environmental and operational conditions(EOC),which are inevitable in the normal inspection of ci...Ultrasonic guided wave is an attractive monitoring technique for large-scale structures but is vulnerable to changes in environmental and operational conditions(EOC),which are inevitable in the normal inspection of civil and mechanical structures.This paper thus presents a robust guided wave-based method for damage detection and localization under complex environmental conditions by singular value decomposition-based feature extraction and one-dimensional convolutional neural network(1D-CNN).After singular value decomposition-based feature extraction processing,a temporal robust damage index(TRDI)is extracted,and the effect of EOCs is well removed.Hence,even for the signals with a very large temperature-varying range and low signal-to-noise ratios(SNRs),the final damage detection and localization accuracy retain perfect 100%.Verifications are conducted on two different experimental datasets.The first dataset consists of guided wave signals collected from a thin aluminum plate with artificial noises,and the second is a publicly available experimental dataset of guided wave signals acquired on a composite plate with a temperature ranging from 20℃to 60℃.It is demonstrated that the proposed method can detect and localize the damage accurately and rapidly,showing great potential for application in complex and unknown EOC.展开更多
Aiming at the problem that the average recognition degree of the moving target line is low with the traditional motion target behaviour recognition method, a motion recognition method based on deep convolutional neura...Aiming at the problem that the average recognition degree of the moving target line is low with the traditional motion target behaviour recognition method, a motion recognition method based on deep convolutional neural network is proposed in this paper. A target model of deep convolutional neural network is constructed and the basic unit of the network is designed by using the model. By setting the unit, the returned unit is calculated into the standard density diagram, and the position of the moving target is determined by the local maximum method to realize the behavior identification of the moving target. The experimental results show that the multi-parameter SICNN256 model is slightly better than other model structures. The average recognition rate and recognition rate of the moving target behavior recognition method based on deep convolutional neural network are higher than those of the traditional method, which proves its effectiveness. Since the frequency of single target is higher than that of multiple recognition and there is no target similarity recognition, similar target error detection cannot be excluded.展开更多
Effective features are essential for fault diagnosis.Due to the faint characteristics of a single line-to-ground(SLG)fault,fault line detection has become a challenge in resonant grounding distribution systems.This pa...Effective features are essential for fault diagnosis.Due to the faint characteristics of a single line-to-ground(SLG)fault,fault line detection has become a challenge in resonant grounding distribution systems.This paper proposes a novel fault line detection method using waveform fusion and one-dimensional convolutional neural networks(1-D CNN).After an SLG fault occurs,the first-half waves of zero-sequence currents are collected and superimposed with each other to achieve waveform fusion.The compelling feature of fused waveforms is extracted by 1-D CNN to determine whether the fused waveform source contains the fault line.Then,the 1-D CNN output is used to update the value of the counter in order to identify the fault line.Given the lack of fault data in existing distribution systems,the proposed method only needs a small quantity of data for model training and fault line detection.In addition,the proposed method owns fault-tolerant performance.Even if a few samples are misjudged,the fault line can still be detected correctly based on the full output results of 1-D CNN.Experimental results verified that the proposed method can work effectively under various fault conditions.展开更多
Deep convolutional neural networks (DCNNs) based methods recently keep setting new records on the tasks of predicting depth maps from monocular images. When dealing with video-based applications such as 2D (2-dimen...Deep convolutional neural networks (DCNNs) based methods recently keep setting new records on the tasks of predicting depth maps from monocular images. When dealing with video-based applications such as 2D (2-dimensional) to 3D (3-dimensional) video conversion, however, these approaches tend to produce temporally inconsistent depth maps, since their CNN models are optimized over single frames. In this paper, we address this problem by introducing a novel spatial-temporal conditional random fields (CRF) model into the DCNN architecture, which is able to enforce temporal consistency between depth map estimations over consecutive video frames. In our approach, temporally consistent superpixel (TSP) is first applied to an image sequence to establish the correspondence of targets in consecutive frames. A DCNN is then used to regress the depth value of each temporal superpixel, followed by a spatial-temporal CRF layer to model the relationship of the estimated depths in both spatial and temporal domains. The parameters in both DCNN and CRF models are jointly optimized with back propagation. Experimental results show that our approach not only is able to significantly enhance the temporal consistency of estimated depth maps over existing single-frame-based approaches, but also improves the depth estimation accuracy in terms of various evaluation metrics.展开更多
In this paper,a new bolt fault diagnosis method is developed to solve the fault diagnosis problem of wind turbine flange bolts using one-dimensional depthwise separable convolutions.The main idea is to use a one-dimen...In this paper,a new bolt fault diagnosis method is developed to solve the fault diagnosis problem of wind turbine flange bolts using one-dimensional depthwise separable convolutions.The main idea is to use a one-dimensional convolutional neural network model to classify and identify the acoustic vibration signals of bolts,which represent different bolt damage states.Through the methods of knock test and modal simulation,it is concluded that the damage state of wind turbine flange bolt is related to the natural frequency distribution of acoustic vibration signal.It is found that the bolt damage state affects the modal shape of the structure,and then affects the natural frequency distribution of the bolt vibration signal.Therefore,the damage state can be obtained by identifying the natural frequency distribution of the bolt acoustic vibration signal.In the present one-dimensional depth-detachable convolutional neural network model,the one-dimensional vector is first convolved into multiple channels,and then each channel is separately learned by depth-detachable convolution,which can effectively improve the feature quality and the effect of data classification.From the perspective of the realization mechanism of convolution operation,the depthwise separable convolution operation has fewer parameters and faster computing speed,making it easier to build lightweight models and deploy them to mobile devices.展开更多
Emotion recognition from speech data is an active and emerging area of research that plays an important role in numerous applications,such as robotics,virtual reality,behavior assessments,and emergency call centers.Re...Emotion recognition from speech data is an active and emerging area of research that plays an important role in numerous applications,such as robotics,virtual reality,behavior assessments,and emergency call centers.Recently,researchers have developed many techniques in this field in order to ensure an improvement in the accuracy by utilizing several deep learning approaches,but the recognition rate is still not convincing.Our main aim is to develop a new technique that increases the recognition rate with reasonable cost computations.In this paper,we suggested a new technique,which is a one-dimensional dilated convolutional neural network(1D-DCNN)for speech emotion recognition(SER)that utilizes the hierarchical features learning blocks(HFLBs)with a bi-directional gated recurrent unit(BiGRU).We designed a one-dimensional CNN network to enhance the speech signals,which uses a spectral analysis,and to extract the hidden patterns from the speech signals that are fed into a stacked one-dimensional dilated network that are called HFLBs.Each HFLB contains one dilated convolution layer(DCL),one batch normalization(BN),and one leaky_relu(Relu)layer in order to extract the emotional features using a hieratical correlation strategy.Furthermore,the learned emotional features are feed into a BiGRU in order to adjust the global weights and to recognize the temporal cues.The final state of the deep BiGRU is passed from a softmax classifier in order to produce the probabilities of the emotions.The proposed model was evaluated over three benchmarked datasets that included the IEMOCAP,EMO-DB,and RAVDESS,which achieved 72.75%,91.14%,and 78.01%accuracy,respectively.展开更多
In the traditional well log depth matching tasks,manual adjustments are required,which means significantly labor-intensive for multiple wells,leading to low work efficiency.This paper introduces a multi-agent deep rei...In the traditional well log depth matching tasks,manual adjustments are required,which means significantly labor-intensive for multiple wells,leading to low work efficiency.This paper introduces a multi-agent deep reinforcement learning(MARL)method to automate the depth matching of multi-well logs.This method defines multiple top-down dual sliding windows based on the convolutional neural network(CNN)to extract and capture similar feature sequences on well logs,and it establishes an interaction mechanism between agents and the environment to control the depth matching process.Specifically,the agent selects an action to translate or scale the feature sequence based on the double deep Q-network(DDQN).Through the feedback of the reward signal,it evaluates the effectiveness of each action,aiming to obtain the optimal strategy and improve the accuracy of the matching task.Our experiments show that MARL can automatically perform depth matches for well-logs in multiple wells,and reduce manual intervention.In the application to the oil field,a comparative analysis of dynamic time warping(DTW),deep Q-learning network(DQN),and DDQN methods revealed that the DDQN algorithm,with its dual-network evaluation mechanism,significantly improves performance by identifying and aligning more details in the well log feature sequences,thus achieving higher depth matching accuracy.展开更多
Real-time hand gesture recognition technology significantly improves the user's experience for virtual reality/augmented reality(VR/AR) applications, which relies on the identification of the orientation of the ha...Real-time hand gesture recognition technology significantly improves the user's experience for virtual reality/augmented reality(VR/AR) applications, which relies on the identification of the orientation of the hand in captured images or videos. A new three-stage pipeline approach for fast and accurate hand segmentation for the hand from a single depth image is proposed. Firstly, a depth frame is segmented into several regions by histogrambased threshold selection algorithm and by tracing the exterior boundaries of objects after thresholding. Secondly, each segmentation proposal is evaluated by a three-layers shallow convolutional neural network(CNN) to determine whether or not the boundary is associated with the hand. Finally, all hand components are merged as the hand segmentation result. Compared with algorithms based on random decision forest(RDF), the experimental results demonstrate that the approach achieves better performance with high-accuracy(88.34% mean intersection over union, mIoU) and a shorter processing time(≤8 ms).展开更多
Depth-image-based rendering(DIBR) is widely used in 3 DTV, free-viewpoint video, and interactive 3 D graphics applications. Typically, synthetic images generated by DIBR-based systems incorporate various distortions, ...Depth-image-based rendering(DIBR) is widely used in 3 DTV, free-viewpoint video, and interactive 3 D graphics applications. Typically, synthetic images generated by DIBR-based systems incorporate various distortions, particularly geometric distortions induced by object dis-occlusion. Ensuring the quality of synthetic images is critical to maintaining adequate system service. However, traditional 2 D image quality metrics are ineffective for evaluating synthetic images as they are not sensitive to geometric distortion. In this paper, we propose a novel no-reference image quality assessment method for synthetic images based on convolutional neural networks, introducing local image saliency as prediction weights. Due to the lack of existing training data, we construct a new DIBR synthetic image dataset as part of our contribution. Experiments were conducted on both the public benchmark IRCCyN/IVC DIBR image dataset and our own dataset. Results demonstrate that our proposed metric outperforms traditional 2 D image quality metrics and state-of-the-art DIBR-related metrics.展开更多
For traffic object detection in foggy environment based on convolutional neural network(CNN),data sets in fog-free environment are generally used to train the network directly.As a result,the network cannot learn the ...For traffic object detection in foggy environment based on convolutional neural network(CNN),data sets in fog-free environment are generally used to train the network directly.As a result,the network cannot learn the object characteristics in the foggy environment in the training set,and the detection effect is not good.To improve the traffic object detection in foggy environment,we propose a method of generating foggy images on fog-free images from the perspective of data set construction.First,taking the KITTI objection detection data set as an original fog-free image,we generate the depth image of the original image by using improved Monodepth unsupervised depth estimation method.Then,a geometric prior depth template is constructed to fuse the image entropy taken as weight with the depth image.After that,a foggy image is acquired from the depth image based on the atmospheric scattering model.Finally,we take two typical object-detection frameworks,that is,the two-stage object-detection Fster region-based convolutional neural network(Faster-RCNN)and the one-stage object-detection network YOLOv4,to train the original data set,the foggy data set and the mixed data set,respectively.According to the test results on RESIDE-RTTS data set in the outdoor natural foggy environment,the model under the training on the mixed data set shows the best effect.The mean average precision(mAP)values are increased by 5.6%and by 5.0%under the YOLOv4 model and the Faster-RCNN network,respectively.It is proved that the proposed method can effectively improve object identification ability foggy environment.展开更多
基金supported by Jiangsu Social Science Foundation(No.20GLD008)Science,Technology Projects of Jiangsu Provincial Department of Communications(No.2020Y14)Joint Fund for Civil Aviation Research(No.U1933202)。
文摘In order to directly construct the mapping between multiple state parameters and remaining useful life(RUL),and reduce the interference of random error on prediction accuracy,a RUL prediction model of aeroengine based on principal component analysis(PCA)and one-dimensional convolution neural network(1D-CNN)is proposed in this paper.Firstly,multiple state parameters corresponding to massive cycles of aeroengine are collected and brought into PCA for dimensionality reduction,and principal components are extracted for further time series prediction.Secondly,the 1D-CNN model is constructed to directly study the mapping between principal components and RUL.Multiple convolution and pooling operations are applied for deep feature extraction,and the end-to-end RUL prediction of aeroengine can be realized.Experimental results show that the most effective principal component from the multiple state parameters can be obtained by PCA,and the long time series of multiple state parameters can be directly mapped to RUL by 1D-CNN,so as to improve the efficiency and accuracy of RUL prediction.Compared with other traditional models,the proposed method also has lower prediction error and better robustness.
基金Supported by National Natural Science Foundation of China(Grant Nos.52272433 and 11874110)Jiangsu Provincial Key R&D Program(Grant No.BE2021084)Technical Support Special Project of State Administration for Market Regulation(Grant No.2022YJ11).
文摘Ultrasonic guided wave is an attractive monitoring technique for large-scale structures but is vulnerable to changes in environmental and operational conditions(EOC),which are inevitable in the normal inspection of civil and mechanical structures.This paper thus presents a robust guided wave-based method for damage detection and localization under complex environmental conditions by singular value decomposition-based feature extraction and one-dimensional convolutional neural network(1D-CNN).After singular value decomposition-based feature extraction processing,a temporal robust damage index(TRDI)is extracted,and the effect of EOCs is well removed.Hence,even for the signals with a very large temperature-varying range and low signal-to-noise ratios(SNRs),the final damage detection and localization accuracy retain perfect 100%.Verifications are conducted on two different experimental datasets.The first dataset consists of guided wave signals collected from a thin aluminum plate with artificial noises,and the second is a publicly available experimental dataset of guided wave signals acquired on a composite plate with a temperature ranging from 20℃to 60℃.It is demonstrated that the proposed method can detect and localize the damage accurately and rapidly,showing great potential for application in complex and unknown EOC.
文摘Aiming at the problem that the average recognition degree of the moving target line is low with the traditional motion target behaviour recognition method, a motion recognition method based on deep convolutional neural network is proposed in this paper. A target model of deep convolutional neural network is constructed and the basic unit of the network is designed by using the model. By setting the unit, the returned unit is calculated into the standard density diagram, and the position of the moving target is determined by the local maximum method to realize the behavior identification of the moving target. The experimental results show that the multi-parameter SICNN256 model is slightly better than other model structures. The average recognition rate and recognition rate of the moving target behavior recognition method based on deep convolutional neural network are higher than those of the traditional method, which proves its effectiveness. Since the frequency of single target is higher than that of multiple recognition and there is no target similarity recognition, similar target error detection cannot be excluded.
基金supported by the National Natural Science Foundation of China through the Project of Research of Flexible and Adaptive Arc-Suppression Method for Single-Phase Grounding Fault in Distribution Networks(No.51677030).
文摘Effective features are essential for fault diagnosis.Due to the faint characteristics of a single line-to-ground(SLG)fault,fault line detection has become a challenge in resonant grounding distribution systems.This paper proposes a novel fault line detection method using waveform fusion and one-dimensional convolutional neural networks(1-D CNN).After an SLG fault occurs,the first-half waves of zero-sequence currents are collected and superimposed with each other to achieve waveform fusion.The compelling feature of fused waveforms is extracted by 1-D CNN to determine whether the fused waveform source contains the fault line.Then,the 1-D CNN output is used to update the value of the counter in order to identify the fault line.Given the lack of fault data in existing distribution systems,the proposed method only needs a small quantity of data for model training and fault line detection.In addition,the proposed method owns fault-tolerant performance.Even if a few samples are misjudged,the fault line can still be detected correctly based on the full output results of 1-D CNN.Experimental results verified that the proposed method can work effectively under various fault conditions.
基金This work is supported in part by the Natural Science Foundation of Zhejiang Province of China under Grant No. LQ17F030001, the National Natural Science Foundation of China under Grant No. U1609215, Qianjiang Talent Program of Zhejiang Province of China under Grant No. QJD1602021, the National Key Technology Research and Development Program of the Ministry of Science and Technology of China under Grant No. 2014BAK14B01, and Beihang University Virtual Reality Technology and System National Key Laboratory Open Project under Grant No. BUAA-VR-16KF-17.
文摘Deep convolutional neural networks (DCNNs) based methods recently keep setting new records on the tasks of predicting depth maps from monocular images. When dealing with video-based applications such as 2D (2-dimensional) to 3D (3-dimensional) video conversion, however, these approaches tend to produce temporally inconsistent depth maps, since their CNN models are optimized over single frames. In this paper, we address this problem by introducing a novel spatial-temporal conditional random fields (CRF) model into the DCNN architecture, which is able to enforce temporal consistency between depth map estimations over consecutive video frames. In our approach, temporally consistent superpixel (TSP) is first applied to an image sequence to establish the correspondence of targets in consecutive frames. A DCNN is then used to regress the depth value of each temporal superpixel, followed by a spatial-temporal CRF layer to model the relationship of the estimated depths in both spatial and temporal domains. The parameters in both DCNN and CRF models are jointly optimized with back propagation. Experimental results show that our approach not only is able to significantly enhance the temporal consistency of estimated depth maps over existing single-frame-based approaches, but also improves the depth estimation accuracy in terms of various evaluation metrics.
基金supported in part by the National Key R&D Program of China(Nos.2021YFE0206100 and 2018YFB1702300)the National Natural Science Foundation of China(No.62073321)+1 种基金the National Defense Basic Scientific Research Program(No.JCKY2019203C029)the Science and Technology Development Fund,Macao SAR(No.0015/2020/AMJ).
文摘In this paper,a new bolt fault diagnosis method is developed to solve the fault diagnosis problem of wind turbine flange bolts using one-dimensional depthwise separable convolutions.The main idea is to use a one-dimensional convolutional neural network model to classify and identify the acoustic vibration signals of bolts,which represent different bolt damage states.Through the methods of knock test and modal simulation,it is concluded that the damage state of wind turbine flange bolt is related to the natural frequency distribution of acoustic vibration signal.It is found that the bolt damage state affects the modal shape of the structure,and then affects the natural frequency distribution of the bolt vibration signal.Therefore,the damage state can be obtained by identifying the natural frequency distribution of the bolt acoustic vibration signal.In the present one-dimensional depth-detachable convolutional neural network model,the one-dimensional vector is first convolved into multiple channels,and then each channel is separately learned by depth-detachable convolution,which can effectively improve the feature quality and the effect of data classification.From the perspective of the realization mechanism of convolution operation,the depthwise separable convolution operation has fewer parameters and faster computing speed,making it easier to build lightweight models and deploy them to mobile devices.
基金supported by the National Research Foundation of Korea funded by the Korean Government through the Ministry of Science and ICT under Grant NRF-2020R1F1A1060659 and in part by the 2020 Faculty Research Fund of Sejong University。
文摘Emotion recognition from speech data is an active and emerging area of research that plays an important role in numerous applications,such as robotics,virtual reality,behavior assessments,and emergency call centers.Recently,researchers have developed many techniques in this field in order to ensure an improvement in the accuracy by utilizing several deep learning approaches,but the recognition rate is still not convincing.Our main aim is to develop a new technique that increases the recognition rate with reasonable cost computations.In this paper,we suggested a new technique,which is a one-dimensional dilated convolutional neural network(1D-DCNN)for speech emotion recognition(SER)that utilizes the hierarchical features learning blocks(HFLBs)with a bi-directional gated recurrent unit(BiGRU).We designed a one-dimensional CNN network to enhance the speech signals,which uses a spectral analysis,and to extract the hidden patterns from the speech signals that are fed into a stacked one-dimensional dilated network that are called HFLBs.Each HFLB contains one dilated convolution layer(DCL),one batch normalization(BN),and one leaky_relu(Relu)layer in order to extract the emotional features using a hieratical correlation strategy.Furthermore,the learned emotional features are feed into a BiGRU in order to adjust the global weights and to recognize the temporal cues.The final state of the deep BiGRU is passed from a softmax classifier in order to produce the probabilities of the emotions.The proposed model was evaluated over three benchmarked datasets that included the IEMOCAP,EMO-DB,and RAVDESS,which achieved 72.75%,91.14%,and 78.01%accuracy,respectively.
基金Supported by the China National Petroleum Corporation Limited-China University of Petroleum(Beijing)Strategic Cooperation Science and Technology Project(ZLZX2020-03).
文摘In the traditional well log depth matching tasks,manual adjustments are required,which means significantly labor-intensive for multiple wells,leading to low work efficiency.This paper introduces a multi-agent deep reinforcement learning(MARL)method to automate the depth matching of multi-well logs.This method defines multiple top-down dual sliding windows based on the convolutional neural network(CNN)to extract and capture similar feature sequences on well logs,and it establishes an interaction mechanism between agents and the environment to control the depth matching process.Specifically,the agent selects an action to translate or scale the feature sequence based on the double deep Q-network(DDQN).Through the feedback of the reward signal,it evaluates the effectiveness of each action,aiming to obtain the optimal strategy and improve the accuracy of the matching task.Our experiments show that MARL can automatically perform depth matches for well-logs in multiple wells,and reduce manual intervention.In the application to the oil field,a comparative analysis of dynamic time warping(DTW),deep Q-learning network(DQN),and DDQN methods revealed that the DDQN algorithm,with its dual-network evaluation mechanism,significantly improves performance by identifying and aligning more details in the well log feature sequences,thus achieving higher depth matching accuracy.
文摘Real-time hand gesture recognition technology significantly improves the user's experience for virtual reality/augmented reality(VR/AR) applications, which relies on the identification of the orientation of the hand in captured images or videos. A new three-stage pipeline approach for fast and accurate hand segmentation for the hand from a single depth image is proposed. Firstly, a depth frame is segmented into several regions by histogrambased threshold selection algorithm and by tracing the exterior boundaries of objects after thresholding. Secondly, each segmentation proposal is evaluated by a three-layers shallow convolutional neural network(CNN) to determine whether or not the boundary is associated with the hand. Finally, all hand components are merged as the hand segmentation result. Compared with algorithms based on random decision forest(RDF), the experimental results demonstrate that the approach achieves better performance with high-accuracy(88.34% mean intersection over union, mIoU) and a shorter processing time(≤8 ms).
基金sponsored by the National Key R&D Program of China (No. 2017YFB1002702)the National Natural Science Foundation of China (Nos. 61572058, 61472363)
文摘Depth-image-based rendering(DIBR) is widely used in 3 DTV, free-viewpoint video, and interactive 3 D graphics applications. Typically, synthetic images generated by DIBR-based systems incorporate various distortions, particularly geometric distortions induced by object dis-occlusion. Ensuring the quality of synthetic images is critical to maintaining adequate system service. However, traditional 2 D image quality metrics are ineffective for evaluating synthetic images as they are not sensitive to geometric distortion. In this paper, we propose a novel no-reference image quality assessment method for synthetic images based on convolutional neural networks, introducing local image saliency as prediction weights. Due to the lack of existing training data, we construct a new DIBR synthetic image dataset as part of our contribution. Experiments were conducted on both the public benchmark IRCCyN/IVC DIBR image dataset and our own dataset. Results demonstrate that our proposed metric outperforms traditional 2 D image quality metrics and state-of-the-art DIBR-related metrics.
文摘For traffic object detection in foggy environment based on convolutional neural network(CNN),data sets in fog-free environment are generally used to train the network directly.As a result,the network cannot learn the object characteristics in the foggy environment in the training set,and the detection effect is not good.To improve the traffic object detection in foggy environment,we propose a method of generating foggy images on fog-free images from the perspective of data set construction.First,taking the KITTI objection detection data set as an original fog-free image,we generate the depth image of the original image by using improved Monodepth unsupervised depth estimation method.Then,a geometric prior depth template is constructed to fuse the image entropy taken as weight with the depth image.After that,a foggy image is acquired from the depth image based on the atmospheric scattering model.Finally,we take two typical object-detection frameworks,that is,the two-stage object-detection Fster region-based convolutional neural network(Faster-RCNN)and the one-stage object-detection network YOLOv4,to train the original data set,the foggy data set and the mixed data set,respectively.According to the test results on RESIDE-RTTS data set in the outdoor natural foggy environment,the model under the training on the mixed data set shows the best effect.The mean average precision(mAP)values are increased by 5.6%and by 5.0%under the YOLOv4 model and the Faster-RCNN network,respectively.It is proved that the proposed method can effectively improve object identification ability foggy environment.