Flooding and heavy rainfall under extreme weather conditions pose significant challenges to target detection algorithms.Traditional methods often struggle to address issues such as image blurring,dynamic noise interfe...Flooding and heavy rainfall under extreme weather conditions pose significant challenges to target detection algorithms.Traditional methods often struggle to address issues such as image blurring,dynamic noise interference,and variations in target scale.Conventional neural network(CNN)-based target detection approaches face notable limitations in such adverse weather scenarios,primarily due to the fixed geometric sampling structures that hinder adaptability to complex backgrounds and dynamically changing object appearances.To address these challenges,this paper proposes an optimized YOLOv9 model incorporating an improved deformable convolutional network(DCN)enhanced with a multi-scale dilated attention(MSDA)mechanism.Specifically,the DCN module enhances themodel’s adaptability to target deformation and noise interference by adaptively adjusting the sampling grid positions,while also integrating feature amplitude modulation to further improve robustness.Additionally,theMSDA module is introduced to capture contextual features acrossmultiple scales,effectively addressing issues related to target occlusion and scale variation commonly encountered in flood-affected environments.Experimental evaluations are conducted on the ISE-UFDS and UA-DETRAC datasets.The results demonstrate that the proposedmodel significantly outperforms state-of-the-art methods in key evaluation metrics,including precision,recall,F1-score,and mAP(Mean Average Precision).Notably,the model exhibits superior robustness and generalization performance under simulated severe weather conditions,offering reliable technical support for disaster emergency response systems.This study contributes to enhancing the accuracy and real-time capabilities of flood early warning systems,thereby supporting more effective disaster mitigation strategies.展开更多
Currently,deep convolutional neural networks have made great progress in the field of semantic segmentation.Because of the fixed convolution kernel geometry,standard convolution neural networks have been limited the a...Currently,deep convolutional neural networks have made great progress in the field of semantic segmentation.Because of the fixed convolution kernel geometry,standard convolution neural networks have been limited the ability to simulate geometric transformations.Therefore,a deformable convolution is introduced to enhance the adaptability of convolutional networks to spatial transformation.Considering that the deep convolutional neural networks cannot adequately segment the local objects at the output layer due to using the pooling layers in neural network architecture.To overcome this shortcoming,the rough prediction segmentation results of the neural network output layer will be processed by fully connected conditional random fields to improve the ability of image segmentation.The proposed method can easily be trained by end-to-end using standard backpropagation algorithms.Finally,the proposed method is tested on the ISPRS dataset.The results show that the proposed method can effectively overcome the influence of the complex structure of the segmentation object and obtain state-of-the-art accuracy on the ISPRS Vaihingen 2D semantic labeling dataset.展开更多
Background Exploring correspondences across multiview images is the basis of various computer vision tasks.However,most existing methods have limited accuracy under challenging conditions.Method To learn more robust a...Background Exploring correspondences across multiview images is the basis of various computer vision tasks.However,most existing methods have limited accuracy under challenging conditions.Method To learn more robust and accurate correspondences,we propose DSD-MatchingNet for local feature matching in this study.First,we develop a deformable feature extraction module to obtain multilevel feature maps,which harvest contextual information from dynamic receptive fields.The dynamic receptive fields provided by the deformable convolution network ensure that our method obtains dense and robust correspondence.Second,we utilize sparse-to-dense matching with symmetry of correspondence to implement accurate pixel-level matching,which enables our method to produce more accurate correspondences.Result Experiments show that our proposed DSD-MatchingNet achieves a better performance on the image matching benchmark,as well as on the visual localization benchmark.Specifically,our method achieved 91.3%mean matching accuracy on the HPatches dataset and 99.3%visual localization recalls on the Aachen Day-Night dataset.展开更多
Due to the complex environment of the university laboratory,personnel flow intensive,personnel irregular behavior is easy to cause security risks.Monitoring using mainstream detection algorithms suffers from low detec...Due to the complex environment of the university laboratory,personnel flow intensive,personnel irregular behavior is easy to cause security risks.Monitoring using mainstream detection algorithms suffers from low detection accuracy and slow speed.Therefore,the current management of personnel behavior mainly relies on institutional constraints,education and training,on-site supervision,etc.,which is time-consuming and ineffective.Given the above situation,this paper proposes an improved You Only Look Once version 7(YOLOv7)to achieve the purpose of quickly detecting irregular behaviors of laboratory personnel while ensuring high detection accuracy.First,to better capture the shape features of the target,deformable convolutional networks(DCN)is used in the backbone part of the model to replace the traditional convolution to improve the detection accuracy and speed.Second,to enhance the extraction of important features and suppress useless features,this paper proposes a new convolutional block attention module_efficient channel attention(CBAM_E)for embedding the neck network to improve the model’s ability to extract features from complex scenes.Finally,to reduce the influence of angle factor and bounding box regression accuracy,this paper proposes a newα-SCYLLA intersection over union(α-SIoU)instead of the complete intersection over union(CIoU),which improves the regression accuracy while increasing the convergence speed.Comparison experiments on public and homemade datasets show that the improved algorithm outperforms the original algorithm in all evaluation indexes,with an increase of 2.92%in the precision rate,4.14%in the recall rate,0.0356 in the weighted harmonic mean,3.60%in the mAP@0.5 value,and a reduction in the number of parameters and complexity.Compared with the mainstream algorithm,the improved algorithm has higher detection accuracy,faster convergence speed,and better actual recognition effect,indicating the effectiveness of the improved algorithm in this paper and its potential for practical application in laboratory scenarios.展开更多
The YOLOv5 algorithm is widely used in edge computing systems for object detection.However,the limited computing resources of embedded devices and the large model size of existing deep learning based methods increase ...The YOLOv5 algorithm is widely used in edge computing systems for object detection.However,the limited computing resources of embedded devices and the large model size of existing deep learning based methods increase the difficulty of real-time object detection on edge devices.To address this issue,we propose a smaller,less computationally intensive,and more accurate algorithm for object detection.Multi-scale Feature Fusion-YOLO(MFF-YOLO)is built on top of the YOLOv5s framework,but it contains substantial improvements to YOLOv5s.First,we design the MFF module to improve the feature propagation path in the feature pyramid,which further integrates the semantic information from different paths of feature layers.Then,a large convolution-kernel module is used in the bottleneck.The structure enlarges the receptive field and preserves shallow semantic information,which overcomes the performance limitation arising from uneven propagation in Feature Pyramid Networks(FPN).In addition,a multi-branch downsampling method based on depthwise separable convolutions and a bottleneck structure with deformable convolutions are designed to reduce the complexity of the backbone network and minimize the real-time performance loss caused by the increased model complexity.The experimental results on PASCAL VOC and MS COCO datasets show that,compared with YOLOv5s,MFF-YOLO reduces the number of parameters by 7%and the number of FLoating point Operations Per second(FLOPs)by 11.8%.The mAP@0.5 has improved by 3.7%and 5.5%,and the mAP@0.5:0.95 has improved by 6.5%and 6.2%,respetively.Furthermore,compared with YOLOv7-tiny,PP-YOLO-tiny,and other mainstream methods,MFF-YOLO has achieved better results on multiple indicators.展开更多
基金financially supported by the National Key R&D Program of China(No.2022YFC3090603)R&DProgramof BeijingMunicipal EducationCommission(No.KZ202211417049)。
文摘Flooding and heavy rainfall under extreme weather conditions pose significant challenges to target detection algorithms.Traditional methods often struggle to address issues such as image blurring,dynamic noise interference,and variations in target scale.Conventional neural network(CNN)-based target detection approaches face notable limitations in such adverse weather scenarios,primarily due to the fixed geometric sampling structures that hinder adaptability to complex backgrounds and dynamically changing object appearances.To address these challenges,this paper proposes an optimized YOLOv9 model incorporating an improved deformable convolutional network(DCN)enhanced with a multi-scale dilated attention(MSDA)mechanism.Specifically,the DCN module enhances themodel’s adaptability to target deformation and noise interference by adaptively adjusting the sampling grid positions,while also integrating feature amplitude modulation to further improve robustness.Additionally,theMSDA module is introduced to capture contextual features acrossmultiple scales,effectively addressing issues related to target occlusion and scale variation commonly encountered in flood-affected environments.Experimental evaluations are conducted on the ISE-UFDS and UA-DETRAC datasets.The results demonstrate that the proposedmodel significantly outperforms state-of-the-art methods in key evaluation metrics,including precision,recall,F1-score,and mAP(Mean Average Precision).Notably,the model exhibits superior robustness and generalization performance under simulated severe weather conditions,offering reliable technical support for disaster emergency response systems.This study contributes to enhancing the accuracy and real-time capabilities of flood early warning systems,thereby supporting more effective disaster mitigation strategies.
基金National Key Research and Development Program of China(No.2017YFC0405806)。
文摘Currently,deep convolutional neural networks have made great progress in the field of semantic segmentation.Because of the fixed convolution kernel geometry,standard convolution neural networks have been limited the ability to simulate geometric transformations.Therefore,a deformable convolution is introduced to enhance the adaptability of convolutional networks to spatial transformation.Considering that the deep convolutional neural networks cannot adequately segment the local objects at the output layer due to using the pooling layers in neural network architecture.To overcome this shortcoming,the rough prediction segmentation results of the neural network output layer will be processed by fully connected conditional random fields to improve the ability of image segmentation.The proposed method can easily be trained by end-to-end using standard backpropagation algorithms.Finally,the proposed method is tested on the ISPRS dataset.The results show that the proposed method can effectively overcome the influence of the complex structure of the segmentation object and obtain state-of-the-art accuracy on the ISPRS Vaihingen 2D semantic labeling dataset.
基金Supported by the National Natural Science Foundation of China under Grants 61872241,62077037 and 62272298in part by Shanghai Municipal Science and Technology Major Project under Grant 2021SHZDZX0102。
文摘Background Exploring correspondences across multiview images is the basis of various computer vision tasks.However,most existing methods have limited accuracy under challenging conditions.Method To learn more robust and accurate correspondences,we propose DSD-MatchingNet for local feature matching in this study.First,we develop a deformable feature extraction module to obtain multilevel feature maps,which harvest contextual information from dynamic receptive fields.The dynamic receptive fields provided by the deformable convolution network ensure that our method obtains dense and robust correspondence.Second,we utilize sparse-to-dense matching with symmetry of correspondence to implement accurate pixel-level matching,which enables our method to produce more accurate correspondences.Result Experiments show that our proposed DSD-MatchingNet achieves a better performance on the image matching benchmark,as well as on the visual localization benchmark.Specifically,our method achieved 91.3%mean matching accuracy on the HPatches dataset and 99.3%visual localization recalls on the Aachen Day-Night dataset.
基金This study was supported by the National Natural Science Foundation of China(No.61861007)Guizhou ProvincialDepartment of Education Innovative Group Project(QianJiaohe KY[2021]012)Guizhou Science and Technology Plan Project(Guizhou Science Support[2023]General 412).
文摘Due to the complex environment of the university laboratory,personnel flow intensive,personnel irregular behavior is easy to cause security risks.Monitoring using mainstream detection algorithms suffers from low detection accuracy and slow speed.Therefore,the current management of personnel behavior mainly relies on institutional constraints,education and training,on-site supervision,etc.,which is time-consuming and ineffective.Given the above situation,this paper proposes an improved You Only Look Once version 7(YOLOv7)to achieve the purpose of quickly detecting irregular behaviors of laboratory personnel while ensuring high detection accuracy.First,to better capture the shape features of the target,deformable convolutional networks(DCN)is used in the backbone part of the model to replace the traditional convolution to improve the detection accuracy and speed.Second,to enhance the extraction of important features and suppress useless features,this paper proposes a new convolutional block attention module_efficient channel attention(CBAM_E)for embedding the neck network to improve the model’s ability to extract features from complex scenes.Finally,to reduce the influence of angle factor and bounding box regression accuracy,this paper proposes a newα-SCYLLA intersection over union(α-SIoU)instead of the complete intersection over union(CIoU),which improves the regression accuracy while increasing the convergence speed.Comparison experiments on public and homemade datasets show that the improved algorithm outperforms the original algorithm in all evaluation indexes,with an increase of 2.92%in the precision rate,4.14%in the recall rate,0.0356 in the weighted harmonic mean,3.60%in the mAP@0.5 value,and a reduction in the number of parameters and complexity.Compared with the mainstream algorithm,the improved algorithm has higher detection accuracy,faster convergence speed,and better actual recognition effect,indicating the effectiveness of the improved algorithm in this paper and its potential for practical application in laboratory scenarios.
基金supported by the Natural Science Foundation of Shandong Province(Nos.ZR2023LZH017 and ZR2024MF066)the Natural Science Foundation of Hebei Province(No.F2022511001)+1 种基金the Key Funding from National Natural Science Foundation of China(No.92067206)the National Natural Science Foundation of China(No.62471493).
文摘The YOLOv5 algorithm is widely used in edge computing systems for object detection.However,the limited computing resources of embedded devices and the large model size of existing deep learning based methods increase the difficulty of real-time object detection on edge devices.To address this issue,we propose a smaller,less computationally intensive,and more accurate algorithm for object detection.Multi-scale Feature Fusion-YOLO(MFF-YOLO)is built on top of the YOLOv5s framework,but it contains substantial improvements to YOLOv5s.First,we design the MFF module to improve the feature propagation path in the feature pyramid,which further integrates the semantic information from different paths of feature layers.Then,a large convolution-kernel module is used in the bottleneck.The structure enlarges the receptive field and preserves shallow semantic information,which overcomes the performance limitation arising from uneven propagation in Feature Pyramid Networks(FPN).In addition,a multi-branch downsampling method based on depthwise separable convolutions and a bottleneck structure with deformable convolutions are designed to reduce the complexity of the backbone network and minimize the real-time performance loss caused by the increased model complexity.The experimental results on PASCAL VOC and MS COCO datasets show that,compared with YOLOv5s,MFF-YOLO reduces the number of parameters by 7%and the number of FLoating point Operations Per second(FLOPs)by 11.8%.The mAP@0.5 has improved by 3.7%and 5.5%,and the mAP@0.5:0.95 has improved by 6.5%and 6.2%,respetively.Furthermore,compared with YOLOv7-tiny,PP-YOLO-tiny,and other mainstream methods,MFF-YOLO has achieved better results on multiple indicators.