Side-scan sonar(SSS)is now a prevalent instrument for large-scale seafloor topography measurements,deployable on an autonomous underwater vehicle(AUV)to execute fully automated underwater acoustic scanning imaging alo...Side-scan sonar(SSS)is now a prevalent instrument for large-scale seafloor topography measurements,deployable on an autonomous underwater vehicle(AUV)to execute fully automated underwater acoustic scanning imaging along a predetermined trajectory.However,SSS images often suffer from speckle noise caused by mutual interference between echoes,and limited AUV computational resources further hinder noise suppression.Existing approaches for SSS image processing and speckle noise reduction rely heavily on complex network structures and fail to combine the benefits of deep learning and domain knowledge.To address the problem,Rep DNet,a novel and effective despeckling convolutional neural network is proposed.Rep DNet introduces two re-parameterized blocks:the Pixel Smoothing Block(PSB)and Edge Enhancement Block(EEB),preserving edge information while attenuating speckle noise.During training,PSB and EEB manifest as double-layered multi-branch structures,integrating first-order and secondorder derivatives and smoothing functions.During inference,the branches are re-parameterized into a 3×3 convolution,enabling efficient inference without sacrificing accuracy.Rep DNet comprises three computational operations:3×3 convolution,element-wise summation and Rectified Linear Unit activation.Evaluations on benchmark datasets,a real SSS dataset and Data collected at Lake Mulan aestablish Rep DNet as a well-balanced network,meeting the AUV computational constraints in terms of performance and latency.展开更多
Automatic pancreas segmentation plays a pivotal role in assisting physicians with diagnosing pancreatic diseases,facilitating treatment evaluations,and designing surgical plans.Due to the pancreas’s tiny size,signifi...Automatic pancreas segmentation plays a pivotal role in assisting physicians with diagnosing pancreatic diseases,facilitating treatment evaluations,and designing surgical plans.Due to the pancreas’s tiny size,significant variability in shape and location,and low contrast with surrounding tissues,achieving high segmentation accuracy remains challenging.To improve segmentation precision,we propose a novel network utilizing EfficientNetV2 and multi-branch structures for automatically segmenting the pancreas fromCT images.Firstly,an EfficientNetV2 encoder is employed to extract complex and multi-level features,enhancing the model’s ability to capture the pancreas’s intricate morphology.Then,a residual multi-branch dilated attention(RMDA)module is designed to suppress irrelevant background noise and highlight useful pancreatic features.And re-parameterization Visual Geometry Group(RepVGG)blocks with amulti-branch structure are introduced in the decoder to effectively integrate deep features and low-level details,improving segmentation accuracy.Furthermore,we apply re-parameterization to the model,reducing computations and parameters while accelerating inference and reducing memory usage.Our approach achieves average dice similarity coefficient(DSC)of 85.59%,intersection over union(IoU)of 75.03%,precision of 85.09%,and recall of 86.57%on the NIH pancreas dataset.Compared with other methods,our model has fewer parameters and faster inference speed,demonstrating its enormous potential in practical applications of pancreatic segmentation.展开更多
Addressing the current issues in construction site detection algorithms—such as missed detections,false positives,and high model complexity—caused by occlusions and scale variations in dense environments.This paper ...Addressing the current issues in construction site detection algorithms—such as missed detections,false positives,and high model complexity—caused by occlusions and scale variations in dense environments.This paper proposes a lightweight multi-object detection model for construction sites based on YOLO-World,named the LCS-YOLO model,to achieve a balance between detection efficiency and accuracy.We propose the RGNet(Re-parameterization GhostNet)module,which integrates re-parameterized convolutions and a multi-branch architecture.This approach addresses the issue of information redundancy in intermediate feature maps while enhancing feature extraction and gradient flow capabilities.Combined with the adaptive downsampling module ADown(Adaptive Downsampling),it better captures image features and achieves spatial compression,reducing model complexity while enhancing interaction between images and text.Experiments demonstrate that the LCS-YOLO model outperforms other comparison models in overall performance,achieving a balance between accuracy and efficiency.展开更多
A forest fire is a natural disaster characterized by rapid spread,difficulty in extinguishing,and widespread destruction,which requires an efficient response.Existing detection methods fail to balance global and local...A forest fire is a natural disaster characterized by rapid spread,difficulty in extinguishing,and widespread destruction,which requires an efficient response.Existing detection methods fail to balance global and local fire features,resulting in the false detection of small or hidden fires.In this paper,we propose a novel detection technique based on an improved YOLO v5 model to enhance the visual representation of forest fires and retain more information about global interactions.We add a plug-and-play global attention mechanism to improve the efficiency of neck and backbone feature extraction of the YOLO v5 model.Then,a re-parameterized convolutional module is designed,and a decoupled detection head is used to accelerate the convergence speed.Finally,a weighted bi-directional feature pyramid network(BiFPN)is introduced to merge feature information for local information processing.In the evaluation,we use the complete intersection over union(CIoU)loss function to optimize the multi-task loss for different kinds of forest fires.Experiments show that the precision,recall,and mean average precision are increased by 4.2%,3.8%,and 4.6%,respectively,compared with the classic YOLO v5 model.In particular,the mAP@0.5:0.95 is 2.2% higher than the other detection methods,while meeting the requirements of real-time detection.展开更多
基金supported by the National Key R&D Program of China(Grant No.2023YFC3010803)the National Nature Science Foundation of China(Grant No.52272424)+1 种基金the Key R&D Program of Hubei Province of China(Grant No.2023BCB123)the Fundamental Research Funds for the Central Universities(Grant No.WUT:2023IVB079)。
文摘Side-scan sonar(SSS)is now a prevalent instrument for large-scale seafloor topography measurements,deployable on an autonomous underwater vehicle(AUV)to execute fully automated underwater acoustic scanning imaging along a predetermined trajectory.However,SSS images often suffer from speckle noise caused by mutual interference between echoes,and limited AUV computational resources further hinder noise suppression.Existing approaches for SSS image processing and speckle noise reduction rely heavily on complex network structures and fail to combine the benefits of deep learning and domain knowledge.To address the problem,Rep DNet,a novel and effective despeckling convolutional neural network is proposed.Rep DNet introduces two re-parameterized blocks:the Pixel Smoothing Block(PSB)and Edge Enhancement Block(EEB),preserving edge information while attenuating speckle noise.During training,PSB and EEB manifest as double-layered multi-branch structures,integrating first-order and secondorder derivatives and smoothing functions.During inference,the branches are re-parameterized into a 3×3 convolution,enabling efficient inference without sacrificing accuracy.Rep DNet comprises three computational operations:3×3 convolution,element-wise summation and Rectified Linear Unit activation.Evaluations on benchmark datasets,a real SSS dataset and Data collected at Lake Mulan aestablish Rep DNet as a well-balanced network,meeting the AUV computational constraints in terms of performance and latency.
基金supported by the Science and Technology Innovation Programof Hunan Province(Grant No.2022RC1021)the Hunan Provincial Natural Science Foundation Project(Grant No.2023JJ60124)+1 种基金the Changsha Natural Science Foundation Project(Grant No.kq2202265)the key project of the Hunan Provincial of Education(Grant No.22A0255).
文摘Automatic pancreas segmentation plays a pivotal role in assisting physicians with diagnosing pancreatic diseases,facilitating treatment evaluations,and designing surgical plans.Due to the pancreas’s tiny size,significant variability in shape and location,and low contrast with surrounding tissues,achieving high segmentation accuracy remains challenging.To improve segmentation precision,we propose a novel network utilizing EfficientNetV2 and multi-branch structures for automatically segmenting the pancreas fromCT images.Firstly,an EfficientNetV2 encoder is employed to extract complex and multi-level features,enhancing the model’s ability to capture the pancreas’s intricate morphology.Then,a residual multi-branch dilated attention(RMDA)module is designed to suppress irrelevant background noise and highlight useful pancreatic features.And re-parameterization Visual Geometry Group(RepVGG)blocks with amulti-branch structure are introduced in the decoder to effectively integrate deep features and low-level details,improving segmentation accuracy.Furthermore,we apply re-parameterization to the model,reducing computations and parameters while accelerating inference and reducing memory usage.Our approach achieves average dice similarity coefficient(DSC)of 85.59%,intersection over union(IoU)of 75.03%,precision of 85.09%,and recall of 86.57%on the NIH pancreas dataset.Compared with other methods,our model has fewer parameters and faster inference speed,demonstrating its enormous potential in practical applications of pancreatic segmentation.
文摘Addressing the current issues in construction site detection algorithms—such as missed detections,false positives,and high model complexity—caused by occlusions and scale variations in dense environments.This paper proposes a lightweight multi-object detection model for construction sites based on YOLO-World,named the LCS-YOLO model,to achieve a balance between detection efficiency and accuracy.We propose the RGNet(Re-parameterization GhostNet)module,which integrates re-parameterized convolutions and a multi-branch architecture.This approach addresses the issue of information redundancy in intermediate feature maps while enhancing feature extraction and gradient flow capabilities.Combined with the adaptive downsampling module ADown(Adaptive Downsampling),it better captures image features and achieves spatial compression,reducing model complexity while enhancing interaction between images and text.Experiments demonstrate that the LCS-YOLO model outperforms other comparison models in overall performance,achieving a balance between accuracy and efficiency.
基金supported by the Graduate Research and Innovation Projects of Jiangsu Province(No.SJCX23_0320).
文摘A forest fire is a natural disaster characterized by rapid spread,difficulty in extinguishing,and widespread destruction,which requires an efficient response.Existing detection methods fail to balance global and local fire features,resulting in the false detection of small or hidden fires.In this paper,we propose a novel detection technique based on an improved YOLO v5 model to enhance the visual representation of forest fires and retain more information about global interactions.We add a plug-and-play global attention mechanism to improve the efficiency of neck and backbone feature extraction of the YOLO v5 model.Then,a re-parameterized convolutional module is designed,and a decoupled detection head is used to accelerate the convergence speed.Finally,a weighted bi-directional feature pyramid network(BiFPN)is introduced to merge feature information for local information processing.In the evaluation,we use the complete intersection over union(CIoU)loss function to optimize the multi-task loss for different kinds of forest fires.Experiments show that the precision,recall,and mean average precision are increased by 4.2%,3.8%,and 4.6%,respectively,compared with the classic YOLO v5 model.In particular,the mAP@0.5:0.95 is 2.2% higher than the other detection methods,while meeting the requirements of real-time detection.