This paper presents CW-HRNet,a high-resolution,lightweight crack segmentation network designed to address challenges in complex scenes with slender,deformable,and blurred crack structures.The model incorporates two ke...This paper presents CW-HRNet,a high-resolution,lightweight crack segmentation network designed to address challenges in complex scenes with slender,deformable,and blurred crack structures.The model incorporates two key modules:Constrained Deformable Convolution(CDC),which stabilizes geometric alignment by applying a tanh limiter and learnable scaling factor to the predicted offsets,and the Wavelet Frequency Enhancement Module(WFEM),which decomposes features using Haar wavelets to preserve low-frequency structures while enhancing high-frequency boundaries and textures.Evaluations on the CrackSeg9k benchmark demonstrate CW-HRNet’s superior performance,achieving 82.39%mIoU with only 7.49M parameters and 10.34 GFLOPs,outperforming HrSegNet-B48 by 1.83% in segmentation accuracy with minimal complexity overhead.The model also shows strong cross-dataset generalization,achieving 60.01%mIoU and 66.22%F1 on Asphalt3k without fine-tuning.These results highlight CW-HRNet’s favorable accuracyefficiency trade-off for real-world crack segmentation tasks.展开更多
Accurate and reliable crack segmentation is a challenge and meaningful task.In this article,aiming at the characteristics of cracks on the concrete images,the intensity frequency information of source images which is ...Accurate and reliable crack segmentation is a challenge and meaningful task.In this article,aiming at the characteristics of cracks on the concrete images,the intensity frequency information of source images which is obtained by Discrete Wavelet Transform(DWT)is fed into deep learning-based networks to enhance the ability of network on crack segmentation.To well integrate frequency information into network an effective and novel DWTA module based on the DWT and scSE attention mechanism is proposed.The semantic information of cracks is enhanced and the irrelevant information is suppressed by DWTA module.And the gap between frequency information and convolution information from network is balanced by DWTA module which can well fuse wavelet information into image segmentation network.The Unet-DWTA is proposed to preserved the information of crack boundary and thin crack in intermediate feature maps by adding DWTA module in the encoderdecoder structures.In decoder,diverse level feature maps are fused to capture the information of crack boundary and the abstract semantic information which is beneficial to crack pixel classification.The proposed method is verified on three classic datasets including CrackDataset,CrackForest,and DeepCrack datasets.Compared with the other crack methods,the proposed Unet-DWTA shows better performance based on the evaluation of the subjective analysis and objective metrics about image semantic segmentation.展开更多
Contemporary demands necessitate the swift and accurate detection of cracks in critical infrastructures,including tunnels and pavements.This study proposed a transfer learning-based encoder-decoder method with visual ...Contemporary demands necessitate the swift and accurate detection of cracks in critical infrastructures,including tunnels and pavements.This study proposed a transfer learning-based encoder-decoder method with visual explanations for infrastructure crack segmentation.Firstly,a vast dataset containing 7089 images was developed,comprising diverse conditions—simple and complex crack patterns as well as clean and rough backgrounds.Secondly,leveraging transfer learning,an encoder-decoder model with visual explanations was formulated,utilizing varied pre-trained convolutional neural network(CNN)as the encoder.Visual explanations were achieved through gradient-weighted class activation mapping(Grad-CAM)to interpret the CNN segmentation model.Thirdly,accuracy,complexity(computation and model),and memory usage assessed CNN feasibility in practical engineering.Model performance was gauged via prediction and visual explanation.The investigation encompassed hyperparameters,data augmentation,deep learning from scratch vs.transfer learning,segmentation model architectures,segmentation model encoders,and encoder pre-training strategies.Results underscored transfer learning’s potency in enhancing CNN accuracy for crack segmentation,surpassing deep learning from scratch.Notably,encoder classification accuracy bore no significant correlation with CNN segmentation accuracy.Among all tested models,UNet-EfficientNet_B7 excelled in crack segmentation,harmonizing accuracy,complexity,memory usage,prediction,and visual explanation.展开更多
Detecting pavement cracks is critical for road safety and infrastructure management.Traditional methods,relying on manual inspection and basic image processing,are time-consuming and prone to errors.Recent deep-learni...Detecting pavement cracks is critical for road safety and infrastructure management.Traditional methods,relying on manual inspection and basic image processing,are time-consuming and prone to errors.Recent deep-learning(DL)methods automate crack detection,but many still struggle with variable crack patterns and environmental conditions.This study aims to address these limitations by introducing the Masker Transformer,a novel hybrid deep learning model that integrates the precise localization capabilities of Mask Region-based Convolutional Neural Network(Mask R-CNN)with the global contextual awareness of Vision Transformer(ViT).The research focuses on leveraging the strengths of both architectures to enhance segmentation accuracy and adaptability across different pavement conditions.We evaluated the performance of theMaskerTransformer against other state-of-theartmodels such asU-Net,TransformerU-Net(TransUNet),U-NetTransformer(UNETr),SwinU-NetTransformer(Swin-UNETr),You Only Look Once version 8(YoloV8),and Mask R-CNN using two benchmark datasets:Crack500 and DeepCrack.The findings reveal that the MaskerTransformer significantly outperforms the existing models,achieving the highest Dice SimilarityCoefficient(DSC),precision,recall,and F1-Score across both datasets.Specifically,the model attained a DSC of 80.04%on Crack500 and 91.37%on DeepCrack,demonstrating superior segmentation accuracy and reliability.The high precision and recall rates further substantiate its effectiveness in real-world applications,suggesting that the Masker Transformer can serve as a robust tool for automated pavement crack detection,potentially replacing more traditional methods.展开更多
Identifying crack and predicting crack propagation are critical processes for the risk assessment of engineering structures.Most traditional approaches to crack modeling are faced with issues of high computational cos...Identifying crack and predicting crack propagation are critical processes for the risk assessment of engineering structures.Most traditional approaches to crack modeling are faced with issues of high computational costs and excessive computing time.To address this issue,we explore the potential of deep learning(DL)to increase the efficiency of crack detection and forecasting crack growth.However,there is no single algorithm that can fit all data sets well or can apply in all cases since specific tasks vary.In the paper,we present DL models for identifying cracks,especially on concrete surface images,and for predicting crack propagation.Firstly,SegNet and U-Net networks are used to identify concrete cracks.Stochastic gradient descent(SGD)and adaptive moment estimation(Adam)algorithms are applied to minimize loss function during iterations.Secondly,time series algorithms including gated recurrent unit(GRU)and long short-term memory(LSTM)are used to predict crack propagation.The experimental findings indicate that the U-Net is more robust and efficient than the SegNet for identifying crack segmentation and achieves the most outstanding results.For evaluation of crack propagation,GRU and LSTM are used as DL models and results show good agreement with the experimental data.展开更多
This research developed a hybrid position-channel network (named PCNet) through incorporating newly designed channel and position attention modules into U-Net to alleviate the crack discontinuity problem in channel an...This research developed a hybrid position-channel network (named PCNet) through incorporating newly designed channel and position attention modules into U-Net to alleviate the crack discontinuity problem in channel and spatial dimensions. In PCNet, the U-Net is used as a baseline to extract informative spatial and channel-wise features from shield tunnel lining crack images. A channel and a position attention module are designed and embedded after each convolution layer of U-Net to model the feature interdependencies in channel and spatial dimensions. These attention modules can make the U-Net adaptively integrate local crack features with their global dependencies. Experiments were conducted utilizing the dataset based on the images from Shanghai metro shield tunnels. The results validate the effectiveness of the designed channel and position attention modules, since they can individually increase balanced accuracy (BA) by 11.25% and 12.95%, intersection over union (IoU) by 10.79% and 11.83%, and F1 score by 9.96% and 10.63%, respectively. In comparison with the state-of-the-art models (i.e. LinkNet, PSPNet, U-Net, PANet, and Mask R–CNN) on the testing dataset, the proposed PCNet outperforms others with an improvement of BA, IoU, and F1 score owing to the implementation of the channel and position attention modules. These evaluation metrics indicate that the proposed PCNet presents refined crack segmentation with improved performance and is a practicable approach to segment shield tunnel lining cracks in field practice.展开更多
In underground engineering,the detection of structural cracks on tunnel surfaces stands as a pivotal task in ensuring the health and reliability of tunnel structures.However,the dim and dusty environment inherent to u...In underground engineering,the detection of structural cracks on tunnel surfaces stands as a pivotal task in ensuring the health and reliability of tunnel structures.However,the dim and dusty environment inherent to under-ground engineering poses considerable challenges to crack segmentation.This paper proposes a crack segmentation algorithm termed as Focused Detection for Subsurface Cracks YOLOv8(FDSC-YOLOv8)specifically designed for underground engineering structural surfaces.Firstly,to improve the extraction of multi-layer convolutional features,the fixed convolutional module is replaced with a deformable convolutional module.Secondly,the model’s receptive field is enhanced by introducing a multi-branch convolutional module,improving the extraction of shallow features for small targets.Next,the Dynamic Snake Convolution module is incorporated to enhance the extraction capability for slender and weak cracks.Finally,the Convolutional Block Attention Module(CBAM)module is employed to achieve better target determination.The FDSC-YOLOv8s algorithm’s mAP50 and mAP50-95 reach 96.5%and 66.4%,according to the testing data.展开更多
During the operation, maintenance and upkeep of concrete buildings, surface cracks are often regarded as important warning signs of potential damage. Their precise segmentation plays a key role in assessing the health...During the operation, maintenance and upkeep of concrete buildings, surface cracks are often regarded as important warning signs of potential damage. Their precise segmentation plays a key role in assessing the health of a building. Traditional manual inspection is subjective, inefficient and has safety hazards. In contrast, current mainstream computer vision–based crack segmentation methods still suffer from missed detections, false detections, and segmentation discontinuities. These problems are particularly evident when dealing with small cracks, complex backgrounds, and blurred boundaries. For this reason, this paper proposes a lightweight building surface crack segmentation method, HL-YOLO, based on YOLOv11n-seg, which integrates an attention mechanism and a dilation-wise residual structure. First, we design a lightweight backbone network, RCSAA-Net, which combines ResNet50, capable of multi-scale feature extraction, with a custom Channel-Spatial Aggregation Attention (CSAA) module. This design boosts the model’s capacity to extract features of fine cracks and complex backgrounds. Among them, the CSAA module enhances the model’s attention to critical crack areas by capturing global dependencies in feature maps. Secondly, we construct an enhanced Content-aware ReAssembly of FEatures (ProCARAFE) module. It introduces a larger receptive field and dynamic kernel generation mechanism to achieve the reconstruction and accurate restoration of crack edge details. Finally, a Dilation-wise Residual (DWR) structure is introduced to reconstruct the C3k2 modules in the neck. It enhances multi-scale feature extraction and long-range contextual information fusion capabilities through multi-rate depthwise dilated convolutions. The improved model’s superiority and generalization ability have been validated through experiments on the self-built dataset. Compared to the baseline model, HL-YOLO improves mean Average Precision at 0.5 IoU by 4.1%, and increases the mean Intersection over Union (mIoU) by 4.86%, with only 3.12 million parameters. These results indicate that HL-YOLO can efficiently and accurately identify cracks on building surfaces, meeting the demand for rapid detection and providing an effective technical solution for real-time crack monitoring.展开更多
An algorithm based on deep semantic segmentation called LC-DeepLab is proposed for detecting the trends and geometries of cracks on tunnel linings at the pixel level.The proposed method addresses the low accuracy of t...An algorithm based on deep semantic segmentation called LC-DeepLab is proposed for detecting the trends and geometries of cracks on tunnel linings at the pixel level.The proposed method addresses the low accuracy of tunnel crack segmentation and the slow detection speed of conventional models in complex backgrounds.The novel algorithm is based on the DeepLabv3+network framework.A lighter backbone network was used for feature extraction.Next,an efficient shallow feature fusion module that extracts crack features across pixels is designed to improve the edges of crack segmentation.Finally,an efficient attention module that significantly improves the anti-interference ability of the model in complex backgrounds is validated.Four classic semantic segmentation algorithms(fully convolutional network,pyramid scene parsing network,U-Net,and DeepLabv3+)are selected for comparative analysis to verify the effectiveness of the proposed algorithm.The experimental results show that LC-DeepLab can accurately segment and highlight cracks from tunnel linings in complex backgrounds,and the accuracy(mean intersection over union)is 78.26%.The LC-DeepLab can achieve a real-time segmentation of 416×416×3 defect images with 46.98 f/s and 21.85 Mb parameters.展开更多
Cracks are a major sign of aging transportation infrastructure.The detection and repair of cracks is the key to ensuring the overall safety of the transportation infrastructure.In recent years,due to the remarkable su...Cracks are a major sign of aging transportation infrastructure.The detection and repair of cracks is the key to ensuring the overall safety of the transportation infrastructure.In recent years,due to the remarkable success of deep learning(DL)in the field of crack detection,many researches have been devoted to developing pixel-level crack image segmentation(CIS)models based on DL to improve crack detection accuracy,but as far as we know there is no review of DL-based CIS methods yet.To address this gap,we present a comprehensive thematic survey of DL-based CIS techniques.Our review offers several contributions to the CIS area.First,more than 40 papers of journal or top conference most published in the last three years are identified and collected based on the systematic literature review method.Second,according to the backbone network architecture of the models proposed in them,they are grouped into 10 topics:FCN,U-Net,encoder-decoder model,multi-scale,attention mechanism,transformer,two-stage detection,multi-modal fusion,unsupervised learning and weakly supervised learning,to be reviewed.Meanwhile,our survey focuses on discussing strengths and limitations of the models in each topic so as to reveal the latest research progress in the CIS field.Third,publicly accessible data sets,evaluation metrics,and loss functions that can be used for pixel-level crack detection are systematically introduced and summarized to facilitate researchers to select suitable components according to their own research tasks.Finally,we discuss six common problems and existing solutions to them in the field of DL-based CIS,and then suggest eight possible future research directions in this field.展开更多
文摘This paper presents CW-HRNet,a high-resolution,lightweight crack segmentation network designed to address challenges in complex scenes with slender,deformable,and blurred crack structures.The model incorporates two key modules:Constrained Deformable Convolution(CDC),which stabilizes geometric alignment by applying a tanh limiter and learnable scaling factor to the predicted offsets,and the Wavelet Frequency Enhancement Module(WFEM),which decomposes features using Haar wavelets to preserve low-frequency structures while enhancing high-frequency boundaries and textures.Evaluations on the CrackSeg9k benchmark demonstrate CW-HRNet’s superior performance,achieving 82.39%mIoU with only 7.49M parameters and 10.34 GFLOPs,outperforming HrSegNet-B48 by 1.83% in segmentation accuracy with minimal complexity overhead.The model also shows strong cross-dataset generalization,achieving 60.01%mIoU and 66.22%F1 on Asphalt3k without fine-tuning.These results highlight CW-HRNet’s favorable accuracyefficiency trade-off for real-world crack segmentation tasks.
基金National Natural Science Foundation of China under Grant 61972267National Natural Science Foundation of Hebei Province under Grant F2018210148University Science Research Project of Hebei Province under Grant ZD2021334。
文摘Accurate and reliable crack segmentation is a challenge and meaningful task.In this article,aiming at the characteristics of cracks on the concrete images,the intensity frequency information of source images which is obtained by Discrete Wavelet Transform(DWT)is fed into deep learning-based networks to enhance the ability of network on crack segmentation.To well integrate frequency information into network an effective and novel DWTA module based on the DWT and scSE attention mechanism is proposed.The semantic information of cracks is enhanced and the irrelevant information is suppressed by DWTA module.And the gap between frequency information and convolution information from network is balanced by DWTA module which can well fuse wavelet information into image segmentation network.The Unet-DWTA is proposed to preserved the information of crack boundary and thin crack in intermediate feature maps by adding DWTA module in the encoderdecoder structures.In decoder,diverse level feature maps are fused to capture the information of crack boundary and the abstract semantic information which is beneficial to crack pixel classification.The proposed method is verified on three classic datasets including CrackDataset,CrackForest,and DeepCrack datasets.Compared with the other crack methods,the proposed Unet-DWTA shows better performance based on the evaluation of the subjective analysis and objective metrics about image semantic segmentation.
基金the National Natural Science Foundation of China(Grant Nos.52090083 and 52378405)Key Technology R&D Plan of Yunnan Provincial Department of Science and Technology(Grant No.202303AA080003)for their financial support.
文摘Contemporary demands necessitate the swift and accurate detection of cracks in critical infrastructures,including tunnels and pavements.This study proposed a transfer learning-based encoder-decoder method with visual explanations for infrastructure crack segmentation.Firstly,a vast dataset containing 7089 images was developed,comprising diverse conditions—simple and complex crack patterns as well as clean and rough backgrounds.Secondly,leveraging transfer learning,an encoder-decoder model with visual explanations was formulated,utilizing varied pre-trained convolutional neural network(CNN)as the encoder.Visual explanations were achieved through gradient-weighted class activation mapping(Grad-CAM)to interpret the CNN segmentation model.Thirdly,accuracy,complexity(computation and model),and memory usage assessed CNN feasibility in practical engineering.Model performance was gauged via prediction and visual explanation.The investigation encompassed hyperparameters,data augmentation,deep learning from scratch vs.transfer learning,segmentation model architectures,segmentation model encoders,and encoder pre-training strategies.Results underscored transfer learning’s potency in enhancing CNN accuracy for crack segmentation,surpassing deep learning from scratch.Notably,encoder classification accuracy bore no significant correlation with CNN segmentation accuracy.Among all tested models,UNet-EfficientNet_B7 excelled in crack segmentation,harmonizing accuracy,complexity,memory usage,prediction,and visual explanation.
文摘Detecting pavement cracks is critical for road safety and infrastructure management.Traditional methods,relying on manual inspection and basic image processing,are time-consuming and prone to errors.Recent deep-learning(DL)methods automate crack detection,but many still struggle with variable crack patterns and environmental conditions.This study aims to address these limitations by introducing the Masker Transformer,a novel hybrid deep learning model that integrates the precise localization capabilities of Mask Region-based Convolutional Neural Network(Mask R-CNN)with the global contextual awareness of Vision Transformer(ViT).The research focuses on leveraging the strengths of both architectures to enhance segmentation accuracy and adaptability across different pavement conditions.We evaluated the performance of theMaskerTransformer against other state-of-theartmodels such asU-Net,TransformerU-Net(TransUNet),U-NetTransformer(UNETr),SwinU-NetTransformer(Swin-UNETr),You Only Look Once version 8(YoloV8),and Mask R-CNN using two benchmark datasets:Crack500 and DeepCrack.The findings reveal that the MaskerTransformer significantly outperforms the existing models,achieving the highest Dice SimilarityCoefficient(DSC),precision,recall,and F1-Score across both datasets.Specifically,the model attained a DSC of 80.04%on Crack500 and 91.37%on DeepCrack,demonstrating superior segmentation accuracy and reliability.The high precision and recall rates further substantiate its effectiveness in real-world applications,suggesting that the Masker Transformer can serve as a robust tool for automated pavement crack detection,potentially replacing more traditional methods.
基金The first author would like to thank European Commission H2020-MSCA-RISE BESTOFRAC project for research funding.
文摘Identifying crack and predicting crack propagation are critical processes for the risk assessment of engineering structures.Most traditional approaches to crack modeling are faced with issues of high computational costs and excessive computing time.To address this issue,we explore the potential of deep learning(DL)to increase the efficiency of crack detection and forecasting crack growth.However,there is no single algorithm that can fit all data sets well or can apply in all cases since specific tasks vary.In the paper,we present DL models for identifying cracks,especially on concrete surface images,and for predicting crack propagation.Firstly,SegNet and U-Net networks are used to identify concrete cracks.Stochastic gradient descent(SGD)and adaptive moment estimation(Adam)algorithms are applied to minimize loss function during iterations.Secondly,time series algorithms including gated recurrent unit(GRU)and long short-term memory(LSTM)are used to predict crack propagation.The experimental findings indicate that the U-Net is more robust and efficient than the SegNet for identifying crack segmentation and achieves the most outstanding results.For evaluation of crack propagation,GRU and LSTM are used as DL models and results show good agreement with the experimental data.
基金support from the Ministry of Science and Tech-nology of the:People's Republic of China(Grant No.2021 YFB2600804)the Open Research Project Programme of the State Key Labor atory of Interet of Things for Smart City(University of Macao)(Grant No.SKL-IoTSC(UM)-2021-2023/ORPF/A19/2022)the General Research Fund(GRF)project(Grant No.15214722)from Research Grants Council(RGC)of Hong Kong Special Administrative Re gion Government of China are gratefully acknowledged.
文摘This research developed a hybrid position-channel network (named PCNet) through incorporating newly designed channel and position attention modules into U-Net to alleviate the crack discontinuity problem in channel and spatial dimensions. In PCNet, the U-Net is used as a baseline to extract informative spatial and channel-wise features from shield tunnel lining crack images. A channel and a position attention module are designed and embedded after each convolution layer of U-Net to model the feature interdependencies in channel and spatial dimensions. These attention modules can make the U-Net adaptively integrate local crack features with their global dependencies. Experiments were conducted utilizing the dataset based on the images from Shanghai metro shield tunnels. The results validate the effectiveness of the designed channel and position attention modules, since they can individually increase balanced accuracy (BA) by 11.25% and 12.95%, intersection over union (IoU) by 10.79% and 11.83%, and F1 score by 9.96% and 10.63%, respectively. In comparison with the state-of-the-art models (i.e. LinkNet, PSPNet, U-Net, PANet, and Mask R–CNN) on the testing dataset, the proposed PCNet outperforms others with an improvement of BA, IoU, and F1 score owing to the implementation of the channel and position attention modules. These evaluation metrics indicate that the proposed PCNet presents refined crack segmentation with improved performance and is a practicable approach to segment shield tunnel lining cracks in field practice.
基金This research was funded by the National Key R&D Program of China(Project:Key Technologies and Equipment for Multi-View Stereoscopic Disaster Detection and Emergency Response to Derived Disasters in Underground Spaces,2022YFC3005600)the National Natural Science Foundation of China(52378402)+2 种基金Shandong Provincial Natural Science Foundation Youth Project(ZR2022QE021 and ZR202211100077)Shandong Province Higher Education Young Innovative Team Project(2022KJ037)State Key Laboratory of Precision Blasting and Hubei Key Laboratory of Blasting Engineering,Jianghan University(PBSKL2022C03),funding from Shandong Railway Investment Holding Group Co.,Ltd.(“Key Technologies for Rapid and Intelligent Construction of Large Section High-Speed Railway Tunnels in Low Mountain and Hilly Areas”and“Intelligent Construction Trolley Equipment and Key Technologies for the Lining of Ultra-Long Open Tunnel Sections”).
文摘In underground engineering,the detection of structural cracks on tunnel surfaces stands as a pivotal task in ensuring the health and reliability of tunnel structures.However,the dim and dusty environment inherent to under-ground engineering poses considerable challenges to crack segmentation.This paper proposes a crack segmentation algorithm termed as Focused Detection for Subsurface Cracks YOLOv8(FDSC-YOLOv8)specifically designed for underground engineering structural surfaces.Firstly,to improve the extraction of multi-layer convolutional features,the fixed convolutional module is replaced with a deformable convolutional module.Secondly,the model’s receptive field is enhanced by introducing a multi-branch convolutional module,improving the extraction of shallow features for small targets.Next,the Dynamic Snake Convolution module is incorporated to enhance the extraction capability for slender and weak cracks.Finally,the Convolutional Block Attention Module(CBAM)module is employed to achieve better target determination.The FDSC-YOLOv8s algorithm’s mAP50 and mAP50-95 reach 96.5%and 66.4%,according to the testing data.
基金support from Natural Science Foundation of Hunan Province(Grant No.2024JJ8055)Hunan Yiduoyun Commodity Itelligence Project(Grant No.h2024-003).
文摘During the operation, maintenance and upkeep of concrete buildings, surface cracks are often regarded as important warning signs of potential damage. Their precise segmentation plays a key role in assessing the health of a building. Traditional manual inspection is subjective, inefficient and has safety hazards. In contrast, current mainstream computer vision–based crack segmentation methods still suffer from missed detections, false detections, and segmentation discontinuities. These problems are particularly evident when dealing with small cracks, complex backgrounds, and blurred boundaries. For this reason, this paper proposes a lightweight building surface crack segmentation method, HL-YOLO, based on YOLOv11n-seg, which integrates an attention mechanism and a dilation-wise residual structure. First, we design a lightweight backbone network, RCSAA-Net, which combines ResNet50, capable of multi-scale feature extraction, with a custom Channel-Spatial Aggregation Attention (CSAA) module. This design boosts the model’s capacity to extract features of fine cracks and complex backgrounds. Among them, the CSAA module enhances the model’s attention to critical crack areas by capturing global dependencies in feature maps. Secondly, we construct an enhanced Content-aware ReAssembly of FEatures (ProCARAFE) module. It introduces a larger receptive field and dynamic kernel generation mechanism to achieve the reconstruction and accurate restoration of crack edge details. Finally, a Dilation-wise Residual (DWR) structure is introduced to reconstruct the C3k2 modules in the neck. It enhances multi-scale feature extraction and long-range contextual information fusion capabilities through multi-rate depthwise dilated convolutions. The improved model’s superiority and generalization ability have been validated through experiments on the self-built dataset. Compared to the baseline model, HL-YOLO improves mean Average Precision at 0.5 IoU by 4.1%, and increases the mean Intersection over Union (mIoU) by 4.86%, with only 3.12 million parameters. These results indicate that HL-YOLO can efficiently and accurately identify cracks on building surfaces, meeting the demand for rapid detection and providing an effective technical solution for real-time crack monitoring.
基金This study was supported by the National Natural Science Foundation of China(Grant Nos.50908234,52208421)the Open Fund of the National Engineering Research Center of Highway Maintenance Technology,Changsha University of Science&Technology(No.kfj220101)+1 种基金the Natural Science Foundation of Hunan Province(No.2020JJ4743)the Research Innovation Project for Postgraduate of Central South University(No.1053320213484).
文摘An algorithm based on deep semantic segmentation called LC-DeepLab is proposed for detecting the trends and geometries of cracks on tunnel linings at the pixel level.The proposed method addresses the low accuracy of tunnel crack segmentation and the slow detection speed of conventional models in complex backgrounds.The novel algorithm is based on the DeepLabv3+network framework.A lighter backbone network was used for feature extraction.Next,an efficient shallow feature fusion module that extracts crack features across pixels is designed to improve the edges of crack segmentation.Finally,an efficient attention module that significantly improves the anti-interference ability of the model in complex backgrounds is validated.Four classic semantic segmentation algorithms(fully convolutional network,pyramid scene parsing network,U-Net,and DeepLabv3+)are selected for comparative analysis to verify the effectiveness of the proposed algorithm.The experimental results show that LC-DeepLab can accurately segment and highlight cracks from tunnel linings in complex backgrounds,and the accuracy(mean intersection over union)is 78.26%.The LC-DeepLab can achieve a real-time segmentation of 416×416×3 defect images with 46.98 f/s and 21.85 Mb parameters.
基金the National Natural Science Foundation of China(No.61971005)the Scientific Research Project of Department of Transport of Shaanxi Province in 2020(No.20-24K)+2 种基金the Key Project of Baoji University of Arts and Science(ZK2018013)Research Project of Department of Education of Zhejiang Province(Y202146796)Major Scientific and Technological Innovation Project of Wenzhou City(ZG2021029)。
文摘Cracks are a major sign of aging transportation infrastructure.The detection and repair of cracks is the key to ensuring the overall safety of the transportation infrastructure.In recent years,due to the remarkable success of deep learning(DL)in the field of crack detection,many researches have been devoted to developing pixel-level crack image segmentation(CIS)models based on DL to improve crack detection accuracy,but as far as we know there is no review of DL-based CIS methods yet.To address this gap,we present a comprehensive thematic survey of DL-based CIS techniques.Our review offers several contributions to the CIS area.First,more than 40 papers of journal or top conference most published in the last three years are identified and collected based on the systematic literature review method.Second,according to the backbone network architecture of the models proposed in them,they are grouped into 10 topics:FCN,U-Net,encoder-decoder model,multi-scale,attention mechanism,transformer,two-stage detection,multi-modal fusion,unsupervised learning and weakly supervised learning,to be reviewed.Meanwhile,our survey focuses on discussing strengths and limitations of the models in each topic so as to reveal the latest research progress in the CIS field.Third,publicly accessible data sets,evaluation metrics,and loss functions that can be used for pixel-level crack detection are systematically introduced and summarized to facilitate researchers to select suitable components according to their own research tasks.Finally,we discuss six common problems and existing solutions to them in the field of DL-based CIS,and then suggest eight possible future research directions in this field.