Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso...Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.展开更多
Detection of cracks in concrete structures is critical for their safety and the sustainability of maintenance processes.Traditional inspection techniques are costly,time-consuming,and inefficient regarding human resou...Detection of cracks in concrete structures is critical for their safety and the sustainability of maintenance processes.Traditional inspection techniques are costly,time-consuming,and inefficient regarding human resources.Deep learning architectures have become more widespread in recent years by accelerating these processes and increasing their efficiency.Deep learning models(DLMs)stand out as an effective solution in crack detection due to their features such as end-to-end learning capability,model adaptation,and automatic learning processes.However,providing an optimal balance between model performance and computational efficiency of DLMs is a vital research topic.In this article,three different methods are proposed for detecting cracks in concrete structures.In the first method,a Separable Convolutional with Attention and Multi-layer Enhanced Fusion Network(SCAMEFNet)deep neural network,which has a deep architecture and can provide a balance between the depth of DLMs and model parameters,has been developed.This model was designed using a convolutional neural network,multi-head attention,and various fusion techniques.The second method proposes a modified vision transformer(ViT)model.A two-stage ensemble learning model,deep featurebased two-stage ensemble model(DFTSEM),is proposed in the third method.In this method,deep features and machine learning methods are used.The proposed approaches are evaluated using the Concrete Cracks Image Data set,which the authors collected and contains concrete cracks on building surfaces.The results show that the SCAMEFNet model achieved an accuracy rate of 98.83%,the ViT model 97.33%,and the DFTSEM model 99.00%.These findings show that the proposed techniques successfully detect surface cracks and deformations and can provide practical solutions to realworld problems.In addition,the developed methods can contribute as a tool for BIM platforms in smart cities for building health.展开更多
文摘Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.
文摘Detection of cracks in concrete structures is critical for their safety and the sustainability of maintenance processes.Traditional inspection techniques are costly,time-consuming,and inefficient regarding human resources.Deep learning architectures have become more widespread in recent years by accelerating these processes and increasing their efficiency.Deep learning models(DLMs)stand out as an effective solution in crack detection due to their features such as end-to-end learning capability,model adaptation,and automatic learning processes.However,providing an optimal balance between model performance and computational efficiency of DLMs is a vital research topic.In this article,three different methods are proposed for detecting cracks in concrete structures.In the first method,a Separable Convolutional with Attention and Multi-layer Enhanced Fusion Network(SCAMEFNet)deep neural network,which has a deep architecture and can provide a balance between the depth of DLMs and model parameters,has been developed.This model was designed using a convolutional neural network,multi-head attention,and various fusion techniques.The second method proposes a modified vision transformer(ViT)model.A two-stage ensemble learning model,deep featurebased two-stage ensemble model(DFTSEM),is proposed in the third method.In this method,deep features and machine learning methods are used.The proposed approaches are evaluated using the Concrete Cracks Image Data set,which the authors collected and contains concrete cracks on building surfaces.The results show that the SCAMEFNet model achieved an accuracy rate of 98.83%,the ViT model 97.33%,and the DFTSEM model 99.00%.These findings show that the proposed techniques successfully detect surface cracks and deformations and can provide practical solutions to realworld problems.In addition,the developed methods can contribute as a tool for BIM platforms in smart cities for building health.