To address the issue of inconsistent image quality and data scarcity in bolt defect detection for transmission lines,this paper proposes an improved sparse region-based convolutional neural network(RCNN) based detecti...To address the issue of inconsistent image quality and data scarcity in bolt defect detection for transmission lines,this paper proposes an improved sparse region-based convolutional neural network(RCNN) based detection framework integrating image quality evaluation and text-to-image data augmentation.First,a HyperNetwork-based image quality assessment module is introduced to filter low-quality inspection images in terms of clarity and structural integrity,resulting in a high-quality training dataset.Second,a text-to-image diffusion model is utilized for sample augmentation.By designing text prompts that describe various bolt defect types under diverse lighting and viewing conditions,the model automatically generates realistic synthetic samples.The generated images are further filtered using a combination of quality and perceptual similarity metrics to ensure consistency with the real data distribution.Building upon the sparse RCNN baseline,a dynamic label assignment mechanism and a random decision path detection head are incorporated to enhance bounding box matching and prediction accuracy.Experimental results demonstrate that the proposed method significantly improves detection accuracy(mAP@0.5) over the original sparse RCNN while maintaining low computational cost,enabling more efficient and intelligent inspection of transmission line components.展开更多
基金Supported by the Science and Technology Project from State Grid Corporation of China (No.5700-202490330A-2-1-ZX)。
文摘To address the issue of inconsistent image quality and data scarcity in bolt defect detection for transmission lines,this paper proposes an improved sparse region-based convolutional neural network(RCNN) based detection framework integrating image quality evaluation and text-to-image data augmentation.First,a HyperNetwork-based image quality assessment module is introduced to filter low-quality inspection images in terms of clarity and structural integrity,resulting in a high-quality training dataset.Second,a text-to-image diffusion model is utilized for sample augmentation.By designing text prompts that describe various bolt defect types under diverse lighting and viewing conditions,the model automatically generates realistic synthetic samples.The generated images are further filtered using a combination of quality and perceptual similarity metrics to ensure consistency with the real data distribution.Building upon the sparse RCNN baseline,a dynamic label assignment mechanism and a random decision path detection head are incorporated to enhance bounding box matching and prediction accuracy.Experimental results demonstrate that the proposed method significantly improves detection accuracy(mAP@0.5) over the original sparse RCNN while maintaining low computational cost,enabling more efficient and intelligent inspection of transmission line components.