针对目前人工巡视导致的变电站设备及生产行为异常检测效率低、人工风险高等问题,提出改进你只看一次11纳米版(you only look once version 11 nano, YOLOv11n)模型。首先,通过设计基于自注意力机制的3尺度卷积双路径可变核(convolution...针对目前人工巡视导致的变电站设备及生产行为异常检测效率低、人工风险高等问题,提出改进你只看一次11纳米版(you only look once version 11 nano, YOLOv11n)模型。首先,通过设计基于自注意力机制的3尺度卷积双路径可变核(convolutional three-scale kernel-adaptive dual-path self-attention mechanism, C3k2-SA)模块在较小特征图衔接特征融合部分,优化了网络结构,增强了全局特征提取能力。然后,在主干网络末层引入了基于注意力机制的特征增强(feature enhancement, FEN)模块,动态调整不同区域的特征权重,实现自适应的特征增强,缓解深层网络中的梯度消失问题。最后,对拼接(concatenate, Concat)模块进行优化,通过卷积层调整通道数,采用池化和sigmoid激活函数进行特征的精细处理,提高了模型对不同类型特征的自适应,增强了特征融合效果,同时抑制了无关或冗余特征,防止过拟合。结果表明,与原始YOLOv11n模型相比,改进YOLOv11n模型的精确率、召回率、平均精确率均值分别上升了1.7、6.6、3.6个百分点。改进YOLOv11n模型能够提高变电站异常状态检测的准确性,为智能变电站的异常检测工作提供一定参考。展开更多
针对背景复杂的变电站电力设备过热故障红外图像难以检测的问题,提出了改进你只看一次11纳米版(you only look once version 11 nano, YOLOv11n)算法。首先,采用轻量级跨尺度特征融合模块(cross-scale feature fusion module, CCFM)改...针对背景复杂的变电站电力设备过热故障红外图像难以检测的问题,提出了改进你只看一次11纳米版(you only look once version 11 nano, YOLOv11n)算法。首先,采用轻量级跨尺度特征融合模块(cross-scale feature fusion module, CCFM)改进原有颈部网络,以实现对特征通道信息的高效整合并降低参数量;其次,引入具有可切换空洞卷积的2次跨阶段3卷积可变核(cross stage partial with three-convolution blocks of variable kernel size two-switchable atrous convolution, C3k2-SAConv)模块替换整个网络的C3k2模块,提升了算法的特征提取能力;最后,使用具有双层路由注意力视觉转换器的跨阶段双卷积(cross stage partial with two convolutions and vision transformer of bi-level routing attention, C2BF)模块替换跨阶段双卷积逐点空间注意力(cross stage partial with two convolutions and pointwise spatial attention, C2PSA)模块,提升了算法在复杂环境下对红外图像的检测准确度。结果表明,相较于原始YOLOv11n算法,改进YOLOv11n算法的参数量减少了22.1%;精确率、召回率、平均精确率均值分别达到91.1%、85.5%、90.9%,各自提升了3.0、2.6、2.8个百分点;检测速度达到128.2帧/s。改进YOLOv11n算法能实现对电力设备过热故障红外图像的有效检测,可满足算法轻量化与实时性检测的要求。展开更多
To address the challenges faced in real-world tomato ripeness detection,such as variable lighting conditions,complex backgrounds,and the trade-off between accuracy and the model being effectively lightweight,this stud...To address the challenges faced in real-world tomato ripeness detection,such as variable lighting conditions,complex backgrounds,and the trade-off between accuracy and the model being effectively lightweight,this study proposes a lightweight YOLOv11-MHS model.The improvements of the proposed model are reflected in three aspects:(1)the C3k2_MSCB module is designed,which integrates a multiscale convolutional block(MSCB)for multiscale feature extraction and fusion,thereby enhancing detection accuracy;(2)the neck of the model is redesigned as a high-level feature screening-fusion pyramid structure,which fuses key features to improve robustness in cluttered environments while reducing model size;and(3)the C2PSA module is enhanced by introducing the spatial and channel synergistic attention mechanism to improve the ability of the model to handle complex scenes.Experimental results on the same data set show that,compared to the baseline model YOLOv11n,YOLOv11-MHS achieves improvements of 1.7%in mAP0.5 and 2.9%in mAP0.5-0.95,while reducing parameters and model size by 35.2%and 32.7%,respectively.These results demonstrate that YOLOv11-MHS achieves both outstanding accuracy and lightweight performance in tomato ripeness detection,providing technical support for agricultural applications.展开更多
To address challenges in feature extraction and real-time processing during traffic police pose estimation,this paper proposes an improved YOLOv11-pose network for traffic police gesture recognition.By replacing the C...To address challenges in feature extraction and real-time processing during traffic police pose estimation,this paper proposes an improved YOLOv11-pose network for traffic police gesture recognition.By replacing the C3K2 module in the backbone network with an enhanced C3K2-Star-CAA module,we achieve efficient extraction of traffic police posture features.A multi-branch star topology enables cross-level feature fusion and multi-scale information propagation,enhancing the model’s perception of minute posture details and complex background interference.Embedding the CAA attention mechanism at the key feature layer models critical locations and their spatial contextual relationships through contextual anchors,effectively enhancing key-point feature representation while suppressing complex background interference.Experimental results demonstrate that the improved model achieves 78.6%mAP on the self-built dataset with a detection speed of 186.9 fps,outperforming comparison models in both accuracy and real-time performance.The findings indicate that this approach provides a robust and highly real-time practical solution for traffic police gesture recognition.展开更多
文摘针对背景复杂的变电站电力设备过热故障红外图像难以检测的问题,提出了改进你只看一次11纳米版(you only look once version 11 nano, YOLOv11n)算法。首先,采用轻量级跨尺度特征融合模块(cross-scale feature fusion module, CCFM)改进原有颈部网络,以实现对特征通道信息的高效整合并降低参数量;其次,引入具有可切换空洞卷积的2次跨阶段3卷积可变核(cross stage partial with three-convolution blocks of variable kernel size two-switchable atrous convolution, C3k2-SAConv)模块替换整个网络的C3k2模块,提升了算法的特征提取能力;最后,使用具有双层路由注意力视觉转换器的跨阶段双卷积(cross stage partial with two convolutions and vision transformer of bi-level routing attention, C2BF)模块替换跨阶段双卷积逐点空间注意力(cross stage partial with two convolutions and pointwise spatial attention, C2PSA)模块,提升了算法在复杂环境下对红外图像的检测准确度。结果表明,相较于原始YOLOv11n算法,改进YOLOv11n算法的参数量减少了22.1%;精确率、召回率、平均精确率均值分别达到91.1%、85.5%、90.9%,各自提升了3.0、2.6、2.8个百分点;检测速度达到128.2帧/s。改进YOLOv11n算法能实现对电力设备过热故障红外图像的有效检测,可满足算法轻量化与实时性检测的要求。
基金financially supported by National Natural Science Foundation of China(12364011)Guangxi Science and Technology Plan,China(AD21220147,AD25069027)+1 种基金Liuzhou Science and Technology Program,China(2023PRJ0103,2024AA0204A001)Graduate Education Innovation Project,China(YCSW2024522).
文摘To address the challenges faced in real-world tomato ripeness detection,such as variable lighting conditions,complex backgrounds,and the trade-off between accuracy and the model being effectively lightweight,this study proposes a lightweight YOLOv11-MHS model.The improvements of the proposed model are reflected in three aspects:(1)the C3k2_MSCB module is designed,which integrates a multiscale convolutional block(MSCB)for multiscale feature extraction and fusion,thereby enhancing detection accuracy;(2)the neck of the model is redesigned as a high-level feature screening-fusion pyramid structure,which fuses key features to improve robustness in cluttered environments while reducing model size;and(3)the C2PSA module is enhanced by introducing the spatial and channel synergistic attention mechanism to improve the ability of the model to handle complex scenes.Experimental results on the same data set show that,compared to the baseline model YOLOv11n,YOLOv11-MHS achieves improvements of 1.7%in mAP0.5 and 2.9%in mAP0.5-0.95,while reducing parameters and model size by 35.2%and 32.7%,respectively.These results demonstrate that YOLOv11-MHS achieves both outstanding accuracy and lightweight performance in tomato ripeness detection,providing technical support for agricultural applications.
文摘To address challenges in feature extraction and real-time processing during traffic police pose estimation,this paper proposes an improved YOLOv11-pose network for traffic police gesture recognition.By replacing the C3K2 module in the backbone network with an enhanced C3K2-Star-CAA module,we achieve efficient extraction of traffic police posture features.A multi-branch star topology enables cross-level feature fusion and multi-scale information propagation,enhancing the model’s perception of minute posture details and complex background interference.Embedding the CAA attention mechanism at the key feature layer models critical locations and their spatial contextual relationships through contextual anchors,effectively enhancing key-point feature representation while suppressing complex background interference.Experimental results demonstrate that the improved model achieves 78.6%mAP on the self-built dataset with a detection speed of 186.9 fps,outperforming comparison models in both accuracy and real-time performance.The findings indicate that this approach provides a robust and highly real-time practical solution for traffic police gesture recognition.