针对水下光线衰减、散射等影响导致水下目标检测效果不佳的问题,提出一种基于YOLOv8的水下目标检测框架ERMS-YOLOv8,提升水下目标检测性能。主干网络采用高效视觉transformer网络(efficient vision transformer,EfficientViT),增强模型...针对水下光线衰减、散射等影响导致水下目标检测效果不佳的问题,提出一种基于YOLOv8的水下目标检测框架ERMS-YOLOv8,提升水下目标检测性能。主干网络采用高效视觉transformer网络(efficient vision transformer,EfficientViT),增强模型对水下生物的特征提取能力,减少特征信息丢失;Neck部分采用高效重参数化广义特征金字塔网络(reparameterized generalized-directional feature pyramid network,RepGFPN),增强模型对水下生物高层语义和低级空间特征的提取和融合能力,使得模型获取更加丰富的特征信息;引入混合局部通道注意力机制(mixed local channel attention for object detection,MLCA),使得模型同时融合通道信息、空间信息、局部通道信息和全局通道信息,增强了模型的表征能力;引入可扩展交并比损失函数(scalable intersection over union loss,SIoU),提升模型对目标边界信息的提取能力,从而进一步提高检测精度。实验结果表明,改进后的算法在UPRC2021和DUO数据集上mAP值分别达到83.9%和84.4%,与基准YOLOv8算法相比都有提高,在水下目标检测中具有优越的性能。展开更多
针对柑橘罐头生产中橘瓣外观检测的速度和精度低的问题,以及主流检测模型的参数量较高问题,提出一种轻量化橘瓣外观检测模型,即YOLOv7-VSS。首先,该模型引入利用Hard-Swish激活函数改进后的EfficientViT网络作为主干网络,通过输入不同...针对柑橘罐头生产中橘瓣外观检测的速度和精度低的问题,以及主流检测模型的参数量较高问题,提出一种轻量化橘瓣外观检测模型,即YOLOv7-VSS。首先,该模型引入利用Hard-Swish激活函数改进后的EfficientViT网络作为主干网络,通过输入不同层次的特征减少不同检测头的映射相似度,缓解冗余计算,并通过级联组注意力机制增强网络的特征提取能力;其次,引入一种slim-neck模块,融合标准卷积和深度可分离卷积的特性,减小模型的规模,同时保持高精度;然后,为进一步缩小模型体积并加快推理速度,将SPPCSPC替换为SPPF结构;最后,为符合数据集中橘瓣的位置特点,使用MPDIoU损失函数来提升预测框的回归精度。实验结果表明,所提出的橘瓣外观检测模型的大小相比于YOLOv7减小了63.81%,检测精度达到了96.57%;同时,经过在Jetson Orin Nano上部署测试,模型大小和检测精度的平衡性相较于同类型的方法有较大提升,可满足柑橘罐头生产线的要求。展开更多
[Objective]The accurate identification of maize tassels is critical for the production of hybrid seed.Existing object detection models in complex farmland scenarios face limitations such as restricted data diversity,i...[Objective]The accurate identification of maize tassels is critical for the production of hybrid seed.Existing object detection models in complex farmland scenarios face limitations such as restricted data diversity,insufficient feature extraction,high computational load,and low detection efficiency.To address these challenges,a real-time field maize tassel detection model,LightTassel-YOLO(You Only Look Once)based on an improved YOLOv11n is proposed.The model is designed to quickly and accurately identify maize tassels,enabling efficient operation of detasseling unmanned aerial vehicles(UAVs)and reducing the impact of manual intervention.[Methods]Data was continuously collected during the tasseling stage of maize from 2023 to 2024 using UAVs,establishing a large-scale,high-quality maize tassel dataset that covered different maize tasseling stages,multiple varieties,varying altitudes,and diverse meteorological conditions.First,EfficientViT(Efficient vision transformer)was applied as the backbone network to enhance the ability to perceive information across multi-scale features.Second,the C2PSA-CPCA(Convolutional block with parallel spatial attention with channel prior convolutional attention)module was designed to dynamically assign attention weights to the channel and spatial dimensions of feature maps,effectively enhancing the network's capability to extract target features while reducing computational complexity.Finally,the C3k2-SCConv module was constructed to facilitate representative feature learning and achieve low-cost spatial feature reconstruction,thereby improving the model's detection accuracy.[Results and Discussions]The results demonstrated that LightTassel-YOLO provided a reliable method for maize tassel detection.The final model achieved an accuracy of 92.6%,a recall of 89.1%,and an AP@0.5 of 94.7%,representing improvements of 2.5,3.8 and 4.0 percentage points over the baseline model YOLOv11n,respectively.The model had only 3.23 M parameters and a computational cost of 6.7 GFLOPs.In addition,LightTassel-YOLO was compared with mainstream object detection algorithms such as Faster R-CNN,SSD,and multiple versions of the YOLO series.The results demonstrated that the proposed method outperformed these algorithms in overall performance and exhibits excellent adaptability in typical field scenarios.[Conclusions]The proposed method provides an effective theoretical framework for precise maize tassel monitoring and holds significant potential for advancing intelligent field management practices.展开更多
文摘针对水下光线衰减、散射等影响导致水下目标检测效果不佳的问题,提出一种基于YOLOv8的水下目标检测框架ERMS-YOLOv8,提升水下目标检测性能。主干网络采用高效视觉transformer网络(efficient vision transformer,EfficientViT),增强模型对水下生物的特征提取能力,减少特征信息丢失;Neck部分采用高效重参数化广义特征金字塔网络(reparameterized generalized-directional feature pyramid network,RepGFPN),增强模型对水下生物高层语义和低级空间特征的提取和融合能力,使得模型获取更加丰富的特征信息;引入混合局部通道注意力机制(mixed local channel attention for object detection,MLCA),使得模型同时融合通道信息、空间信息、局部通道信息和全局通道信息,增强了模型的表征能力;引入可扩展交并比损失函数(scalable intersection over union loss,SIoU),提升模型对目标边界信息的提取能力,从而进一步提高检测精度。实验结果表明,改进后的算法在UPRC2021和DUO数据集上mAP值分别达到83.9%和84.4%,与基准YOLOv8算法相比都有提高,在水下目标检测中具有优越的性能。
文摘针对柑橘罐头生产中橘瓣外观检测的速度和精度低的问题,以及主流检测模型的参数量较高问题,提出一种轻量化橘瓣外观检测模型,即YOLOv7-VSS。首先,该模型引入利用Hard-Swish激活函数改进后的EfficientViT网络作为主干网络,通过输入不同层次的特征减少不同检测头的映射相似度,缓解冗余计算,并通过级联组注意力机制增强网络的特征提取能力;其次,引入一种slim-neck模块,融合标准卷积和深度可分离卷积的特性,减小模型的规模,同时保持高精度;然后,为进一步缩小模型体积并加快推理速度,将SPPCSPC替换为SPPF结构;最后,为符合数据集中橘瓣的位置特点,使用MPDIoU损失函数来提升预测框的回归精度。实验结果表明,所提出的橘瓣外观检测模型的大小相比于YOLOv7减小了63.81%,检测精度达到了96.57%;同时,经过在Jetson Orin Nano上部署测试,模型大小和检测精度的平衡性相较于同类型的方法有较大提升,可满足柑橘罐头生产线的要求。
文摘[Objective]The accurate identification of maize tassels is critical for the production of hybrid seed.Existing object detection models in complex farmland scenarios face limitations such as restricted data diversity,insufficient feature extraction,high computational load,and low detection efficiency.To address these challenges,a real-time field maize tassel detection model,LightTassel-YOLO(You Only Look Once)based on an improved YOLOv11n is proposed.The model is designed to quickly and accurately identify maize tassels,enabling efficient operation of detasseling unmanned aerial vehicles(UAVs)and reducing the impact of manual intervention.[Methods]Data was continuously collected during the tasseling stage of maize from 2023 to 2024 using UAVs,establishing a large-scale,high-quality maize tassel dataset that covered different maize tasseling stages,multiple varieties,varying altitudes,and diverse meteorological conditions.First,EfficientViT(Efficient vision transformer)was applied as the backbone network to enhance the ability to perceive information across multi-scale features.Second,the C2PSA-CPCA(Convolutional block with parallel spatial attention with channel prior convolutional attention)module was designed to dynamically assign attention weights to the channel and spatial dimensions of feature maps,effectively enhancing the network's capability to extract target features while reducing computational complexity.Finally,the C3k2-SCConv module was constructed to facilitate representative feature learning and achieve low-cost spatial feature reconstruction,thereby improving the model's detection accuracy.[Results and Discussions]The results demonstrated that LightTassel-YOLO provided a reliable method for maize tassel detection.The final model achieved an accuracy of 92.6%,a recall of 89.1%,and an AP@0.5 of 94.7%,representing improvements of 2.5,3.8 and 4.0 percentage points over the baseline model YOLOv11n,respectively.The model had only 3.23 M parameters and a computational cost of 6.7 GFLOPs.In addition,LightTassel-YOLO was compared with mainstream object detection algorithms such as Faster R-CNN,SSD,and multiple versions of the YOLO series.The results demonstrated that the proposed method outperformed these algorithms in overall performance and exhibits excellent adaptability in typical field scenarios.[Conclusions]The proposed method provides an effective theoretical framework for precise maize tassel monitoring and holds significant potential for advancing intelligent field management practices.