驾驶员在实际驾驶的过程中会存在面部遮挡场景,例如戴眼镜、戴口罩等,传统单一通过提取驾驶员面部特征进行疲劳检测的Dlib算法不再适用。该文结合Dlib与YOLO11使用多阈值判定,对传统Dlib疲劳检测算法进行改进,给出戴眼镜、戴口罩等驾驶...驾驶员在实际驾驶的过程中会存在面部遮挡场景,例如戴眼镜、戴口罩等,传统单一通过提取驾驶员面部特征进行疲劳检测的Dlib算法不再适用。该文结合Dlib与YOLO11使用多阈值判定,对传统Dlib疲劳检测算法进行改进,给出戴眼镜、戴口罩等驾驶员面部遮挡场景的疲劳检测算法,并在Raspberry Pi 5硬件平台,使用公开数据集验证改进算法对于驾驶员疲劳检测的准确性。另外,改进算法还可以对吸烟、打电话等这类分心驾驶行为进行检测和语音提醒,对疲劳和分心行为实现更全面的检测和预警。展开更多
In response to the challenges in highway pavement distress detection,such as multiple defect categories,difficulties in feature extraction for different damage types,and slow identification speeds,this paper proposes ...In response to the challenges in highway pavement distress detection,such as multiple defect categories,difficulties in feature extraction for different damage types,and slow identification speeds,this paper proposes an enhanced pavement crack detection model named Star-YOLO11.This improved algorithm modifies the YOLO11 architecture by substituting the original C3k2 backbone network with a Star-s50 feature extraction network.The enhanced structure adjusts the number of stacked layers in the StarBlock module to optimize detection accuracy and improve model efficiency.To enhance the accuracy of pavement crack detection and improve model efficiency,three key modifications to the YOLO11 architecture are proposed.Firstly,the original C3k2 backbone is replaced with a StarBlock-based structure,forming the Star-s50 feature extraction backbone network.This lightweight redesign reduces computational complexity while maintaining detection precision.Secondly,to address the inefficiency of the original Partial Self-attention(PSA)mechanism in capturing localized crack features,the convolutional prior-aware Channel Prior Convolutional Attention(CPCA)mechanism is integrated into the channel dimension,creating a hybrid CPC-C2PSA attention structure.Thirdly,the original neck structure is upgraded to a Star Multi-Branch Auxiliary Feature Pyramid Network(SMAFPN)based on the Multi-Branch Auxiliary Feature Pyramid Network architecture,which adaptively fuses high-level semantic and low-level spatial information through Star-s50 connections and C3k2 extraction blocks.Additionally,a composite dataset augmentation strategy combining traditional and advanced augmentation techniques is developed.This strategy is validated on a specialized pavement dataset containing five distinct crack categories for comprehensive training and evaluation.Experimental results indicate that the proposed Star-YOLO11 achieves an accuracy of 89.9%(3.5%higher than the baseline),a mean average precision(mAP)of 90.3%(+2.6%),and an F1-score of 85.8%(+0.5%),while reducing the model size by 18.8%and reaching a frame rate of 225.73 frames per second(FPS)for real-time detection.It shows potential for lightweight deployment in pavement crack detection tasks.展开更多
针对动态场景导致视觉定位与建图(simultaneous localization and mapping,SLAM)算法位姿估计精度低和地图质量差等问题,提出一种结合深度学习的动态视觉SLAM算法。该算法在ORB-SLAM3前端引入轻量化且目标识别率高的YOLO11n目标检测网络...针对动态场景导致视觉定位与建图(simultaneous localization and mapping,SLAM)算法位姿估计精度低和地图质量差等问题,提出一种结合深度学习的动态视觉SLAM算法。该算法在ORB-SLAM3前端引入轻量化且目标识别率高的YOLO11n目标检测网络,检测潜在动态区域,并结合Lucas-Kanade(LK)光流法识别其中的动态特征点,从而在剔除动态特征点的同时保留静态特征点,提高特征点利用率和位姿估计精度。此外,新增语义地图构建线程,通过去除YOLO11n识别到的动态物体点云,并融合前端提取的语义信息,实现静态语义地图的构建。在TUM数据集上的实验结果表明,相较于ORB-SLAM3,该算法在高动态序列数据集中的定位精度提升了95.02%,验证了该算法在动态环境下的有效性,能显著提升视觉SLAM系统的定位精度和地图构建质量。展开更多
【目的】矿产资源是人类生存和经济发展的重要物质基础,开展矿山监测、建立矿山监测模型对矿产资源的高效开发和矿区环境保护具有重要意义。针对露天矿区背景复杂、目标尺度多样且小目标聚集的特点,本研究旨在构建兼顾监测精度与效率的...【目的】矿产资源是人类生存和经济发展的重要物质基础,开展矿山监测、建立矿山监测模型对矿产资源的高效开发和矿区环境保护具有重要意义。针对露天矿区背景复杂、目标尺度多样且小目标聚集的特点,本研究旨在构建兼顾监测精度与效率的轻量化模型,以提升矿区目标地物监测的准确性和效率。【方法】现有遥感数据集存在的样本单一、地域局限等问题,因此本文基于0.9 m天地图与1.8 m谷歌影像构建了不同气候背景、大范围和多种地物的六大露天煤矿基地OMTSFD(Open-pit Mine Typical Surface Features Dataset)数据集,提出改进的YOLO11-DAE算法进行模型训练与验证。首先,在骨干网络和特征金字塔中引入C3K2-DBB模块以增强多尺度特征捕获能力;其次,采用ADown模块替换网络下采样卷积,增强了模块对不同特征的表征能力,减少了低对比度场景的细节丢失;最后,采用E_Detect高效检测头降低模型复杂度和参数量,实现模型轻量化。【结果】实验表明,YOLO11-DAE的每秒帧数(Frames Per Second,FPS)为528.100,模型推理速度较快,精确率(Precision,P)、召回率(Recall,R)、综合评价指标(F1-Score,F1)、平均精度均值(Mean Average Precision,mAP)分别达到0.932、0.894、0.913和0.950,显著优于YOLOv5n、YOLOv8n和YOLOv10n算法,相较于YOLOv11n各项指标分别提高7.600%、10.000%、8.800%、8.000%。【结论】YOLO11-DAE算法能够满足矿区实时监测,并适用于多尺度、多背景等复杂场景的目标识别,实现了高精度、低漏检率的监测目标,达到了模型可应用性与实时性的平衡。展开更多
针对太阳能电池表面存在的各种缺陷严重影响能量转换效率的问题,提出一种基于改进YOLO11的高效多尺度特征学习模型(efficient multi-scale feature learning based on YOLO11,YOLO11-EMFL),专门用于快速准确地检测太阳能电池中的表面缺...针对太阳能电池表面存在的各种缺陷严重影响能量转换效率的问题,提出一种基于改进YOLO11的高效多尺度特征学习模型(efficient multi-scale feature learning based on YOLO11,YOLO11-EMFL),专门用于快速准确地检测太阳能电池中的表面缺陷。该模型在骨干和颈部网络中引入小波卷积以增加感受野,同时将可变形注意力机制引入骨干网络中,以增强对不同图像大小和图像内容的适应能力。此外,在颈部网络中加入特征融合层以增强多尺度特征融合能力,并在小目标检测层引入三重注意力机制以提高对小目标的检测精度。这些改进使得YOLO11-EMFL网络能够有效地应对不同缺陷种类、缺陷尺寸以及复杂背景的挑战。通过在大规模光伏电池图像数据集上的验证,实验结果显示,YOLO11-EMFL的精确率达到91.8%,召回率为93.8%,F1分数为92.0%,mAP@50和mAP@50-95分别为98.2%和76.9%,在12种缺陷上展现出极高的检测精度。与当前的其他方法相比,该模型的各方面性能都有提升。展开更多
文摘驾驶员在实际驾驶的过程中会存在面部遮挡场景,例如戴眼镜、戴口罩等,传统单一通过提取驾驶员面部特征进行疲劳检测的Dlib算法不再适用。该文结合Dlib与YOLO11使用多阈值判定,对传统Dlib疲劳检测算法进行改进,给出戴眼镜、戴口罩等驾驶员面部遮挡场景的疲劳检测算法,并在Raspberry Pi 5硬件平台,使用公开数据集验证改进算法对于驾驶员疲劳检测的准确性。另外,改进算法还可以对吸烟、打电话等这类分心驾驶行为进行检测和语音提醒,对疲劳和分心行为实现更全面的检测和预警。
基金funded by the Jiangxi SASAC Science and Technology Innovation Special Project and the Key Technology Research and Application Promotion of Highway Overload Digital Solution.
文摘In response to the challenges in highway pavement distress detection,such as multiple defect categories,difficulties in feature extraction for different damage types,and slow identification speeds,this paper proposes an enhanced pavement crack detection model named Star-YOLO11.This improved algorithm modifies the YOLO11 architecture by substituting the original C3k2 backbone network with a Star-s50 feature extraction network.The enhanced structure adjusts the number of stacked layers in the StarBlock module to optimize detection accuracy and improve model efficiency.To enhance the accuracy of pavement crack detection and improve model efficiency,three key modifications to the YOLO11 architecture are proposed.Firstly,the original C3k2 backbone is replaced with a StarBlock-based structure,forming the Star-s50 feature extraction backbone network.This lightweight redesign reduces computational complexity while maintaining detection precision.Secondly,to address the inefficiency of the original Partial Self-attention(PSA)mechanism in capturing localized crack features,the convolutional prior-aware Channel Prior Convolutional Attention(CPCA)mechanism is integrated into the channel dimension,creating a hybrid CPC-C2PSA attention structure.Thirdly,the original neck structure is upgraded to a Star Multi-Branch Auxiliary Feature Pyramid Network(SMAFPN)based on the Multi-Branch Auxiliary Feature Pyramid Network architecture,which adaptively fuses high-level semantic and low-level spatial information through Star-s50 connections and C3k2 extraction blocks.Additionally,a composite dataset augmentation strategy combining traditional and advanced augmentation techniques is developed.This strategy is validated on a specialized pavement dataset containing five distinct crack categories for comprehensive training and evaluation.Experimental results indicate that the proposed Star-YOLO11 achieves an accuracy of 89.9%(3.5%higher than the baseline),a mean average precision(mAP)of 90.3%(+2.6%),and an F1-score of 85.8%(+0.5%),while reducing the model size by 18.8%and reaching a frame rate of 225.73 frames per second(FPS)for real-time detection.It shows potential for lightweight deployment in pavement crack detection tasks.
文摘针对动态场景导致视觉定位与建图(simultaneous localization and mapping,SLAM)算法位姿估计精度低和地图质量差等问题,提出一种结合深度学习的动态视觉SLAM算法。该算法在ORB-SLAM3前端引入轻量化且目标识别率高的YOLO11n目标检测网络,检测潜在动态区域,并结合Lucas-Kanade(LK)光流法识别其中的动态特征点,从而在剔除动态特征点的同时保留静态特征点,提高特征点利用率和位姿估计精度。此外,新增语义地图构建线程,通过去除YOLO11n识别到的动态物体点云,并融合前端提取的语义信息,实现静态语义地图的构建。在TUM数据集上的实验结果表明,相较于ORB-SLAM3,该算法在高动态序列数据集中的定位精度提升了95.02%,验证了该算法在动态环境下的有效性,能显著提升视觉SLAM系统的定位精度和地图构建质量。
文摘【目的】矿产资源是人类生存和经济发展的重要物质基础,开展矿山监测、建立矿山监测模型对矿产资源的高效开发和矿区环境保护具有重要意义。针对露天矿区背景复杂、目标尺度多样且小目标聚集的特点,本研究旨在构建兼顾监测精度与效率的轻量化模型,以提升矿区目标地物监测的准确性和效率。【方法】现有遥感数据集存在的样本单一、地域局限等问题,因此本文基于0.9 m天地图与1.8 m谷歌影像构建了不同气候背景、大范围和多种地物的六大露天煤矿基地OMTSFD(Open-pit Mine Typical Surface Features Dataset)数据集,提出改进的YOLO11-DAE算法进行模型训练与验证。首先,在骨干网络和特征金字塔中引入C3K2-DBB模块以增强多尺度特征捕获能力;其次,采用ADown模块替换网络下采样卷积,增强了模块对不同特征的表征能力,减少了低对比度场景的细节丢失;最后,采用E_Detect高效检测头降低模型复杂度和参数量,实现模型轻量化。【结果】实验表明,YOLO11-DAE的每秒帧数(Frames Per Second,FPS)为528.100,模型推理速度较快,精确率(Precision,P)、召回率(Recall,R)、综合评价指标(F1-Score,F1)、平均精度均值(Mean Average Precision,mAP)分别达到0.932、0.894、0.913和0.950,显著优于YOLOv5n、YOLOv8n和YOLOv10n算法,相较于YOLOv11n各项指标分别提高7.600%、10.000%、8.800%、8.000%。【结论】YOLO11-DAE算法能够满足矿区实时监测,并适用于多尺度、多背景等复杂场景的目标识别,实现了高精度、低漏检率的监测目标,达到了模型可应用性与实时性的平衡。
文摘针对太阳能电池表面存在的各种缺陷严重影响能量转换效率的问题,提出一种基于改进YOLO11的高效多尺度特征学习模型(efficient multi-scale feature learning based on YOLO11,YOLO11-EMFL),专门用于快速准确地检测太阳能电池中的表面缺陷。该模型在骨干和颈部网络中引入小波卷积以增加感受野,同时将可变形注意力机制引入骨干网络中,以增强对不同图像大小和图像内容的适应能力。此外,在颈部网络中加入特征融合层以增强多尺度特征融合能力,并在小目标检测层引入三重注意力机制以提高对小目标的检测精度。这些改进使得YOLO11-EMFL网络能够有效地应对不同缺陷种类、缺陷尺寸以及复杂背景的挑战。通过在大规模光伏电池图像数据集上的验证,实验结果显示,YOLO11-EMFL的精确率达到91.8%,召回率为93.8%,F1分数为92.0%,mAP@50和mAP@50-95分别为98.2%和76.9%,在12种缺陷上展现出极高的检测精度。与当前的其他方法相比,该模型的各方面性能都有提升。