Classroom behavior recognition is a hot research topic,which plays a vital role in assessing and improving the quality of classroom teaching.However,existing classroom behavior recognition methods have challenges for ...Classroom behavior recognition is a hot research topic,which plays a vital role in assessing and improving the quality of classroom teaching.However,existing classroom behavior recognition methods have challenges for high recognition accuracy with datasets with problems such as scenes with blurred pictures,and inconsistent objects.To address this challenge,we proposed an effective,lightweight object detector method called the RFNet model(YOLO-FR).The YOLO-FR is a lightweight and effective model.Specifically,for efficient multi-scale feature extraction,effective feature pyramid shared convolutional(FPSC)was designed to improve the feature extract performance by leveraging convolutional layers with varying dilation rates from the input image in the backbone.Secondly,to address the problem of multi-scale variability in the scene,we design the Rep Ghost fusion Cross Stage Partial and Efficient Layer Aggregation Network(RGCSPELAN)to improve the network performance further and reduce the amount of computation and the number of parameters.In addition,by conducting experimental valuation on the SCB dataset3 and STBD-08 dataset.Experimental results indicate that,compared to the baseline model,the RFNet model has increased mean accuracy precision(mAP@50)from 69.6%to 71.0%on the SCB dataset3 and from 91.8%to 93.1%on the STBD-08 dataset.The RFNet approach has effectiveness precision at 68.6%,surpassing the baseline method(YOLOv11)at 3.3%and archieve the minimal size(4.9 M)on the SCB dataset3.Finally,comparing it with other algorithms,it accurately detects student behavior in complex classroom environments results confirmed that RFNet is well-suited for real-time and efficiently recognizing classroom behaviors.展开更多
作为高铁牵引供电系统的重要组成部分,接触网系统承担着向动车组传输电能的重要功能.实际工程运营表明,受弓网交互产生的持续冲击以及外部环境的影响,接触网支撑部件可能会出现“松、脱、断、裂”等缺陷,导致接触网结构可靠性下降,严重...作为高铁牵引供电系统的重要组成部分,接触网系统承担着向动车组传输电能的重要功能.实际工程运营表明,受弓网交互产生的持续冲击以及外部环境的影响,接触网支撑部件可能会出现“松、脱、断、裂”等缺陷,导致接触网结构可靠性下降,严重影响接触网系统稳定运行.因此,及时精确定位接触网支撑部件(CSCs),对保障高铁安全运行和完善接触网检修维护策略具有重大意义.然而,CSCs的检测通常面临着零部件种类多、尺度差异大、部分零部件微小的问题.针对以上问题,本文提出一种基于多尺度融合金字塔焦点网络的接触网零部件检测算法,将平衡模块和特征金字塔模块相结合,提高对小目标的检测性能.首先,设计了可分离残差金字塔聚合模块(SRPAM),用于优化模型多尺度特征提取能力、扩大感受野,缓解CSCs检测的多尺度问题;其次,设计了一种基于平衡特征金字塔的路径聚合网络(PA-BFPN),用于提升跨层特征融合效率和小目标检测性能.最后,通过对比试验、可视化实验和消融实验证明了所提方法的有效性和优越性.其中,所提的MFPFCOS在CSCs数据集上的检测精度(mAP)能够在达到48.6%的同时,实现30的FLOPs(Floating point operations per second),表明所提方法能够在检测精度和检测速度之间保持良好的平衡.展开更多
为提升弹载成像制导中运动模糊图像目标检测的精确性与效率,提出一种轻量化且高效的运动模糊图像目标检测(Lighter and More Effective Motion-blurred Image Object Detection,LEMBD)网络。通过深入分析运动模糊图像的成因,基于成像机...为提升弹载成像制导中运动模糊图像目标检测的精确性与效率,提出一种轻量化且高效的运动模糊图像目标检测(Lighter and More Effective Motion-blurred Image Object Detection,LEMBD)网络。通过深入分析运动模糊图像的成因,基于成像机理构建了专用的运动模糊图像数据集。在不增加网络参数的前提下,采用共享权重的孪生网络设计,并引入先验知识,将清晰图像的特征学习用于模糊图像的特征提取,以同时实现对清晰与模糊图像的精准检测。此外,设计了部分深度可分离卷积替代普通卷积,显著减少了网络的参数量与计算量,并提升了学习性能。为进一步优化特征融合质量,提出跨层路径聚合特征金字塔网络,有效利用低级特征的细节信息和高级特征的语义信息。实验结果表明,所提LEMBD网络在运动模糊图像目标检测任务中的性能优于传统目标检测方法和主流运动模糊检测算法,能够为精确制导任务提供更精准的目标相对位置信息。展开更多
Small object detection is a fundamental and challenging topic in the computer vision community.To detect small objects in images,several methods rely on feature pyramid networks(FPN),which can alleviate the conflict b...Small object detection is a fundamental and challenging topic in the computer vision community.To detect small objects in images,several methods rely on feature pyramid networks(FPN),which can alleviate the conflict between resolution and semantic information.However,the FPN-based methods also have limitations.First,existing methods only focus only on regions with close spatial distance,hindering the effectiveness of long-range interactions.Second,element-wise addition ignores the different perceptive fields of the two feature layers,thus causing higher-level features to introduce noise to the lower-level features.To address these problems,we propose a cross-layer attention(CLA)block as a generic block for capturing long-range dependencies and reducing noise from high-level features.Specifically,the CLA block performs feature fusion by factoring in both the channel and spatial dimensions,which provides a reliable way of fusing the features from different layers.Because CLA is a lightweight and general block,it can be plugged into most feature fusion frameworks.On the COCO 2017 dataset,we validated the CLA block by plugging it into several state-of-the-art FPN-based detectors.Experiments show that our approach achieves consistent improvements in both object detection and instance segmentation,which demonstrates the effectiveness of our approach.展开更多
基金suported by the Fundamental Research Grant Scheme(FRGS)of Universiti Sains Malaysia,Research Number:FRGS/1/2024/ICT02/USM/02/1.
文摘Classroom behavior recognition is a hot research topic,which plays a vital role in assessing and improving the quality of classroom teaching.However,existing classroom behavior recognition methods have challenges for high recognition accuracy with datasets with problems such as scenes with blurred pictures,and inconsistent objects.To address this challenge,we proposed an effective,lightweight object detector method called the RFNet model(YOLO-FR).The YOLO-FR is a lightweight and effective model.Specifically,for efficient multi-scale feature extraction,effective feature pyramid shared convolutional(FPSC)was designed to improve the feature extract performance by leveraging convolutional layers with varying dilation rates from the input image in the backbone.Secondly,to address the problem of multi-scale variability in the scene,we design the Rep Ghost fusion Cross Stage Partial and Efficient Layer Aggregation Network(RGCSPELAN)to improve the network performance further and reduce the amount of computation and the number of parameters.In addition,by conducting experimental valuation on the SCB dataset3 and STBD-08 dataset.Experimental results indicate that,compared to the baseline model,the RFNet model has increased mean accuracy precision(mAP@50)from 69.6%to 71.0%on the SCB dataset3 and from 91.8%to 93.1%on the STBD-08 dataset.The RFNet approach has effectiveness precision at 68.6%,surpassing the baseline method(YOLOv11)at 3.3%and archieve the minimal size(4.9 M)on the SCB dataset3.Finally,comparing it with other algorithms,it accurately detects student behavior in complex classroom environments results confirmed that RFNet is well-suited for real-time and efficiently recognizing classroom behaviors.
文摘作为高铁牵引供电系统的重要组成部分,接触网系统承担着向动车组传输电能的重要功能.实际工程运营表明,受弓网交互产生的持续冲击以及外部环境的影响,接触网支撑部件可能会出现“松、脱、断、裂”等缺陷,导致接触网结构可靠性下降,严重影响接触网系统稳定运行.因此,及时精确定位接触网支撑部件(CSCs),对保障高铁安全运行和完善接触网检修维护策略具有重大意义.然而,CSCs的检测通常面临着零部件种类多、尺度差异大、部分零部件微小的问题.针对以上问题,本文提出一种基于多尺度融合金字塔焦点网络的接触网零部件检测算法,将平衡模块和特征金字塔模块相结合,提高对小目标的检测性能.首先,设计了可分离残差金字塔聚合模块(SRPAM),用于优化模型多尺度特征提取能力、扩大感受野,缓解CSCs检测的多尺度问题;其次,设计了一种基于平衡特征金字塔的路径聚合网络(PA-BFPN),用于提升跨层特征融合效率和小目标检测性能.最后,通过对比试验、可视化实验和消融实验证明了所提方法的有效性和优越性.其中,所提的MFPFCOS在CSCs数据集上的检测精度(mAP)能够在达到48.6%的同时,实现30的FLOPs(Floating point operations per second),表明所提方法能够在检测精度和检测速度之间保持良好的平衡.
文摘为提升弹载成像制导中运动模糊图像目标检测的精确性与效率,提出一种轻量化且高效的运动模糊图像目标检测(Lighter and More Effective Motion-blurred Image Object Detection,LEMBD)网络。通过深入分析运动模糊图像的成因,基于成像机理构建了专用的运动模糊图像数据集。在不增加网络参数的前提下,采用共享权重的孪生网络设计,并引入先验知识,将清晰图像的特征学习用于模糊图像的特征提取,以同时实现对清晰与模糊图像的精准检测。此外,设计了部分深度可分离卷积替代普通卷积,显著减少了网络的参数量与计算量,并提升了学习性能。为进一步优化特征融合质量,提出跨层路径聚合特征金字塔网络,有效利用低级特征的细节信息和高级特征的语义信息。实验结果表明,所提LEMBD网络在运动模糊图像目标检测任务中的性能优于传统目标检测方法和主流运动模糊检测算法,能够为精确制导任务提供更精准的目标相对位置信息。
基金supported in part by the National Natural Science Foundation of China(62088102,91748208,and 61973246)the Shaanxi Project(2018ZDCXLGY0607)the program of the Ministry of Education.
文摘Small object detection is a fundamental and challenging topic in the computer vision community.To detect small objects in images,several methods rely on feature pyramid networks(FPN),which can alleviate the conflict between resolution and semantic information.However,the FPN-based methods also have limitations.First,existing methods only focus only on regions with close spatial distance,hindering the effectiveness of long-range interactions.Second,element-wise addition ignores the different perceptive fields of the two feature layers,thus causing higher-level features to introduce noise to the lower-level features.To address these problems,we propose a cross-layer attention(CLA)block as a generic block for capturing long-range dependencies and reducing noise from high-level features.Specifically,the CLA block performs feature fusion by factoring in both the channel and spatial dimensions,which provides a reliable way of fusing the features from different layers.Because CLA is a lightweight and general block,it can be plugged into most feature fusion frameworks.On the COCO 2017 dataset,we validated the CLA block by plugging it into several state-of-the-art FPN-based detectors.Experiments show that our approach achieves consistent improvements in both object detection and instance segmentation,which demonstrates the effectiveness of our approach.