Moving object segmentation (MOS) is one of the essential functions of the vision system of all robots,including medical robots. Deep learning-based MOS methods, especially deep end-to-end MOS methods, are actively inv...Moving object segmentation (MOS) is one of the essential functions of the vision system of all robots,including medical robots. Deep learning-based MOS methods, especially deep end-to-end MOS methods, are actively investigated in this field. Foreground segmentation networks (FgSegNets) are representative deep end-to-endMOS methods proposed recently. This study explores a new mechanism to improve the spatial feature learningcapability of FgSegNets with relatively few brought parameters. Specifically, we propose an enhanced attention(EA) module, a parallel connection of an attention module and a lightweight enhancement module, with sequentialattention and residual attention as special cases. We also propose integrating EA with FgSegNet_v2 by taking thelightweight convolutional block attention module as the attention module and plugging EA module after the twoMaxpooling layers of the encoder. The derived new model is named FgSegNet_v2 EA. The ablation study verifiesthe effectiveness of the proposed EA module and integration strategy. The results on the CDnet2014 dataset,which depicts human activities and vehicles captured in different scenes, show that FgSegNet_v2 EA outperformsFgSegNet_v2 by 0.08% and 14.5% under the settings of scene dependent evaluation and scene independent evaluation, respectively, which indicates the positive effect of EA on improving spatial feature learning capability ofFgSegNet_v2.展开更多
Human posture estimation is a prominent research topic in the fields of human-com-puter interaction,motion recognition,and other intelligent applications.However,achieving highaccuracy in key point localization,which ...Human posture estimation is a prominent research topic in the fields of human-com-puter interaction,motion recognition,and other intelligent applications.However,achieving highaccuracy in key point localization,which is crucial for intelligent applications,contradicts the lowdetection accuracy of human posture detection models in practical scenarios.To address this issue,a human pose estimation network called AT-HRNet has been proposed,which combines convolu-tional self-attention and cross-dimensional feature transformation.AT-HRNet captures significantfeature information from various regions in an adaptive manner,aggregating them through convolu-tional operations within the local receptive domain.The residual structures TripNeck and Trip-Block of the high-resolution network are designed to further refine the key point locations,wherethe attention weight is adjusted by a cross-dimensional interaction to obtain more features.To vali-date the effectiveness of this network,AT-HRNet was evaluated using the COCO2017 dataset.Theresults show that AT-HRNet outperforms HRNet by improving 3.2%in mAP,4.0%in AP75,and3.9%in AP^(M).This suggests that AT-HRNet can offer more beneficial solutions for human posture estimation.展开更多
随着卫星遥感图像和航空遥感图像分辨率的不断提高,从遥感影像中获取的有用数据和信息越来越多,与普通图像相比,遥感图像具有类别不平衡、背景复杂、小物体检测困难等特点。针对以上问题,提出一种改进YOLOv5s的遥感图像目标检测算法MPSA...随着卫星遥感图像和航空遥感图像分辨率的不断提高,从遥感影像中获取的有用数据和信息越来越多,与普通图像相比,遥感图像具有类别不平衡、背景复杂、小物体检测困难等特点。针对以上问题,提出一种改进YOLOv5s的遥感图像目标检测算法MPSA-YOLOv5。设计一种多维度信息交互极化自注意力(Multi-dimensional information interaction Polarized Self-Attention,MPSA)模块,充分考虑捕获通道交互对细节信息的重要程度,并将其嵌入到主干网络中。改进特征增强结构,替换使用Softpool池化方式保留更多原始信息,实现特征增强。实验结果表明,MPSA-YOLOv5在NWPUVHR-10公共遥感图像数据集上检测精度达到91.4%,相比于SSD、YOLOv3、YOLOX-S和原YOLOv5s算法分别提高了6.06、2.8、1.45和1.7个百分点,MPSA-YOLOv5算法有效提升了遥感图像的检测精度。展开更多
基金the National Natural Science Foundation of China(No.61702323)。
文摘Moving object segmentation (MOS) is one of the essential functions of the vision system of all robots,including medical robots. Deep learning-based MOS methods, especially deep end-to-end MOS methods, are actively investigated in this field. Foreground segmentation networks (FgSegNets) are representative deep end-to-endMOS methods proposed recently. This study explores a new mechanism to improve the spatial feature learningcapability of FgSegNets with relatively few brought parameters. Specifically, we propose an enhanced attention(EA) module, a parallel connection of an attention module and a lightweight enhancement module, with sequentialattention and residual attention as special cases. We also propose integrating EA with FgSegNet_v2 by taking thelightweight convolutional block attention module as the attention module and plugging EA module after the twoMaxpooling layers of the encoder. The derived new model is named FgSegNet_v2 EA. The ablation study verifiesthe effectiveness of the proposed EA module and integration strategy. The results on the CDnet2014 dataset,which depicts human activities and vehicles captured in different scenes, show that FgSegNet_v2 EA outperformsFgSegNet_v2 by 0.08% and 14.5% under the settings of scene dependent evaluation and scene independent evaluation, respectively, which indicates the positive effect of EA on improving spatial feature learning capability ofFgSegNet_v2.
基金the National Natural Science Foundation of China(No.61975015)the Research and Innovation Project for Graduate Students at Zhongyuan University of Technology(No.YKY2024ZK14).
文摘Human posture estimation is a prominent research topic in the fields of human-com-puter interaction,motion recognition,and other intelligent applications.However,achieving highaccuracy in key point localization,which is crucial for intelligent applications,contradicts the lowdetection accuracy of human posture detection models in practical scenarios.To address this issue,a human pose estimation network called AT-HRNet has been proposed,which combines convolu-tional self-attention and cross-dimensional feature transformation.AT-HRNet captures significantfeature information from various regions in an adaptive manner,aggregating them through convolu-tional operations within the local receptive domain.The residual structures TripNeck and Trip-Block of the high-resolution network are designed to further refine the key point locations,wherethe attention weight is adjusted by a cross-dimensional interaction to obtain more features.To vali-date the effectiveness of this network,AT-HRNet was evaluated using the COCO2017 dataset.Theresults show that AT-HRNet outperforms HRNet by improving 3.2%in mAP,4.0%in AP75,and3.9%in AP^(M).This suggests that AT-HRNet can offer more beneficial solutions for human posture estimation.
文摘随着卫星遥感图像和航空遥感图像分辨率的不断提高,从遥感影像中获取的有用数据和信息越来越多,与普通图像相比,遥感图像具有类别不平衡、背景复杂、小物体检测困难等特点。针对以上问题,提出一种改进YOLOv5s的遥感图像目标检测算法MPSA-YOLOv5。设计一种多维度信息交互极化自注意力(Multi-dimensional information interaction Polarized Self-Attention,MPSA)模块,充分考虑捕获通道交互对细节信息的重要程度,并将其嵌入到主干网络中。改进特征增强结构,替换使用Softpool池化方式保留更多原始信息,实现特征增强。实验结果表明,MPSA-YOLOv5在NWPUVHR-10公共遥感图像数据集上检测精度达到91.4%,相比于SSD、YOLOv3、YOLOX-S和原YOLOv5s算法分别提高了6.06、2.8、1.45和1.7个百分点,MPSA-YOLOv5算法有效提升了遥感图像的检测精度。