期刊文献+
共找到2,024篇文章
< 1 2 102 >
每页显示 20 50 100
Towards Real-Time Multi-Person Pose Estimation via Feature Selection and Sharpening Mechanisms
1
作者 Chengang Dong Yongkang Ding Jianwei Hu 《Computer Modeling in Engineering & Sciences》 2026年第3期888-908,共21页
Real-time multi-person pose estimation(MPE)built upon neural network architectures aims to simultaneously detect multiple human instances and regress joint coordinates in dynamic scenes.However,due to factors such as ... Real-time multi-person pose estimation(MPE)built upon neural network architectures aims to simultaneously detect multiple human instances and regress joint coordinates in dynamic scenes.However,due to factors such as high model complexity and limited expression of keypoint information,both the efficiency and accuracy of real-time MPE remain to be improved.To mitigate the adverse impacts caused by the aforementioned issues,this work develops FSEM-Pose,a real-time MPE model rooted in the YOLOv10 framework.In detail,first,FSEM-Pose upgrades the backbone module of the baseline network by introducing the Feature Shuffling-Convolution(FS-Conv),which effectively reduces the backbone size while maximizing the retention of spatial information from the input image.Second,FSEM-Pose incorporates a Feature Saliency Enhancement Module(FSEM)to strengthen the feature encoding of human keypoints,thereby improving the accuracy of pose estimation.Finally,FSEM-Pose further enhances inference efficiency via a lightweight optimization of the head using shared convolutional layers.Our method achieves competitive results across multiple accuracy and efficiency metrics on the MS COCO 2017 and CrowdPose datasets.While being lightweight in design,it improves average precision(AP)by 2.1%and 2.5%,respectively. 展开更多
关键词 pose estimation feature sharpening LIGHTWEIGHT YOLOv10
在线阅读 下载PDF
High-accuracy real-time satellite pose estimation for in-orbit applications
2
作者 Zi WANG Jinghao WANG +2 位作者 Jiyang YU Zhang LI Qifeng YU 《Chinese Journal of Aeronautics》 2025年第6期130-142,共13页
Vision-based relative pose estimation plays a pivotal role in various space missions.Deep learning enhances monocular spacecraft pose estimation,but high computational demands necessitate model simplification for onbo... Vision-based relative pose estimation plays a pivotal role in various space missions.Deep learning enhances monocular spacecraft pose estimation,but high computational demands necessitate model simplification for onboard systems.In this paper,we aim to achieve an optimal balance between accuracy and computational efficiency.We present a Perspective-n-Point(PnP)based method for spacecraft pose estimation,leveraging lightweight neural networks to localize semantic keypoints and reduce computational load.Since the accuracy of keypoint localization is closely related to the heatmap resolution,we devise an efficient upsampling module to increase the resolution of heatmaps with minimal overhead.Furthermore,the heatmaps predicted by the lightweight models tend to show high-level noise.To tackle this issue,we propose a weighting strategy by analyzing the statistical characteristics of predicted semantic keypoints and substantially improve the pose estimation accuracy.The experiments carried out on the SPEED dataset underscore the prospect of our method in engineering applications.We dramatically reduce the model parameters to 0.7 M,merely 2.5%of that required by the top-performing method,and achieve lower pose estimation error and better real-time performance. 展开更多
关键词 Keypoint detection Lightweight models Non-cooperative satellite pose estimation Weighted PnP
原文传递
An Attention-Based 6D Pose Estimation Network for Weakly Textured Industrial Parts
3
作者 Song Xu Liang Xuan +1 位作者 Yifeng Li Qiang Zhang 《Computers, Materials & Continua》 2026年第2期2148-2166,共19页
The 6D pose estimation of objects is of great significance for the intelligent assembly and sorting of industrial parts.In the industrial robot production scenarios,the 6D pose estimation of industrial parts mainly fa... The 6D pose estimation of objects is of great significance for the intelligent assembly and sorting of industrial parts.In the industrial robot production scenarios,the 6D pose estimation of industrial parts mainly faces two challenges:one is the loss of information and interference caused by occlusion and stacking in the sorting scenario,the other is the difficulty of feature extraction due to the weak texture of industrial parts.To address the above problems,this paper proposes an attention-based pixel-level voting network for 6D pose estimation of weakly textured industrial parts,namely CB-PVNet.On the one hand,the voting scheme can predict the keypoints of affected pixels,which improves the accuracy of keypoint localization even in scenarios such as weak texture and partial occlusion.On the other hand,the attention mechanism can extract interesting features of the object while suppressing useless features of surroundings.Extensive comparative experiments were conducted on both public datasets(including LINEMOD,Occlusion LINEMOD and T-LESS datasets)and self-made datasets.The experimental results indicate that the proposed network CB-PVNet can achieve accuracy of ADD(-s)comparable to state-of-the-art using only RGB images while ensuring real-time performance.Additionally,we also conducted robot grasping experiments in the real world.The balance between accuracy and computational efficiency makes the method well-suited for applications in industrial automation. 展开更多
关键词 Industrial robots pose estimation industrial parts attention mechanism weak texture
在线阅读 下载PDF
Robust Human Pose Estimation and Action Recognition Utilizing Feature Extraction
4
作者 Sheng Luo Rashid Abbasi +7 位作者 Hao Wang Jinghua Xu Dongyang Lyu Aaron Zhang Farhan Amin Isabel de la Torre Gerardo Mendez Mezquita Henry Fabian Gongora 《Computer Modeling in Engineering & Sciences》 2026年第3期870-887,共18页
Human pose estimation is crucial across diverse applications,from healthcare to human-computer interaction.Integrating inertial measurement units(IMUs)with monocular vision methods holds great potential for leveraging... Human pose estimation is crucial across diverse applications,from healthcare to human-computer interaction.Integrating inertial measurement units(IMUs)with monocular vision methods holds great potential for leveraging complementary modalities;however,existing approaches are often limited by IMU drift,noise,and underutilization of visual information.To address these limitations,we propose a novel dual-stream feature extraction framework that effectively combines temporal IMU data and single-view image features for improved pose estimation.Short-term dependencies in IMU sequences are captured with convolutional layers,while a Transformerbased architecture models long-range temporal dynamics.To mitigate IMU drift and inter-sensor inconsistencies,a complementary filtering module is introduced alongside a cross-channel interaction mechanism.Features from the IMU and image streams are then fused via a dedicated fusion module and further refined utilizing a high-precision regression head for accurate pose prediction.Experimental results on benchmark datasets demonstrate that our method significantly outperforms existing techniques in terms of estimation,accuracy,and robustness,validating the effectiveness of our dual-stream architecture. 展开更多
关键词 Human pose estimation dual-stream network inertial measurement units(IMU)
在线阅读 下载PDF
Fuzzy Adaptive Admittance Control of Hexapod Wheeled-Legged Robot Based on Real-Time Estimation
5
作者 CHEN Mengqi LI Yan XU Yang 《Journal of Donghua University(English Edition)》 2025年第6期650-660,共11页
A fuzzy adaptive admittance control method based on real-time estimation is proposed for the motion of the hexapod wheeled-legged robot in various environments.Firstly,the mechanical structure of the robot is designed... A fuzzy adaptive admittance control method based on real-time estimation is proposed for the motion of the hexapod wheeled-legged robot in various environments.Firstly,the mechanical structure of the robot is designed,and a control system framework is proposed according to the different motion environments.To address the adaptability issue of the robot foot contact with the ground,a position-based admittance control method is proposed.Secondly,to improve the tracking performance of the robot foot contact force when the ground environment changes,a fuzzy adaptive admittance parameter adjustment method is proposed.Furthermore,to address the problem of sudden changes in the tracking difference of the foot contact force when the ground environment changes,a real-time estimation method is proposed to estimate the dynamic foot contact force.Finally,a simulation experiment is conducted in MATLAB and Simscape to verify the effectiveness of the robot motion control system,admittance control,fuzzy adaptive admittance parameters adjustment,and the realtime estimation method.Through multi-scenario experiments with the robot prototype,the control method demonstrates its effectiveness and adaptability in various environments. 展开更多
关键词 hexapod wheeled-legged robot dynamic foot contact force fuzzy adaptive real-time estimation admittance control
在线阅读 下载PDF
Toward Coordination Control of Multiple Fish-Like Robots:Real-Time Vision-Based Pose Estimation and Tracking via Deep Neural Networks 被引量:3
6
作者 Tianhao Zhang Jiuhong Xiao +2 位作者 Liang Li Chen Wang Guangming Xie 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2021年第12期1964-1976,共13页
Controlling multiple multi-joint fish-like robots has long captivated the attention of engineers and biologists,for which a fundamental but challenging topic is to robustly track the postures of the individuals in rea... Controlling multiple multi-joint fish-like robots has long captivated the attention of engineers and biologists,for which a fundamental but challenging topic is to robustly track the postures of the individuals in real time.This requires detecting multiple robots,estimating multi-joint postures,and tracking identities,as well as processing fast in real time.To the best of our knowledge,this challenge has not been tackled in the previous studies.In this paper,to precisely track the planar postures of multiple swimming multi-joint fish-like robots in real time,we propose a novel deep neural network-based method,named TAB-IOL.Its TAB part fuses the top-down and bottom-up approaches for vision-based pose estimation,while the IOL part with long short-term memory considers the motion constraints among joints for precise pose tracking.The satisfying performance of our TAB-IOL is verified by testing on a group of freely swimming fish-like robots in various scenarios with strong disturbances and by a deed comparison of accuracy,speed,and robustness with most state-of-the-art algorithms.Further,based on the precise pose estimation and tracking realized by our TAB-IOL,several formation control experiments are conducted for the group of fish-like robots.The results clearly demonstrate that our TAB-IOL lays a solid foundation for the coordination control of multiple fish-like robots in a real working environment.We believe our proposed method will facilitate the growth and development of related fields. 展开更多
关键词 Deep neural networks formation control multiple fish-like robots pose estimation pose tracking
在线阅读 下载PDF
Acceleration of Points to Convex Region Correspondence Pose Estimation Algorithm on GPUs for Real-Time Applications
7
作者 Raghu Raj P. Kumar Suresh S. Muknahallipatna John E. McInroy 《Journal of Computer and Communications》 2016年第17期1-17,共18页
In our previous work, a novel algorithm to perform robust pose estimation was presented. The pose was estimated using points on the object to regions on image correspondence. The laboratory experiments conducted in th... In our previous work, a novel algorithm to perform robust pose estimation was presented. The pose was estimated using points on the object to regions on image correspondence. The laboratory experiments conducted in the previous work showed that the accuracy of the estimated pose was over 99% for position and 84% for orientation estimations respectively. However, for larger objects, the algorithm requires a high number of points to achieve the same accuracy. The requirement of higher number of points makes the algorithm, computationally intensive resulting in the algorithm infeasible for real-time computer vision applications. In this paper, the algorithm is parallelized to run on NVIDIA GPUs. The results indicate that even for objects having more than 2000 points, the algorithm can estimate the pose in real time for each frame of high-resolution videos. 展开更多
关键词 pose estimation Parallel Computing GPU CUDA Real Time Image Processing
在线阅读 下载PDF
Hourglass-GCN for 3D Human Pose Estimation Using Skeleton Structure and View Correlation
8
作者 Ange Chen Chengdong Wu Chuanjiang Leng 《Computers, Materials & Continua》 SCIE EI 2025年第1期173-191,共19页
Previous multi-view 3D human pose estimation methods neither correlate different human joints in each view nor model learnable correlations between the same joints in different views explicitly,meaning that skeleton s... Previous multi-view 3D human pose estimation methods neither correlate different human joints in each view nor model learnable correlations between the same joints in different views explicitly,meaning that skeleton structure information is not utilized and multi-view pose information is not completely fused.Moreover,existing graph convolutional operations do not consider the specificity of different joints and different views of pose information when processing skeleton graphs,making the correlation weights between nodes in the graph and their neighborhood nodes shared.Existing Graph Convolutional Networks(GCNs)cannot extract global and deeplevel skeleton structure information and view correlations efficiently.To solve these problems,pre-estimated multiview 2D poses are designed as a multi-view skeleton graph to fuse skeleton priors and view correlations explicitly to process occlusion problem,with the skeleton-edge and symmetry-edge representing the structure correlations between adjacent joints in each viewof skeleton graph and the view-edge representing the view correlations between the same joints in different views.To make graph convolution operation mine elaborate and sufficient skeleton structure information and view correlations,different correlation weights are assigned to different categories of neighborhood nodes and further assigned to each node in the graph.Based on the graph convolution operation proposed above,a Residual Graph Convolution(RGC)module is designed as the basic module to be combined with the simplified Hourglass architecture to construct the Hourglass-GCN as our 3D pose estimation network.Hourglass-GCNwith a symmetrical and concise architecture processes three scales ofmulti-viewskeleton graphs to extract local-to-global scale and shallow-to-deep level skeleton features efficiently.Experimental results on common large 3D pose dataset Human3.6M and MPI-INF-3DHP show that Hourglass-GCN outperforms some excellent methods in 3D pose estimation accuracy. 展开更多
关键词 3D human pose estimation multi-view skeleton graph elaborate graph convolution operation Hourglass-GCN
在线阅读 下载PDF
Self-Supervised Monocular Depth Estimation with Scene Dynamic Pose
9
作者 Jing He Haonan Zhu +1 位作者 Chenhao Zhao Minrui Zhao 《Computers, Materials & Continua》 2025年第6期4551-4573,共23页
Self-supervised monocular depth estimation has emerged as a major research focus in recent years,primarily due to the elimination of ground-truth depth dependence.However,the prevailing architectures in this domain su... Self-supervised monocular depth estimation has emerged as a major research focus in recent years,primarily due to the elimination of ground-truth depth dependence.However,the prevailing architectures in this domain suffer from inherent limitations:existing pose network branches infer camera ego-motion exclusively under static-scene and Lambertian-surface assumptions.These assumptions are often violated in real-world scenarios due to dynamic objects,non-Lambertian reflectance,and unstructured background elements,leading to pervasive artifacts such as depth discontinuities(“holes”),structural collapse,and ambiguous reconstruction.To address these challenges,we propose a novel framework that integrates scene dynamic pose estimation into the conventional self-supervised depth network,enhancing its ability to model complex scene dynamics.Our contributions are threefold:(1)a pixel-wise dynamic pose estimation module that jointly resolves the pose transformations of moving objects and localized scene perturbations;(2)a physically-informed loss function that couples dynamic pose and depth predictions,designed to mitigate depth errors arising from high-speed distant objects and geometrically inconsistent motion profiles;(3)an efficient SE(3)transformation parameterization that streamlines network complexity and temporal pre-processing.Extensive experiments on the KITTI and NYU-V2 benchmarks show that our framework achieves state-of-the-art performance in both quantitative metrics and qualitative visual fidelity,significantly improving the robustness and generalization of monocular depth estimation under dynamic conditions. 展开更多
关键词 Monocular depth estimation self-supervised learning scene dynamic pose estimation dynamic-depth constraint pixel-wise dynamic pose
在线阅读 下载PDF
Animal Pose Estimation Based on YOLO-POSE
10
作者 Binbin Zhou Lei Liu 《国际计算机前沿大会会议论文集》 2025年第1期234-246,共13页
With the development of computer vision technology,deep learning-based pose estimation and target detection have been widely used in the fields of human behavior analysis and intelligent security.However,owing to the ... With the development of computer vision technology,deep learning-based pose estimation and target detection have been widely used in the fields of human behavior analysis and intelligent security.However,owing to the complexity of animal poses and the diversity of species,the existing pose estimation methods still face many challenges when applied to animal targets.To solve this problem,an improved YOLO-Pose model is proposed to improve the accuracy and efficiency of animal pose estimation.On the basis of the original YOLO-Pose model,a separable kernel attention mechanism is introduced and improved to make it conform to the animal target,and combined with the spatial pyramid pool of YOLO-Pose,the multiscale feature fusion capability of the model is improved.The experimental results show that the improved YOLO-Pose model achieves excellent performance on both the public animal pose dataset and the AP-10K dataset,significantly improving the ability of target detection and pose estimation. 展开更多
关键词 ANIMAL pose estimation LSKA YOLO-pose
原文传递
Manifold-Optimized Error-State Kalman Filter for Robust Pose Estimation in Unmanned Aerial Vehicles
11
作者 Bolin Jia Zongwen Bai +5 位作者 Yiqun Gao Dong Wang Meili Zhou Peiqi Gao Pei Zhang Zhang Yang 《Journal of Electronic Research and Application》 2025年第2期247-257,共11页
This paper presents a manifold-optimized Error-State Kalman Filter(ESKF)framework for unmanned aerial vehicle(UAV)pose estimation,integrating Inertial Measurement Unit(IMU)data with GPS or LiDAR to enhance estimation ... This paper presents a manifold-optimized Error-State Kalman Filter(ESKF)framework for unmanned aerial vehicle(UAV)pose estimation,integrating Inertial Measurement Unit(IMU)data with GPS or LiDAR to enhance estimation accuracy and robustness.We employ a manifold-based optimization approach,leveraging exponential and logarithmic mappings to transform rotation vectors into rotation matrices.The proposed ESKF framework ensures state variables remain near the origin,effectively mitigating singularity issues and enhancing numerical stability.Additionally,due to the small magnitude of state variables,second-order terms can be neglected,simplifying Jacobian matrix computation and improving computational efficiency.Furthermore,we introduce a novel Kalman filter gain computation strategy that dynamically adapts to low-dimensional and high-dimensional observation equations,enabling efficient processing across different sensor modalities.Specifically,for resource-constrained UAV platforms,this method significantly reduces computational cost,making it highly suitable for real-time UAV applications. 展开更多
关键词 UAV pose estimation Error-State Kalman Filter MANIFOLD GPS LIDAR
在线阅读 下载PDF
Review of Pose Estimation Methods for Spacecraft Targets
12
作者 LI Shoucheng LI Jing +2 位作者 CHEN Qiang LI Xindong WANG Junzheng 《Aerospace China》 2025年第1期53-58,共6页
Pose estimation of spacecraft targets is a key technology for achieving space operation tasks,such as the cleaning of failed satellites and the detection and scanning of non-cooperative targets.This paper reviews the ... Pose estimation of spacecraft targets is a key technology for achieving space operation tasks,such as the cleaning of failed satellites and the detection and scanning of non-cooperative targets.This paper reviews the target pose estimation methods based on image feature extraction and PnP,the target estimation methods based on registration,and the spacecraft target pose estimation methods based on deep learning,and introduces the corresponding research methods. 展开更多
关键词 SPACECRAFT pose estimation non-cooperative targets feature extraction deep learning
在线阅读 下载PDF
High-Precision Fish Pose Estimation Method Based on Improved HRNet
13
作者 PENG Qiujun LI Weiran +1 位作者 LIU Yeqiang LI Zhenbo 《智慧农业(中英文)》 2025年第3期160-172,共13页
[Objective]Fish pose estimation(FPE)provides fish physiological information,facilitating health monitoring in aquaculture.It aids decision-making in areas such as fish behavior recognition.When fish are injured or def... [Objective]Fish pose estimation(FPE)provides fish physiological information,facilitating health monitoring in aquaculture.It aids decision-making in areas such as fish behavior recognition.When fish are injured or deficient,they often display abnormal behaviors and noticeable changes in the positioning of their body parts.Moreover,the unpredictable posture and orientation of fish during swimming,combined with the rapid swimming speed of fish,restrict the current scope of research in FPE.In this research,a FPE model named HPFPE is presented to capture the swimming posture of fish and accurately detect their key points.[Methods]On the one hand,this model incorporated the CBAM module into the HRNet framework.The attention module enhanced accuracy without adding computational complexity,while effectively capturing a broader range of contextual information.On the other hand,the model incorporated dilated convolution to increase the receptive field,allowing it to capture more spatial context.[Results and Discussions]Experiments showed that compared with the baseline method,the average precision(AP)of HPFPE based on different backbones and input sizes on the oplegnathus punctatus datasets had increased by 0.62,1.35,1.76,and 1.28 percent point,respectively,while the average recall(AR)had also increased by 0.85,1.50,1.40,and 1.00,respectively.Additionally,HPFPE outperformed other mainstream methods,including DeepPose,CPM,SCNet,and Lite-HRNet.Furthermore,when compared to other methods using the ornamental fish data,HPFPE achieved the highest AP and AR values of 52.96%,and 59.50%,respectively.[Conclusions]The proposed HPFPE can accurately estimate fish posture and assess their swimming patterns,serving as a valuable reference for applications such as fish behavior recognition. 展开更多
关键词 AQUACULTURE computer vision fish pose estimation key point attention mechanism
在线阅读 下载PDF
AARPose:Real-time and accurate drogue pose measurement based on monocular vision for autonomous aerial refueling
14
作者 Shuyuan WEN Yang GAO +3 位作者 Bingrui HU Zhongyu LUO Zhenzhong WEI Guangjun ZHANG 《Chinese Journal of Aeronautics》 2025年第6期552-572,共21页
Real-time and accurate drogue pose measurement during docking is basic and critical for Autonomous Aerial Refueling(AAR).Vision measurement is the best practicable technique,but its measurement accuracy and robustness... Real-time and accurate drogue pose measurement during docking is basic and critical for Autonomous Aerial Refueling(AAR).Vision measurement is the best practicable technique,but its measurement accuracy and robustness are easily affected by limited computing power of airborne equipment,complex aerial scenes and partial occlusion.To address the above challenges,we propose a novel drogue keypoint detection and pose measurement algorithm based on monocular vision,and realize real-time processing on airborne embedded devices.Firstly,a lightweight network is designed with structural re-parameterization to reduce computational cost and improve inference speed.And a sub-pixel level keypoints prediction head and loss functions are adopted to improve keypoint detection accuracy.Secondly,a closed-form solution of drogue pose is computed based on double spatial circles,followed by a nonlinear refinement based on Levenberg-Marquardt optimization.Both virtual simulation and physical simulation experiments have been used to test the proposed method.In the virtual simulation,the mean pixel error of the proposed method is 0.787 pixels,which is significantly superior to that of other methods.In the physical simulation,the mean relative measurement error is 0.788%,and the mean processing time is 13.65 ms on embedded devices. 展开更多
关键词 Autonomous aerial refueling Vision measurement Deep learning real-time LIGHTWEIGHT ACCURATE Monocular vision Drogue pose measurement
原文传递
High-throughput markerless pose estimation and home-cage activity analysis of tree shrew using deep learning
15
作者 Yangzhen Wang Feng Su +8 位作者 Rixu Cong Mengna Liu Kaichen Shan Xiaying Li Desheng Zhu Yusheng Wei Jiejie Dai Chen Zhang Yonglu Tian 《Animal Models and Experimental Medicine》 2025年第5期896-905,共10页
Background:Q uantifying the rich home-c age activities of tree shrews provides a reliable basis for understanding their daily routines and building disease models.However,due to the lack of effective behavioral method... Background:Q uantifying the rich home-c age activities of tree shrews provides a reliable basis for understanding their daily routines and building disease models.However,due to the lack of effective behavioral methods,most efforts on tree shrew behavior are limited to simple measures,resulting in the loss of much behavioral information.Methods:T o address this issue,we present a deep learning(DL)approach to achieve markerless pose estimation and recognize multiple spontaneous behaviors of tree shrews,including drinking,eating,resting,and staying in the dark house,etc.Results:T his high-t hroughput approach can monitor the home-cage activities of 16 tree shrews simultaneously over an extended period.Additionally,we demonstrated an innovative system with reliable apparatus,paradigms,and analysis methods for investigating food grasping behavior.The median duration for each bout of grasping was 0.20 s.Conclusion:T his study provides an efficient tool for quantifying and understand tree shrews'natural behaviors. 展开更多
关键词 deep learning food grasping home-cage activity pose estimation tree shrew
在线阅读 下载PDF
Lightweight Human Pose Estimation Based on Multi-Attention Mechanism
16
作者 LIN Xiao LU Meichen +1 位作者 GAO Mufeng LI Yan 《Journal of Shanghai Jiaotong university(Science)》 2025年第5期899-910,共12页
Human pose estimation has received much attention from the research community because of its wide range of applications.However,current research for pose estimation is usually complex and computationally intensive,esp... Human pose estimation has received much attention from the research community because of its wide range of applications.However,current research for pose estimation is usually complex and computationally intensive,especially the feature loss problems in the feature fusion process.To address the above problems,we propose a lightweight human pose estimation network based on multi-attention mechanism(LMANet).In our method,network parameters can be significantly reduced by lightweighting the bottleneck blocks with depth-wise separable convolution on the high-resolution networks.After that,we also introduce a multi-attention mechanism to improve the model prediction accuracy,and the channel attention module is added in the initial stage of the network to enhance the local cross-channel information interaction.More importantly,we inject spatial crossawareness module in the multi-scale feature fusion stage to reduce the spatial information loss during feature extraction.Extensive experiments on COCO2017 dataset and MPII dataset show that LMANet can guarantee a higher prediction accuracy with fewer network parameters and computational effort.Compared with the highresolution network HRNet,the number of parameters and the computational complexity of the network are reduced by 67%and 73%,respectively. 展开更多
关键词 human pose estimation attention mechanisms multi-scale feature fusion high-resolution networks
原文传递
A Multi-Type Feature Fusion Network Based on Importance Weighting for Occluded Human Pose Estimation
17
作者 Jiahong Jiang Nan Xia Siyao Zhou 《IEEE/CAA Journal of Automatica Sinica》 2025年第4期789-805,共17页
Human pose estimation is a challenging task in computer vision.Most algorithms perform well in regular scenes,but lack good performance in occlusion scenarios.Therefore,we propose a multi-type feature fusion network b... Human pose estimation is a challenging task in computer vision.Most algorithms perform well in regular scenes,but lack good performance in occlusion scenarios.Therefore,we propose a multi-type feature fusion network based on importance weighting,which consists of three modules.In the first module,we propose a multi-resolution backbone with two feature enhancement sub-modules,which can extract features from different scales and enhance the feature expression ability.In the second module,we enhance the expressiveness of keypoint features by suppressing obstacle features and compensating for the unique and shared attributes of keypoints and topology.In the third module,we perform importance weighting on the adjacency matrix to enable it to describe the correlation among nodes,thereby improving the feature extraction ability.We conduct comparative experiments on the keypoint detection datasets of common objects in Context 2017(COCO2017),COCO-Wholebody and CrowdPose,achieving the accuracy of 78.9%,67.1%and 77.6%,respectively.Additionally,a series of ablation experiments are designed to show the performance of our work.Finally,we present the visualization of different scenarios to verify the effectiveness of our work. 展开更多
关键词 Human keypoint detection human pose estimation importance weighting multi-type feature fusion occlusion environments
在线阅读 下载PDF
VMHPE:Human Pose Estimation for Virtual Maintenance Tasks
18
作者 Shuo Zhang Hanwu He Yueming Wu 《Computers, Materials & Continua》 2025年第10期801-826,共26页
Virtual maintenance,as an important means of industrial training and education,places strict requirements on the accuracy of participant pose perception and assessment of motion standardization.However,existing resear... Virtual maintenance,as an important means of industrial training and education,places strict requirements on the accuracy of participant pose perception and assessment of motion standardization.However,existing research mainly focuses on human pose estimation in general scenarios,lacking specialized solutions for maintenance scenarios.This paper proposes a virtual maintenance human pose estimation method based on multi-scale feature enhancement(VMHPE),which integrates adaptive input feature enhancement,multi-scale feature correction for improved expression of fine movements and complex poses,and multi-scale feature fusion to enhance keypoint localization accuracy.Meanwhile,this study constructs the first virtual maintenance-specific human keypoint dataset(VMHKP),which records standard action sequences of professional maintenance personnel in five typical maintenance tasks and provides a reliable benchmark for evaluating operator motion standardization.The dataset is publicly available at.Using high-precision keypoint prediction results,an action assessment system utilizing topological structure similarity was established.Experiments show that our method achieves significant performance improvements:average precision(AP)reaches 94.4%,an increase of 2.3 percentage points over baseline methods;average recall(AR)reaches 95.6%,an increase of 1.3 percentage points.This research establishes a scientific four-level evaluation standard based on comparative motion analysis and provides a reliable solution for standardizing industrial maintenance training. 展开更多
关键词 Virtual maintenance human pose estimation multi-scale feature fusion
在线阅读 下载PDF
3D Hand Pose Estimation Using Semantic Dynamic Hypergraph Convolutional Networks
19
作者 WU Yalei LI Jinghua +2 位作者 KONG Dehui LI Qianxing YIN Baocai 《Journal of Shanghai Jiaotong university(Science)》 2025年第5期855-865,共11页
Due to self-occlusion and high degree of freedom,estimating 3D hand pose from a single RGB image is a great challenging problem.Graph convolutional networks(GCNs)use graphs to describe the physical connection relation... Due to self-occlusion and high degree of freedom,estimating 3D hand pose from a single RGB image is a great challenging problem.Graph convolutional networks(GCNs)use graphs to describe the physical connection relationships between hand joints and improve the accuracy of 3D hand pose regression.However,GCNs cannot effectively describe the relationships between non-adjacent hand joints.Recently,hypergraph convolutional networks(HGCNs)have received much attention as they can describe multi-dimensional relationships between nodes through hyperedges;therefore,this paper proposes a framework for 3D hand pose estimation based on HGCN,which can better extract correlated relationships between adjacent and non-adjacent hand joints.To overcome the shortcomings of predefined hypergraph structures,a kind of dynamic hypergraph convolutional network is proposed,in which hyperedges are constructed dynamically based on hand joint feature similarity.To better explore the local semantic relationships between nodes,a kind of semantic dynamic hypergraph convolution is proposed.The proposed method is evaluated on publicly available benchmark datasets.Qualitative and quantitative experimental results both show that the proposed HGCN and improved methods for 3D hand pose estimation are better than GCN,and achieve state-of-the-art performance compared with existing methods. 展开更多
关键词 hand pose estimation hypergraph convolution dynamic hypergraph convolution semantic dynamic hypergraph convolution
原文传递
Multi-Human Pose Estimation by Deep Learning-Based Sequential Approach for Human Keypoint Position and Human Body Detection
20
作者 TAHIR Rizwana CAI Yunze 《Journal of Shanghai Jiaotong university(Science)》 2025年第6期1103-1113,共11页
Recent multimedia and computer vision research has focused on analyzing human behavior and activity using images.Skeleton estimation,known as pose estimation,has received a significant attention.For human pose estimat... Recent multimedia and computer vision research has focused on analyzing human behavior and activity using images.Skeleton estimation,known as pose estimation,has received a significant attention.For human pose estimation,deep learning approaches primarily emphasize on the keypoint features.Conversely,in the case of occluded or incomplete poses,the keypoint feature is insufficiently substantial,especially when there are multiple humans in a single frame.Other features,such as the body border and visibility conditions,can contribute to pose estimation in addition to the keypoint feature.Our model framework integrates multiple features,namely the human body mask features,which can serve as a constraint to keypoint location estimation,the body keypoint features,and the keypoint visibility via mask region-based convolutional neural network(Mask-RCNN).A sequential multi-feature learning setup is formed to share multi-features across the structure,whereas,in the Mask-RCNN,the only feature that could be shared through the system is the region of interest feature.By two-way up-scaling with the shared weight process to produce the mask,we have addressed the problems of improper segmentation,small intrusion,and object loss when Mask-RCNN is used,for instance,segmentation.Accuracy is indicated by the percentage of correct keypoint,and our model can identify 86.1%of the correct keypoints. 展开更多
关键词 multiperson pose estimation multi-feature learning mask region-based convolutional neural network(RCNN) deep learning
原文传递
上一页 1 2 102 下一页 到第
使用帮助 返回顶部