期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Spatio-temporal feature extraction with a global-local Transformer model for video scene graph generation
1
作者 Rongsen Wu Jie Xu +4 位作者 Hao Zheng Zhiyuan Xu Zixuan Li Shixue Cheng Shumao Zhang 《Digital Communications and Networks》 2026年第2期364-374,共11页
In the field of video scene graph generation,spatio-temporal feature extraction and the long-tail effect in relationship classification are core research issues.This paper proposes extracting spatio-temporal features ... In the field of video scene graph generation,spatio-temporal feature extraction and the long-tail effect in relationship classification are core research issues.This paper proposes extracting spatio-temporal features using the global-local Transformer model for video scene graph generation.Methods based on the Transformer architecture and attention mechanism enrich the semantic information of spatio-temporal features in videos,thereby improving the accuracy of relationship classification.In the feature processing module,pose features are introduced to strengthen the semantic representation of objects.In the spatial feature encoding module,a local spatial visibility matrix based on bounding boxes and key points of human pose features is proposed to add the issue of insufficient attention to local details in traditional Transformer encoders.In the temporal feature encoding module,a global random frame extraction strategy is proposed,which considers global temporal features while also taking computational complexity into account.In the relation classification module,to address the uneven distribution of object and relation categories in the Action Genome dataset,a relation classification loss function based on bipartite graph matching and Focal Loss is proposed,which alleviates the long-tail effect in relation classification and improves the accuracy. 展开更多
关键词 Video scene graph generation Transformer pose features Visibility matrix Bipartite graph matching
在线阅读 下载PDF
SPATIAL TRAJECTORY PREDICTION OF VISUAL SERVOING
2
作者 WangGang QiHui 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2003年第1期7-9,12,共4页
Target tracking is one typical application of visual servoing technology. It is still a difficult task to track high speed target with current visual servo system. The improvement of visual servoing scheme is strongly... Target tracking is one typical application of visual servoing technology. It is still a difficult task to track high speed target with current visual servo system. The improvement of visual servoing scheme is strongly required. A position-based visual servo parallel system is presented for tracking target with high speed. A local Frenet frame is assigned to the sampling point of spatial trajectory. Position estimation is formed by the differential features of intrinsic geometry, and orientation estimation is formed by homogenous transformation. The time spent for searching and processing can be greatly reduced by shifting the window according to features location prediction. The simulation results have demonstrated the ability of the system to track spatial moving object. 展开更多
关键词 Robot Visual servo pose estimation feature location prediction Target tracking
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部