期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
基于关联信息增强与关系平衡的场景图生成方法 被引量:1
1
作者 李林昊 韩冬 +2 位作者 董永峰 李英双 王振 《计算机应用》 北大核心 2025年第3期953-962,共10页
利用场景图的上下文信息可以帮助模型理解目标之间的关联作用;然而,大量不相关的目标可能带来额外噪声,进而影响信息交互,造成预测偏差。在嘈杂且多样的场景中,即使几个简单的关联目标,也足够推断目标所处的环境信息,并消除其他目标的... 利用场景图的上下文信息可以帮助模型理解目标之间的关联作用;然而,大量不相关的目标可能带来额外噪声,进而影响信息交互,造成预测偏差。在嘈杂且多样的场景中,即使几个简单的关联目标,也足够推断目标所处的环境信息,并消除其他目标的歧义信息。此外,在面对真实场景中的长尾偏差数据时,场景图生成(SGG)的性能难以令人满意。针对上下文信息增强和预测偏差的问题,提出一种基于关联信息增强与关系平衡的SGG(IERB)方法。IERB方法采用一种二次推理结构,即根据有偏场景图的预测结果重新构建不同预测视角下的关联信息并平衡预测偏差。首先,聚焦不同视角下的强相关目标以构建上下文关联信息;其次,利用树型结构的平衡策略增强尾部关系的预测能力;最后,采用一种预测引导方式在已有场景图的基础上预测优化。在通用的数据集Visual Genome上的实验结果表明,与3类基线模型VTransE(Visual Translation Embedding network)、Motif和VCTree(Visual Context Tree)相比,所提方法在谓词分类(PredCls)任务下的均值召回率mR@100分别提高了11.66、13.77和13.62个百分点,验证了所提方法的有效性。 展开更多
关键词 场景图生成 信息增强 有偏预测 关系平衡 预测优化
在线阅读 下载PDF
Dynamic Scene Graph Generation of Point Clouds with Structural Representation Learning 被引量:1
2
作者 Chao Qi Jianqin Yin +1 位作者 Zhicheng Zhang Jin Tang 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2024年第1期232-243,共12页
Scene graphs of point clouds help to understand object-level relationships in the 3D space.Most graph generation methods work on 2D structured data,which cannot be used for the 3D unstructured point cloud data.Existin... Scene graphs of point clouds help to understand object-level relationships in the 3D space.Most graph generation methods work on 2D structured data,which cannot be used for the 3D unstructured point cloud data.Existing point-cloud-based methods generate the scene graph with an additional graph structure that needs labor-intensive manual annotation.To address these problems,we explore a method to convert the point clouds into structured data and generate graphs without given structures.Specifically,we cluster points with similar augmented features into groups and establish their relationships,resulting in an initial structural representation of the point cloud.Besides,we propose a Dynamic Graph Generation Network(DGGN)to judge the semantic labels of targets of different granularity.It dynamically splits and merges point groups,resulting in a scene graph with high precision.Experiments show that our methods outperform other baseline methods.They output reliable graphs describing the object-level relationships without additional manual labeled data. 展开更多
关键词 scene graph generation structural representation point cloud
原文传递
Comprehensive Relation Modelling for Image Paragraph Generation 被引量:2
3
作者 Xianglu Zhu Zhang Zhang +1 位作者 Wei Wang Zilei Wang 《Machine Intelligence Research》 EI CSCD 2024年第2期369-382,共14页
Image paragraph generation aims to generate a long description composed of multiple sentences,which is different from traditional image captioning containing only one sentence.Most of previous methods are dedicated to... Image paragraph generation aims to generate a long description composed of multiple sentences,which is different from traditional image captioning containing only one sentence.Most of previous methods are dedicated to extracting rich features from image regions,and ignore modelling the visual relationships.In this paper,we propose a novel method to generate a paragraph by modelling visual relationships comprehensively.First,we parse an image into a scene graph,where each node represents a specific object and each edge denotes the relationship between two objects.Second,we enrich the object features by implicitly encoding visual relationships through a graph convolutional network(GCN).We further explore high-order relations between different relation features using another graph convolutional network.In addition,we obtain the linguistic features by projecting the predicted object labels and their relationships into a semantic embedding space.With these features,we present an attention-based topic generation network to select relevant features and produce a set of topic vectors,which are then utilized to generate multiple sentences.We evaluate the proposed method on the Stanford image-paragraph dataset which is currently the only available dataset for image paragraph generation,and our method achieves competitive performance in comparison with other state-of-the-art(SOTA)methods. 展开更多
关键词 Image paragraph generation visual relationship scene graph graph convolutional network(GCN) long short-term memory
原文传递
CAGNet:a context-aware graph neural network for detecting social relationships in videos
4
作者 Fan Yu Yaqun Fang +3 位作者 Zhixiang Zhao Jia Bei Tongwei Ren Gangshan Wu 《Visual Intelligence》 2024年第1期259-271,共13页
Social relationships,such as parent-offspring and friends,are crucial and stable connections between individuals,especially at the person level,and are essential for accurately describing the semantics of videos.In th... Social relationships,such as parent-offspring and friends,are crucial and stable connections between individuals,especially at the person level,and are essential for accurately describing the semantics of videos.In this paper,we analogize such a task to scene graph generation,which we call video social relationship graph generation(VSRGG).It involves generating a social relationship graph for each video based on person-level relationships.We propose a context-aware graph neural network(CAGNet)for VSRGG,which effectively generates social relationship graphs through message passing,capturing the context of the video.Specifically,CAGNet detects persons in the video,generates an initial graph via relationship proposal,and extracts facial and body features to describe the detected individuals,as well as temporal features to describe their interactions.Then,CAGNet predicts pairwise relationships between individuals using graph message passing.Additionally,we construct a new dataset,VidSoR,to evaluate VSRGG,which contains 72 h of video with 6276 person instances and 5313 relationship instances of eight relationship types.Extensive experiments show that CAGNet can make accurate predictions with a comparatively high mean recall(mRecall)when using only visual features. 展开更多
关键词 Video analysis Social relationship detection scene graph generation Message passing
在线阅读 下载PDF
Learning group interaction for sports video understanding from a perspective of athlete
5
作者 Rui HE Zehua FU +2 位作者 Qingjie LIU Yunhong WANG Xunxun CHEN 《Frontiers of Computer Science》 SCIE EI CSCD 2024年第4期175-188,共14页
Learning activities interactions between small groups is a key step in understanding team sports videos.Recent research focusing on team sports videos can be strictly regarded from the perspective of the audience rath... Learning activities interactions between small groups is a key step in understanding team sports videos.Recent research focusing on team sports videos can be strictly regarded from the perspective of the audience rather than the athlete.For team sports videos such as volleyball and basketball videos,there are plenty of intra-team and inter-team relations.In this paper,a new task named Group Scene Graph Generation is introduced to better understand intra-team relations and inter-team relations in sports videos.To tackle this problem,a novel Hierarchical Relation Network is proposed.After all players in a video are finely divided into two teams,the feature of the two teams’activities and interactions will be enhanced by Graph Convolutional Networks,which are finally recognized to generate Group Scene Graph.For evaluation,built on Volleyball dataset with additional 9660 team activity labels,a Volleyball+dataset is proposed.A baseline is set for better comparison and our experimental results demonstrate the effectiveness of our method.Moreover,the idea of our method can be directly utilized in another video-based task,Group Activity Recognition.Experiments show the priority of our method and display the link between the two tasks.Finally,from the athlete’s view,we elaborately present an interpretation that shows how to utilize Group Scene Graph to analyze teams’activities and provide professional gaming suggestions. 展开更多
关键词 group scene graph group activity recognition scene graph generation graph convolutional network sports video understanding
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部