摘要
基于异构信息网络的跨媒体关联挖掘成为新的研究热点。一般情况下,视频中非线性视觉信息和失范的文本会使得模态间关联极其稀疏。现有方法多采用嵌入多条语义路径来增强媒体间关联。然而,这种方法往往忽略了路径中局部子图结构内节点间的关联,导致节点的子图邻域信息被遗漏,节点嵌入无法捕捉与邻域节点的关联性,进而引起网络视频突发事件挖掘效果不佳。因此,本文提出了一种基于子图邻域学习的跨媒体语义关联增强方法。具体来说,该方法将异构图分解为不同类型子图,在不同子图中捕捉邻域节点的关联,得到关联丰富后的节点最终嵌入。首先,将不同模态节点初级特征映射到统一空间后,将异构图分解为同构和异构子图,以获取节点基于元路径的同构邻居和一阶异构邻居;然后,通过特定注意力机制分别嵌入基于同构和异构子图的邻域节点,捕获子图内节点邻域信息;最后,通过图级注意力聚合同构和异构子图间交互和语义信息,得到邻域关联后的节点最终嵌入,在下游任务中实现网络视频突发事件的准确挖掘。通过在10个真实数据集上的实验验证,本文方法展现了较高的可靠性,且所提模型在性能上超越了现有方法。
Cross-media association mining based on heterogeneous information networks has received wide-spread attention.Typically,the non-linear visual information and the inaccurate textual information within videos result in extremely sparse associations between them.Existing methods often enhance these associations by embedding multiple semantic paths.However,these approaches overlook the associations between nodes within local subgraph structures,leading to the omission of neighborhood information.As a result,node embeddings fail to capture the association with neighboring nodes,ultimately leading to poor performance in web video event mining.To address this issue,this paper proposes a cross-media semantic association enhancement method based on subgraph neighborhood learning.Specifically,this method decomposes heterogeneous graph into different types of subgraphs,captures the associations of neighboring nodes within these subgraphs,and obtains the final node embeddings.Initially,node attributes are projected into a shared latent space using type-specific linear transformations.Subsequently,the heterogeneous graph is divided into multiple subgraphs,including homogeneous and heterogeneous structures based on predefined metapaths.Subsequently,tailored attention are independently applied to each subgraph to capture the neighborhood information of nodes.Finally,information from different subgraphs is fused through graph-level attention,aggregating the interactions and semantic information.The learned representations are evaluated by web video event mining.Through experiments on 10 real-world datasets,the proposed method in this paper has demonstrated high reliability and outperformed existing methods in terms of performance.
作者
张承德
周璇
ZHANG Cheng-De;ZHOU Xuan(School of Information Engineering,Zhongnan University of Economics and Law,Wuhan 430073)
出处
《计算机学报》
北大核心
2025年第5期1134-1150,共17页
Chinese Journal of Computers
基金
国家社会科学基金一般项目(22BXW081)资助。
关键词
跨媒体
网络视频
事件挖掘
子图
子图邻域学习
cross-media
webvideo
event mining
subgraph
subgraph neighborhood learning