Drug-drug interaction(DDI)refers to the interaction between two or more drugs in the body,altering their efficacy or pharmacokinetics.Fully considering and accurately predicting DDI has become an indispensable part of...Drug-drug interaction(DDI)refers to the interaction between two or more drugs in the body,altering their efficacy or pharmacokinetics.Fully considering and accurately predicting DDI has become an indispensable part of ensuring safe medication for patients.In recent years,many deep learning-based methods have been proposed to predict DDI.However,most existing computational models tend to oversimplify the fusion of drug structural and topological information,often relying on methods such as splicing or weighted summation,which fail to adequately capture the potential complementarity between structural and topological features.This loss of information may lead to models that do not fully leverage these features,thus limiting their performance in DDI prediction.To address these challenges,we propose a relation-aware cross adversarial network for predicting DDI,named RCAN-DDI,which combines a relationship-aware structure feature learning module and a topological feature learning module based on DDI networks to capture multimodal features of drugs.To explore the correlations and complementarities among different information sources,the cross-adversarial network is introduced to fully integrate features from various modalities,enhancing the predictive performance of the model.The experimental results demonstrate that the RCAN-DDI method outperforms other methods.Even in cases of labelled DDI scarcity,the method exhibits good robustness in the DDI prediction task.Furthermore,the effectiveness of the cross-adversarial module is validated through ablation experiments,demonstrating its superiority in learning multimodal complementary information.展开更多
With the growing application of intelligent robots in service,manufacturing,and medical fields,efficient and natural interaction between humans and robots has become key to improving collaboration efficiency and user ...With the growing application of intelligent robots in service,manufacturing,and medical fields,efficient and natural interaction between humans and robots has become key to improving collaboration efficiency and user experience.Gesture recognition,as an intuitive and contactless interaction method,can overcome the limitations of traditional interfaces and enable real-time control and feedback of robot movements and behaviors.This study first reviews mainstream gesture recognition algorithms and their application on different sensing platforms(RGB cameras,depth cameras,and inertial measurement units).It then proposes a gesture recognition method based on multimodal feature fusion and a lightweight deep neural network that balances recognition accuracy with computational efficiency.At system level,a modular human-robot interaction architecture is constructed,comprising perception,decision,and execution layers,and gesture commands are transmitted and mapped to robot actions in real time via the ROS communication protocol.Through multiple comparative experiments on public gesture datasets and a self-collected dataset,the proposed method’s superiority is validated in terms of accuracy,response latency,and system robustness,while user-experience tests assess the interface’s usability.The results provide a reliable technical foundation for robot collaboration and service in complex scenarios,offering broad prospects for practical application and deployment.展开更多
为了充分利用特征间的高阶交互以提升点击率预测模型的预测精度,提出了一种基于图神经网络和注意力的点击率预测模型VBGA (vector-wise and bit-wise interaction model based on GNN and attention),该模型借助图神经网络和注意力机制...为了充分利用特征间的高阶交互以提升点击率预测模型的预测精度,提出了一种基于图神经网络和注意力的点击率预测模型VBGA (vector-wise and bit-wise interaction model based on GNN and attention),该模型借助图神经网络和注意力机制,为每个特征分别学习一个细粒度的权重,并将这种细粒度的特征权重输入到向量级交互层和元素级交互层联合预测点击率.VBGA模型主要由向量级交互层和元素级交互层构成,其中向量级交互层采用有向图来构建向量级的特征交互,实现无重复的显式特征交互,在减少计算量的同时,还可以实现更高阶的特征交叉,以获得更准确的预测精度.此外,本文还提出了一种交叉网络用于构建元素级特征交互.在Criteo和Avazu数据集上,与其他几种最先进的点击率预测模型进行了比较,实验结果表明,VBGA可以获得良好的预测结果.展开更多
The rapid development of information technology and accelerated digitalization have led to an explosive growth of data across various fields.As a key technology for knowledge representation and sharing,knowledge graph...The rapid development of information technology and accelerated digitalization have led to an explosive growth of data across various fields.As a key technology for knowledge representation and sharing,knowledge graphs play a crucial role by constructing structured networks of relationships among entities.However,data sparsity and numerous unexplored implicit relations result in the widespread incompleteness of knowledge graphs.In static knowledge graph completion,most existing methods rely on linear operations or simple interaction mechanisms for triple encoding,making it difficult to fully capture the deep semantic associations between entities and relations.Moreover,many methods focus only on the local information of individual triples,ignoring the rich semantic dependencies embedded in the neighboring nodes of entities within the graph structure,which leads to incomplete embedding representations.To address these challenges,we propose Two-Stage Mixer Embedding(TSMixerE),a static knowledge graph completion method based on entity context.In the unit semantic extraction stage,TSMixerE leveragesmulti-scale circular convolution to capture local features atmultiple granularities,enhancing the flexibility and robustness of feature interactions.A channel attention mechanism amplifies key channel responses to suppress noise and irrelevant information,thereby improving the discriminative power and semantic depth of feature representations.For contextual information fusion,a multi-layer self-attentionmechanism enables deep interactions among contextual cues,effectively integrating local details with global context.Simultaneously,type embeddings clarify the semantic identities and roles of each component,enhancing the model’s sensitivity and fusion capabilities for diverse information sources.Furthermore,TSMixerE constructs contextual unit sequences for entities,fully exploring neighborhood information within the graph structure to model complex semantic dependencies,thus improving the completeness and generalization of embedding representations.展开更多
从单张RGB图像中实现双手的3D交互式网格重建是一项极具挑战性的任务。由于双手之间的相互遮挡以及局部外观相似性较高,导致部分特征提取不够准确,从而丢失了双手之间的交互信息并使重建的手部网格与输入图像出现不对齐等问题。为了解...从单张RGB图像中实现双手的3D交互式网格重建是一项极具挑战性的任务。由于双手之间的相互遮挡以及局部外观相似性较高,导致部分特征提取不够准确,从而丢失了双手之间的交互信息并使重建的手部网格与输入图像出现不对齐等问题。为了解决上述问题,本文首先提出一种包含两个部分的特征交互适应模块,第一部分特征交互在保留左右手分离特征的同时生成两种新的特征表示,并通过交互注意力模块捕获双手的交互特征;第二部分特征适应则是将此交互特征利用交互注意力模块适应到每只手,为左右手特征注入全局上下文信息。其次,引入三层图卷积细化网络结构用于精确回归双手网格顶点,并通过基于注意力机制的特征对齐模块增强顶点特征和图像特征的对齐,从而增强重建的手部网格和输入图像的对齐。同时提出一种新的多层感知机结构,通过下采样和上采样操作学习多尺度特征信息。最后,设计相对偏移损失函数约束双手的空间关系。在InterHand2.6M数据集上的定量和定性实验表明,与现有的优秀方法相比,所提出的方法显著提升了模型性能,其中平均每关节位置误差(Mean Per Joint Position Error,MPJPE)和平均每顶点位置误差(Mean Per Vertex Position Error,MPVPE)分别降低至7.19 mm和7.33 mm。此外,在RGB2Hands和EgoHands数据集上进行泛化性实验,定性实验结果表明所提出的方法具有良好的泛化能力,能够适应不同环境背景下的手部网格重建。展开更多
基金supported by the Natural Science Foundation of Shandong Province(Grant No.:ZR2023MF053)the National Natural Science Foundation of China(Grant No.:61902430).
文摘Drug-drug interaction(DDI)refers to the interaction between two or more drugs in the body,altering their efficacy or pharmacokinetics.Fully considering and accurately predicting DDI has become an indispensable part of ensuring safe medication for patients.In recent years,many deep learning-based methods have been proposed to predict DDI.However,most existing computational models tend to oversimplify the fusion of drug structural and topological information,often relying on methods such as splicing or weighted summation,which fail to adequately capture the potential complementarity between structural and topological features.This loss of information may lead to models that do not fully leverage these features,thus limiting their performance in DDI prediction.To address these challenges,we propose a relation-aware cross adversarial network for predicting DDI,named RCAN-DDI,which combines a relationship-aware structure feature learning module and a topological feature learning module based on DDI networks to capture multimodal features of drugs.To explore the correlations and complementarities among different information sources,the cross-adversarial network is introduced to fully integrate features from various modalities,enhancing the predictive performance of the model.The experimental results demonstrate that the RCAN-DDI method outperforms other methods.Even in cases of labelled DDI scarcity,the method exhibits good robustness in the DDI prediction task.Furthermore,the effectiveness of the cross-adversarial module is validated through ablation experiments,demonstrating its superiority in learning multimodal complementary information.
文摘With the growing application of intelligent robots in service,manufacturing,and medical fields,efficient and natural interaction between humans and robots has become key to improving collaboration efficiency and user experience.Gesture recognition,as an intuitive and contactless interaction method,can overcome the limitations of traditional interfaces and enable real-time control and feedback of robot movements and behaviors.This study first reviews mainstream gesture recognition algorithms and their application on different sensing platforms(RGB cameras,depth cameras,and inertial measurement units).It then proposes a gesture recognition method based on multimodal feature fusion and a lightweight deep neural network that balances recognition accuracy with computational efficiency.At system level,a modular human-robot interaction architecture is constructed,comprising perception,decision,and execution layers,and gesture commands are transmitted and mapped to robot actions in real time via the ROS communication protocol.Through multiple comparative experiments on public gesture datasets and a self-collected dataset,the proposed method’s superiority is validated in terms of accuracy,response latency,and system robustness,while user-experience tests assess the interface’s usability.The results provide a reliable technical foundation for robot collaboration and service in complex scenarios,offering broad prospects for practical application and deployment.
文摘为了充分利用特征间的高阶交互以提升点击率预测模型的预测精度,提出了一种基于图神经网络和注意力的点击率预测模型VBGA (vector-wise and bit-wise interaction model based on GNN and attention),该模型借助图神经网络和注意力机制,为每个特征分别学习一个细粒度的权重,并将这种细粒度的特征权重输入到向量级交互层和元素级交互层联合预测点击率.VBGA模型主要由向量级交互层和元素级交互层构成,其中向量级交互层采用有向图来构建向量级的特征交互,实现无重复的显式特征交互,在减少计算量的同时,还可以实现更高阶的特征交叉,以获得更准确的预测精度.此外,本文还提出了一种交叉网络用于构建元素级特征交互.在Criteo和Avazu数据集上,与其他几种最先进的点击率预测模型进行了比较,实验结果表明,VBGA可以获得良好的预测结果.
基金supported by the National Natural Science Foundation of China(No.62267005)the Chinese Guangxi Natural Science Foundation(No.2023GXNSFAA026493)+1 种基金Guangxi Collaborative Innovation Center ofMulti-Source Information Integration and Intelligent ProcessingGuangxi Academy of Artificial Intelligence.
文摘The rapid development of information technology and accelerated digitalization have led to an explosive growth of data across various fields.As a key technology for knowledge representation and sharing,knowledge graphs play a crucial role by constructing structured networks of relationships among entities.However,data sparsity and numerous unexplored implicit relations result in the widespread incompleteness of knowledge graphs.In static knowledge graph completion,most existing methods rely on linear operations or simple interaction mechanisms for triple encoding,making it difficult to fully capture the deep semantic associations between entities and relations.Moreover,many methods focus only on the local information of individual triples,ignoring the rich semantic dependencies embedded in the neighboring nodes of entities within the graph structure,which leads to incomplete embedding representations.To address these challenges,we propose Two-Stage Mixer Embedding(TSMixerE),a static knowledge graph completion method based on entity context.In the unit semantic extraction stage,TSMixerE leveragesmulti-scale circular convolution to capture local features atmultiple granularities,enhancing the flexibility and robustness of feature interactions.A channel attention mechanism amplifies key channel responses to suppress noise and irrelevant information,thereby improving the discriminative power and semantic depth of feature representations.For contextual information fusion,a multi-layer self-attentionmechanism enables deep interactions among contextual cues,effectively integrating local details with global context.Simultaneously,type embeddings clarify the semantic identities and roles of each component,enhancing the model’s sensitivity and fusion capabilities for diverse information sources.Furthermore,TSMixerE constructs contextual unit sequences for entities,fully exploring neighborhood information within the graph structure to model complex semantic dependencies,thus improving the completeness and generalization of embedding representations.
文摘从单张RGB图像中实现双手的3D交互式网格重建是一项极具挑战性的任务。由于双手之间的相互遮挡以及局部外观相似性较高,导致部分特征提取不够准确,从而丢失了双手之间的交互信息并使重建的手部网格与输入图像出现不对齐等问题。为了解决上述问题,本文首先提出一种包含两个部分的特征交互适应模块,第一部分特征交互在保留左右手分离特征的同时生成两种新的特征表示,并通过交互注意力模块捕获双手的交互特征;第二部分特征适应则是将此交互特征利用交互注意力模块适应到每只手,为左右手特征注入全局上下文信息。其次,引入三层图卷积细化网络结构用于精确回归双手网格顶点,并通过基于注意力机制的特征对齐模块增强顶点特征和图像特征的对齐,从而增强重建的手部网格和输入图像的对齐。同时提出一种新的多层感知机结构,通过下采样和上采样操作学习多尺度特征信息。最后,设计相对偏移损失函数约束双手的空间关系。在InterHand2.6M数据集上的定量和定性实验表明,与现有的优秀方法相比,所提出的方法显著提升了模型性能,其中平均每关节位置误差(Mean Per Joint Position Error,MPJPE)和平均每顶点位置误差(Mean Per Vertex Position Error,MPVPE)分别降低至7.19 mm和7.33 mm。此外,在RGB2Hands和EgoHands数据集上进行泛化性实验,定性实验结果表明所提出的方法具有良好的泛化能力,能够适应不同环境背景下的手部网格重建。