期刊文献+
共找到1,722篇文章
< 1 2 87 >
每页显示 20 50 100
Probabilistic Graphical Model-Based Operational Reliability-Centric Design of Offshore Wind Farm Feeder Layouts
1
作者 Qiuyu Lu Yunqi Yan +4 位作者 Yang Liu Ying Chen Yinguo Yang Tannan Xiao Guobing Wu 《Energy Engineering》 2025年第12期4799-4814,共16页
The rapid expansion of offshore wind energy necessitates robust and cost-effective electrical collector system(ECS)designs that prioritize lifetime operational reliability.Traditional optimization approaches often sim... The rapid expansion of offshore wind energy necessitates robust and cost-effective electrical collector system(ECS)designs that prioritize lifetime operational reliability.Traditional optimization approaches often simplify reliability considerations or fail to holistically integrate them with economic and technical constraints.This paper introduces a novel,two-stage optimization framework for offshore wind farm(OWF)ECS planning that systematically incorporates reliability.The first stage employs Mixed-Integer Linear Programming(MILP)to determine an optimal radial network topology,considering linearized reliability approximations and geographical constraints.The second stage enhances this design by strategically placing tie-lines using a Mixed-Integer Quadratically Constrained Program(MIQCP).This stage leverages a dynamic-aware adaptation of Multi-Source Multi-Terminal Network Reliability(MSMT-NR)assessment,with its inherent nonlinear equations successfully transformed into a solvable MIQCP form for loopy networks.A benchmark case study demonstrates the framework’s efficacy,illustrating how increasing the emphasis on reliability leads to more distributed and interconnected network topologies,effectively balancing investment costs against enhanced system resilience. 展开更多
关键词 Offshore wind farm feeder layout optimization network reliability nonlinear optimization probabilistic graphical model
在线阅读 下载PDF
Systemic Risk of Conventional and Islamic Banks: Comparison with Graphical Network Models
2
作者 Shatha Qamhieh Hashem Paolo Giudici 《Applied Mathematics》 2016年第17期2079-2096,共19页
The main aim of this paper is to compare the stability, in terms of systemic risk, of conventional and Islamic banking systems. To this aim, we propose correlation network models for stock market returns based on grap... The main aim of this paper is to compare the stability, in terms of systemic risk, of conventional and Islamic banking systems. To this aim, we propose correlation network models for stock market returns based on graphical Gaussian distributions, which allows us to capture the contagion effects that move along countries. We also consider Bayesian graphical models, to account for model uncertainty in the measurement of financial systems interconnectedness. Our proposed model is applied to the Middle East and North Africa (MENA) region banking sector, characterized by the presence of both conventional and Islamic banks, for the period from 2007 to the beginning of 2014. Our empirical findings show that there are differences in the systemic risk and stability of the two banking systems during crisis times. In addition, the differences are subject to country specific effects that are amplified during crisis period. 展开更多
关键词 Financial Stability Centrality Measures graphical Gaussian models Islamic Banks Conventional Banks Systemic Risk
在线阅读 下载PDF
A graphical model for haloanhydrite components and P-wave velocity:A case study of haloanhydrites in Amu Darya Basin 被引量:2
3
作者 Guo Tong-Cui Wang Hong-Jun +4 位作者 Mu Long-Xin Zhang Xing-Yang Ma Zhi Tian Yu Li Hao-Chen 《Applied Geophysics》 SCIE CSCD 2016年第3期459-468,579,共11页
Wave velocities in haloanhydrites are difficult to determine and significantly depend on the mineralogy. We used petrophysical parameters to study the wave velocity in haloanhydrites in the Amur Darya Basin and constr... Wave velocities in haloanhydrites are difficult to determine and significantly depend on the mineralogy. We used petrophysical parameters to study the wave velocity in haloanhydrites in the Amur Darya Basin and constructed a template of the relation between haloanhydrite mineralogy (anhydrite, salt, mudstone, and pore water) and wave velocities. We used the relation between the P-wave rnoduli ratio and porosity as constraint and constructed a graphical model (petrophysical template) for the relation between wave velocity, mineral content and porosity. We tested the graphical model using rock core and well logging data. 展开更多
关键词 SALT ANHYDRITE graphical model P-wave velocity Ainu Darya Basin
在线阅读 下载PDF
Discrete-time dynamic graphical games:model-free reinforcement learning solution 被引量:7
4
作者 Mohammed I.ABOUHEAF Frank L.LEWIS +1 位作者 Magdi S.MAHMOUD Dariusz G.MIKULSKI 《Control Theory and Technology》 EI CSCD 2015年第1期55-69,共15页
This paper introduces a model-free reinforcement learning technique that is used to solve a class of dynamic games known as dynamic graphical games. The graphical game results from to make all the agents synchronize t... This paper introduces a model-free reinforcement learning technique that is used to solve a class of dynamic games known as dynamic graphical games. The graphical game results from to make all the agents synchronize to the state of a command multi-agent dynamical systems, where pinning control is used generator or a leader agent. Novel coupled Bellman equations and Hamiltonian functions are developed for the dynamic graphical games. The Hamiltonian mechanics are used to derive the necessary conditions for optimality. The solution for the dynamic graphical game is given in terms of the solution to a set of coupled Hamilton-Jacobi-Bellman equations developed herein. Nash equilibrium solution for the graphical game is given in terms of the solution to the underlying coupled Hamilton-Jacobi-Bellman equations. An online model-free policy iteration algorithm is developed to learn the Nash solution for the dynamic graphical game. This algorithm does not require any knowledge of the agents' dynamics. A proof of convergence for this multi-agent learning algorithm is given under mild assumption about the inter-connectivity properties of the graph. A gradient descent technique with critic network structures is used to implement the policy iteration algorithm to solve the graphical game online in real-time. 展开更多
关键词 Dynamic graphical games Nash equilibrium discrete mechanics optimal control model-free reinforcementlearning policy iteration
原文传递
Integrating models of real aboveground scene and underground geological structures at an open pit mine
5
作者 Biao DONG Wenjun TAN +4 位作者 Weichao CHANG Baoting LI Yanliang GUO Quanxing HU Guangwei LIU 《虚拟现实与智能硬件(中英文)》 2025年第4期406-420,共15页
Background As information technology has advanced and been popularized,open pit mining has rapidly developed toward integration and digitization.The three-dimensional reconstruction technology has been successfully ap... Background As information technology has advanced and been popularized,open pit mining has rapidly developed toward integration and digitization.The three-dimensional reconstruction technology has been successfully applied to geological reconstruction and modeling of surface scenes in open pit mines.However,an integrated modeling method for surface and underground mine sites has not been reported.Methods In this study,we propose an integrated modeling method for open pit mines that fuses a real scene on the surface with an underground geological model.Based on oblique photography,a real-scene model was established on the surface.Based on the surface-stitching method proposed,the upper and lower surfaces and sides of the model were constructed in stages to construct a complete underground three-dimensional geological model,and the aboveground and underground models were registered together to build an integrated open pit mine model.Results The oblique photography method used reconstructed a surface model of an open pit mine using a real scene.The surface-stitching algorithm proposed was compared with the ball-pivoting and Poisson algorithms,and the integrity of the reconstructed model was markedly superior to that of the other two reconstruction methods.In addition,the surface-stitching algorithm was applied to the reconstruction of different formation models and showed good stability and reconstruction efficiency.Finally,the aboveground and underground models were accurately fitted after registration to form an integrated model.Conclusions The proposed method can efficiently establish an integrated open pit model.Based on the integrated model,an open pit auxiliary planning system was designed and realized.It supports the functions of mining planning and output calculation,assists users in mining planning and operation management,and improves production efficiency and management levels. 展开更多
关键词 Three-dimensional reconstruction Computer graphics VISUALIZATION Integrated model Open pit mine
暂未订购
Graphical model construction based on evolutionary algorithms
6
作者 Youlong YANG Yan WU Sanyang LIU 《控制理论与应用(英文版)》 EI 2006年第4期349-354,共6页
Using Bayesian networks to model promising solutions from the current population of the evolutionary algorithms can ensure efficiency and intelligence search for the optimum. However, to construct a Bayesian network t... Using Bayesian networks to model promising solutions from the current population of the evolutionary algorithms can ensure efficiency and intelligence search for the optimum. However, to construct a Bayesian network that fits a given dataset is a NP-hard problem, and it also needs consuming mass computational resources. This paper develops a methodology for constructing a graphical model based on Bayesian Dirichlet metric. Our approach is derived from a set of propositions and theorems by researching the local metric relationship of networks matching dataset. This paper presents the algorithm to construct a tree model from a set of potential solutions using above approach. This method is important not only for evolutionary algorithms based on graphical models, but also for machine learning and data mining. The experimental results show that the exact theoretical results and the approximations match very well. 展开更多
关键词 graphical model Evolutionary algorithms Bayesian network Tree models Bayesian Dirichlet metric
在线阅读 下载PDF
Graphical representations and worm algorithms for the O(N) spin model
7
作者 Longxiang Liu Lei Zhang +1 位作者 Xiaojun Tan Youjin Deng 《Communications in Theoretical Physics》 SCIE CAS CSCD 2023年第11期152-161,共10页
We present a family of graphical representations for the O(N)spin model,where N≥1 represents the spin dimension,and N=1,2,3 corresponds to the Ising,XY and Heisenberg models,respectively.With an integer parameter 0≤... We present a family of graphical representations for the O(N)spin model,where N≥1 represents the spin dimension,and N=1,2,3 corresponds to the Ising,XY and Heisenberg models,respectively.With an integer parameter 0≤ℓ≤N/2,each configuration is the coupling of ℓ copies of subgraphs consisting of directed flows and N−2ℓ copies of subgraphs constructed by undirected loops,which we call the XY and Ising subgraphs,respectively.On each lattice site,the XY subgraphs satisfy the Kirchhoff flow-conservation law and the Ising subgraphs obey the Eulerian bond condition.Then,we formulate worm-type algorithms and simulate the O(N)model on the simple-cubic lattice for N from 2 to 6 at all possibleℓ.It is observed that the worm algorithm has much higher efficiency than the Metropolis method,and,for a given N,the efficiency is an increasing function ofℓ.Besides Monte Carlo simulations,we expect that these graphical representations would provide a convenient basis for the study of the O(N)spin model by other state-of-the-art methods like the tensor network renormalization. 展开更多
关键词 Markov-chain Monte Carlo algorithms continuous spin models graphical representations
原文传递
VIDEO MULTI-TARGET TRACKING BASED ON PROBABILISTIC GRAPHICAL MODEL
8
作者 Xu Feng Huang Chenrong +1 位作者 Wu Zhengjun Xu Lizhong 《Journal of Electronics(China)》 2011年第4期548-557,共10页
In the technique of video multi-target tracking,the common particle filter can not deal well with uncertain relations among multiple targets.To solve this problem,many researchers use data association method to reduce... In the technique of video multi-target tracking,the common particle filter can not deal well with uncertain relations among multiple targets.To solve this problem,many researchers use data association method to reduce the multi-target uncertainty.However,the traditional data association method is difficult to track accurately when the target is occluded.To remove the occlusion in the video,combined with the theory of data association,this paper adopts the probabilistic graphical model for multi-target modeling and analysis of the targets relationship in the particle filter framework.Ex-perimental results show that the proposed algorithm can solve the occlusion problem better compared with the traditional algorithm. 展开更多
关键词 Video tracking Multi-target tracking Data association Probabilistic graphical model Particle filter
在线阅读 下载PDF
Multi-Label Classification Model Using Graph Convolutional Neural Network for Social Network Nodes
9
作者 Junmin Lyu Guangyu Xu +4 位作者 Feng Bao Yu Zhou Yuxin Liu Siyu Lu Wenfeng Zheng 《Computer Modeling in Engineering & Sciences》 2026年第2期1235-1256,共22页
Graph neural networks(GNN)have shown strong performance in node classification tasks,yet most existing models rely on uniform or shared weight aggregation,lacking flexibility in modeling the varying strength of relati... Graph neural networks(GNN)have shown strong performance in node classification tasks,yet most existing models rely on uniform or shared weight aggregation,lacking flexibility in modeling the varying strength of relationships among nodes.This paper proposes a novel graph coupling convolutional model that introduces an adaptive weighting mechanism to assign distinct importance to neighboring nodes based on their similarity to the central node.Unlike traditional methods,the proposed coupling strategy enhances the interpretability of node interactions while maintaining competitive classification performance.The model operates in the spatial domain,utilizing adjacency list structures for efficient convolution and addressing the limitations of weight sharing through a coupling-based similarity computation.Extensive experiments are conducted on five graph-structured datasets,including Cora,Citeseer,PubMed,Reddit,and BlogCatalog,as well as a custom topology dataset constructed from the Open University Learning Analytics Dataset(OULAD)educational platform.Results demonstrate that the proposed model achieves good classification accuracy,while significantly reducing training time through direct second-order neighbor fusion and data preprocessing.Moreover,analysis of neighborhood order reveals that considering third-order neighbors offers limited accuracy gains but introduces considerable computational overhead,confirming the efficiency of first-and second-order convolution in practical applications.Overall,the proposed graph coupling model offers a lightweight,interpretable,and effective framework for multi-label node classification in complex networks. 展开更多
关键词 GNN social networks nodes multi-label classification model graphic convolution neural network coupling principle
在线阅读 下载PDF
Graphical Simulator for Chinese Ink-Wash Drawing 被引量:4
10
作者 王秀锦 焦景山 孙济洲 《Transactions of Tianjin University》 EI CAS 2002年第1期1-7,共7页
Simulating the traditional painting art by computer graphics is a challenging and attractive subject. Basing on the experience in the ink wash drawing, in this paper, we expound the artistic characters of ink wash p... Simulating the traditional painting art by computer graphics is a challenging and attractive subject. Basing on the experience in the ink wash drawing, in this paper, we expound the artistic characters of ink wash painting and particularly analyze the characteristics of the materials used in the ink wash drawing and the relationships between them. A simulation model is presented and some typical visual effects of the ink wash painting are realized. 展开更多
关键词 Chinese ink wash drawing computer art simulation model graphicS
全文增补中
Optimization of a precise integration method for seismic modeling based on graphic processing unit 被引量:2
11
作者 Jingyu Li Genyang Tang Tianyue Hu 《Earthquake Science》 CSCD 2010年第4期387-393,共7页
General purpose graphic processing unit (GPU) calculation technology is gradually widely used in various fields. Its mode of single instruction, multiple threads is capable of seismic numerical simulation which has ... General purpose graphic processing unit (GPU) calculation technology is gradually widely used in various fields. Its mode of single instruction, multiple threads is capable of seismic numerical simulation which has a huge quantity of data and calculation steps. In this study, we introduce a GPU-based parallel calculation method of a precise integration method (PIM) for seismic forward modeling. Compared with CPU single-core calculation, GPU parallel calculating perfectly keeps the features of PIM, which has small bandwidth, high accuracy and capability of modeling complex substructures, and GPU calculation brings high computational efficiency, which means that high-performing GPU parallel calculation can make seismic forward modeling closer to real seismic records. 展开更多
关键词 precise integration method seismic modeling general purpose GPU graphic processing unit
在线阅读 下载PDF
Recent advances in parametric neuroreceptor mapping with dynamic PET:basic concepts and graphical analyses 被引量:1
12
作者 Seongho Seo Su Jin Kim +1 位作者 Dong Soo Lee Jae Sung Lee 《Neuroscience Bulletin》 SCIE CAS CSCD 2014年第5期733-754,共22页
Tracer kinetic modeling in dynamic positron emission tomography (PET) has been widely used to investigate the characteristic distribution patterns or dysfunctions of neuroreceptors in brain diseases. Its practical g... Tracer kinetic modeling in dynamic positron emission tomography (PET) has been widely used to investigate the characteristic distribution patterns or dysfunctions of neuroreceptors in brain diseases. Its practical goal has progressed from regional data quantification to parametric mapping that produces images of kinetic-model parameters by fully exploiting the spatiotemporal information in dynamic PET data. Graphical analysis (GA) is a major parametric mapping technique that is independent on any compartmental model configuration, robust to noise, and computationally efficient. In this paper, we provide an overview of recent advances in the parametric mapping of neuroreceptor binding based on GA methods. The associated basic concepts in tracer kinetic modeling are presented, including commonly-used compartment models and major parameters of interest. Technical details of GA approaches for reversible and irreversible radioligands are described, considering both plasma input and reference tissue input models. Their statistical properties are discussed in view of parametric imaging. 展开更多
关键词 dynamic positron emission tomography graphical analysis neuroreceptor imaging parametric image tracer kinetic modeling
原文传递
Naxi-English Bilingual Word Alignment Based on Language Characteristics and Log-Linear Model
13
作者 Yu Zhengtao Xian Yantuan +2 位作者 Tian Wei Guo Jianyi Zhang Tao 《China Communications》 SCIE CSCD 2012年第3期78-86,共9页
We propose a method that can achieve the Naxi-English bilingual word automatic alignment based on a log-linear model.This method defines the different Naxi-English structural feature functions,which are English-Naxi i... We propose a method that can achieve the Naxi-English bilingual word automatic alignment based on a log-linear model.This method defines the different Naxi-English structural feature functions,which are English-Naxi interval switching function and Naxi-English bilingual word position transformation function.With the manually labeled Naxi-English words alignment corpus,the parameters of the model are trained by using the minimum error,thus Naxi-English bilingual word alignment is achieved automatically.Experiments are conducted with IBM Model 3 as a benchmark,and the Naxi language constraints are introduced.The final experiment results show that the proposed alignment method achieves very good results:the introduction of the language characteristic function can effectively improve the accuracy of the Naxi-English Bilingual Word Alignment. 展开更多
关键词 word aligrmaent Naxi language ENGLISH log-linear model interval switching function posi-tion transformation function
在线阅读 下载PDF
Enhancing Feature Discretization in Alarm and Fire Detection Systems Using Probabilistic Inference Models
14
作者 Joe Essien 《Journal of Computer and Communications》 2023年第7期140-155,共16页
Sensors for fire alarms require a high level of predictive variables to ensure accurate detection, injury prevention, and loss prevention. Bayesian networks can aid in enhancing early fire detection capabilities and r... Sensors for fire alarms require a high level of predictive variables to ensure accurate detection, injury prevention, and loss prevention. Bayesian networks can aid in enhancing early fire detection capabilities and reducing the frequency of erroneous fire alerts, thereby enhancing the effectiveness of numerous safety monitoring systems. This research explores the development of optimized probabilistic graphic models for the discretization thresholds of alarm system predictor variables. The study presents a statistical model framework that increases the efficacy of fire detection by predicting the discretization thresholds of alarm system predictor variable fluctuations used to detect the onset of fire. The work applies the Bayesian networks and probabilistic visual models to reveal the specific characteristics required to cope with fire detection strategies and patterns. The adopted methodology utilizes a combination of prior knowledge and statistical data to draw conclusions from observations. Utilizing domain knowledge to compute conditional dependencies between network variables enabled predictions to be made through the application of specialized analytical and simulation techniques. 展开更多
关键词 Neural Network DISCRETIZATION Alarm Systems graphical models Machine Learning
在线阅读 下载PDF
Visual-Graphical Methods for Exploring Psychological Longitudinal Data
15
作者 Hsiang-wei Ker 《Psychology Research》 2012年第9期545-561,共17页
关键词 纵向数据 图形技术 视觉 心理 时间变化 异方差性 误差范围 截面数据
在线阅读 下载PDF
Reproducible Learning of Gaussian Graphical Models via Graphical Lasso Multiple Data Splitting
16
作者 Kang Hu Danning Li Binghui Liu 《Acta Mathematica Sinica,English Series》 2025年第2期553-568,共16页
Gaussian graphical models(GGMs) are widely used as intuitive and efficient tools for data analysis in several application domains. To address the reproducibility issue of structure learning of a GGM, it is essential t... Gaussian graphical models(GGMs) are widely used as intuitive and efficient tools for data analysis in several application domains. To address the reproducibility issue of structure learning of a GGM, it is essential to control the false discovery rate(FDR) of the estimated edge set of the graph in terms of the graphical model. Hence, in recent years, the problem of GGM estimation with FDR control is receiving more and more attention. In this paper, we propose a new GGM estimation method by implementing multiple data splitting. Instead of using the node-by-node regressions to estimate each row of the precision matrix, we suggest directly estimating the entire precision matrix using the graphical Lasso in the multiple data splitting, and our calculation speed is p times faster than the previous. We show that the proposed method can asymptotically control FDR, and the proposed method has significant advantages in computational efficiency. Finally, we demonstrate the usefulness of the proposed method through a real data analysis. 展开更多
关键词 False discovery rate Gaussian graphical model multiple data splitting graphical Lasso
原文传递
大规模有限元模型图形可视化引擎技术研究
17
作者 王晓辉 许向彦 +1 位作者 聂小华 常亮 《计算机应用与软件》 北大核心 2026年第1期17-24,41,共9页
针对复杂结构精细化仿真分析中的大规模有限元模型可视交互力弱的问题,针对性提出高效的模型数据管理及显示的关键技术与软件设计方案。该文基于轻量化的有限元模型数据结构设计,实现高效的有限元模型数据管理引擎;基于最小节点相关面... 针对复杂结构精细化仿真分析中的大规模有限元模型可视交互力弱的问题,针对性提出高效的模型数据管理及显示的关键技术与软件设计方案。该文基于轻量化的有限元模型数据结构设计,实现高效的有限元模型数据管理引擎;基于最小节点相关面表法有效剔除网格模型内部单元面,降低了图形渲染规模;再基于BVH结构的射线拾取算法和Qt通信机制实现了三维模型图形交互;采用三层软件架构设计研发了一款高性能可视化引擎SABRE.Visual。通过与软件测试对比,表明该引擎可完全支持千万单元/节点规模的有限元模型的显示及交互操作,在模型显示效率、大规模问题适用性方面具备一定优越性。 展开更多
关键词 千万单元规模 有限元模型可视化 数据管理引擎 三维图形渲染 SABRE
在线阅读 下载PDF
数字信息化技术发展下建筑图学教学面临的机遇与挑战
18
作者 周雪帆 高宇辰 《华中建筑》 2026年第4期149-152,共4页
建筑图学与画法几何是面向建筑类专业一年级新生的专业基础课,旨在培养学生三维空间与二维图形间思维互转能力,帮助学生掌握建筑专业工程图样识图与制图技巧,并理解建筑效果图制图原理。然而随着数字信息化技术发展,本课程某些教学内容... 建筑图学与画法几何是面向建筑类专业一年级新生的专业基础课,旨在培养学生三维空间与二维图形间思维互转能力,帮助学生掌握建筑专业工程图样识图与制图技巧,并理解建筑效果图制图原理。然而随着数字信息化技术发展,本课程某些教学内容已与时代脱节,与此同时,图学课程大幅削减,为了应对以上变化,本课程从教学内容、教学模式、教学方法三方面调整改革,期望通过这些调整策略,提高教学效率与效果。 展开更多
关键词 建筑图学 画法几何 课时削减 数字信息化 数字化模型
在线阅读 下载PDF
基于高斯图与有向无环图的双相障碍抑郁-睡眠症状网络核心症状
19
作者 蓝卫卫 黎冬梅 +3 位作者 龙建雄 蓝兰 龚祖康 苏莉 《新医学》 2026年第3期261-268,共8页
目的构建双相障碍患者抑郁症状与睡眠症状网络,探索核心症状及潜在因果关系。方法纳入2022年1月至2024年12月在南宁市第五人民医院住院的212例双相障碍住院患者,采用90项症状自评量表(SCL-90)的抑郁维度测评抑郁症状,匹兹堡睡眠质量指数... 目的构建双相障碍患者抑郁症状与睡眠症状网络,探索核心症状及潜在因果关系。方法纳入2022年1月至2024年12月在南宁市第五人民医院住院的212例双相障碍住院患者,采用90项症状自评量表(SCL-90)的抑郁维度测评抑郁症状,匹兹堡睡眠质量指数(PSQI)评估睡眠状况。采用高斯图模型分析抑郁症状与睡眠症状之间的相关性,并构建有向无环图分析潜在因果关系。结果212例双相障碍患者的SCL-90抑郁症状因子得分为21.50(14.00,38.75)分,PSQI得分为9.00(5.00,14.00)分;偏相关网络分析表明,抑郁症状与睡眠症状呈正相关。通过中心性分析识别出3个核心症状:感到苦闷(S30)、过分担忧(S31)及感到无价值(S79)。有向无环图结果显示,抑郁症状对睡眠症状存在潜在影响,其中感到无价值(S79)作为关键中介节点,可对其他症状产生直接或间接效应。结论感到无价值(S79)在症状网络中处于核心地位,容易诱发其他症状。临床治疗和干预应重点关注该症状,以减轻其对患者整体健康状况的负面影响。 展开更多
关键词 双相障碍 睡眠症状 抑郁症状 高斯图模型 有向无环图
暂未订购
Bootstrapping Large Language Models with Outsideknowledge for Knowledge-based Visual Question Answering
20
作者 Yanze Min Yawei Sun +2 位作者 Yin Zhu Jun Zhu Bo Zhang 《Machine Intelligence Research》 2026年第1期115-132,共18页
Knowledge-based visual question answering(KB-VQA),requiring external world knowledge beyond the image for reasoning,is more challenging than traditional visual question answering.Recent works have demonstrated the eff... Knowledge-based visual question answering(KB-VQA),requiring external world knowledge beyond the image for reasoning,is more challenging than traditional visual question answering.Recent works have demonstrated the effectiveness of using a large(vision)language model as an implicit knowledge source to acquire the necessary information.However,the knowledge stored in large models(LMs)is often coarse-grained and inaccurate,causing questions requiring finer-grained information to be answered incorrectly.In this work,we propose a variational expectation-maximization(EM)framework that bootstraps the VQA performance of LMs with its own answer.In contrast to former VQA pipelines,we treat the outside knowledge as a latent variable.In the E-step,we approximate the posterior with two components:First,a rough answer,e.g.,a general description of the image,which is usually the strength of LMs,and second,a multi-modal neural retriever to retrieve question-specific knowledge from an external knowledge base.In the M-step,the training objective optimizes the ability of the original LMs to generate rough answers as well as refined answers based on the retrieved information.Extensive experiments show that our proposed framework,BootLM,has a strong retrieval ability and achieves state-of-the-art performance on knowledge-based VQA tasks. 展开更多
关键词 Multi-modal large language models visual question answering(VQA) knowledge retrieval graphical models machine learning
原文传递
上一页 1 2 87 下一页 到第
使用帮助 返回顶部