MKGViLT:visual-and-language transformer based on medical knowledge graph embedding

下载PDF

导出

摘要 Medical visual question answering(MedVQA)aims to enhance diagnostic confidence and deepen patientsunderstanding of their health conditions.While the Transformer architecture is widely used in multimodal fields,its application in MedVQA requires further enhancement.A critical limitation of contemporary MedVQA systems lies in the inability to integrate lifelong knowledge with specific patient data to generate human-like responses.Existing Transformer-based MedVQA models require enhancing their capabitities for interpreting answers through the applications of medical image knowledge.The introduction of the medical knowledge graph visual language transformer(MKGViLT),designed for joint medical knowledge graphs(KGs),addresses this challenge.MKGViLT incorporates an enhanced Transformer structure to effectively extract features and combine modalities for MedVQA tasks.The MKGViLT model delivers answers based on richer background knowledge,thereby enhancing performance.The efficacy of MKGViLT is evaluated using the SLAKE and P-VQA datasets.Experimental results show that MKGViLT surpasses the most advanced methods on the SLAKE dataset.

作者 CUI Wencheng SHI Wentao SHAO Hong 崔文成

机构地区 School of Information Science and Engineering

出处《High Technology Letters》 2025年第1期73-85,共13页 高技术通讯(英文版)

基金 Supported by the National Natural Science Foundation of China(No.62001313) the Liaoning Professional Talent Protect(No.XLYC2203046) the Shenyang Municipal Medical Engineering Cross Research Foundation of China(No.22-321-32-09).

关键词 knowledge graph(KG) medical vision question answer(MedVQA) vision-andlanguage transformer

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1Peak Performance[J].Beijing Review,2025,68(13):2-2.
2Junming FAN,Yue YIN,Tian WANG,Wenhang DONG,Pai ZHENG,Lihui WANG.Vision-language model-based human-robot collaboration for smart manufacturing:A state-of-the-art survey[J].Frontiers of Engineering Management,2025,12(1):177-200. 被引量：1
3Zhang Pengcheng.Special topic on deep learning for medical image processing[J].Journal of Measurement Science and Instrumentation,2025,16(1).
4舒娇,赵诗琦,牟雪.基于新质人才培养需求模式下的教师专业能力发展[J].教育进展,2025,15(2):468-475.
5XU GANG.Chinese Economy:Changes and Consistencies[J].China Today,2025,74(4):52-54.
6Ming Yan,Tianyi Zhou Joey,W.Tsang Ivor.Collaborative Knowledge Infusion for Low-Resource Stance Detection[J].Big Data Mining and Analytics,2024,7(3):682-698. 被引量：1
7Houda Lamane,Latifa Mouhir,Rachid Moussadek,Bouamar Baghdad,Ozgur Kisi,Ali El Bilali.Interpreting machine learning models based on SHAP values in predicting suspended sediment concentration[J].International Journal of Sediment Research,2025,40(1):91-107. 被引量：3
8Zhang Hui.From AI Kung Fu to Economic Resilience:Boosting Global Confidence with Innovation and Policies[J].China Today,2025,74(4):2-2.
9Michelle Muxiao Wang,Yingpeng Wang.Superbugs on the international space station:challenge of space sterilization[J].感染控制(英文),2024(2):78-80.
10Yingrui Li,Jianlin Du,Songbai Deng,Bin Liu,Xiaodong Jing,Yuling Yan,Yajie Liu,Jing Wang,Xiaobo Zhou,Qiang She.The molecular mechanisms of cardiac development and related diseases[J].Signal Transduction and Targeted Therapy,2025,10(1):180-239.

High Technology Letters

2025年第1期

浏览历史

内容加载中请稍等...

MKGViLT:visual-and-language transformer based on medical knowledge graph embedding

相关作者

相关机构

相关主题

浏览历史