期刊文献+
共找到37,772篇文章
< 1 2 250 >
每页显示 20 50 100
Advances in longitudinal studies of amnestic mild cognitive impairment and Alzheimer's disease based on multi-modal MRI techniques 被引量:8
1
作者 Zhongjie Hu Liyong Wu +1 位作者 Jianping Jia Ying Han 《Neuroscience Bulletin》 SCIE CAS CSCD 2014年第2期198-206,共9页
Amnestic mild cognitive impairment (aMCI) is a prodromal stage of Alzheimer's disease (AD), and 75%-80% of aMCI patients finally develop AD. So, early identification of patients with aMCI or AD is of great signif... Amnestic mild cognitive impairment (aMCI) is a prodromal stage of Alzheimer's disease (AD), and 75%-80% of aMCI patients finally develop AD. So, early identification of patients with aMCI or AD is of great significance for prevention and intervention. According to cross-sectional studies, it is known that the hippocampus, posterior cingulate cortex, and corpus callosum are key areas in studies based on structural MRI (sMRI), functional MRI (fMRI), and diffusion tensor imaging (DTI) respectively. Recently, longitudinal studies using each MRI modality have demonstrated that the neuroimaging abnormalities generally involve the posterior brain regions at the very beginning and then gradually affect the anterior areas during the progression of aMCI to AD. However, it is not known whether follow-up studies based on multi-modal neuroimaging techniques (e.g., sMRI, fMRI, and DTI) can help build effective MRI models that can be directly applied to the screening and diagnosis of aMCI and AD. Thus, in the future, large-scale multi-center follow-up studies are urgently needed, not only to build an MRI diagnostic model that can be used on a single person, but also to evaluate the variability and stability of the model in the general population. In this review, we present longitudinal studies using each MRI modality separately, and then discuss the future directions in this field. 展开更多
关键词 magnetic resonance imaging amnestic mild cognitive impairment Alzheimer's disease multi-modalITY longitudinal studies
原文传递
SwinHCAD: A Robust Multi-Modality Segmentation Model for Brain Tumors Using Transformer and Channel-Wise Attention
2
作者 Seyong Jin Muhammad Fayaz +2 位作者 L.Minh Dang Hyoung-Kyu Song Hyeonjoon Moon 《Computers, Materials & Continua》 2026年第1期511-533,共23页
Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the b... Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the burden on medical staff and provides quantitative information,existing methodologies and recent models still struggle to accurately capture and classify the fine boundaries and diverse morphologies of tumors.In order to address these challenges and maximize the performance of brain tumor segmentation,this research introduces a novel SwinUNETR-based model by integrating a new decoder block,the Hierarchical Channel-wise Attention Decoder(HCAD),into a powerful SwinUNETR encoder.The HCAD decoder block utilizes hierarchical features and channelspecific attention mechanisms to further fuse information at different scales transmitted from the encoder and preserve spatial details throughout the reconstruction phase.Rigorous evaluations on the recent BraTS GLI datasets demonstrate that the proposed SwinHCAD model achieved superior and improved segmentation accuracy on both the Dice score and HD95 metrics across all tumor subregions(WT,TC,and ET)compared to baseline models.In particular,the rationale and contribution of the model design were clarified through ablation studies to verify the effectiveness of the proposed HCAD decoder block.The results of this study are expected to greatly contribute to enhancing the efficiency of clinical diagnosis and treatment planning by increasing the precision of automated brain tumor segmentation. 展开更多
关键词 Attention mechanism brain tumor segmentation channel-wise attention decoder deep learning medical imaging mri TRANSFORMER U-Net
在线阅读 下载PDF
Developing a multi-modal MRI radiomics-based model to predict the long-term overall survival of patients with hypopharyngeal cancer receiving definitive radiotherapy
3
作者 Xi-Wei Zhang Dilinaer Wusiman +8 位作者 Ye Zhang Xiao-Duo Yu Su-Sheng Miao Zhi Wang Shao-Yan Liu Zheng-Jiang Li Ying Sun Jun-Lin Yi Chang-Ming An 《World Journal of Otorhinolaryngology-Head and Neck Surgery》 2025年第3期440-448,共9页
Objective:The aim of this study is to develop a multimodal MRI radiomics-based model for predicting long-term overall survival in hypopharyngeal cancer patients undergoing definitive radiotherapy.Methods:We enrolled 2... Objective:The aim of this study is to develop a multimodal MRI radiomics-based model for predicting long-term overall survival in hypopharyngeal cancer patients undergoing definitive radiotherapy.Methods:We enrolled 207 hypopharyngeal cancer patients who underwent definitive radiotherapy and had 5-year overall survival outcomes from two major cancer centers in China.Pretreatment MRI images and clinical features were collected.Regions of interest(ROIs)for primary tumors and lymph node metastases(LNM)were delineated on T2 and contrast-enhanced T1(CE-T1)sequences.Principal component analysis(PCA),support vector machine(SVM),and 5-fold cross-validation were used to develop and evaluate the models.Results:Multivariate Cox regression analysis identified age under 50 years,advanced T stage,and N stage as risk factors for overall survival.Predictive models based solely on clinical features(Model A),single radiomics features(Model B),and their combination(Model C)performed poorly,with mean AUC values in the validation set of 0.663,0.772,and 0.779,respectively.The addition of multimodal LNM and CE-T1 radiomics features significantly improved prediction accuracy(Models D and E),with AUC values of 0.831 and 0.837 in the validation set.Conclusion:We developed a well-discriminating overall survival prediction model based on multimodal MRI radiomics,applicable to patients receiving definitive radiotherapy,which may contribute to personalized treatment strategies. 展开更多
关键词 hypopharyngeal cancer machine learning Magnetic Resonance Imaging(mri) radiomics survival analysis
原文传递
慢性高原病脑部改变的MRI研究进展
4
作者 王学玲 孙艳秋 《影像研究与医学应用》 2026年第1期1-3,共3页
慢性高原病是由于长期暴露于高海拔低氧环境而引起多系统受累的临床综合征,主要表现为红细胞过度增多、肺动脉高压及低氧血症。脑是一个对缺氧极其敏感又高耗氧、耗能的器官,长期处于高海拔缺氧状态下会出现头痛、头晕、失眠、记忆力减... 慢性高原病是由于长期暴露于高海拔低氧环境而引起多系统受累的临床综合征,主要表现为红细胞过度增多、肺动脉高压及低氧血症。脑是一个对缺氧极其敏感又高耗氧、耗能的器官,长期处于高海拔缺氧状态下会出现头痛、头晕、失眠、记忆力减退、注意力不集中等一系列症状。本综述基于MRI技术,探讨了慢性高原病对脑部结构和功能的影响,包括脑萎缩、脑白质病变、脑血管变化以及认知和情绪障碍,旨在为高海拔地区的居民提供健康指导,并为未来的研究提供方向。 展开更多
关键词 慢性高原病 高海拔 低氧血症 mri 大脑
暂未订购
应用MRI T_(2)^(*) mapping分区定量评估不同年龄组髌软骨的初步研究
5
作者 陈曦 胡杰 杨献峰 《影像研究与医学应用》 2026年第1期27-30,34,共5页
目的:探讨MRI T_(2)^(*)mapping定量技术在不同年龄段健康髌软骨研究中的应用价值。方法:回顾性收集2022年10月—2025年5月于南京大学医学院附属鼓楼医院接受膝关节软骨成像检查的100例健康髌软骨受检者的临床资料,按年龄分为10~19岁、2... 目的:探讨MRI T_(2)^(*)mapping定量技术在不同年龄段健康髌软骨研究中的应用价值。方法:回顾性收集2022年10月—2025年5月于南京大学医学院附属鼓楼医院接受膝关节软骨成像检查的100例健康髌软骨受检者的临床资料,按年龄分为10~19岁、20~29岁、30~39岁、40~49岁、50~59岁5组,每组20例。将髌软骨划为6个分区,应用T_(2)^(*)mapping技术定量分析各分区的T_(2)^(*)值及软骨厚度,并按年龄分组比较各区的差异。结果:不同年龄组别的髌软骨厚度之间差异无统计学意义(P>0.05)。20~29岁年龄组内侧下区的T_(2)^(*)值高于10~19岁、40~49岁、50~59岁年龄组(P<0.05);20~29岁年龄组外侧下区的T_(2)^(*)值高于40~49岁、50~59岁年龄组(P<0.05);50~59岁组内侧中区的T_(2)^(*)值低于20~29岁组、30~39岁组、40~49岁组(P<0.05);其他软骨分区的不同年龄组别间的髌软骨T_(2)^(*)值比较差异无统计学意义(P>0.05)。结论:软骨厚度参数在不同年龄段未呈现显著差异;T_(2)^(*)值的年龄相关性具有重要的临床价值,有助于早期髌软骨病变的诊断及治疗策略的制定。 展开更多
关键词 mri 髌软骨 年龄 T_(2)^(*)值 软骨厚度
暂未订购
神经根沉降征影响腰椎管狭窄症经皮内镜减压效果的MRI评价
6
作者 王楠 陈双 +5 位作者 席志鹏 钱宇章 张啸宇 顾军 康然 谢林 《中国组织工程研究》 北大核心 2026年第9期2262-2268,共7页
背景:神经根沉降征作为腰椎管狭窄的新评估指标,提高了对腰椎管狭窄症的影像学认识,但是关于神经根沉降征是否影响全内窥镜下腰椎管减压的预后疗效,目前仍存在争议。目的:探讨神经根沉降征对全内窥镜下腰椎管减压治疗腰椎管狭窄症疗效... 背景:神经根沉降征作为腰椎管狭窄的新评估指标,提高了对腰椎管狭窄症的影像学认识,但是关于神经根沉降征是否影响全内窥镜下腰椎管减压的预后疗效,目前仍存在争议。目的:探讨神经根沉降征对全内窥镜下腰椎管减压治疗腰椎管狭窄症疗效的影响。方法:回顾性分析江苏省中西医结合医院2018年9月至2022年9月收治的69例腰椎管狭窄症患者行全内窥镜下腰椎管减压的病历资料。根据MRI下神经根是否沉降将患者分为2组,阳性组45例,阴性组24例。比较两组患者一般资料、腰痛及腿痛目测类比评分、Oswestry功能障碍指数及Macnab疗效优良率,对比治疗前后腰椎椎管矢状径、横径、椎管面积及腰椎前凸角的变化。结果与结论:①两组患者术后腰腿痛目测类比评分及Oswestry功能障碍指数均较术前有所降低,差异有显著性意义(P<0.05);组间比较而言,阳性组治疗后1周、1年腰腿痛目测类比评分明显低于阴性组,差异有显著性意义(P<0.05);②两组术后椎管面积、椎管矢状径及椎管横径均较术前明显扩大,差异有显著性意义(P<0.05);③两组术后腰椎前凸角均未产生明显影响,术前、术后相比差异均无显著性意义(P>0.05);④通过改良MacNab标准评估患者术后1年疗效,阳性组优30例,良11例,可3例,差1例,优良率为91%;阴性组优16例,良4例,可4例,优良率为83%,但两组间差异无显著性意义(P>0.05);⑤结果表明,全内窥镜下腰椎管减压治疗腰椎管狭窄症疗效突出,可达到精确减压,MRI上可得到良好的体现,而伴或不伴马尾神经根沉降征对术后疗效无明显影响。 展开更多
关键词 神经根沉降征 腰椎管狭窄症 全内窥镜技术 mri 椎管面积 腰椎前凸角 回顾性研究
暂未订购
Construction and evaluation of a predictive model for the degree of coronary artery occlusion based on adaptive weighted multi-modal fusion of traditional Chinese and western medicine data 被引量:2
7
作者 Jiyu ZHANG Jiatuo XU +1 位作者 Liping TU Hongyuan FU 《Digital Chinese Medicine》 2025年第2期163-173,共11页
Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocar... Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocardiographic data,traditional Chinese medicine(TCM)tongue manifestations,and facial features were collected from patients who underwent coro-nary computed tomography angiography(CTA)in the Cardiac Care Unit(CCU)of Shanghai Tenth People's Hospital between May 1,2023 and May 1,2024.An adaptive weighted multi-modal data fusion(AWMDF)model based on deep learning was constructed to predict the severity of coronary artery stenosis.The model was evaluated using metrics including accura-cy,precision,recall,F1 score,and the area under the receiver operating characteristic(ROC)curve(AUC).Further performance assessment was conducted through comparisons with six ensemble machine learning methods,data ablation,model component ablation,and various decision-level fusion strategies.Results A total of 158 patients were included in the study.The AWMDF model achieved ex-cellent predictive performance(AUC=0.973,accuracy=0.937,precision=0.937,recall=0.929,and F1 score=0.933).Compared with model ablation,data ablation experiments,and various traditional machine learning models,the AWMDF model demonstrated superior per-formance.Moreover,the adaptive weighting strategy outperformed alternative approaches,including simple weighting,averaging,voting,and fixed-weight schemes.Conclusion The AWMDF model demonstrates potential clinical value in the non-invasive prediction of coronary artery disease and could serve as a tool for clinical decision support. 展开更多
关键词 Coronary artery disease Deep learning multi-modal Clinical prediction Traditional Chinese medicine diagnosis
暂未订购
TCM network pharmacology:new perspective integrating network target with artificial intelligence and multi-modal multi-omics technologies 被引量:1
8
作者 Ziyi Wang Tingyu Zhang +1 位作者 Boyang Wang Shao Li 《Chinese Journal of Natural Medicines》 2025年第11期1425-1434,共10页
Traditional Chinese medicine(TCM)demonstrates distinctive advantages in disease prevention and treatment.However,analyzing its biological mechanisms through the modern medical research paradigm of“single drug,single ... Traditional Chinese medicine(TCM)demonstrates distinctive advantages in disease prevention and treatment.However,analyzing its biological mechanisms through the modern medical research paradigm of“single drug,single target”presents significant challenges due to its holistic approach.Network pharmacology and its core theory of network targets connect drugs and diseases from a holistic and systematic perspective based on biological networks,overcoming the limitations of reductionist research models and showing considerable value in TCM research.Recent integration of network target computational and experimental methods with artificial intelligence(AI)and multi-modal multi-omics technologies has substantially enhanced network pharmacology methodology.The advancement in computational and experimental techniques provides complementary support for network target theory in decoding TCM principles.This review,centered on network targets,examines the progress of network target methods combined with AI in predicting disease molecular mechanisms and drug-target relationships,alongside the application of multi-modal multi-omics technologies in analyzing TCM formulae,syndromes,and toxicity.Looking forward,network target theory is expected to incorporate emerging technologies while developing novel approaches aligned with its unique characteristics,potentially leading to significant breakthroughs in TCM research and advancing scientific understanding and innovation in TCM. 展开更多
关键词 Network pharmacology Traditional Chinese medicine Network target Artificial intelligence multi-modal Multi-omics
原文传递
Multi-modal intelligent situation awareness in real-time air traffic control: Control intent understanding and flight trajectory prediction 被引量:1
9
作者 Dongyue GUO Jianwei ZHANG +1 位作者 Bo YANG Yi LIN 《Chinese Journal of Aeronautics》 2025年第6期41-57,共17页
With the advent of the next-generation Air Traffic Control(ATC)system,there is growing interest in using Artificial Intelligence(AI)techniques to enhance Situation Awareness(SA)for ATC Controllers(ATCOs),i.e.,Intellig... With the advent of the next-generation Air Traffic Control(ATC)system,there is growing interest in using Artificial Intelligence(AI)techniques to enhance Situation Awareness(SA)for ATC Controllers(ATCOs),i.e.,Intelligent SA(ISA).However,the existing AI-based SA approaches often rely on unimodal data and lack a comprehensive description and benchmark of the ISA tasks utilizing multi-modal data for real-time ATC environments.To address this gap,by analyzing the situation awareness procedure of the ATCOs,the ISA task is refined to the processing of the two primary elements,i.e.,spoken instructions and flight trajectories.Subsequently,the ISA is further formulated into Controlling Intent Understanding(CIU)and Flight Trajectory Prediction(FTP)tasks.For the CIU task,an innovative automatic speech recognition and understanding framework is designed to extract the controlling intent from unstructured and continuous ATC communications.For the FTP task,the single-and multi-horizon FTP approaches are investigated to support the high-precision prediction of the situation evolution.A total of 32 unimodal/multi-modal advanced methods with extensive evaluation metrics are introduced to conduct the benchmarks on the real-world multi-modal ATC situation dataset.Experimental results demonstrate the effectiveness of AI-based techniques in enhancing ISA for the ATC environment. 展开更多
关键词 Airtraffic control Automatic speechrecognition and understanding Flight trajectory prediction multi-modal Situationawareness
原文传递
Personal Style Guided Outfit Recommendation with Multi-Modal Fashion Compatibility Modeling 被引量:1
10
作者 WANG Kexin ZHANG Jie +3 位作者 ZHANG Peng SUN Kexin ZHAN Jiamei WEI Meng 《Journal of Donghua University(English Edition)》 2025年第2期156-167,共12页
A personalized outfit recommendation has emerged as a hot research topic in the fashion domain.However,existing recommendations do not fully exploit user style preferences.Typically,users prefer particular styles such... A personalized outfit recommendation has emerged as a hot research topic in the fashion domain.However,existing recommendations do not fully exploit user style preferences.Typically,users prefer particular styles such as casual and athletic styles,and consider attributes like color and texture when selecting outfits.To achieve personalized outfit recommendations in line with user style preferences,this paper proposes a personal style guided outfit recommendation with multi-modal fashion compatibility modeling,termed as PSGNet.Firstly,a style classifier is designed to categorize fashion images of various clothing types and attributes into distinct style categories.Secondly,a personal style prediction module extracts user style preferences by analyzing historical data.Then,to address the limitations of single-modal representations and enhance fashion compatibility,both fashion images and text data are leveraged to extract multi-modal features.Finally,PSGNet integrates these components through Bayesian personalized ranking(BPR)to unify the personal style and fashion compatibility,where the former is used as personal style features and guides the output of the personalized outfit recommendation tailored to the target user.Extensive experiments on large-scale datasets demonstrate that the proposed model is efficient on the personalized outfit recommendation. 展开更多
关键词 personalized outfit recommendation fashion compatibility modeling style preference multi-modal representation Bayesian personalized ranking(BPR) style classifier
暂未订购
静息态功能性MRI评价蓝斑-去甲肾上腺素系统介导伴有失眠抑郁症的发生机制
11
作者 李仲贤 焦梓桐 +9 位作者 任涵月 张潘 彭敏 黄颖欣 李梦瑶 胡玥琛 梁峻铨 阎路达 符文彬 周鹏 《中国组织工程研究》 北大核心 2026年第12期3083-3090,共8页
背景:相关研究发现,抑郁症患者外周血去甲肾上腺素水平较低,失眠患者存在去甲肾上腺素代谢紊乱现象,提示蓝斑-去甲肾上腺素系统的功能异常可能是构成抑郁失眠共病的神经生物学基础。目的:采用静息态功能性MRI成像观察伴有失眠症状的抑... 背景:相关研究发现,抑郁症患者外周血去甲肾上腺素水平较低,失眠患者存在去甲肾上腺素代谢紊乱现象,提示蓝斑-去甲肾上腺素系统的功能异常可能是构成抑郁失眠共病的神经生物学基础。目的:采用静息态功能性MRI成像观察伴有失眠症状的抑郁症患者的脑干蓝斑功能连接,结合患者外周血去甲肾上腺素水平探讨伴有失眠症状抑郁症的潜在发生机制。方法:于2023年3月至2024年9月在深圳市宝安区中医院和社会招募伴有失眠症状的抑郁症患者60例(病例组),同期招募30例健康对照(健康对照组),采用汉密尔顿抑郁量表17项(HAMD-17)、抑郁自评量表(SDS)、匹兹堡睡眠指数量表(PSQI)和失眠严重程度指数量表(ISI)评估所有受试者的抑郁状态和睡眠质量,静息态功能性MRI检测所有受试者蓝斑区域的功能连接,ELISA法检测外周血去甲肾上腺素水平。对上述各项指标进行组间比较,采用Pearson相关分析功能连接差异显著的脑区、外周血去甲肾上腺素水平与临床量表评分的相关性。结果与结论:(1)病例组HAMD-17评分、SDS评分、PSQI评分和ISI评分均高于对照组(P<0.05),左侧蓝斑-左楔前叶、左侧蓝斑-左顶下小叶的功能连接值与外周血去甲肾上腺素水平低于健康受试组(P<0.05)。Pearson相关分析显示,外周血去甲肾上腺素水平与左侧蓝斑-左楔前叶功能连接值(r=0.40,P<0.01)、左侧蓝斑-顶下小叶功能连接值(r=0.36,P<0.01)呈正相关,与HAMD-17评分(r=-0.42,P<0.01)、PSQI评分(r=-0.46,P<0.01)呈负相关;左侧蓝斑-左楔前叶功能连接值与HAMD-17评分(r=-0.41,P<0.01)、PSQI评分(r=-0.44,P<0.01)呈负相关,左侧蓝斑-顶下小叶功能连接值与HAMD-17评分(r=-0.29,P<0.01)、PSQI评分(r=-0.36,P<0.01)呈负相关。(2)结果表明,左侧蓝斑与左楔前叶、左顶下小叶功能连接值及外周血去甲肾上腺素水平的降低与抑郁和失眠症状的加重密切相关,提示蓝斑-去甲肾上腺素系统功能失调可能通过影响大脑默认模式网络(包括左楔前叶和左顶下小叶)的功能连接,介导伴有失眠症状的抑郁症的发生机制。 展开更多
关键词 抑郁症 失眠 蓝斑 去甲肾上腺素 功能性mri成像 功能连接 机制研究
暂未订购
Tri-M2MT:Multi-modalities based effective acute bilirubin encephalopathy diagnosis through multi-transformer using neonatal Magnetic Resonance Imaging
12
作者 Kumar Perumal Rakesh Kumar Mahendran +1 位作者 Arfat Ahmad Khan Seifedine Kadry 《CAAI Transactions on Intelligence Technology》 2025年第2期434-449,共16页
Acute Bilirubin Encephalopathy(ABE)is a significant threat to neonates and it leads to disability and high mortality rates.Detecting and treating ABE promptly is important to prevent further complications and long-ter... Acute Bilirubin Encephalopathy(ABE)is a significant threat to neonates and it leads to disability and high mortality rates.Detecting and treating ABE promptly is important to prevent further complications and long-term issues.Recent studies have explored ABE diagnosis.However,they often face limitations in classification due to reliance on a single modality of Magnetic Resonance Imaging(MRI).To tackle this problem,the authors propose a Tri-M2MT model for precise ABE detection by using tri-modality MRI scans.The scans include T1-weighted imaging(T1WI),T2-weighted imaging(T2WI),and apparent diffusion coefficient maps to get indepth information.Initially,the tri-modality MRI scans are collected and preprocessesed by using an Advanced Gaussian Filter for noise reduction and Z-score normalisation for data standardisation.An Advanced Capsule Network was utilised to extract relevant features by using Snake Optimization Algorithm to select optimal features based on feature correlation with the aim of minimising complexity and enhancing detection accuracy.Furthermore,a multi-transformer approach was used for feature fusion and identify feature correlations effectively.Finally,accurate ABE diagnosis is achieved through the utilisation of a SoftMax layer.The performance of the proposed Tri-M2MT model is evaluated across various metrics,including accuracy,specificity,sensitivity,F1-score,and ROC curve analysis,and the proposed methodology provides better performance compared to existing methodologies. 展开更多
关键词 Acute Bilirubin Encephalopathy(ABE)Diagnosis feature extraction mri multi-modalITY multi-transformer NEONATAL
在线阅读 下载PDF
Multi-Modal Named Entity Recognition with Auxiliary Visual Knowledge and Word-Level Fusion
13
作者 Huansha Wang Ruiyang Huang +1 位作者 Qinrang Liu Xinghao Wang 《Computers, Materials & Continua》 2025年第6期5747-5760,共14页
Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or ... Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or obtaining entity related external knowledge from knowledge bases or Large Language Models(LLMs).However,these approaches ignore the poor semantic correlation between visual and textual modalities in MNER datasets and do not explore different multi-modal fusion approaches.In this paper,we present MMAVK,a multi-modal named entity recognition model with auxiliary visual knowledge and word-level fusion,which aims to leverage the Multi-modal Large Language Model(MLLM)as an implicit knowledge base.It also extracts vision-based auxiliary knowledge from the image formore accurate and effective recognition.Specifically,we propose vision-based auxiliary knowledge generation,which guides the MLLM to extract external knowledge exclusively derived from images to aid entity recognition by designing target-specific prompts,thus avoiding redundant recognition and cognitive confusion caused by the simultaneous processing of image-text pairs.Furthermore,we employ a word-level multi-modal fusion mechanism to fuse the extracted external knowledge with each word-embedding embedded from the transformerbased encoder.Extensive experimental results demonstrate that MMAVK outperforms or equals the state-of-the-art methods on the two classical MNER datasets,even when the largemodels employed have significantly fewer parameters than other baselines. 展开更多
关键词 multi-modal named entity recognition large language model multi-modal fusion
在线阅读 下载PDF
MMCSD:Multi-Modal Knowledge Graph Completion Based on Super-Resolution and Detailed Description Generation
14
作者 Huansha Wang Ruiyang Huang +2 位作者 Qinrang Liu Shaomei Li Jianpeng Zhang 《Computers, Materials & Continua》 2025年第4期761-783,共23页
Multi-modal knowledge graph completion(MMKGC)aims to complete missing entities or relations in multi-modal knowledge graphs,thereby discovering more previously unknown triples.Due to the continuous growth of data and ... Multi-modal knowledge graph completion(MMKGC)aims to complete missing entities or relations in multi-modal knowledge graphs,thereby discovering more previously unknown triples.Due to the continuous growth of data and knowledge and the limitations of data sources,the visual knowledge within the knowledge graphs is generally of low quality,and some entities suffer from the issue of missing visual modality.Nevertheless,previous studies of MMKGC have primarily focused on how to facilitate modality interaction and fusion while neglecting the problems of low modality quality and modality missing.In this case,mainstream MMKGC models only use pre-trained visual encoders to extract features and transfer the semantic information to the joint embeddings through modal fusion,which inevitably suffers from problems such as error propagation and increased uncertainty.To address these problems,we propose a Multi-modal knowledge graph Completion model based on Super-resolution and Detailed Description Generation(MMCSD).Specifically,we leverage a pre-trained residual network to enhance the resolution and improve the quality of the visual modality.Moreover,we design multi-level visual semantic extraction and entity description generation,thereby further extracting entity semantics from structural triples and visual images.Meanwhile,we train a variational multi-modal auto-encoder and utilize a pre-trained multi-modal language model to complement the missing visual features.We conducted experiments on FB15K-237 and DB13K,and the results showed that MMCSD can effectively perform MMKGC and achieve state-of-the-art performance. 展开更多
关键词 multi-modal knowledge graph knowledge graph completion multi-modal fusion
在线阅读 下载PDF
Transformers for Multi-Modal Image Analysis in Healthcare
15
作者 Sameera V Mohd Sagheer Meghana K H +2 位作者 P M Ameer Muneer Parayangat Mohamed Abbas 《Computers, Materials & Continua》 2025年第9期4259-4297,共39页
Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status... Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status.Each of these methods contributes unique diagnostic insights,enhancing the overall assessment of patient condition.Nevertheless,the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution,data collection methods,and noise levels.While traditional models like Convolutional Neural Networks(CNNs)excel in single-modality tasks,they struggle to handle multi-modal complexities,lacking the capacity to model global relationships.This research presents a novel approach for examining multi-modal medical imagery using a transformer-based system.The framework employs self-attention and cross-attention mechanisms to synchronize and integrate features across various modalities.Additionally,it shows resilience to variations in noise and image quality,making it adaptable for real-time clinical use.To address the computational hurdles linked to transformer models,particularly in real-time clinical applications in resource-constrained environments,several optimization techniques have been integrated to boost scalability and efficiency.Initially,a streamlined transformer architecture was adopted to minimize the computational load while maintaining model effectiveness.Methods such as model pruning,quantization,and knowledge distillation have been applied to reduce the parameter count and enhance the inference speed.Furthermore,efficient attention mechanisms such as linear or sparse attention were employed to alleviate the substantial memory and processing requirements of traditional self-attention operations.For further deployment optimization,researchers have implemented hardware-aware acceleration strategies,including the use of TensorRT and ONNX-based model compression,to ensure efficient execution on edge devices.These optimizations allow the approach to function effectively in real-time clinical settings,ensuring viability even in environments with limited resources.Future research directions include integrating non-imaging data to facilitate personalized treatment and enhancing computational efficiency for implementation in resource-limited environments.This study highlights the transformative potential of transformer models in multi-modal medical imaging,offering improvements in diagnostic accuracy and patient care outcomes. 展开更多
关键词 multi-modal image analysis medical imaging deep learning image segmentation disease detection multi-modal fusion Vision Transformers(ViTs) precision medicine clinical decision support
在线阅读 下载PDF
Multi-Modal Pre-Synergistic Fusion Entity Alignment Based on Mutual Information Strategy Optimization
16
作者 Huayu Li Xinxin Chen +3 位作者 Lizhuang Tan Konstantin I.Kostromitin Athanasios V.Vasilakos Peiying Zhang 《Computers, Materials & Continua》 2025年第11期4133-4153,共21页
To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities... To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities,this paper proposes a Multi-modal Pre-synergistic Entity Alignmentmodel based on Cross-modalMutual Information Strategy Optimization(MPSEA).The model first employs independent encoders to process multi-modal features,including text,images,and numerical values.Next,a multi-modal pre-synergistic fusion mechanism integrates graph structural and visual modal features into the textual modality as preparatory information.This pre-fusion strategy enables unified perception of heterogeneous modalities at the model’s initial stage,reducing discrepancies during the fusion process.Finally,using cross-modal deep perception reinforcement learning,the model achieves adaptive multilevel feature fusion between modalities,supporting learningmore effective alignment strategies.Extensive experiments on multiple public datasets show that the MPSEA method achieves gains of up to 7% in Hits@1 and 8.2% in MRR on the FBDB15K dataset,and up to 9.1% in Hits@1 and 7.7% in MRR on the FBYG15K dataset,compared to existing state-of-the-art methods.These results confirm the effectiveness of the proposed model. 展开更多
关键词 Knowledge graph multi-modal entity alignment feature fusion pre-synergistic fusion
在线阅读 下载PDF
Research Progress on Multi-Modal Fusion Object Detection Algorithms for Autonomous Driving:A Review
17
作者 Peicheng Shi Li Yang +2 位作者 Xinlong Dong Heng Qi Aixi Yang 《Computers, Materials & Continua》 2025年第6期3877-3917,共41页
As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advan... As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advancing the development of perception technology in autonomous driving.To further promote the development of fusion algorithms and improve detection performance,this paper discusses the advantages and recent advancements of multimodal fusion-based object detection algorithms.Starting fromsingle-modal sensor detection,the paper provides a detailed overview of typical sensors used in autonomous driving and introduces object detection methods based on images and point clouds.For image-based detection methods,they are categorized into monocular detection and binocular detection based on different input types.For point cloud-based detection methods,they are classified into projection-based,voxel-based,point cluster-based,pillar-based,and graph structure-based approaches based on the technical pathways for processing point cloud features.Additionally,multimodal fusion algorithms are divided into Camera-LiDAR fusion,Camera-Radar fusion,Camera-LiDAR-Radar fusion,and other sensor fusion methods based on the types of sensors involved.Furthermore,the paper identifies five key future research directions in this field,aiming to provide insights for researchers engaged in multimodal fusion-based object detection algorithms and to encourage broader attention to the research and application of multimodal fusion-based object detection. 展开更多
关键词 multi-modal fusion 3D object detection deep learning autonomous driving
在线阅读 下载PDF
A multi-modal hierarchical approach for Chinese spelling correction using multi-head attention and residual connections
18
作者 SHAO Qing DU Yiwei 《High Technology Letters》 2025年第3期309-320,共12页
The primary objective of Chinese spelling correction(CSC)is to detect and correct erroneous characters in Chinese text,which can result from various factors,such as inaccuracies in pinyin representation,character rese... The primary objective of Chinese spelling correction(CSC)is to detect and correct erroneous characters in Chinese text,which can result from various factors,such as inaccuracies in pinyin representation,character resemblance,and semantic discrepancies.However,existing methods often struggle to fully address these types of errors,impacting the overall correction accuracy.This paper introduces a multi-modal feature encoder designed to efficiently extract features from three distinct modalities:pinyin,semantics,and character morphology.Unlike previous methods that rely on direct fusion or fixed-weight summation to integrate multi-modal information,our approach employs a multi-head attention mechanism to focuse more on relevant modal information while dis-regarding less pertinent data.To prevent issues such as gradient explosion or vanishing,the model incorporates a residual connection of the original text vector for fine-tuning.This approach ensures robust model performance by maintaining essential linguistic details throughout the correction process.Experimental evaluations on the SIGHAN benchmark dataset demonstrate that the pro-posed model outperforms baseline approaches across various metrics and datasets,confirming its effectiveness and feasibility. 展开更多
关键词 Chinese spelling correction multiple-headed attention multi-modal fusion resid-ual connection pinyin encoder
在线阅读 下载PDF
Effectiveness of a multi-modal intervention protocol for preventing stress ulcers in critically ill older patients after gastrointestinal surgery
19
作者 Hai-Ming Xi Man-Li Tian +3 位作者 Ya-Li Tian Hui Liu Yun Wang Min-Juan Chu 《World Journal of Gastrointestinal Surgery》 2025年第4期316-323,共8页
BACKGROUND Stress ulcers are common complications in critically ill patients,with a higher incidence observed in older patients following gastrointestinal surgery.This study aimed to develop and evaluate the effective... BACKGROUND Stress ulcers are common complications in critically ill patients,with a higher incidence observed in older patients following gastrointestinal surgery.This study aimed to develop and evaluate the effectiveness of a multi-modal intervention protocol to prevent stress ulcers in this high-risk population.AIM To assess the impact of a multi-modal intervention on preventing stress ulcers in older intensive care unit(ICU)patients postoperatively.METHODS A randomized controlled trial involving critically ill patients(aged≥65 years)admitted to the ICU after gastrointestinal surgery was conducted.Patients were randomly assigned to either the intervention group,which received a multimodal stress ulcer prevention protocol,or the control group,which received standard care.The primary outcome measure was the incidence of stress ulcers.The secondary outcomes included ulcer healing time,complication rates,and length of hospital stay.RESULTS A total of 200 patients(100 in each group)were included in this study.The intervention group exhibited a significantly lower incidence of stress ulcers than the control group(15%vs 30%,P<0.01).Additionally,the intervention group demonstrated shorter ulcer healing times(mean 5.2 vs 7.8 days,P<0.05),lower complication rates(10%vs 22%,P<0.05),and reduced length of hospital stay(mean 12.3 vs 15.7 days,P<0.05).CONCLUSION This multi-modal intervention protocol significantly reduced the incidence of stress ulcers and improved clinical outcomes in critically ill older patients after gastrointestinal surgery.This comprehensive approach may provide a valuable strategy for managing high-risk populations in intensive care settings. 展开更多
关键词 Stress ulcers Older patients Gastrointestinal surgery Critical care multi-modal intervention
暂未订购
上一页 1 2 250 下一页 到第
使用帮助 返回顶部