期刊文献+
共找到1,174篇文章
< 1 2 59 >
每页显示 20 50 100
GaitMAFF:Adaptive Multi-Modal Fusion of Skeleton Maps and Silhouettes for Robust Gait Recognition in Complex Scenarios
1
作者 Zhongbin Luo Zhaoyang Guan +2 位作者 Wenxing You Yunteng Wang Yanqiu Bi 《Computers, Materials & Continua》 2026年第5期540-558,共19页
Gait recognition is a key biometric for long-distance identification,yet its performance is severely degraded by real-world challenges such as varying clothing,carrying conditions,and changing viewpoints.While combini... Gait recognition is a key biometric for long-distance identification,yet its performance is severely degraded by real-world challenges such as varying clothing,carrying conditions,and changing viewpoints.While combining silhouette and skeleton data is a promising direction,effectively fusing these heterogeneous modalities and adaptively weighting their contributions in response to diverse conditions remains a central problem.This paper introduces GaitMAFF,a novelMulti-modal Adaptive Feature Fusion Network,to address this challenge.Our approach first transforms discrete skeleton joints into a dense SkeletonMap representation to align with silhouettes,then employs an attention-based module to dynamically learn the fusion weights between the two modalities.These fused features are processed by a powerful spatio-temporal backbone withWeighted Global-Local Feature FusionModules(WFFM)to learn a discriminative representation.Extensive experiments on the challenging CCPG and Gait3D datasets show that GaitMAFF achieves state-of-the-art performance,with an average Rank-1 accuracy of 84.6%on CCPG and 58.7%on Gait3D.These results demonstrate that our adaptive fusion strategy effectively integrates complementary multimodal information,significantly enhancing gait recognition robustness and accuracy in complex scenes and providing a practical solution for real-world applications. 展开更多
关键词 Gait recognition multi-modal fusion adaptive feature fusion skeleton map SILHOUETTE
在线阅读 下载PDF
Laser additive manufacturing of high-resolution microscale shell lattices by toolpath engineering
2
作者 Junhao Ding Shuo Qu +8 位作者 Shengbiao Zhang Zongxin Hu Zhenyong Feng Tianyu Gao Ming Wang Fu Lei Zhang Chinnapat Panwisawas Wen Chen Xu Song 《International Journal of Extreme Manufacturing》 2026年第1期485-500,共16页
Laser additively manufactured microscale metallic lattices show great potential for high-performance applications,yet trade-offs among geometric precision,structural integrity,and computational efficiency still persis... Laser additively manufactured microscale metallic lattices show great potential for high-performance applications,yet trade-offs among geometric precision,structural integrity,and computational efficiency still persist.Here,we introduce a stereolithography file format-free(STL-free)hybrid toolpath generation method for laser-based powder bed fusion(PBF-LB)that synergizes implicit geometric modeling with optimized laser scanning strategy,overcoming these limitations.By circumventing traditional mesh-based workflows,our method directly translates implicit lattice geometries into laser toolpaths while precisely regulating energy deposition trajectories.This mesh-free process enables the fabrication of complex shell lattices with ultra-thin walls and enhanced surface quality.In addition to reducing memory usage and processing time by up to 90%,the method yields a synergistic enhancement in mechanical performance,notably improving both strength and toughness.By bridging computational design and fabrication,this framework enables the scalable production of high-performance microscale lattices and unlocks their potential for industrial applications. 展开更多
关键词 toolpath engineering STL-free hybrid toolpath high-resolution printing laser-based powder bed fusion microscale lattices
在线阅读 下载PDF
Data-Model Fusion Methods and Applications Toward Smart Manufacturing and Digital Engineering 被引量:1
3
作者 Fei Tao Yilin Li +2 位作者 Yupeng Wei Chenyuan Zhang Ying Zuo 《Engineering》 2025年第12期36-50,共15页
As pivotal supporting technologies for smart manufacturing and digital engineering,model-based and data-driven methods have been widely applied in many industrial fields,such as product design,process monitoring,and s... As pivotal supporting technologies for smart manufacturing and digital engineering,model-based and data-driven methods have been widely applied in many industrial fields,such as product design,process monitoring,and smart maintenance.While promising,both methods have issues that need to be addressed.For example,model-based methods are limited by low computational accuracy and a high computational burden,and data-driven methods always suffer from poor interpretability and redundant features.To address these issues,the concept of data-model fusion(DMF)emerges as a promising solution.DMF involves integrating model-based methods with data-driven methods by incorporating big data into model-based methods or embedding relevant domain knowledge into data-driven methods.Despite growing efforts in the field of DMF,a unanimous definition of DMF remains elusive,and a general framework of DMF has been rarely discussed.This paper aims to address this gap by providing a thorough overview and categorization of both data-driven methods and model-based methods.Subsequently,this paper also presents the definition and categorization of DMF and discusses the general framework of DMF.Moreover,the primary seven applications of DMF are reviewed within the context of smart manufacturing and digital engineering.Finally,this paper directs the future directions of DMF. 展开更多
关键词 Data-model fusion Model-based methods Data-driven methods Smart manufacturing Digital engineering
在线阅读 下载PDF
Multi-Modal Pre-Synergistic Fusion Entity Alignment Based on Mutual Information Strategy Optimization
4
作者 Huayu Li Xinxin Chen +3 位作者 Lizhuang Tan Konstantin I.Kostromitin Athanasios V.Vasilakos Peiying Zhang 《Computers, Materials & Continua》 2025年第11期4133-4153,共21页
To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities... To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities,this paper proposes a Multi-modal Pre-synergistic Entity Alignmentmodel based on Cross-modalMutual Information Strategy Optimization(MPSEA).The model first employs independent encoders to process multi-modal features,including text,images,and numerical values.Next,a multi-modal pre-synergistic fusion mechanism integrates graph structural and visual modal features into the textual modality as preparatory information.This pre-fusion strategy enables unified perception of heterogeneous modalities at the model’s initial stage,reducing discrepancies during the fusion process.Finally,using cross-modal deep perception reinforcement learning,the model achieves adaptive multilevel feature fusion between modalities,supporting learningmore effective alignment strategies.Extensive experiments on multiple public datasets show that the MPSEA method achieves gains of up to 7% in Hits@1 and 8.2% in MRR on the FBDB15K dataset,and up to 9.1% in Hits@1 and 7.7% in MRR on the FBYG15K dataset,compared to existing state-of-the-art methods.These results confirm the effectiveness of the proposed model. 展开更多
关键词 Knowledge graph multi-modal entity alignment feature fusion pre-synergistic fusion
在线阅读 下载PDF
Multi-Modal Named Entity Recognition with Auxiliary Visual Knowledge and Word-Level Fusion
5
作者 Huansha Wang Ruiyang Huang +1 位作者 Qinrang Liu Xinghao Wang 《Computers, Materials & Continua》 2025年第6期5747-5760,共14页
Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or ... Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or obtaining entity related external knowledge from knowledge bases or Large Language Models(LLMs).However,these approaches ignore the poor semantic correlation between visual and textual modalities in MNER datasets and do not explore different multi-modal fusion approaches.In this paper,we present MMAVK,a multi-modal named entity recognition model with auxiliary visual knowledge and word-level fusion,which aims to leverage the Multi-modal Large Language Model(MLLM)as an implicit knowledge base.It also extracts vision-based auxiliary knowledge from the image formore accurate and effective recognition.Specifically,we propose vision-based auxiliary knowledge generation,which guides the MLLM to extract external knowledge exclusively derived from images to aid entity recognition by designing target-specific prompts,thus avoiding redundant recognition and cognitive confusion caused by the simultaneous processing of image-text pairs.Furthermore,we employ a word-level multi-modal fusion mechanism to fuse the extracted external knowledge with each word-embedding embedded from the transformerbased encoder.Extensive experimental results demonstrate that MMAVK outperforms or equals the state-of-the-art methods on the two classical MNER datasets,even when the largemodels employed have significantly fewer parameters than other baselines. 展开更多
关键词 multi-modal named entity recognition large language model multi-modal fusion
在线阅读 下载PDF
Research Progress on Multi-Modal Fusion Object Detection Algorithms for Autonomous Driving:A Review
6
作者 Peicheng Shi Li Yang +2 位作者 Xinlong Dong Heng Qi Aixi Yang 《Computers, Materials & Continua》 2025年第6期3877-3917,共41页
As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advan... As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advancing the development of perception technology in autonomous driving.To further promote the development of fusion algorithms and improve detection performance,this paper discusses the advantages and recent advancements of multimodal fusion-based object detection algorithms.Starting fromsingle-modal sensor detection,the paper provides a detailed overview of typical sensors used in autonomous driving and introduces object detection methods based on images and point clouds.For image-based detection methods,they are categorized into monocular detection and binocular detection based on different input types.For point cloud-based detection methods,they are classified into projection-based,voxel-based,point cluster-based,pillar-based,and graph structure-based approaches based on the technical pathways for processing point cloud features.Additionally,multimodal fusion algorithms are divided into Camera-LiDAR fusion,Camera-Radar fusion,Camera-LiDAR-Radar fusion,and other sensor fusion methods based on the types of sensors involved.Furthermore,the paper identifies five key future research directions in this field,aiming to provide insights for researchers engaged in multimodal fusion-based object detection algorithms and to encourage broader attention to the research and application of multimodal fusion-based object detection. 展开更多
关键词 multi-modal fusion 3D object detection deep learning autonomous driving
在线阅读 下载PDF
RELIABILITY EVALUATION MODEL BASED ON DATA FUSION FOR AIRCRAFT ENGINES 被引量:3
7
作者 王华伟 吴海桥 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI 2012年第4期318-324,共7页
Reliability evaluation for aircraft engines is difficult because of the scarcity of failure data. But aircraft engine data are available from a variety of sources. Data fusion has the function of maximizing the amount... Reliability evaluation for aircraft engines is difficult because of the scarcity of failure data. But aircraft engine data are available from a variety of sources. Data fusion has the function of maximizing the amount of valu- able information extracted from disparate data sources to obtain the comprehensive reliability knowledge. Consid- ering the degradation failure and the catastrophic failure simultaneously, which are competing risks and can affect the reliability, a reliability evaluation model based on data fusion for aircraft engines is developed, Above the characteristics of the proposed model, reliability evaluation is more feasible than that by only utilizing failure data alone, and is also more accurate than that by only considering single failure mode. Example shows the effective- ness of the proposed model. 展开更多
关键词 aircraft engine reliability evaluation data fusion competing failure condition monitoring
在线阅读 下载PDF
A Comprehensive Survey on Deep Learning Multi-Modal Fusion:Methods,Technologies and Applications 被引量:10
8
作者 Tianzhe Jiao Chaopeng Guo +2 位作者 Xiaoyue Feng Yuming Chen Jie Song 《Computers, Materials & Continua》 SCIE EI 2024年第7期1-35,共35页
Multi-modal fusion technology gradually become a fundamental task in many fields,such as autonomous driving,smart healthcare,sentiment analysis,and human-computer interaction.It is rapidly becoming the dominant resear... Multi-modal fusion technology gradually become a fundamental task in many fields,such as autonomous driving,smart healthcare,sentiment analysis,and human-computer interaction.It is rapidly becoming the dominant research due to its powerful perception and judgment capabilities.Under complex scenes,multi-modal fusion technology utilizes the complementary characteristics of multiple data streams to fuse different data types and achieve more accurate predictions.However,achieving outstanding performance is challenging because of equipment performance limitations,missing information,and data noise.This paper comprehensively reviews existing methods based onmulti-modal fusion techniques and completes a detailed and in-depth analysis.According to the data fusion stage,multi-modal fusion has four primary methods:early fusion,deep fusion,late fusion,and hybrid fusion.The paper surveys the three majormulti-modal fusion technologies that can significantly enhance the effect of data fusion and further explore the applications of multi-modal fusion technology in various fields.Finally,it discusses the challenges and explores potential research opportunities.Multi-modal tasks still need intensive study because of data heterogeneity and quality.Preserving complementary information and eliminating redundant information between modalities is critical in multi-modal technology.Invalid data fusion methods may introduce extra noise and lead to worse results.This paper provides a comprehensive and detailed summary in response to these challenges. 展开更多
关键词 multi-modal fusion REPRESENTATION TRANSLATION ALIGNMENT deep learning comparative analysis
在线阅读 下载PDF
Grain boundary and microstructure engineering of Inconel 690 cladding on stainless-steel 316L using electron-beam powder bed fusion additive manufacturing 被引量:7
9
作者 I.A.Segura L.E.Murr +7 位作者 C.A.Terrazas D.Bermudez J.Mireles V.S.V.Injeti K.Li B.Yu R.D.K.Misra R.B.Wicker 《Journal of Materials Science & Technology》 SCIE EI CAS CSCD 2019年第2期351-367,共17页
This research explores the prospect of fabricating a face-centered cubic(fcc) Ni-base alloy cladding(Inconel 690) on an fcc Fe-base alloy(316 L stainless-steel) having improved mechanical properties and reduced sensit... This research explores the prospect of fabricating a face-centered cubic(fcc) Ni-base alloy cladding(Inconel 690) on an fcc Fe-base alloy(316 L stainless-steel) having improved mechanical properties and reduced sensitivity to corrosion through grain boundary and microstructure engineering concepts enabled by additive manufacturing(AM) utilizing electron-beam powder bed fusion(EPBF). The unique solidification and associated constitutional supercooling phenomena characteristic of EPBF promotes[100] textured and extended columnar grains having lower energy grain boundaries as opposed to random, high-angle grain boundaries, but no coherent {111} twin boundaries characteristic of conventional thermo-mechanically processed fcc metals and alloys, including Inconel 690 and 316 L stainless-steel.In addition to [100] textured grains, columnar grains were produced by EPBF fabrication of Inconel 690 claddings on 316 L stainless-steel substrates. Also, irregular 2–3 μm diameter, low energy subgrains were formed along with dislocation densities varying from 108 to 109 cm^2, and a homogeneous distribution of Cr_(23)C_6 precipitates. Precipitates were formed within the grains(with ~3 μm interparticle spacing),but not in the subgrain or columnar grain boundaries. These inclusive, hierarchical microstructures produced a tensile yield strength of 0.527 GPa, elongation of 21%, and Vickers microindentation hardness of 2.33 GPa for the Inconel 690 cladding in contrast to a tensile yield strength of 0.327 GPa, elongation of 53%, and Vickers microindentation hardness of 1.78 GPa, respectively for the wrought 316 L stainlesssteel substrate. Aging of both the Inconel 690 cladding and the 316 L stainless-steel substrate at 685?C for50 h precipitated Cr_(23)C_6 carbides in the Inconel 690 columnar grain boundaries, but not in the low-angle(and low energy) subgrain boundaries. In contrast, Cr_(23)C_6 carbides precipitated in the 316 L stainless-steel grain boundaries, but not in the low energy coherent {111} twin boundaries. Consequently, the Inconel690 subgrain boundaries essentially serve as surrogates for coherent twin boundaries with regard to avoiding carbide precipitation and corrosion sensitization. 展开更多
关键词 Additive manufacturing ELECTRON-BEAM powder bed fusion (EPBF) INCONEL 690 CLADDING 316L STAINLESS steel Grain boundary engineering Materials characterization Mechanical properties
原文传递
PowerDetector:Malicious PowerShell Script Family Classification Based on Multi-Modal Semantic Fusion and Deep Learning 被引量:8
10
作者 Xiuzhang Yang Guojun Peng +2 位作者 Dongni Zhang Yuhang Gao Chenguang Li 《China Communications》 SCIE CSCD 2023年第11期202-224,共23页
Power Shell has been widely deployed in fileless malware and advanced persistent threat(APT)attacks due to its high stealthiness and live-off-theland technique.However,existing works mainly focus on deobfuscation and ... Power Shell has been widely deployed in fileless malware and advanced persistent threat(APT)attacks due to its high stealthiness and live-off-theland technique.However,existing works mainly focus on deobfuscation and malicious detection,lacking the malicious Power Shell families classification and behavior analysis.Moreover,the state-of-the-art methods fail to capture fine-grained features and semantic relationships,resulting in low robustness and accuracy.To this end,we propose Power Detector,a novel malicious Power Shell script detector based on multimodal semantic fusion and deep learning.Specifically,we design four feature extraction methods to extract key features from character,token,abstract syntax tree(AST),and semantic knowledge graph.Then,we intelligently design four embeddings(i.e.,Char2Vec,Token2Vec,AST2Vec,and Rela2Vec) and construct a multi-modal fusion algorithm to concatenate feature vectors from different views.Finally,we propose a combined model based on transformer and CNN-Bi LSTM to implement Power Shell family detection.Our experiments with five types of Power Shell attacks show that PowerDetector can accurately detect various obfuscated and stealth PowerShell scripts,with a 0.9402 precision,a 0.9358 recall,and a 0.9374 F1-score.Furthermore,through singlemodal and multi-modal comparison experiments,we demonstrate that PowerDetector’s multi-modal embedding and deep learning model can achieve better accuracy and even identify more unknown attacks. 展开更多
关键词 deep learning malicious family detection multi-modal semantic fusion POWERSHELL
在线阅读 下载PDF
MMGC-Net: Deep neural network for classification of mineral grains using multi-modal polarization images 被引量:1
11
作者 Jun Shu Xiaohai He +3 位作者 Qizhi Teng Pengcheng Yan Haibo He Honggang Chen 《Journal of Rock Mechanics and Geotechnical Engineering》 2025年第6期3894-3909,共16页
The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring ef... The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring effective exploitation utilization of its resources.However,the existing methods for classifying mineral particles do not fully utilize these multi-modal features,thereby limiting the classification accuracy.Furthermore,when conventional multi-modal image classification methods are applied to planepolarized and cross-polarized sequence images of mineral particles,they encounter issues such as information loss,misaligned features,and challenges in spatiotemporal feature extraction.To address these challenges,we propose a multi-modal mineral particle polarization image classification network(MMGC-Net)for precise mineral particle classification.Initially,MMGC-Net employs a two-dimensional(2D)backbone network with shared parameters to extract features from two types of polarized images to ensure feature alignment.Subsequently,a cross-polarized intra-modal feature fusion module is designed to refine the spatiotemporal features from the extracted features of the cross-polarized sequence images.Ultimately,the inter-modal feature fusion module integrates the two types of modal features to enhance the classification precision.Quantitative and qualitative experimental results indicate that when compared with the current state-of-the-art multi-modal image classification methods,MMGC-Net demonstrates marked superiority in terms of mineral particle multi-modal feature learning and four classification evaluation metrics.It also demonstrates better stability than the existing models. 展开更多
关键词 Mineral particles multi-modal image classification Shared parameters Feature fusion Spatiotemporal feature
暂未订购
Adaptive Multi-modal Fusion Instance Segmentation for CAEVs in Complex Conditions:Dataset,Framework and Verifications 被引量:3
12
作者 Pai Peng Keke Geng +3 位作者 Guodong Yin Yanbo Lu Weichao Zhuang Shuaipeng Liu 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2021年第5期96-106,共11页
Current works of environmental perception for connected autonomous electrified vehicles(CAEVs)mainly focus on the object detection task in good weather and illumination conditions,they often perform poorly in adverse ... Current works of environmental perception for connected autonomous electrified vehicles(CAEVs)mainly focus on the object detection task in good weather and illumination conditions,they often perform poorly in adverse scenarios and have a vague scene parsing ability.This paper aims to develop an end-to-end sharpening mixture of experts(SMoE)fusion framework to improve the robustness and accuracy of the perception systems for CAEVs in complex illumination and weather conditions.Three original contributions make our work distinctive from the existing relevant literature.The Complex KITTI dataset is introduced which consists of 7481 pairs of modified KITTI RGB images and the generated LiDAR dense depth maps,and this dataset is fine annotated in instance-level with the proposed semi-automatic annotation method.The SMoE fusion approach is devised to adaptively learn the robust kernels from complementary modalities.Comprehensive comparative experiments are implemented,and the results show that the proposed SMoE framework yield significant improvements over the other fusion techniques in adverse environmental conditions.This research proposes a SMoE fusion framework to improve the scene parsing ability of the perception systems for CAEVs in adverse conditions. 展开更多
关键词 Connected autonomous electrified vehicles multi-modal fusion Semi-automatic annotation Sharpening mixture of experts Comparative experiments
在线阅读 下载PDF
Test method of laser paint removal based on multi-modal feature fusion 被引量:2
13
作者 HUANG Hai-peng HAO Ben-tian +2 位作者 YE De-jun GAO Hao LI Liang 《Journal of Central South University》 SCIE EI CAS CSCD 2022年第10期3385-3398,共14页
Laser cleaning is a highly nonlinear physical process for solving poor single-modal(e.g., acoustic or vision)detection performance and low inter-information utilization. In this study, a multi-modal feature fusion net... Laser cleaning is a highly nonlinear physical process for solving poor single-modal(e.g., acoustic or vision)detection performance and low inter-information utilization. In this study, a multi-modal feature fusion network model was constructed based on a laser paint removal experiment. The alignment of heterogeneous data under different modals was solved by combining the piecewise aggregate approximation and gramian angular field. Moreover, the attention mechanism was introduced to optimize the dual-path network and dense connection network, enabling the sampling characteristics to be extracted and integrated. Consequently, the multi-modal discriminant detection of laser paint removal was realized. According to the experimental results, the verification accuracy of the constructed model on the experimental dataset was 99.17%, which is 5.77% higher than the optimal single-modal detection results of the laser paint removal. The feature extraction network was optimized by the attention mechanism, and the model accuracy was increased by 3.3%. Results verify the improved classification performance of the constructed multi-modal feature fusion model in detecting laser paint removal, the effective integration of acoustic data and visual image data, and the accurate detection of laser paint removal. 展开更多
关键词 laser cleaning multi-modal fusion image processing deep learning
在线阅读 下载PDF
Multi-modality hierarchical fusion network for lumbar spine segmentation with magnetic resonance images 被引量:1
14
作者 Han Yan Guangtao Zhang +1 位作者 Wei Cui Zhuliang Yu 《Control Theory and Technology》 EI CSCD 2024年第4期612-622,共11页
For the analysis of spinal and disc diseases,automated tissue segmentation of the lumbar spine is vital.Due to the continuous and concentrated location of the target,the abundance of edge features,and individual diffe... For the analysis of spinal and disc diseases,automated tissue segmentation of the lumbar spine is vital.Due to the continuous and concentrated location of the target,the abundance of edge features,and individual differences,conventional automatic segmentation methods perform poorly.Since the success of deep learning in the segmentation of medical images has been shown in the past few years,it has been applied to this task in a number of ways.The multi-scale and multi-modal features of lumbar tissues,however,are rarely explored by methodologies of deep learning.Because of the inadequacies in medical images availability,it is crucial to effectively fuse various modes of data collection for model training to alleviate the problem of insufficient samples.In this paper,we propose a novel multi-modality hierarchical fusion network(MHFN)for improving lumbar spine segmentation by learning robust feature representations from multi-modality magnetic resonance images.An adaptive group fusion module(AGFM)is introduced in this paper to fuse features from various modes to extract cross-modality features that could be valuable.Furthermore,to combine features from low to high levels of cross-modality,we design a hierarchical fusion structure based on AGFM.Compared to the other feature fusion methods,AGFM is more effective based on experimental results on multi-modality MR images of the lumbar spine.To further enhance segmentation accuracy,we compare our network with baseline fusion structures.Compared to the baseline fusion structures(input-level:76.27%,layer-level:78.10%,decision-level:79.14%),our network was able to segment fractured vertebrae more accurately(85.05%). 展开更多
关键词 Lumbar spine segmentation Deep learning multi-modality fusion Feature fusion
原文传递
A multi-rate sensor fusion approach using information filters for estimating aero-engine performance degradation 被引量:5
15
作者 Feng LU Chunyu JIANG +1 位作者 Jinquan HUANG Xiaojie QIU 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2019年第7期1603-1617,共15页
Gas-path performance estimation plays an important role in aero-engine health management, and Kalman Filter(KF) is a well-known technique to estimate performance degradation. In previous studies, it is assumed that di... Gas-path performance estimation plays an important role in aero-engine health management, and Kalman Filter(KF) is a well-known technique to estimate performance degradation. In previous studies, it is assumed that different kinds of sensors are with the same sampling rate, and they are used for state estimation by the KF simultaneously. However, it is hard to achieve state estimation using various kinds of sensor measurements at the same sampling rate due to a complex network and physical characteristic differences between sensors, especially in an advanced multisensor architecture. For this purpose, a multi-rate sensor fusion using the information filtering approach is proposed based on the square-root cubature rule, which is called Multi-rate Squareroot Cubature Information Filter(MSCIF) to track engine performance degradation. Soft measurement synchronization of the MSCIF is designed to provide a sensor fusion condition for multiple sampling rates of measurement, and a fault sensor is isolated by maximum likelihood validation before state estimation. The contribution of this paper is to supply a novel multi-rate informationfilter approach for sensor fault tolerant health estimation of an aero-engine in a multi-sensor system. Tests are conducted for aero-engine performance degradation estimation with multiple sampling rates of sensor measurement on both digital simulation and semi-physical experiment.Experimental results illustrate the superiority of the proposed algorithm in terms of degradation estimation accuracy and robustness to sensor failure in a multi-sensor system. 展开更多
关键词 AERO-engine CUBATURE information filter Performance DEGRADATION Sensor fusion State estimation
原文传递
MMCSD:Multi-Modal Knowledge Graph Completion Based on Super-Resolution and Detailed Description Generation
16
作者 Huansha Wang Ruiyang Huang +2 位作者 Qinrang Liu Shaomei Li Jianpeng Zhang 《Computers, Materials & Continua》 2025年第4期761-783,共23页
Multi-modal knowledge graph completion(MMKGC)aims to complete missing entities or relations in multi-modal knowledge graphs,thereby discovering more previously unknown triples.Due to the continuous growth of data and ... Multi-modal knowledge graph completion(MMKGC)aims to complete missing entities or relations in multi-modal knowledge graphs,thereby discovering more previously unknown triples.Due to the continuous growth of data and knowledge and the limitations of data sources,the visual knowledge within the knowledge graphs is generally of low quality,and some entities suffer from the issue of missing visual modality.Nevertheless,previous studies of MMKGC have primarily focused on how to facilitate modality interaction and fusion while neglecting the problems of low modality quality and modality missing.In this case,mainstream MMKGC models only use pre-trained visual encoders to extract features and transfer the semantic information to the joint embeddings through modal fusion,which inevitably suffers from problems such as error propagation and increased uncertainty.To address these problems,we propose a Multi-modal knowledge graph Completion model based on Super-resolution and Detailed Description Generation(MMCSD).Specifically,we leverage a pre-trained residual network to enhance the resolution and improve the quality of the visual modality.Moreover,we design multi-level visual semantic extraction and entity description generation,thereby further extracting entity semantics from structural triples and visual images.Meanwhile,we train a variational multi-modal auto-encoder and utilize a pre-trained multi-modal language model to complement the missing visual features.We conducted experiments on FB15K-237 and DB13K,and the results showed that MMCSD can effectively perform MMKGC and achieve state-of-the-art performance. 展开更多
关键词 multi-modal knowledge graph knowledge graph completion multi-modal fusion
在线阅读 下载PDF
Transformers for Multi-Modal Image Analysis in Healthcare
17
作者 Sameera V Mohd Sagheer Meghana K H +2 位作者 P M Ameer Muneer Parayangat Mohamed Abbas 《Computers, Materials & Continua》 2025年第9期4259-4297,共39页
Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status... Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status.Each of these methods contributes unique diagnostic insights,enhancing the overall assessment of patient condition.Nevertheless,the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution,data collection methods,and noise levels.While traditional models like Convolutional Neural Networks(CNNs)excel in single-modality tasks,they struggle to handle multi-modal complexities,lacking the capacity to model global relationships.This research presents a novel approach for examining multi-modal medical imagery using a transformer-based system.The framework employs self-attention and cross-attention mechanisms to synchronize and integrate features across various modalities.Additionally,it shows resilience to variations in noise and image quality,making it adaptable for real-time clinical use.To address the computational hurdles linked to transformer models,particularly in real-time clinical applications in resource-constrained environments,several optimization techniques have been integrated to boost scalability and efficiency.Initially,a streamlined transformer architecture was adopted to minimize the computational load while maintaining model effectiveness.Methods such as model pruning,quantization,and knowledge distillation have been applied to reduce the parameter count and enhance the inference speed.Furthermore,efficient attention mechanisms such as linear or sparse attention were employed to alleviate the substantial memory and processing requirements of traditional self-attention operations.For further deployment optimization,researchers have implemented hardware-aware acceleration strategies,including the use of TensorRT and ONNX-based model compression,to ensure efficient execution on edge devices.These optimizations allow the approach to function effectively in real-time clinical settings,ensuring viability even in environments with limited resources.Future research directions include integrating non-imaging data to facilitate personalized treatment and enhancing computational efficiency for implementation in resource-limited environments.This study highlights the transformative potential of transformer models in multi-modal medical imaging,offering improvements in diagnostic accuracy and patient care outcomes. 展开更多
关键词 multi-modal image analysis medical imaging deep learning image segmentation disease detection multi-modal fusion Vision Transformers(ViTs) precision medicine clinical decision support
在线阅读 下载PDF
A multi-modal hierarchical approach for Chinese spelling correction using multi-head attention and residual connections
18
作者 SHAO Qing DU Yiwei 《High Technology Letters》 2025年第3期309-320,共12页
The primary objective of Chinese spelling correction(CSC)is to detect and correct erroneous characters in Chinese text,which can result from various factors,such as inaccuracies in pinyin representation,character rese... The primary objective of Chinese spelling correction(CSC)is to detect and correct erroneous characters in Chinese text,which can result from various factors,such as inaccuracies in pinyin representation,character resemblance,and semantic discrepancies.However,existing methods often struggle to fully address these types of errors,impacting the overall correction accuracy.This paper introduces a multi-modal feature encoder designed to efficiently extract features from three distinct modalities:pinyin,semantics,and character morphology.Unlike previous methods that rely on direct fusion or fixed-weight summation to integrate multi-modal information,our approach employs a multi-head attention mechanism to focuse more on relevant modal information while dis-regarding less pertinent data.To prevent issues such as gradient explosion or vanishing,the model incorporates a residual connection of the original text vector for fine-tuning.This approach ensures robust model performance by maintaining essential linguistic details throughout the correction process.Experimental evaluations on the SIGHAN benchmark dataset demonstrate that the pro-posed model outperforms baseline approaches across various metrics and datasets,confirming its effectiveness and feasibility. 展开更多
关键词 Chinese spelling correction multiple-headed attention multi-modal fusion resid-ual connection pinyin encoder
在线阅读 下载PDF
Adaptive cross-fusion learning for multi-modal gesture recognition 被引量:1
19
作者 Benjia ZHOU Jun WAN +1 位作者 Yanyan LIANG Guodong GUO 《Virtual Reality & Intelligent Hardware》 2021年第3期235-247,共13页
Background Gesture recognition has attracted significant attention because of its wide range of potential applications.Although multi-modal gesture recognition has made significant progress in recent years,a popular m... Background Gesture recognition has attracted significant attention because of its wide range of potential applications.Although multi-modal gesture recognition has made significant progress in recent years,a popular method still is simply fusing prediction scores at the end of each branch,which often ignores complementary features among different modalities in the early stage and does not fuse the complementary features into a more discriminative feature.Methods This paper proposes an Adaptive Cross-modal Weighting(ACmW)scheme to exploit complementarity features from RGB-D data in this study.The scheme learns relations among different modalities by combining the features of different data streams.The proposed ACmW module contains two key functions:(1)fusing complementary features from multiple streams through an adaptive one-dimensional convolution;and(2)modeling the correlation of multi-stream complementary features in the time dimension.Through the effective combination of these two functional modules,the proposed ACmW can automatically analyze the relationship between the complementary features from different streams,and can fuse them in the spatial and temporal dimensions.Results Extensive experiments validate the effectiveness of the proposed method,and show that our method outperforms state-of-the-art methods on IsoGD and NVGesture. 展开更多
关键词 Gesture recognition multi-modal fusion RGB-D
在线阅读 下载PDF
A Hand Features Based Fusion Recognition Network with Enhancing Multi-Modal Correlation
20
作者 Wei Wu Yuan Zhang +2 位作者 Yunpeng Li Chuanyang Li YanHao 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第7期537-555,共19页
Fusing hand-based features in multi-modal biometric recognition enhances anti-spoofing capabilities.Additionally,it leverages inter-modal correlation to enhance recognition performance.Concurrently,the robustness and ... Fusing hand-based features in multi-modal biometric recognition enhances anti-spoofing capabilities.Additionally,it leverages inter-modal correlation to enhance recognition performance.Concurrently,the robustness and recognition performance of the system can be enhanced through judiciously leveraging the correlation among multimodal features.Nevertheless,two issues persist in multi-modal feature fusion recognition:Firstly,the enhancement of recognition performance in fusion recognition has not comprehensively considered the inter-modality correlations among distinct modalities.Secondly,during modal fusion,improper weight selection diminishes the salience of crucial modal features,thereby diminishing the overall recognition performance.To address these two issues,we introduce an enhanced DenseNet multimodal recognition network founded on feature-level fusion.The information from the three modalities is fused akin to RGB,and the input network augments the correlation between modes through channel correlation.Within the enhanced DenseNet network,the Efficient Channel Attention Network(ECA-Net)dynamically adjusts the weight of each channel to amplify the salience of crucial information in each modal feature.Depthwise separable convolution markedly reduces the training parameters and further enhances the feature correlation.Experimental evaluations were conducted on four multimodal databases,comprising six unimodal databases,including multispectral palmprint and palm vein databases from the Chinese Academy of Sciences.The Equal Error Rates(EER)values were 0.0149%,0.0150%,0.0099%,and 0.0050%,correspondingly.In comparison to other network methods for palmprint,palm vein,and finger vein fusion recognition,this approach substantially enhances recognition performance,rendering it suitable for high-security environments with practical applicability.The experiments in this article utilized amodest sample database comprising 200 individuals.The subsequent phase involves preparing for the extension of the method to larger databases. 展开更多
关键词 BIOMETRICS multi-modal CORRELATION deep learning feature-level fusion
在线阅读 下载PDF
上一页 1 2 59 下一页 到第
使用帮助 返回顶部