期刊文献+
共找到257,935篇文章
< 1 2 250 >
每页显示 20 50 100
From preparedness to action:Synthesising insights on robot usage in post-Earthquake search operations
1
作者 Rajashekhar V S Gowdham Prabhakar 《Resilient Cities and Structures》 2026年第1期60-70,共11页
Human life is invaluable,and timely efforts are crucial to rescue individuals trapped under debris following an earthquake.To identify opportunities for improving post-earthquake search and rescue(SAR)robotics,we get ... Human life is invaluable,and timely efforts are crucial to rescue individuals trapped under debris following an earthquake.To identify opportunities for improving post-earthquake search and rescue(SAR)robotics,we get insights through four different sources:(i)A literature review of disaster robotics and victim psychology,(ii)A public survey on earthquake awareness and their view of rescue robots,(iii)Semi-structured interviews with first responders,and(iv)Responses from GenAI chatbots which were prompted to assume the role of expert rescuers.The triangulated analysis show that there are challenges in mobility,autonomy,communication,situational awareness,and human-robot collaboration.The public respondents showed high acceptance of robot-assisted rescue and prioritised survivor detection,sensing,and communication as essential functionalities of robots.The experts expressed limitations in current equipment,the need for improved victim localisation,and interest in XR-based training and robot-assisted debris handling.The GenAI chatbots highlighted structural risk assessment,multi-sensor fusion,and supervised autonomy.Therefore,this study identifies critical robot features,outlines multi-modal interaction requirements,and highlights gaps in current SAR practice.These findings offer robot design directions for developing effective,trustworthy SAR robots,which can be integrated to future response disaster-workflows. 展开更多
关键词 Robots DISASTERS Earthquake search multi-modal interaction Human-Robot interaction
在线阅读 下载PDF
MDGET-MER:Multi-Level Dynamic Gating and Emotion Transfer for Multi-Modal Emotion Recognition
2
作者 Musheng Chen Qiang Wen +2 位作者 Xiaohong Qiu Junhua Wu Wenqing Fu 《Computers, Materials & Continua》 2026年第3期872-893,共22页
In multi-modal emotion recognition,excessive reliance on historical context often impedes the detection of emotional shifts,while modality heterogeneity and unimodal noise limit recognition performance.Existing method... In multi-modal emotion recognition,excessive reliance on historical context often impedes the detection of emotional shifts,while modality heterogeneity and unimodal noise limit recognition performance.Existing methods struggle to dynamically adjust cross-modal complementary strength to optimize fusion quality and lack effective mechanisms to model the dynamic evolution of emotions.To address these issues,we propose a multi-level dynamic gating and emotion transfer framework for multi-modal emotion recognition.A dynamic gating mechanism is applied across unimodal encoding,cross-modal alignment,and emotion transfer modeling,substantially improving noise robustness and feature alignment.First,we construct a unimodal encoder based on gated recurrent units and feature-selection gating to suppress intra-modal noise and enhance contextual representation.Second,we design a gated-attention crossmodal encoder that dynamically calibrates the complementary contributions of visual and audio modalities to the dominant textual features and eliminates redundant information.Finally,we introduce a gated enhanced emotion transfer module that explicitly models the temporal dependence of emotional evolution in dialogues via transfer gating and optimizes continuity modeling with a comparative learning loss.Experimental results demonstrate that the proposed method outperforms state-of-the-art models on the public MELD and IEMOCAP datasets. 展开更多
关键词 multi-modal emotion recognition dynamic gating emotion transfer module cross-modal dynamic alignment noise robustness
在线阅读 下载PDF
Lines Across Horizons Hainan's island-wide special customs operations are redefining boundaries to elevate opening-up
3
作者 Yang Hangjun 《China Report ASEAN》 2026年第3期16-19,共4页
On December 18,2025, Hainan Free Trade Port (Hainan FTP) officially began islandwide special customs operations.Although only two months have passed since this landmark step, the shift is already visible everywhere—f... On December 18,2025, Hainan Free Trade Port (Hainan FTP) officially began islandwide special customs operations.Although only two months have passed since this landmark step, the shift is already visible everywhere—from the bustling flow of international passengers at Haikou Meilan International Airport to the steady stream of cargo vessels calling at Yangpu Port, and even in the sustained attention investors are paying to “Hainan-related” stocks.Together, these signals point to one clear conclusion:China’s largest special economic zone has entered a new phase of development. 展开更多
关键词 free trade port Special Customs operations Hainan Free Trade Port Islandwide Special Customs operations International Passengers cargo vessels INVESTORS Opening up
在线阅读 下载PDF
Preparation of digital-encoded and analog-encoded quantum states corresponding to matrix operations
4
作者 Kaitian Gao Youlong Yang Zhenye Du 《Chinese Physics B》 2026年第1期332-344,共13页
Efficient implementation of fundamental matrix operations on quantum computers,such as matrix products and Hadamard operations,holds significant potential for accelerating machine learning algorithms.A critical prereq... Efficient implementation of fundamental matrix operations on quantum computers,such as matrix products and Hadamard operations,holds significant potential for accelerating machine learning algorithms.A critical prerequisite for quantum implementations is the effective encoding of classical data into quantum states.We propose two quantum computing frameworks for preparing the distinct encoded states corresponding to matrix operations,including the matrix product,matrix sum,matrix Hadamard product and division.Quantum algorithms based on the digital encoding computing framework are capable of implementing the matrix Hadamard operation with a time complexity of O(poly log(mn/ε))and the matrix product with a time complexity of O(poly log(mnl/ε)),achieving an exponential speedup in contrast to the classical methods of O(mn)and O(mnl).Quantum algorithms based on the analog-encoding framework are capable of implementing the matrix Hadamard operation with a time complexity of O(k_(1)√mn·poly log(mn/ε))and the matrix product with a time complexity of O(k_(2)√1·poly log(mnl/ε)),where k_(1)and k_(2)are coefficients correlated with the elements of the matrix,achieving a square speedup in contrast to the classical counterparts.As applications,we construct an oracle that can access the trace of a matrix within logarithmic time,and propose several algorithms to respectively estimate the trace of a matrix,the trace of the product of two matrices,and the trace inner product of two matrices within logarithmic time. 展开更多
关键词 quantum algorithm matrix operation digital and analog-encoded states quantum computing
原文传递
Splitting of Operations for Di-Associative Algebras and Tri-Associative Algebras
5
作者 Wen TENG Xiansheng DAI 《Journal of Mathematical Research with Applications》 2026年第1期21-32,共12页
Loday introduced di-associative algebras and tri-associative algebras motivated by periodicity phenomena in algebraic K-theory.The purpose of this paper is to study the splittings of operations on di-associative algeb... Loday introduced di-associative algebras and tri-associative algebras motivated by periodicity phenomena in algebraic K-theory.The purpose of this paper is to study the splittings of operations on di-associative algebras and tri-associative algebras.We introduce the notion of a quad-dendriform algebra,which is a splitting of a di-associative algebra.We show that a relative averaging operator on dendriform algebras gives rise to a quad-dendriform algebra.Furthermore,we introduce the notion of six-dendriform algebras,which are splittings of the tri-associative algebras,and demonstrate that homomorphic relative averaging operators induce six-dendriform algebras. 展开更多
关键词 dendriform algebra di-associative algebra quad-dendriform algebra tri-associative algebra six-dendriform algebra relative averaging operator
原文传递
Gateway to China Hainan's special customs operations facilitate access to the Chinese market
6
作者 Wang Junshan 《China Report ASEAN》 2026年第3期34-35,共2页
For ordinary tourists, simpler entry and exit procedures and a broader range of duty-free goods in Hainan create a better travel and shopping experience.For China’s earnest endeavor to deepen reform and opening-up, i... For ordinary tourists, simpler entry and exit procedures and a broader range of duty-free goods in Hainan create a better travel and shopping experience.For China’s earnest endeavor to deepen reform and opening-up, implementation of the special customs operations policy in Hainan represents a significant step forward. For businesses in Malaysia and other ASEAN member states, especially export-oriented small and medium-sized enterprises (SMEs), Hainan serves as a“transit hub” for accessing the Chinese market and even other Asian markets. 展开更多
关键词 SMEs simplified entry exit procedures duty free goods ASEAN special customs operations HAINAN Chinese market reform opening up
在线阅读 下载PDF
Forging a New Trade Frontier The special customs operations in Hainan Free Trade Port are blazing new opportunities for inbound and outbound investment
7
作者 Wang Ruohan 《China Report ASEAN》 2026年第3期26-27,共2页
With the gradual implementation of a series of institutional arrangements, H ainan is becoming a new hot spot for global investment and an ideal destination for starting businesses and developing industry. While attra... With the gradual implementation of a series of institutional arrangements, H ainan is becoming a new hot spot for global investment and an ideal destination for starting businesses and developing industry. While attracting foreign investment projects, it is also creating more favorable conditions for local enterprises to expand into international markets. 展开更多
关键词 global investment series institutional arrangements inbound investment trade frontier customs operations developing industry outbound investment institutional arrangements
在线阅读 下载PDF
Sky's the Limit In Wenchang, commercial aerospace is soaring to new heights, propelled by Hainan FTP's island-wide special customs operations
8
作者 Xia Yuanyuan 《China Report ASEAN》 2026年第3期30-31,共2页
At 11:00 p.m. on January 13, 2026, floodlights illuminated the launch pad at the Hainan Commercial Space Launch Site in Wenchang,south China’s Hainan Province. A Long March-8A (CZ-8A) carrier rocket lifted off with a... At 11:00 p.m. on January 13, 2026, floodlights illuminated the launch pad at the Hainan Commercial Space Launch Site in Wenchang,south China’s Hainan Province. A Long March-8A (CZ-8A) carrier rocket lifted off with a steady roar, its exhaust lighting up the night sky as it delivered a satellite into its designated orbit.The development of China’s first commercial space launch site has been striking. Since its inaugural launch in2024, it has completed 11 missions in less than 14 months—each a success. 展开更多
关键词 Long March space launch site satellite launch special customs operations commercial space launch launch pad Hainan Wenchang commercial aerospace
在线阅读 下载PDF
Construction and evaluation of a predictive model for the degree of coronary artery occlusion based on adaptive weighted multi-modal fusion of traditional Chinese and western medicine data 被引量:2
9
作者 Jiyu ZHANG Jiatuo XU +1 位作者 Liping TU Hongyuan FU 《Digital Chinese Medicine》 2025年第2期163-173,共11页
Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocar... Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocardiographic data,traditional Chinese medicine(TCM)tongue manifestations,and facial features were collected from patients who underwent coro-nary computed tomography angiography(CTA)in the Cardiac Care Unit(CCU)of Shanghai Tenth People's Hospital between May 1,2023 and May 1,2024.An adaptive weighted multi-modal data fusion(AWMDF)model based on deep learning was constructed to predict the severity of coronary artery stenosis.The model was evaluated using metrics including accura-cy,precision,recall,F1 score,and the area under the receiver operating characteristic(ROC)curve(AUC).Further performance assessment was conducted through comparisons with six ensemble machine learning methods,data ablation,model component ablation,and various decision-level fusion strategies.Results A total of 158 patients were included in the study.The AWMDF model achieved ex-cellent predictive performance(AUC=0.973,accuracy=0.937,precision=0.937,recall=0.929,and F1 score=0.933).Compared with model ablation,data ablation experiments,and various traditional machine learning models,the AWMDF model demonstrated superior per-formance.Moreover,the adaptive weighting strategy outperformed alternative approaches,including simple weighting,averaging,voting,and fixed-weight schemes.Conclusion The AWMDF model demonstrates potential clinical value in the non-invasive prediction of coronary artery disease and could serve as a tool for clinical decision support. 展开更多
关键词 Coronary artery disease Deep learning multi-modal Clinical prediction Traditional Chinese medicine diagnosis
暂未订购
TCM network pharmacology:new perspective integrating network target with artificial intelligence and multi-modal multi-omics technologies 被引量:1
10
作者 Ziyi Wang Tingyu Zhang +1 位作者 Boyang Wang Shao Li 《Chinese Journal of Natural Medicines》 2025年第11期1425-1434,共10页
Traditional Chinese medicine(TCM)demonstrates distinctive advantages in disease prevention and treatment.However,analyzing its biological mechanisms through the modern medical research paradigm of“single drug,single ... Traditional Chinese medicine(TCM)demonstrates distinctive advantages in disease prevention and treatment.However,analyzing its biological mechanisms through the modern medical research paradigm of“single drug,single target”presents significant challenges due to its holistic approach.Network pharmacology and its core theory of network targets connect drugs and diseases from a holistic and systematic perspective based on biological networks,overcoming the limitations of reductionist research models and showing considerable value in TCM research.Recent integration of network target computational and experimental methods with artificial intelligence(AI)and multi-modal multi-omics technologies has substantially enhanced network pharmacology methodology.The advancement in computational and experimental techniques provides complementary support for network target theory in decoding TCM principles.This review,centered on network targets,examines the progress of network target methods combined with AI in predicting disease molecular mechanisms and drug-target relationships,alongside the application of multi-modal multi-omics technologies in analyzing TCM formulae,syndromes,and toxicity.Looking forward,network target theory is expected to incorporate emerging technologies while developing novel approaches aligned with its unique characteristics,potentially leading to significant breakthroughs in TCM research and advancing scientific understanding and innovation in TCM. 展开更多
关键词 Network pharmacology Traditional Chinese medicine Network target Artificial intelligence multi-modal Multi-omics
原文传递
MMGC-Net: Deep neural network for classification of mineral grains using multi-modal polarization images 被引量:1
11
作者 Jun Shu Xiaohai He +3 位作者 Qizhi Teng Pengcheng Yan Haibo He Honggang Chen 《Journal of Rock Mechanics and Geotechnical Engineering》 2025年第6期3894-3909,共16页
The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring ef... The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring effective exploitation utilization of its resources.However,the existing methods for classifying mineral particles do not fully utilize these multi-modal features,thereby limiting the classification accuracy.Furthermore,when conventional multi-modal image classification methods are applied to planepolarized and cross-polarized sequence images of mineral particles,they encounter issues such as information loss,misaligned features,and challenges in spatiotemporal feature extraction.To address these challenges,we propose a multi-modal mineral particle polarization image classification network(MMGC-Net)for precise mineral particle classification.Initially,MMGC-Net employs a two-dimensional(2D)backbone network with shared parameters to extract features from two types of polarized images to ensure feature alignment.Subsequently,a cross-polarized intra-modal feature fusion module is designed to refine the spatiotemporal features from the extracted features of the cross-polarized sequence images.Ultimately,the inter-modal feature fusion module integrates the two types of modal features to enhance the classification precision.Quantitative and qualitative experimental results indicate that when compared with the current state-of-the-art multi-modal image classification methods,MMGC-Net demonstrates marked superiority in terms of mineral particle multi-modal feature learning and four classification evaluation metrics.It also demonstrates better stability than the existing models. 展开更多
关键词 Mineral particles multi-modal image classification Shared parameters Feature fusion Spatiotemporal feature
暂未订购
Multi-modal intelligent situation awareness in real-time air traffic control: Control intent understanding and flight trajectory prediction 被引量:1
12
作者 Dongyue GUO Jianwei ZHANG +1 位作者 Bo YANG Yi LIN 《Chinese Journal of Aeronautics》 2025年第6期41-57,共17页
With the advent of the next-generation Air Traffic Control(ATC)system,there is growing interest in using Artificial Intelligence(AI)techniques to enhance Situation Awareness(SA)for ATC Controllers(ATCOs),i.e.,Intellig... With the advent of the next-generation Air Traffic Control(ATC)system,there is growing interest in using Artificial Intelligence(AI)techniques to enhance Situation Awareness(SA)for ATC Controllers(ATCOs),i.e.,Intelligent SA(ISA).However,the existing AI-based SA approaches often rely on unimodal data and lack a comprehensive description and benchmark of the ISA tasks utilizing multi-modal data for real-time ATC environments.To address this gap,by analyzing the situation awareness procedure of the ATCOs,the ISA task is refined to the processing of the two primary elements,i.e.,spoken instructions and flight trajectories.Subsequently,the ISA is further formulated into Controlling Intent Understanding(CIU)and Flight Trajectory Prediction(FTP)tasks.For the CIU task,an innovative automatic speech recognition and understanding framework is designed to extract the controlling intent from unstructured and continuous ATC communications.For the FTP task,the single-and multi-horizon FTP approaches are investigated to support the high-precision prediction of the situation evolution.A total of 32 unimodal/multi-modal advanced methods with extensive evaluation metrics are introduced to conduct the benchmarks on the real-world multi-modal ATC situation dataset.Experimental results demonstrate the effectiveness of AI-based techniques in enhancing ISA for the ATC environment. 展开更多
关键词 Airtraffic control Automatic speechrecognition and understanding Flight trajectory prediction multi-modal Situationawareness
原文传递
Personal Style Guided Outfit Recommendation with Multi-Modal Fashion Compatibility Modeling 被引量:1
13
作者 WANG Kexin ZHANG Jie +3 位作者 ZHANG Peng SUN Kexin ZHAN Jiamei WEI Meng 《Journal of Donghua University(English Edition)》 2025年第2期156-167,共12页
A personalized outfit recommendation has emerged as a hot research topic in the fashion domain.However,existing recommendations do not fully exploit user style preferences.Typically,users prefer particular styles such... A personalized outfit recommendation has emerged as a hot research topic in the fashion domain.However,existing recommendations do not fully exploit user style preferences.Typically,users prefer particular styles such as casual and athletic styles,and consider attributes like color and texture when selecting outfits.To achieve personalized outfit recommendations in line with user style preferences,this paper proposes a personal style guided outfit recommendation with multi-modal fashion compatibility modeling,termed as PSGNet.Firstly,a style classifier is designed to categorize fashion images of various clothing types and attributes into distinct style categories.Secondly,a personal style prediction module extracts user style preferences by analyzing historical data.Then,to address the limitations of single-modal representations and enhance fashion compatibility,both fashion images and text data are leveraged to extract multi-modal features.Finally,PSGNet integrates these components through Bayesian personalized ranking(BPR)to unify the personal style and fashion compatibility,where the former is used as personal style features and guides the output of the personalized outfit recommendation tailored to the target user.Extensive experiments on large-scale datasets demonstrate that the proposed model is efficient on the personalized outfit recommendation. 展开更多
关键词 personalized outfit recommendation fashion compatibility modeling style preference multi-modal representation Bayesian personalized ranking(BPR) style classifier
暂未订购
Multi-Modal Named Entity Recognition with Auxiliary Visual Knowledge and Word-Level Fusion
14
作者 Huansha Wang Ruiyang Huang +1 位作者 Qinrang Liu Xinghao Wang 《Computers, Materials & Continua》 2025年第6期5747-5760,共14页
Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or ... Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or obtaining entity related external knowledge from knowledge bases or Large Language Models(LLMs).However,these approaches ignore the poor semantic correlation between visual and textual modalities in MNER datasets and do not explore different multi-modal fusion approaches.In this paper,we present MMAVK,a multi-modal named entity recognition model with auxiliary visual knowledge and word-level fusion,which aims to leverage the Multi-modal Large Language Model(MLLM)as an implicit knowledge base.It also extracts vision-based auxiliary knowledge from the image formore accurate and effective recognition.Specifically,we propose vision-based auxiliary knowledge generation,which guides the MLLM to extract external knowledge exclusively derived from images to aid entity recognition by designing target-specific prompts,thus avoiding redundant recognition and cognitive confusion caused by the simultaneous processing of image-text pairs.Furthermore,we employ a word-level multi-modal fusion mechanism to fuse the extracted external knowledge with each word-embedding embedded from the transformerbased encoder.Extensive experimental results demonstrate that MMAVK outperforms or equals the state-of-the-art methods on the two classical MNER datasets,even when the largemodels employed have significantly fewer parameters than other baselines. 展开更多
关键词 multi-modal named entity recognition large language model multi-modal fusion
在线阅读 下载PDF
MMCSD:Multi-Modal Knowledge Graph Completion Based on Super-Resolution and Detailed Description Generation
15
作者 Huansha Wang Ruiyang Huang +2 位作者 Qinrang Liu Shaomei Li Jianpeng Zhang 《Computers, Materials & Continua》 2025年第4期761-783,共23页
Multi-modal knowledge graph completion(MMKGC)aims to complete missing entities or relations in multi-modal knowledge graphs,thereby discovering more previously unknown triples.Due to the continuous growth of data and ... Multi-modal knowledge graph completion(MMKGC)aims to complete missing entities or relations in multi-modal knowledge graphs,thereby discovering more previously unknown triples.Due to the continuous growth of data and knowledge and the limitations of data sources,the visual knowledge within the knowledge graphs is generally of low quality,and some entities suffer from the issue of missing visual modality.Nevertheless,previous studies of MMKGC have primarily focused on how to facilitate modality interaction and fusion while neglecting the problems of low modality quality and modality missing.In this case,mainstream MMKGC models only use pre-trained visual encoders to extract features and transfer the semantic information to the joint embeddings through modal fusion,which inevitably suffers from problems such as error propagation and increased uncertainty.To address these problems,we propose a Multi-modal knowledge graph Completion model based on Super-resolution and Detailed Description Generation(MMCSD).Specifically,we leverage a pre-trained residual network to enhance the resolution and improve the quality of the visual modality.Moreover,we design multi-level visual semantic extraction and entity description generation,thereby further extracting entity semantics from structural triples and visual images.Meanwhile,we train a variational multi-modal auto-encoder and utilize a pre-trained multi-modal language model to complement the missing visual features.We conducted experiments on FB15K-237 and DB13K,and the results showed that MMCSD can effectively perform MMKGC and achieve state-of-the-art performance. 展开更多
关键词 multi-modal knowledge graph knowledge graph completion multi-modal fusion
在线阅读 下载PDF
Transformers for Multi-Modal Image Analysis in Healthcare
16
作者 Sameera V Mohd Sagheer Meghana K H +2 位作者 P M Ameer Muneer Parayangat Mohamed Abbas 《Computers, Materials & Continua》 2025年第9期4259-4297,共39页
Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status... Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status.Each of these methods contributes unique diagnostic insights,enhancing the overall assessment of patient condition.Nevertheless,the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution,data collection methods,and noise levels.While traditional models like Convolutional Neural Networks(CNNs)excel in single-modality tasks,they struggle to handle multi-modal complexities,lacking the capacity to model global relationships.This research presents a novel approach for examining multi-modal medical imagery using a transformer-based system.The framework employs self-attention and cross-attention mechanisms to synchronize and integrate features across various modalities.Additionally,it shows resilience to variations in noise and image quality,making it adaptable for real-time clinical use.To address the computational hurdles linked to transformer models,particularly in real-time clinical applications in resource-constrained environments,several optimization techniques have been integrated to boost scalability and efficiency.Initially,a streamlined transformer architecture was adopted to minimize the computational load while maintaining model effectiveness.Methods such as model pruning,quantization,and knowledge distillation have been applied to reduce the parameter count and enhance the inference speed.Furthermore,efficient attention mechanisms such as linear or sparse attention were employed to alleviate the substantial memory and processing requirements of traditional self-attention operations.For further deployment optimization,researchers have implemented hardware-aware acceleration strategies,including the use of TensorRT and ONNX-based model compression,to ensure efficient execution on edge devices.These optimizations allow the approach to function effectively in real-time clinical settings,ensuring viability even in environments with limited resources.Future research directions include integrating non-imaging data to facilitate personalized treatment and enhancing computational efficiency for implementation in resource-limited environments.This study highlights the transformative potential of transformer models in multi-modal medical imaging,offering improvements in diagnostic accuracy and patient care outcomes. 展开更多
关键词 multi-modal image analysis medical imaging deep learning image segmentation disease detection multi-modal fusion Vision Transformers(ViTs) precision medicine clinical decision support
在线阅读 下载PDF
Classifying Network Flows through a Multi-Modal 1D CNN Approach Using Unified Traffic Representations
17
作者 Ravi Veerabhadrappa Poornima Athikatte Sampigerayappa 《Computer Systems Science & Engineering》 2025年第1期333-351,共19页
In recent years,the analysis of encrypted network traffic has gained momentum due to the widespread use of Transport Layer Security and Quick UDP Internet Connections protocols,which complicate and prolong the analysi... In recent years,the analysis of encrypted network traffic has gained momentum due to the widespread use of Transport Layer Security and Quick UDP Internet Connections protocols,which complicate and prolong the analysis process.Classification models face challenges in understanding and classifying unknown traffic because of issues related to interpret ability and the representation of traffic data.To tackle these complexities,multi-modal representation learning can be employed to extract meaningful features and represent them in a lower-dimensional latent space.Recently,auto-encoder-based multi-modal representation techniques have shown superior performance in representing network traffic.By combining the advantages of multi-modal representation with efficient classifiers,we can develop robust network traffic classifiers.In this paper,we propose a novel multi-modal encoder-decoder model to create unified representations of network traffic,paired with a robust 1D-CNN(one-dimensional convolution neural network)classifier for effective traffic classification.The proposed model utilizes the ISCX Virtual Private Networknon Virtual Private Network 2016 datasets to extract general multi-modal representations and to train both shallow and deep learning models,such as Random Forest and the 1D-CNN model,for traffic classification.We compare these learning approaches based on the multi-modal representations generated from the autoencoder and the early feature fusion technique.For the classification task,both the Random Forest and 1D-CNN models,when trained on multimodal representations,achieve over 90%accuracy on a highly imbalanced dataset. 展开更多
关键词 Encrypted network traffic multi-modal random forest 1D-CNN
在线阅读 下载PDF
Multi-Modal Pre-Synergistic Fusion Entity Alignment Based on Mutual Information Strategy Optimization
18
作者 Huayu Li Xinxin Chen +3 位作者 Lizhuang Tan Konstantin I.Kostromitin Athanasios V.Vasilakos Peiying Zhang 《Computers, Materials & Continua》 2025年第11期4133-4153,共21页
To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities... To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities,this paper proposes a Multi-modal Pre-synergistic Entity Alignmentmodel based on Cross-modalMutual Information Strategy Optimization(MPSEA).The model first employs independent encoders to process multi-modal features,including text,images,and numerical values.Next,a multi-modal pre-synergistic fusion mechanism integrates graph structural and visual modal features into the textual modality as preparatory information.This pre-fusion strategy enables unified perception of heterogeneous modalities at the model’s initial stage,reducing discrepancies during the fusion process.Finally,using cross-modal deep perception reinforcement learning,the model achieves adaptive multilevel feature fusion between modalities,supporting learningmore effective alignment strategies.Extensive experiments on multiple public datasets show that the MPSEA method achieves gains of up to 7% in Hits@1 and 8.2% in MRR on the FBDB15K dataset,and up to 9.1% in Hits@1 and 7.7% in MRR on the FBYG15K dataset,compared to existing state-of-the-art methods.These results confirm the effectiveness of the proposed model. 展开更多
关键词 Knowledge graph multi-modal entity alignment feature fusion pre-synergistic fusion
在线阅读 下载PDF
Research Progress on Multi-Modal Fusion Object Detection Algorithms for Autonomous Driving:A Review
19
作者 Peicheng Shi Li Yang +2 位作者 Xinlong Dong Heng Qi Aixi Yang 《Computers, Materials & Continua》 2025年第6期3877-3917,共41页
As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advan... As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advancing the development of perception technology in autonomous driving.To further promote the development of fusion algorithms and improve detection performance,this paper discusses the advantages and recent advancements of multimodal fusion-based object detection algorithms.Starting fromsingle-modal sensor detection,the paper provides a detailed overview of typical sensors used in autonomous driving and introduces object detection methods based on images and point clouds.For image-based detection methods,they are categorized into monocular detection and binocular detection based on different input types.For point cloud-based detection methods,they are classified into projection-based,voxel-based,point cluster-based,pillar-based,and graph structure-based approaches based on the technical pathways for processing point cloud features.Additionally,multimodal fusion algorithms are divided into Camera-LiDAR fusion,Camera-Radar fusion,Camera-LiDAR-Radar fusion,and other sensor fusion methods based on the types of sensors involved.Furthermore,the paper identifies five key future research directions in this field,aiming to provide insights for researchers engaged in multimodal fusion-based object detection algorithms and to encourage broader attention to the research and application of multimodal fusion-based object detection. 展开更多
关键词 multi-modal fusion 3D object detection deep learning autonomous driving
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部