期刊文献+
共找到43篇文章
< 1 2 3 >
每页显示 20 50 100
Deep learning-based multimodal data fusion in bone tumor management:Advances in clinical decision support
1
作者 Tongtong Huo Wei Wu +12 位作者 Xiaoliang Chen Mingdi Xue Pengran Liu Jiayao Zhang Yi Xie Honglin Wang Hong Zhou Zineng Yan Songxiang Liu Lin Lu Jiaming Yang Jin Liu Zhewei Ye 《Intelligent Oncology》 2025年第3期204-215,共12页
Bone tumors(BTs)-including osteosarcoma,Ewing sarcoma,and chondrosarcoma-are rare but biologically complex malignancies characterized by pronounced heterogeneity in anatomical location,histological subtype,and molecul... Bone tumors(BTs)-including osteosarcoma,Ewing sarcoma,and chondrosarcoma-are rare but biologically complex malignancies characterized by pronounced heterogeneity in anatomical location,histological subtype,and molecular alterations.Recent advances in artificial intelligence(AI),particularly deep learning,have enabled the integration of diverse clinical data modalities to support diagnosis,treatment planning,and prognostication in bone oncology.This review provides a comprehensive synthesis of AI-driven multimodal fusion strategies that incorporate radiological imaging,digital pathology,multi-omics profiling,and electronic health records.We conducted a structured review of peer-reviewed literature published between 2015 and early 2025,focusing on the development,validation,and clinical applicability of AI models for BT diagnosis,subtyping,treatment response prediction,and recurrence monitoring.Although multimodal models have demonstrated advantages over unimodal approaches,especially in handling missing data and improving generalizability,most remain constrained by single-center study designs,small sample sizes,and limited prospective or external validation.Persistent technical and translational challenges include semantic misalignment across modalities,incomplete datasets,limited model interpretability,and regulatory and infrastructural barriers to clinical integration.To address these limitations,we highlight emerging directions such as contrastive representation learning,generative data augmentation,transformer-based fusion architectures,and privacy-preserving federated learning.We also discuss the evolving role of foundation models and workflow-integrated AI agents in enhancing scalability and clinical usability.In summary,multimodal AI represents a promising paradigm for advancing precision care in BTs.Realizing its full clinical potential will require methodologically rigorous,biologically informed,and system-level approaches that bridge algorithmic innovation with real-world healthcare delivery. 展开更多
关键词 Bone tumors multimodal data fusion Artificial intelligence Clinical decision support systems Deep learning
暂未订购
A Root Cause Analysis Framework for Microservice Systems with Multimodal Data
2
作者 LI Yingke HAN Jing +2 位作者 SUN Yongqian SHI Binpeng GONG Zican 《ZTE Communications》 2025年第4期110-119,共10页
In recent years,microservice architecture has gained increasing popularity.However,due to the complex and dynamically chang⁃ing nature of microservice systems,failure detection has become more challenging.Traditional ... In recent years,microservice architecture has gained increasing popularity.However,due to the complex and dynamically chang⁃ing nature of microservice systems,failure detection has become more challenging.Traditional root cause analysis methods mostly rely on a single modality of data,which is insufficient to cover all failure information.Existing multimodal methods require collecting high-quality la⁃beled samples and often face challenges in classifying unknown failure categories.To address these challenges,this paper proposes a root cause analysis framework based on a masked graph autoencoder(GAE).The main process involves feature extraction,feature dimensionality reduction based on GAE,and online clustering combined with expert input.The method is experimentally evaluated on two public datasets and compared with two baseline methods,demonstrating significant advantages even with 16%labeled samples. 展开更多
关键词 root cause analysis multimodal data self-supervised learning online clustering
在线阅读 下载PDF
Construction of Human Digital Twin Model Based on Multimodal Data and Its Application in Locomotion Mode Identifcation 被引量:3
3
作者 Ruirui Zhong Bingtao Hu +4 位作者 Yixiong Feng Hao Zheng Zhaoxi Hong Shanhe Lou Jianrong Tan 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2023年第5期7-19,共13页
With the increasing attention to the state and role of people in intelligent manufacturing, there is a strong demand for human-cyber-physical systems (HCPS) that focus on human-robot interaction. The existing intellig... With the increasing attention to the state and role of people in intelligent manufacturing, there is a strong demand for human-cyber-physical systems (HCPS) that focus on human-robot interaction. The existing intelligent manufacturing system cannot satisfy efcient human-robot collaborative work. However, unlike machines equipped with sensors, human characteristic information is difcult to be perceived and digitized instantly. In view of the high complexity and uncertainty of the human body, this paper proposes a framework for building a human digital twin (HDT) model based on multimodal data and expounds on the key technologies. Data acquisition system is built to dynamically acquire and update the body state data and physiological data of the human body and realize the digital expression of multi-source heterogeneous human body information. A bidirectional long short-term memory and convolutional neural network (BiLSTM-CNN) based network is devised to fuse multimodal human data and extract the spatiotemporal features, and the human locomotion mode identifcation is taken as an application case. A series of optimization experiments are carried out to improve the performance of the proposed BiLSTM-CNN-based network model. The proposed model is compared with traditional locomotion mode identifcation models. The experimental results proved the superiority of the HDT framework for human locomotion mode identifcation. 展开更多
关键词 Human digital twin Human-cyber-physical system Bidirectional long short-term memory Convolutional neural network multimodal data
在线阅读 下载PDF
Analysis of Emotions Using Multimodal Data: A Case Study
4
作者 Toshiya Akiyama Kyoko Osaka +4 位作者 Hirokazu Ito Ryuichi Tanioka Allan Paulo Blaquera Leah Anne Christine Bollos Tetsuya Tanioka 《Journal of Biosciences and Medicines》 2023年第12期54-68,共15页
In this case study, we hypothesized that sympathetic nerve activity would be higher during conversation with PALRO robot, and that conversation would result in an increase in cerebral blood flow near the Broca’s area... In this case study, we hypothesized that sympathetic nerve activity would be higher during conversation with PALRO robot, and that conversation would result in an increase in cerebral blood flow near the Broca’s area. The facial expressions of a human subject were recorded, and cerebral blood flow and heart rate variability were measured during interactions with the humanoid robot. These multimodal data were time-synchronized to quantitatively verify the change from the resting baseline by testing facial expression analysis, cerebral blood flow, and heart rate variability. In conclusion, this subject indicated that sympathetic nervous activity was dominant, suggesting that the subject may have enjoyed and been excited while talking to the robot (normalized High Frequency < normalized Low Frequency: 0.22 ± 0.16 < 0.78 ± 0.16). Cerebral blood flow values were higher during conversation and in the resting state after the experiment than in the resting state before the experiment. Talking increased cerebral blood flow in the frontal region. As the subject was left-handed, it was confirmed that the right side of the brain, where the Broca’s area is located, was particularly activated (Left < right: 0.15 ± 0.21 < 1.25 ± 0.17). In the sections where a “happy” facial emotion was recognized, the examiner-judged “happy” faces and the MTCNN “happy” results were also generally consistent. 展开更多
关键词 Humanoid Robots multimodal data Emotion Analysis
暂未订购
Automated Machine Learning for Fault Diagnosis Using Multimodal Mel-Spectrogram and Vibration Data
5
作者 Zehao Li Xuting Zhang +4 位作者 Hongqi Lin Wu Qin Junyu Qi Zhuyun Chen Qiang Liu 《Computer Modeling in Engineering & Sciences》 2026年第2期471-498,共28页
To ensure the safe and stable operation of rotating machinery,intelligent fault diagnosis methods hold significant research value.However,existing diagnostic approaches largely rely on manual feature extraction and ex... To ensure the safe and stable operation of rotating machinery,intelligent fault diagnosis methods hold significant research value.However,existing diagnostic approaches largely rely on manual feature extraction and expert experience,which limits their adaptability under variable operating conditions and strong noise environments,severely affecting the generalization capability of diagnostic models.To address this issue,this study proposes a multimodal fusion fault diagnosis framework based on Mel-spectrograms and automated machine learning(AutoML).The framework first extracts fault-sensitive Mel time–frequency features from acoustic signals and fuses them with statistical features of vibration signals to construct complementary fault representations.On this basis,automated machine learning techniques are introduced to enable end-to-end diagnostic workflow construction and optimal model configuration acquisition.Finally,diagnostic decisions are achieved by automatically integrating the predictions of multiple high-performance base models.Experimental results on a centrifugal pump vibration and acoustic dataset demonstrate that the proposed framework achieves high diagnostic accuracy under noise-free conditions and maintains strong robustness under noisy interference,validating its efficiency,scalability,and practical value for rotating machinery fault diagnosis. 展开更多
关键词 Automated machine learning mechanical fault diagnosis feature engineering multimodal data
在线阅读 下载PDF
MILCAnet:a dominant feature attention framework for enhanced multimodal data analysis in depression detection
6
作者 Qian RONG Cheng SONG +5 位作者 Yaru ZHANG Yao YU Ping LIANG Chuan PANG Jie YU Shuai DING 《Frontiers of Computer Science》 2026年第3期165-167,共3页
1 Introduction.The development of information technology has promoted the application of multimodal,long-temporal,and multiscale data in healthcare[1].However,the effective utilization of multimodal data still faces c... 1 Introduction.The development of information technology has promoted the application of multimodal,long-temporal,and multiscale data in healthcare[1].However,the effective utilization of multimodal data still faces challenges related to feature redundancy within the modality of long temporal data and across multimodal data[2]. 展开更多
关键词 information technology multiscale data long temporal data feature redundancy multimodal data
原文传递
Multimodal data-driven approaches in retinal vein occlusion:A narrative review integrating machine learning and bioinformatics
7
作者 Chunlan Liang Lian Liu Jingxiang Zhong 《Advances in Ophthalmology Practice and Research》 2025年第4期235-244,共10页
Background:Retinal vein occlusion(RvO)is a leading cause of visual impairment on a global scale.Its patho-logical mechanisms involve a complex interplay of vascular obstruction,ischemia,and secondary inflammatory resp... Background:Retinal vein occlusion(RvO)is a leading cause of visual impairment on a global scale.Its patho-logical mechanisms involve a complex interplay of vascular obstruction,ischemia,and secondary inflammatory responses.Recent interdisciplinary advances,underpinned by the integration of multimodal data,have estab-lished a new paradigm for unraveling the pathophysiological mechanisms of RvO,enabling early diagnosis and personalized treatment strategies.Main text:This review critically synthesizes recent progress at the intersection of machine learning,bioinfor-matics,and clinical medicine,focusing on developing predictive models and deep analysis,exploring molecular mechanisms,and identifying markers associated with RvO.By bridging technological innovation with clinical needs,this review underscores the potential of data-driven strategies to advance RvO research and optimize patient care.Conclusions:Machine learning-bioinformatics integration has revolutionised RvO research through predictive modelling and mechanistic insights,particularly via deep learning-enhanced retinal imaging and multi-omics networks.Despite progress,clinical translation requires resolving data standardisation inconsistencies and model generalizability limitations.Establishing multicentre validation frameworks and interpretable AI tools,coupled with patient-focused data platforms through cross-disciplinary collaboration,could enable precision interventions to optimally preserve vision. 展开更多
关键词 BIOINFORMATICS Clinical prediction models Deep learning MARKERS multimodal data Machine learning Retinal vein occlusion
原文传递
Optimizing Multimodal Data Queries in Data Lakes
8
作者 Runqun Xiong Shiyuan Zhao +1 位作者 Ciyuan Chen Zhuqing Xu 《Tsinghua Science and Technology》 2025年第6期2625-2637,共13页
This paper addresses the challenge of efficiently querying multimodal related data in data lakes,a large-scale storage and management system that supports heterogeneous data formats,including structured,semi-structure... This paper addresses the challenge of efficiently querying multimodal related data in data lakes,a large-scale storage and management system that supports heterogeneous data formats,including structured,semi-structured,and unstructured data.Multimodal data queries are crucial because they enable seamless retrieval of related data across modalities,such as tables,images,and text,which has applications in fields like e-commerce,healthcare,and education.However,existing methods primarily focus on single-modality queries,such as joinable or unionable table discovery,and struggle to handle the heterogeneity and lack of metadata in data lakes while balancing accuracy and efficiency.To tackle these challenges,we propose a Multimodal data Query mechanism for Data Lakes(MQDL),which employs a modality-adaptive indexing mechanism raleted and contrastive learning based embeddings to unify representations across modalities.Additionally,we introduce product quantization to optimize candidate verification during queries,reducing computational overhead while maintaining precision.We evaluate MQDL using a table-image dataset across multiple business scenarios,measuring metrics such as precision,recall,and F1-score.Results show that MQDL achieves an accuracy rate of approximately 90%,while demonstrating strong scalability and reduced query response time compared to traditional methods.These findings highlight MQDL's potential to enhance multimodal data retrieval in complex data lake environments. 展开更多
关键词 multimodal data query data lake contrastive learning related data query
原文传递
Enhanced temporal encoding‑decoding for survival analysis of multimodal clinical data in smart healthcare
9
作者 Xiaofeng Zhang Zijie Pan +5 位作者 Yuhang Tian Lili Wang Tingting Xu Li Chen Xiangyun Liao Tianyu Jiang 《Visual Computing for Industry,Biomedicine,and Art》 2025年第1期490-505,共16页
Effective survival analysis is essential for identifying optimal preventive treatments within smart healthcare systems and leveraging digital health advancements;however,existing prediction models face limitations,pri... Effective survival analysis is essential for identifying optimal preventive treatments within smart healthcare systems and leveraging digital health advancements;however,existing prediction models face limitations,primarily relying on ensemble classification techniques with suboptimal performance in both target detection and predictive accuracy.To address these gaps,this paper proposes a multimodal framework that integrates enhanced facial feature detection and temporal predictive modeling.For facial feature extraction,this study developed a lightweight faceregion convolutional neural network(FRegNet)specialized in detecting key facial components,such as eyes and lips in clinical patients that incorporates a residual backbone(Rstem)to enhance feature representation and a facial path aggregated feature pyramid network for multi-resolution feature fusion;comparative experiments reveal that FReg-Net outperforms state-of-the-art target detection algorithms,achieving average precision(AP)of 0.922,average recall of 0.933,mean average precision(mAP)of 0.987,and precision of 0.98–significantly surpassing other mask region-based convolutional neural networks(RCNN)variants,such as mask RCNN-ResNeXt with AP of 0.789 and mAP of 0.957.Based on the extracted facial features and clinical physiological indicators,this study proposes an enhanced temporal encoding-decoding(ETED)model that integrates an adaptive attention mechanism and a gated weighting mechanism to improve predictive performance,with comparative results demonstrating that the ETED variant incorporating facial features(ETEncoding-Decoding-Face)outperforms traditional models,achieving an accuracy of 0.916,precision of 0.850,recall of 0.895,F1 of 0.884,and area under the curve(AUC)of 0.947–outperforming gradient boosting with an accuracy of 0.922,but AUC of 0.669,and other classifiers in comprehensive metrics.The results confirm that the multimodal dataset(facial features+physiological indicators)significantly enhances the prediction accuracy of the seven-day survival conditions of patients.Correlation analysis reveals that chronic health evaluation and mean arterial pressure are positively correlated with survival,while temperature,Glasgow Coma Scale,and fibrinogen are negatively correlated. 展开更多
关键词 Digital health Smart healthcare system Facial feature detection Encoding-decoding multimodal clinical data
在线阅读 下载PDF
Seismic vulnerability and risk assessment using multimodal data and machine learning:a case study of the central urban area of Jinan City,China
10
作者 Yaohui LIU Xinyu ZHANG +2 位作者 Jie ZHOU Xu HAN Hao ZHENG 《Frontiers of Earth Science》 2025年第3期452-467,共16页
Seismic hazards pose a major threat to life safety,social development,and the economy.Traditional seismic vulnerability and risk assessments,such as field survey methods,may not be suitable for densely built-up urban ... Seismic hazards pose a major threat to life safety,social development,and the economy.Traditional seismic vulnerability and risk assessments,such as field survey methods,may not be suitable for densely built-up urban areas due to the limited availability of comprehensive data and potential subjectivity in judgment.To overcome these limitations,an integrated method for seismic vulnerability and risk assessment based on multimodal remote sensing data,support vector machine(SVM)and GIScience methods was proposed and applied to the central urban area of Jinan City,Shandong Province,China.First,an area with representative buildings was selected for field survey research,and an attribute information base established.Then,the SVM method was used to establish the susceptibility proxies,which were applied to the whole study area after accuracy evaluation.Finally,the spatial distribution of seismic vulnerability and risk under different seismic intensity scenarios(from VI to X)was analyzed in GIScience.The results show that the average building vulnerability index in the central urban area of Jinan City is 0.53,indicating that the overall seismic performance of buildings is at a moderate level.Under the seismic intensity scenario of VIII,the buildings in the Starting area and New urban district of Jinan would mostly suffer‘Moderate’damage,while Old urban areas,with more seismic-resistant buildings,would experience only‘Slight’damage.This study aims to offer an efficient and accurate method for assessing seismic vulnerability in mid to large-sized cities characterized by concentrated population densities and rapid urbanization,as well as provide a valuable reference for efforts in urban renewal,seismic mitigation,and land planning,particularly in cities and regions of developing countries.Additionally,it contributes to the realization of Sustainable Development Goal 11,which seeks to make cities and human settlements inclusive,safe,resilient,and sustainable. 展开更多
关键词 seismic vulnerability assessment GISCIENCE EMS-98 SVM RISK-UE multimodal remote sensing data
原文传递
Early warning of emerging infectious diseases based on multimodal data
11
作者 Haotian Ren Yunchao Ling +3 位作者 Ruifang Cao Zhen Wang Yixue Li Tao Huang 《Biosafety and Health》 CAS CSCD 2023年第4期193-203,共11页
The coronavirus disease 2019 (COVID-19) pandemic has dramatically increased the awareness of emerging infectious diseases. The advancement of multiomics analysis technology has resulted in the development of several d... The coronavirus disease 2019 (COVID-19) pandemic has dramatically increased the awareness of emerging infectious diseases. The advancement of multiomics analysis technology has resulted in the development of several databases containing virus information. Several scientists have integrated existing data on viruses to construct phylogenetic trees and predict virus mutation and transmission in different ways, providing prospective technical support for epidemic prevention and control. This review summarized the databases of known emerging infectious viruses and techniques focusing on virus variant forecasting and early warning. It focuses on the multi-dimensional information integration and database construction of emerging infectious viruses, virus mutation spectrum construction and variant forecast model, analysis of the affinity between mutation antigen and the receptor, propagation model of virus dynamic evolution, and monitoring and early warning for variants. As people have suffered from COVID-19 and repeated flu outbreaks, we focused on the research results of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and influenza viruses. This review comprehensively viewed the latest virus research and provided a reference for future virus prevention and control research. 展开更多
关键词 Emerging infectious disease SARS-CoV-2 multimodal data Early warning
原文传递
Multimodal artificial intelligence technology in the precision diagnosis and treatment of gastroenterology and hepatology: Innovative applications and challenges
12
作者 Yi-Mao Wu Fei-Yang Tang Zi-Xin Qi 《World Journal of Gastroenterology》 2025年第38期26-43,共18页
With the rapid development of artificial intelligence(AI)technology,multimodal data integration has become an important means to improve the accuracy of diagnosis and treatment in gastroenterology and hepatology.This ... With the rapid development of artificial intelligence(AI)technology,multimodal data integration has become an important means to improve the accuracy of diagnosis and treatment in gastroenterology and hepatology.This article systematically reviews the latest progress of multimodal AI technology in the diagnosis,treatment,and decision-making for gastrointestinal tumors,functional gastrointestinal diseases,and liver diseases,focusing on the innovative applications of endoscopic image AI,pathological section AI,multi-omics data fusion models,and wearable devices combined with natural language processing.Multimodal AI can significantly improve the accuracy of early diagnosis and the efficiency of individualized treatment planning by integrating imaging,pathological data,molecular,and clinical phenotypic data.However,current AI technologies still face challenges such as insufficient data standardization,limited generalization of models,and ethical compliance.This paper proposes solutions,such as the establishment of cross-center data sharing platform,the development of federated learning framework,and the formulation of ethical norms,and looks forward to the application prospect of multimodal large-scale models in the disease management process.This review provides theoretical basis and practical guidance for promoting the clinical translation of AI technology in the field of gastroenterology and hepatology. 展开更多
关键词 Artificial intelligence multimodal data GASTROENTEROLOGY HEPATOLOGY Precision medicine Challenges and countermeasures
在线阅读 下载PDF
Multimodal Classification of Alzheimer’s Disease Based on Kolmogorov-Arnold Graph Attention Network
13
作者 Xiaosheng Wu Ruichao Tian +2 位作者 Zhaozhao Xu Shuihua Wang Yudong Zhang 《Journal of Bionic Engineering》 2025年第5期2717-2730,共14页
Alzheimer’s Disease(AD),a prevalent neurodegenerative disorder characterized by memory loss and cognitive decline,poses significant challenges for individuals and society.Multimodal data fusion has emerged as a promi... Alzheimer’s Disease(AD),a prevalent neurodegenerative disorder characterized by memory loss and cognitive decline,poses significant challenges for individuals and society.Multimodal data fusion has emerged as a promising approach for AD diagnosis,with Graph Convolutional Networks(GCNs)effectively capturing irregular brain information.However,traditional GCN methods face limitations in representing and integrating multimodal data,often resulting in feature mismatch.In this study,we propose a novel Kolmogorov-Arnold Graph Attention Network(KAGAN)model to address this issue through semantic-level alignment.KAGAN incorporates a Multimodal Feature Construction method(MuStaF)to extract structural and functional features from T1-and T2-weighted images,and a Multimodal Graph Adjacency Matrix Construction method(MuGAC)to integrate clinical information,modeling intricate relationships across modalities.Experiments conducted on the ADNI dataset demonstrate the superiority of KAGAN in AD/CN/MCI classification,achieving an accuracy of 98.29±1.21%.This highlights KAGAN’s potential for early AD diagnosis by enabling interactive learning and fusion of multimodal features at the semantic level.The source code of our proposed model and the related datasets are available at https://github.com/sheeprra/KAGAN. 展开更多
关键词 Alzheimer’s disease diagnosis multimodal data fusion Feature mismatch problem GAT KAN
在线阅读 下载PDF
Multimodal detection framework for financial fraud integrating LLMs and interpretable machine learning
14
作者 Hui Nie Zhao-hui Long +1 位作者 Ze-jun Fang Lu-qiong Gao 《Journal of Data and Information Science》 2025年第4期291-315,共25页
Purpose:This study aims to integrate large language models(LLMs)with interpretable machine learning methods to develop a multimodal data-driven framework for predicting corporate financial fraud,addressing the limitat... Purpose:This study aims to integrate large language models(LLMs)with interpretable machine learning methods to develop a multimodal data-driven framework for predicting corporate financial fraud,addressing the limitations of traditional approaches in long-text semantic parsing,model interpretability,and multisource data fusion,thereby providing regulatory agencies with intelligent auditing tools.Design/methodology/approach:Analyzing 5,304 Chinese listed firms’annual reports(2015-2020)from the CSMAD database,this study leverages the Doubao LLMs to generate chunked summaries and 256-dimensional semantic vectors,developing textual semantic features.It integrates 19 financial indicators,11 governance metrics,and linguistic characteristics(tone,readability)with fraud prediction models optimized through a group of Gradient Boosted Decision Tree(GBDT)algorithms.SHAP value analysis in the final model reveals the risk transmission mechanism by quantifying the marginal impacts of financial,governance,and textual features on fraud likelihood.Findings:The study found that LLMs effectively distill lengthy annual reports into semantic summaries,while GBDT algorithms(AUC>0.850)outperform the traditional Logistic Regression model in fraud detection.Multimodal fusion improved performance by 7.4%,with financial,governance,and textual features providing complementary signals.SHAP analysis revealed financial distress,governance conflicts,and narrative patterns(e.g.,tone anchoring,semantic thresholds)as key fraud indicators,highlighting managerial intent in report language.Research limitations:This study identifies three key limitations:1)lack of interpretability for semantic features,2)absence of granular fraud-type differentiation,and 3)unexplored comparative validation with other deep learning methods.Future research will address these gaps to enhance fraud detection precision and model transparency.Practical implications:The developed semantic-enhanced evaluation model provides a quantitative tool for assessing listed companies’information disclosure quality and enables practical implementation through its derivative real-time monitoring system.This advancement significantly strengthens capital market risk early warning capabilities,offering actionable insights for securities regulation.Originality/value:This study presents three key innovations:1)A novel“chunking-summarizationembedding”framework for efficient semantic compression of lengthy annual reports(30,000 words);2)Demonstration of LLMs’superior performance in financial text analysis,outperforming traditional methods by 19.3%;3)A novel“language-psychology-behavior”triad model for analyzing managerial fraud motives. 展开更多
关键词 Financial fraud detection Large language models multimodal data fusion Interpretable machine learning Annual report
在线阅读 下载PDF
Low-Rank Adapter Layers and Bidirectional Gated Feature Fusion for Multimodal Hateful Memes Classification
15
作者 Youwei Huang Han Zhong +1 位作者 Cheng Cheng Yijie Peng 《Computers, Materials & Continua》 2025年第7期1863-1882,共20页
Hateful meme is a multimodal medium that combines images and texts.The potential hate content of hateful memes has caused serious problems for social media security.The current hateful memes classification task faces ... Hateful meme is a multimodal medium that combines images and texts.The potential hate content of hateful memes has caused serious problems for social media security.The current hateful memes classification task faces significant data scarcity challenges,and direct fine-tuning of large-scale pre-trained models often leads to severe overfitting issues.In addition,it is a challenge to understand the underlying relationship between text and images in the hateful memes.To address these issues,we propose a multimodal hateful memes classification model named LABF,which is based on low-rank adapter layers and bidirectional gated feature fusion.Firstly,low-rank adapter layers are adopted to learn the feature representation of the new dataset.This is achieved by introducing a small number of additional parameters while retaining prior knowledge of the CLIP model,which effectively alleviates the overfitting phenomenon.Secondly,a bidirectional gated feature fusion mechanism is designed to dynamically adjust the interaction weights of text and image features to achieve finer cross-modal fusion.Experimental results show that the method significantly outperforms existing methods on two public datasets,verifying its effectiveness and robustness. 展开更多
关键词 Hateful meme multimodal fusion multimodal data deep learning
在线阅读 下载PDF
Holistic Hierarchical Predictive-Integration Theory (HHPIT): An Exploration of AI-Empowered Innovation and Empirical Research in Traditional Chinese Medicine Meridian Theory
16
作者 Jiren Zhang Yang Zhang +3 位作者 Bingna Hao Junjie Hao Leiming Wang Fengtian Lao 《Journal of Clinical and Nursing Research》 2026年第2期278-285,共8页
By 2025,research on Traditional Chinese Medicine(TCM)meridians has generated 12-15 macro-level theories and over 20 specific hypotheses,manifesting a highly fragmented research landscape.Objective:This paper proposes ... By 2025,research on Traditional Chinese Medicine(TCM)meridians has generated 12-15 macro-level theories and over 20 specific hypotheses,manifesting a highly fragmented research landscape.Objective:This paper proposes the“Holistic Hierarchical Predictive-Integration Hypothesis”(HHPIT)to construct a unified theoretical framework that integrates the rational components of existing meridian hypotheses.Methods:The HHPIT hypothesis systematically reviews current meridian theories,employs interdisciplinary methodologies,integrates artificial intelligence technology,and establishes a three-tier architecture encompassing structural,functional,and systemic layers.Results:HHPIT successfully integrates diverse meridian theories,proposes a computable algorithmic pipeline,and provides specific application protocols for chronic disease treatment,anti-aging,and enhancement of Zang-fu organ functions.Conclusion:HHPIT offers a novel,computable,and verifiable research paradigm for meridian studies,promoting the modernization and internationalization of TCM theory. 展开更多
关键词 Holographic hierarchical prediction integration Meridian research Artificial intelligence Modernization of Traditional Chinese Medicine multimodal data fusion
在线阅读 下载PDF
Fusion of color and hallucinated depth features for enhanced multimodal deep learning-based damage segmentation
17
作者 Tarutal Ghosh Mondal Mohammad Reza Jahanshahi 《Earthquake Engineering and Engineering Vibration》 SCIE EI CSCD 2023年第1期55-68,共14页
Recent advances in computer vision and deep learning have shown that the fusion of depth information can significantly enhance the performance of RGB-based damage detection and segmentation models.However,alongside th... Recent advances in computer vision and deep learning have shown that the fusion of depth information can significantly enhance the performance of RGB-based damage detection and segmentation models.However,alongside the advantages,depth-sensing also presents many practical challenges.For instance,the depth sensors impose an additional payload burden on the robotic inspection platforms limiting the operation time and increasing the inspection cost.Additionally,some lidar-based depth sensors have poor outdoor performance due to sunlight contamination during the daytime.In this context,this study investigates the feasibility of abolishing depth-sensing at test time without compromising the segmentation performance.An autonomous damage segmentation framework is developed,based on recent advancements in vision-based multi-modal sensing such as modality hallucination(MH)and monocular depth estimation(MDE),which require depth data only during the model training.At the time of deployment,depth data becomes expendable as it can be simulated from the corresponding RGB frames.This makes it possible to reap the benefits of depth fusion without any depth perception per se.This study explored two different depth encoding techniques and three different fusion strategies in addition to a baseline RGB-based model.The proposed approach is validated on computer-generated RGB-D data of reinforced concrete buildings subjected to seismic damage.It was observed that the surrogate techniques can increase the segmentation IoU by up to 20.1%with a negligible increase in the computation cost.Overall,this study is believed to make a positive contribution to enhancing the resilience of critical civil infrastructure. 展开更多
关键词 multimodal data fusion depth sensing vision-based inspection UAV-assisted inspection damage segmentation post-disaster reconnaissance modality hallucination monocular depth estimation
在线阅读 下载PDF
Deep Multimodal Learning and Fusion Based Intelligent Fault Diagnosis Approach 被引量:3
18
作者 Huifang Li Jianghang Huang +3 位作者 Jingwei Huang Senchun Chai Leilei Zhao Yuanqing Xia 《Journal of Beijing Institute of Technology》 EI CAS 2021年第2期172-185,共14页
Industrial Internet of Things(IoT)connecting society and industrial systems represents a tremendous and promising paradigm shift.With IoT,multimodal and heterogeneous data from industrial devices can be easily collect... Industrial Internet of Things(IoT)connecting society and industrial systems represents a tremendous and promising paradigm shift.With IoT,multimodal and heterogeneous data from industrial devices can be easily collected,and further analyzed to discover device maintenance and health related potential knowledge behind.IoT data-based fault diagnosis for industrial devices is very helpful to the sustainability and applicability of an IoT ecosystem.But how to efficiently use and fuse this multimodal heterogeneous data to realize intelligent fault diagnosis is still a challenge.In this paper,a novel Deep Multimodal Learning and Fusion(DMLF)based fault diagnosis method is proposed for addressing heterogeneous data from IoT environments where industrial devices coexist.First,a DMLF model is designed by combining a Convolution Neural Network(CNN)and Stacked Denoising Autoencoder(SDAE)together to capture more comprehensive fault knowledge and extract features from different modal data.Second,these multimodal features are seamlessly integrated at a fusion layer and the resulting fused features are further used to train a classifier for recognizing potential faults.Third,a two-stage training algorithm is proposed by combining supervised pre-training and fine-tuning to simplify the training process for deep structure models.A series of experiments are conducted over multimodal heterogeneous data from a gear device to verify our proposed fault diagnosis method.The experimental results show that our method outperforms the benchmarking ones in fault diagnosis accuracy. 展开更多
关键词 fault diagnosis deep learning multimodal heterogeneous data multimodal fused features
在线阅读 下载PDF
Revolutionizing gastroenterology and hepatology with artificial intelligence:From precision diagnosis to equitable healthcare through interdisciplinary practice 被引量:2
19
作者 Zhi-Li Chen Chao Wang Fang Wang 《World Journal of Gastroenterology》 2025年第24期25-49,共25页
Artificial intelligence(AI)is driving a paradigm shift in gastroenterology and hepa-tology by delivering cutting-edge tools for disease screening,diagnosis,treatment,and prognostic management.Through deep learning,rad... Artificial intelligence(AI)is driving a paradigm shift in gastroenterology and hepa-tology by delivering cutting-edge tools for disease screening,diagnosis,treatment,and prognostic management.Through deep learning,radiomics,and multimodal data integration,AI has achieved diagnostic parity with expert cli-nicians in endoscopic image analysis(e.g.,early gastric cancer detection,colorectal polyp identification)and non-invasive assessment of liver pathologies(e.g.,fibrosis staging,fatty liver typing)while demonstrating utility in personalized care scenarios such as predicting hepatocellular carcinoma recurrence and opti-mizing inflammatory bowel disease treatment responses.Despite these advance-ments challenges persist including limited model generalization due to frag-mented datasets,algorithmic limitations in rare conditions(e.g.,pediatric liver diseases)caused by insufficient training data,and unresolved ethical issues related to bias,accountability,and patient privacy.Mitigation strategies involve constructing standardized multicenter databases,validating AI tools through prospective trials,leveraging federated learning to address data scarcity,and de-veloping interpretable systems(e.g.,attention heatmap visualization)to enhance clinical trust.Integrating generative AI,digital twin technologies,and establishing unified ethical/regulatory frameworks will accelerate AI adoption in primary care and foster equitable healthcare access while interdisciplinary collaboration and evidence-based implementation remain critical for realizing AI’s potential to redefine precision care for digestive disorders,improve global health outcomes,and reshape healthcare equity. 展开更多
关键词 Artificial intelligence Precision medicine GASTROENTEROLOGY HEPATOLOGY multimodal data integration Deep learning MICROBIOME
暂未订购
Uncovering differences in the spatial structure of intercity interactive networks described by multi-source migration flow:From the multi-hierarchical perspective
20
作者 WEI Shimei PAN Jinghu 《Journal of Geographical Sciences》 2025年第5期1049-1079,共31页
Population migration data derived from location-based services has often been used to delineate population flows between cities or construct intercity relationship networks to reveal and explore the complex interactio... Population migration data derived from location-based services has often been used to delineate population flows between cities or construct intercity relationship networks to reveal and explore the complex interaction patterns underlying human activities.Nevertheless,the inherent heterogeneity in multimodal migration big data has been ignored.This study conducts an in-depth comparison and quantitative analysis through a comprehensive lens of spatial association.Initially,the intercity interactive networks in China were constructed,utilizing migration data from Baidu and AutoNavi collected during the same time period.Subsequently,the characteristics and spatial structure similarities of the two types of intercity interactive networks were quantitatively assessed and analyzed from overall(network)and local(node)perspectives.Furthermore,the precision of these networks at the local scale is corroborated by constructing an intercity network from mobile phone(MP)data.Results indicate that the intercity interactive networks in China,as delineated by Baidu and AutoNavi migration flows,exhibit a high degree of structure equivalence.The correlation coefficient between these two networks is 0.874.Both networks exhibit a pronounced spatial polarization trend and hierarchical structure.This is evident in their distinct core and peripheral structures,as well as in the varying importance and influence of different nodes within the networks.Nevertheless,there are notable differences worthy of attention.Baidu intercity interactive network exhibits pronounced cross-regional effects,and its high-level interactions are characterized by a“rich-club”phenomenon.The AutoNavi intercity interactive network presents a more significant distance attenuation effect,and the high-level interactions display a gradient distribution pattern.Notably,there exists a substantial correlation between the AutoNavi and MP networks at the local scale,evidenced by a high correlation coefficient of 0.954.Furthermore,the“spatial dislocations”phenomenon was observed within the spatial structures at different levels,extracted from the Baidu and AutoNavi intercity networks.However,the measured results of network spatial structure similarity from three dimensions,namely,node location,node size,and local structure,indicate a relatively high similarity and consistency between the two networks. 展开更多
关键词 network differences interactive network intercity migration multimodal data China
原文传递
上一页 1 2 3 下一页 到第
使用帮助 返回顶部