期刊文献+
共找到59篇文章
< 1 2 3 >
每页显示 20 50 100
Research Progress on Multi-Modal Fusion Object Detection Algorithms for Autonomous Driving:A Review
1
作者 Peicheng Shi Li Yang +2 位作者 Xinlong Dong Heng Qi Aixi Yang 《Computers, Materials & Continua》 2025年第6期3877-3917,共41页
As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advan... As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advancing the development of perception technology in autonomous driving.To further promote the development of fusion algorithms and improve detection performance,this paper discusses the advantages and recent advancements of multimodal fusion-based object detection algorithms.Starting fromsingle-modal sensor detection,the paper provides a detailed overview of typical sensors used in autonomous driving and introduces object detection methods based on images and point clouds.For image-based detection methods,they are categorized into monocular detection and binocular detection based on different input types.For point cloud-based detection methods,they are classified into projection-based,voxel-based,point cluster-based,pillar-based,and graph structure-based approaches based on the technical pathways for processing point cloud features.Additionally,multimodal fusion algorithms are divided into Camera-LiDAR fusion,Camera-Radar fusion,Camera-LiDAR-Radar fusion,and other sensor fusion methods based on the types of sensors involved.Furthermore,the paper identifies five key future research directions in this field,aiming to provide insights for researchers engaged in multimodal fusion-based object detection algorithms and to encourage broader attention to the research and application of multimodal fusion-based object detection. 展开更多
关键词 multi-modal fusion 3D object detection deep learning autonomous driving
在线阅读 下载PDF
Which is more faithful,seeing or saying? Multimodal sarcasm detection exploiting contrasting sentiment knowledge
2
作者 Yutao Chen Shumin Shi Heyan Huang 《CAAI Transactions on Intelligence Technology》 2025年第2期375-386,共12页
Using sarcasm on social media platforms to express negative opinions towards a person or object has become increasingly common.However,detecting sarcasm in various forms of communication can be difficult due to confli... Using sarcasm on social media platforms to express negative opinions towards a person or object has become increasingly common.However,detecting sarcasm in various forms of communication can be difficult due to conflicting sentiments.In this paper,we introduce a contrasting sentiment-based model for multimodal sarcasm detection(CS4MSD),which identifies inconsistent emotions by leveraging the CLIP knowledge module to produce sentiment features in both text and image.Then,five external sentiments are introduced to prompt the model learning sentimental preferences among modalities.Furthermore,we highlight the importance of verbal descriptions embedded in illustrations and incorporate additional knowledge-sharing modules to fuse such imagelike features.Experimental results demonstrate that our model achieves state-of-the-art performance on the public multimodal sarcasm dataset. 展开更多
关键词 CLIP image-text classification knowledge fusion multi-modal sarcasm detection
在线阅读 下载PDF
Event-Aware Sarcasm Detection in Chinese Social Media Using Multi-Head Attention and Contrastive Learning
3
作者 Kexuan Niu Xiameng Si +1 位作者 Xiaojie Qi Haiyan Kang 《Computers, Materials & Continua》 2025年第10期2051-2070,共20页
Sarcasm detection is a complex and challenging task,particularly in the context of Chinese social media,where it exhibits strong contextual dependencies and cultural specificity.To address the limitations of existing ... Sarcasm detection is a complex and challenging task,particularly in the context of Chinese social media,where it exhibits strong contextual dependencies and cultural specificity.To address the limitations of existing methods in capturing the implicit semantics and contextual associations in sarcastic expressions,this paper proposes an event-aware model for Chinese sarcasm detection,leveraging a multi-head attention(MHA)mechanism and contrastive learning(CL)strategies.The proposed model employs a dual-path Bidirectional Encoder Representations from Transformers(BERT)encoder to process comment text and event context separately and integrates an MHA mechanism to facilitate deep interactions between the two,thereby capturing multidimensional semantic associations.Additionally,a CL strategy is introduced to enhance feature representation capabilities,further improving the model’s performance in handling class imbalance and complex contextual scenarios.The model achieves state-of-the-art performance on the Chinese sarcasm dataset,with significant improvements in accuracy(79.55%),F1-score(84.22%),and an area under the curve(AUC,84.35%). 展开更多
关键词 sarcasm detection event-aware multi-head attention contrastive learning NLP
在线阅读 下载PDF
PKME-MLM:A Novel Multimodal Large Model for Sarcasm Detection
4
作者 Jian Luo Yaling Li +1 位作者 Xueyu Li Xuliang Hu 《Computers, Materials & Continua》 2025年第4期877-896,共20页
Sarcasm detection in Natural Language Processing(NLP)has become increasingly important,partic-ularly with the rise of social media and non-textual emotional expressions,such as images.Existing methods often rely on se... Sarcasm detection in Natural Language Processing(NLP)has become increasingly important,partic-ularly with the rise of social media and non-textual emotional expressions,such as images.Existing methods often rely on separate image and text modalities,which may not fully utilize the information available from both sources.To address this limitation,we propose a novel multimodal large model,i.e.,the PKME-MLM(Prior Knowledge and Multi-label Emotion analysis based Multimodal Large Model for sarcasm detection).The PKME-MLM aims to enhance sarcasm detection by integrating prior knowledge to extract useful textual information from images,which is then combined with text data for deeper analysis.This method improves the integration of image and text data,addressing the limitation of previous models that process these modalities separately.Additionally,we incorporate multi-label sentiment analysis,refining sentiment labels to improve sarcasm recognition accuracy.This design overcomes the limitations of prior models that treated sentiment classification as a single-label problem,thereby improving sarcasm recognition by distinguishing subtle emotional cues from the text.Experimental results demonstrate that our approach achieves significant performance improvements in multimodal sarcasm detection tasks,with an accuracy(Acc.)of 94.35%,and Macro-Average Precision and Recall reaching 93.92%and 94.21%,respectively.These results highlight the potential of multimodal models in improving sarcasm detection and suggest that further integration of modalities could advance future research.This work also paves the way for incorporating multimodal sentiment analysis into sarcasm detection. 展开更多
关键词 sarcasm detection multimodal large model prior knowledge multi-label fusion
在线阅读 下载PDF
Multi-mode luminescence anti-counterfeiting and visual iron(Ⅲ)ions RTP detection constructed by assembly of CDs&Eu3+in porous RHO zeolite
5
作者 Siyu Zong Xiaowei Yu +2 位作者 Yining Yang Xin Yang Jiyang Li 《Chinese Chemical Letters》 2025年第6期567-572,共6页
Carbon dots(CDs)-based composites have shown impressive performance in fields of information encryption and sensing,however,a great challenge is to simultaneously implement multi-mode luminescence and room-temperature... Carbon dots(CDs)-based composites have shown impressive performance in fields of information encryption and sensing,however,a great challenge is to simultaneously implement multi-mode luminescence and room-temperature phosphorescence(RTP)detection in single system due to the formidable synthesis.Herein,a multifunctional composite of Eu&CDs@p RHO has been designed by co-assembly strategy and prepared via a facile calcination and impregnation treatment.Eu&CDs@p RHO exhibits intense fluorescence(FL)and RTP coming from two individual luminous centers,Eu3+in the free pores and CDs in the interrupted structure of RHO zeolite.Unique four-mode color outputs including pink(Eu^(3+),ex.254 nm),light violet(CDs,ex.365 nm),blue(CDs,254 nm off),and green(CDs,365 nm off)could be realized,on the basis of it,a preliminary application of advanced information encoding has been demonstrated.Given the free pores of matrix and stable RTP in water of confined CDs,a visual RTP detection of Fe^(3+)ions is achieved with the detection limit as low as 9.8μmol/L.This work has opened up a new perspective for the strategic amalgamation of luminous vips with porous zeolite to construct the advanced functional materials. 展开更多
关键词 Carbon dots ZEOLITE Host-vip assembly multi-mode luminescence Phosphorescence detection Information encryption
原文传递
Why Transformers Outperform LSTMs:A Comparative Study on Sarcasm Detection
6
作者 Palak Bari Gurnur Bedi +1 位作者 Khushi Joshi Anupama Jawale 《Journal on Artificial Intelligence》 2025年第1期499-508,共10页
This study investigates sarcasm detection in text using a dataset of 8095 sentences compiled from MUStARD and HuggingFace repositories,balanced across sarcastic and non-sarcastic classes.A sequential baseline model(LS... This study investigates sarcasm detection in text using a dataset of 8095 sentences compiled from MUStARD and HuggingFace repositories,balanced across sarcastic and non-sarcastic classes.A sequential baseline model(LSTM)is compared with transformer-based models(RoBERTa and XLNet),integrated with attention mechanisms.Transformers were chosen for their proven ability to capture long-range contextual dependencies,whereas LSTM serves as a traditional benchmark for sequential modeling.Experimental results show that RoBERTa achieves 0.87 accuracy,XLNet 0.83,and LSTM 0.52.These findings confirm that transformer architectures significantly outperform recurrent models in sarcasm detection.Future work will incorporate multimodal features and error analysis to further improve robustness. 展开更多
关键词 Attention mechanism LSTM natural language processing sarcasm detection sentiment analysis transformer models RoBERTa XLNet
在线阅读 下载PDF
MMDistill:Multi-Modal BEV Distillation Framework for Multi-View 3D Object Detection
7
作者 Tianzhe Jiao Yuming Chen +2 位作者 Zhe Zhang Chaopeng Guo Jie Song 《Computers, Materials & Continua》 SCIE EI 2024年第12期4307-4325,共19页
Multi-modal 3D object detection has achieved remarkable progress,but it is often limited in practical industrial production because of its high cost and low efficiency.The multi-view camera-based method provides a fea... Multi-modal 3D object detection has achieved remarkable progress,but it is often limited in practical industrial production because of its high cost and low efficiency.The multi-view camera-based method provides a feasible solution due to its low cost.However,camera data lacks geometric depth,and only using camera data to obtain high accuracy is challenging.This paper proposes a multi-modal Bird-Eye-View(BEV)distillation framework(MMDistill)to make a trade-off between them.MMDistill is a carefully crafted two-stage distillation framework based on teacher and student models for learning cross-modal knowledge and generating multi-modal features.It can improve the performance of unimodal detectors without introducing additional costs during inference.Specifically,our method can effectively solve the cross-gap caused by the heterogeneity between data.Furthermore,we further propose a Light Detection and Ranging(LiDAR)-guided geometric compensation module,which can assist the student model in obtaining effective geometric features and reduce the gap between different modalities.Our proposed method generally requires fewer computational resources and faster inference speed than traditional multi-modal models.This advancement enables multi-modal technology to be applied more widely in practical scenarios.Through experiments,we validate the effectiveness and superiority of MMDistill on the nuScenes dataset,achieving an improvement of 4.1%mean Average Precision(mAP)and 4.6%NuScenes Detection Score(NDS)over the baseline detector.In addition,we also present detailed ablation studies to validate our method. 展开更多
关键词 3D object detection multi-modal knowledge distillation deep learning remote sensing
在线阅读 下载PDF
Adaptive multi-modal feature fusion for far and hard object detection
8
作者 LI Yang GE Hongwei 《Journal of Measurement Science and Instrumentation》 CAS CSCD 2021年第2期232-241,共10页
In order to solve difficult detection of far and hard objects due to the sparseness and insufficient semantic information of LiDAR point cloud,a 3D object detection network with multi-modal data adaptive fusion is pro... In order to solve difficult detection of far and hard objects due to the sparseness and insufficient semantic information of LiDAR point cloud,a 3D object detection network with multi-modal data adaptive fusion is proposed,which makes use of multi-neighborhood information of voxel and image information.Firstly,design an improved ResNet that maintains the structure information of far and hard objects in low-resolution feature maps,which is more suitable for detection task.Meanwhile,semantema of each image feature map is enhanced by semantic information from all subsequent feature maps.Secondly,extract multi-neighborhood context information with different receptive field sizes to make up for the defect of sparseness of point cloud which improves the ability of voxel features to represent the spatial structure and semantic information of objects.Finally,propose a multi-modal feature adaptive fusion strategy which uses learnable weights to express the contribution of different modal features to the detection task,and voxel attention further enhances the fused feature expression of effective target objects.The experimental results on the KITTI benchmark show that this method outperforms VoxelNet with remarkable margins,i.e.increasing the AP by 8.78%and 5.49%on medium and hard difficulty levels.Meanwhile,our method achieves greater detection performance compared with many mainstream multi-modal methods,i.e.outperforming the AP by 1%compared with that of MVX-Net on medium and hard difficulty levels. 展开更多
关键词 3D object detection adaptive fusion multi-modal data fusion attention mechanism multi-neighborhood features
在线阅读 下载PDF
Speed-up Multi-modal Near Duplicate Image Detection
9
作者 Chunlei Yang Jinye Peng Jianping Fan 《Open Journal of Applied Sciences》 2013年第1期16-21,共6页
Near-duplicate image detection is a necessary operation to refine image search results for efficient user exploration. The existences of large amounts of near duplicates require fast and accurate automatic near-duplic... Near-duplicate image detection is a necessary operation to refine image search results for efficient user exploration. The existences of large amounts of near duplicates require fast and accurate automatic near-duplicate detection methods. We have designed a coarse-to-fine near duplicate detection framework to speed-up the process and a multi-modal integra-tion scheme for accurate detection. The duplicate pairs are detected with both global feature (partition based color his-togram) and local feature (CPAM and SIFT Bag-of-Word model). The experiment results on large scale data set proved the effectiveness of the proposed design. 展开更多
关键词 Near-Duplicate detection Coarse-To-Fine Framework multi-modal FEATURE Integration
在线阅读 下载PDF
PowerDetector:Malicious PowerShell Script Family Classification Based on Multi-Modal Semantic Fusion and Deep Learning 被引量:8
10
作者 Xiuzhang Yang Guojun Peng +2 位作者 Dongni Zhang Yuhang Gao Chenguang Li 《China Communications》 SCIE CSCD 2023年第11期202-224,共23页
Power Shell has been widely deployed in fileless malware and advanced persistent threat(APT)attacks due to its high stealthiness and live-off-theland technique.However,existing works mainly focus on deobfuscation and ... Power Shell has been widely deployed in fileless malware and advanced persistent threat(APT)attacks due to its high stealthiness and live-off-theland technique.However,existing works mainly focus on deobfuscation and malicious detection,lacking the malicious Power Shell families classification and behavior analysis.Moreover,the state-of-the-art methods fail to capture fine-grained features and semantic relationships,resulting in low robustness and accuracy.To this end,we propose Power Detector,a novel malicious Power Shell script detector based on multimodal semantic fusion and deep learning.Specifically,we design four feature extraction methods to extract key features from character,token,abstract syntax tree(AST),and semantic knowledge graph.Then,we intelligently design four embeddings(i.e.,Char2Vec,Token2Vec,AST2Vec,and Rela2Vec) and construct a multi-modal fusion algorithm to concatenate feature vectors from different views.Finally,we propose a combined model based on transformer and CNN-Bi LSTM to implement Power Shell family detection.Our experiments with five types of Power Shell attacks show that PowerDetector can accurately detect various obfuscated and stealth PowerShell scripts,with a 0.9402 precision,a 0.9358 recall,and a 0.9374 F1-score.Furthermore,through singlemodal and multi-modal comparison experiments,we demonstrate that PowerDetector’s multi-modal embedding and deep learning model can achieve better accuracy and even identify more unknown attacks. 展开更多
关键词 deep learning malicious family detection multi-modal semantic fusion POWERSHELL
在线阅读 下载PDF
Method for Detecting Multi-Modal Vibration Characteristics of Landmines 被引量:5
11
作者 WANG Chi DUAN Naiyuan +2 位作者 WU Zhiqiang MA Hui ZHU Jun 《Instrumentation》 2018年第4期39-45,共7页
The acoustic vibration characteristics of landmines are investigated by means of modal analysis. According to the mechanical structure of landmines, a certain number of points are marked on the landmine shell to analy... The acoustic vibration characteristics of landmines are investigated by means of modal analysis. According to the mechanical structure of landmines, a certain number of points are marked on the landmine shell to analyze its multi-modal vibration characteristics, based on laser self-mixing interferometer and taking 69 plastic landmine as an example, the vibration detection experiment system is built to show the results of analytical method of multi-modal testing. The first and second order natural frequencies of the bricks are 38 HZ and 106 HZ, 112 HZ and 232 HZ for plastic landmines, and 74 HZ and 290 HZ for metal landmines. The first and second order natural frequencies of the bricks are far smaller than those of plastic landmines and metal landmines. This indicates that landmines show multi-modal vibration characteristics under external excitation, which are significantly different from those of bricks. The findings can be used for further research on acoustic landmines detection technology. 展开更多
关键词 multi-modal VIBRATION NATURAL Frequency VIBRATION MODE ACOUSTIC LANDMINES detection
原文传递
Dynamic GNN-based multimodal anomaly detection for spatial crowdsourcing drone services
12
作者 Junaid Akram Walayat Hussain +2 位作者 Rutvij H.Jhaveri Rajkumar Singh Rathore Ali Anaissi 《Digital Communications and Networks》 2025年第5期1639-1656,共18页
We introduce a pioneering anomaly detection framework within spatial crowdsourcing Internet of Drone Things(IoDT),specifically designed to improve bushfire management in Australia’s expanding urban areas.This framewo... We introduce a pioneering anomaly detection framework within spatial crowdsourcing Internet of Drone Things(IoDT),specifically designed to improve bushfire management in Australia’s expanding urban areas.This framework innovatively combines Graph Neural Networks(GNN)and advanced data fusion techniques to enhance IoDT capabilities.Through spatial crowdsourcing,drones collectively gather diverse,real-time data across multiple locations,creating a rich dataset for analysis.This method integrates spatial,temporal,and various data modalities,facilitating early bushfire detection by identifying subtle environmental and operational changes.Utilizing a complex GNN architecture,our model effectively processes the intricacies of spatially crowdsourced data,significantly increasing anomaly detection accuracy.It incorporates modules for temporal pattern recognition and spatial analysis of environmental impacts,leveraging multimodal data to detect a wide range of anomalies,from temperature shifts to humidity variations.Our approach has been empirically validated,achieving an F1 score of 0.885,highlighting its superior anomaly detection performance.This integration of spatial crowdsourcing with IoDT not only establishes a new standard for environmental monitoring but also contributes significantly to disaster management and urban sustainability. 展开更多
关键词 Anomaly detection multi-modal data GNN IoDT Data fusion
在线阅读 下载PDF
TIDS: Tensor Based Intrusion Detection System (IDS) and Its Application in Large Scale DDoS Attack Detection
13
作者 Hanqing Sun Xue Li +1 位作者 Qiyuan Fan Puming Wang 《Computers, Materials & Continua》 2025年第7期1659-1679,共21页
The era of big data brings new challenges for information network systems(INS),simultaneously offering unprecedented opportunities for advancing intelligent intrusion detection systems.In this work,we propose a data-d... The era of big data brings new challenges for information network systems(INS),simultaneously offering unprecedented opportunities for advancing intelligent intrusion detection systems.In this work,we propose a data-driven intrusion detection system for Distributed Denial of Service(DDoS)attack detection.The system focuses on intrusion detection from a big data perceptive.As intelligent information processing methods,big data and artificial intelligence have been widely used in information systems.The INS system is an important information system in cyberspace.In advanced INS systems,the network architectures have become more complex.And the smart devices in INS systems collect a large scale of network data.How to improve the performance of a complex intrusion detection system with big data and artificial intelligence is a big challenge.To address the problem,we design a novel intrusion detection system(IDS)from a big data perspective.The IDS system uses tensors to represent large-scale and complex multi-source network data in a unified tensor.Then,a novel tensor decomposition(TD)method is developed to complete big data mining.The TD method seamlessly collaborates with the XGBoost(eXtreme Gradient Boosting)method to complete the intrusion detection.To verify the proposed IDS system,a series of experiments is conducted on two real network datasets.The results revealed that the proposed IDS system attained an impressive accuracy rate over 98%.Additionally,by altering the scale of the datasets,the proposed IDS system still maintains excellent detection performance,which demonstrates the proposed IDS system’s robustness. 展开更多
关键词 Intrusion detection system big data tensor decomposition multi-modal feature DDOS
在线阅读 下载PDF
Transformers for Multi-Modal Image Analysis in Healthcare
14
作者 Sameera V Mohd Sagheer Meghana K H +2 位作者 P M Ameer Muneer Parayangat Mohamed Abbas 《Computers, Materials & Continua》 2025年第9期4259-4297,共39页
Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status... Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status.Each of these methods contributes unique diagnostic insights,enhancing the overall assessment of patient condition.Nevertheless,the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution,data collection methods,and noise levels.While traditional models like Convolutional Neural Networks(CNNs)excel in single-modality tasks,they struggle to handle multi-modal complexities,lacking the capacity to model global relationships.This research presents a novel approach for examining multi-modal medical imagery using a transformer-based system.The framework employs self-attention and cross-attention mechanisms to synchronize and integrate features across various modalities.Additionally,it shows resilience to variations in noise and image quality,making it adaptable for real-time clinical use.To address the computational hurdles linked to transformer models,particularly in real-time clinical applications in resource-constrained environments,several optimization techniques have been integrated to boost scalability and efficiency.Initially,a streamlined transformer architecture was adopted to minimize the computational load while maintaining model effectiveness.Methods such as model pruning,quantization,and knowledge distillation have been applied to reduce the parameter count and enhance the inference speed.Furthermore,efficient attention mechanisms such as linear or sparse attention were employed to alleviate the substantial memory and processing requirements of traditional self-attention operations.For further deployment optimization,researchers have implemented hardware-aware acceleration strategies,including the use of TensorRT and ONNX-based model compression,to ensure efficient execution on edge devices.These optimizations allow the approach to function effectively in real-time clinical settings,ensuring viability even in environments with limited resources.Future research directions include integrating non-imaging data to facilitate personalized treatment and enhancing computational efficiency for implementation in resource-limited environments.This study highlights the transformative potential of transformer models in multi-modal medical imaging,offering improvements in diagnostic accuracy and patient care outcomes. 展开更多
关键词 multi-modal image analysis medical imaging deep learning image segmentation disease detection multi-modal fusion Vision Transformers(ViTs) precision medicine clinical decision support
在线阅读 下载PDF
Research on Sarcasm Detection Technology Based on Image-Text Fusion
15
作者 Xiaofang Jin Yuying Yang +1 位作者 YinanWu Ying Xu 《Computers, Materials & Continua》 SCIE EI 2024年第6期5225-5242,共18页
The emergence of new media in various fields has continuously strengthened the social aspect of social media.Netizens tend to express emotions in social interactions,and many people even use satire,metaphors,and other... The emergence of new media in various fields has continuously strengthened the social aspect of social media.Netizens tend to express emotions in social interactions,and many people even use satire,metaphors,and other techniques to express some negative emotions,it is necessary to detect sarcasm in social comment data.For sarcasm,the more reference data modalities used,the better the experimental effect.This paper conducts research on sarcasm detection technology based on image-text fusion data.To effectively utilize the features of each modality,a feature reconstruction output algorithm is proposed.This algorithm is based on the attention mechanism,learns the low-rank features of another modality through cross-modality,the eigenvectors are reconstructed for the corresponding modality through weighted averaging.When only the image modality in the dataset is used,the preprocessed data has outstanding performance in reconstructing the output model,with an accuracy rate of 87.6%.When using only the text modality data in the dataset,the reconstructed output model is optimal,with an accuracy rate of 85.2%.To improve feature fusion between modalities for effective classification,a weight adaptive learning algorithm is used.This algorithm uses a neural network combined with an attention mechanism to calculate the attention weight of each modality to achieve weight adaptive learning purposes,with an accuracy rate of 87.9%.Extensive experiments on a benchmark dataset demonstrate the superiority of our proposed model. 展开更多
关键词 Sentiment analysis sarcasm detection feature fusion feature reconstruction
在线阅读 下载PDF
Fake News Detection Based on Text-Modal Dominance and Fusing Multiple Multi-Model Clues
16
作者 Li fang Fu Huanxin Peng +1 位作者 Changjin Ma Yuhan Liu 《Computers, Materials & Continua》 SCIE EI 2024年第3期4399-4416,共18页
In recent years,how to efficiently and accurately identify multi-model fake news has become more challenging.First,multi-model data provides more evidence but not all are equally important.Secondly,social structure in... In recent years,how to efficiently and accurately identify multi-model fake news has become more challenging.First,multi-model data provides more evidence but not all are equally important.Secondly,social structure information has proven to be effective in fake news detection and how to combine it while reducing the noise information is critical.Unfortunately,existing approaches fail to handle these problems.This paper proposes a multi-model fake news detection framework based on Tex-modal Dominance and fusing Multiple Multi-model Cues(TD-MMC),which utilizes three valuable multi-model clues:text-model importance,text-image complementary,and text-image inconsistency.TD-MMC is dominated by textural content and assisted by image information while using social network information to enhance text representation.To reduce the irrelevant social structure’s information interference,we use a unidirectional cross-modal attention mechanism to selectively learn the social structure’s features.A cross-modal attention mechanism is adopted to obtain text-image cross-modal features while retaining textual features to reduce the loss of important information.In addition,TD-MMC employs a new multi-model loss to improve the model’s generalization ability.Extensive experiments have been conducted on two public real-world English and Chinese datasets,and the results show that our proposed model outperforms the state-of-the-art methods on classification evaluation metrics. 展开更多
关键词 Fake news detection cross-modal attention mechanism multi-modal fusion social network transfer learning
在线阅读 下载PDF
Feature-Based Augmentation in Sarcasm Detection Using Reverse Generative Adversarial Network
17
作者 Derwin Suhartono Alif Tri Handoyo Franz Adeta Junior 《Computers, Materials & Continua》 SCIE EI 2023年第12期3637-3657,共21页
Sarcasm detection in text data is an increasingly vital area of research due to the prevalence of sarcastic content in online communication.This study addresses challenges associated with small datasets and class imba... Sarcasm detection in text data is an increasingly vital area of research due to the prevalence of sarcastic content in online communication.This study addresses challenges associated with small datasets and class imbalances in sarcasm detection by employing comprehensive data pre-processing and Generative Adversial Network(GAN)based augmentation on diverse datasets,including iSarcasm,SemEval-18,and Ghosh.This research offers a novel pipeline for augmenting sarcasm data with Reverse Generative Adversarial Network(RGAN).The proposed RGAN method works by inverting labels between original and synthetic data during the training process.This inversion of labels provides feedback to the generator for generating high-quality data closely resembling the original distribution.Notably,the proposed RGAN model exhibits performance on par with standard GAN,showcasing its robust efficacy in augmenting text data.The exploration of various datasets highlights the nuanced impact of augmentation on model performance,with cautionary insights into maintaining a delicate balance between synthetic and original data.The methodological framework encompasses comprehensive data pre-processing and GAN-based augmentation,with a meticulous comparison against Natural Language Processing Augmentation(NLPAug)as an alternative augmentation technique.Overall,the F1-score of our proposed technique outperforms that of the synonym replacement augmentation technique using NLPAug.The increase in F1-score in experiments using RGAN ranged from 0.066%to 1.054%,and the use of standard GAN resulted in a 2.88%increase in F1-score.The proposed RGAN model outperformed the NLPAug method and demonstrated comparable performance to standard GAN,emphasizing its efficacy in text data augmentation. 展开更多
关键词 Data augmentation Generative Adversarial Network(GAN) Reverse GAN(RGAN) sarcasm detection
在线阅读 下载PDF
Bis-naphthalimide-based supramolecular self-assembly system for selective and colorimetric detection of oxalyl chloride and phosgene in solution and gas phase
18
作者 Qingqing Wang Huijuan Wu +3 位作者 Aiping Gao Xuefei Ge Xueping Chang Xinhua Cao 《Chinese Chemical Letters》 SCIE CAS CSCD 2023年第6期492-496,共5页
Two bis-naphthalimide-based supramolecular gelators(NN-3 and NN-4)with a little difference of position of amino groups were designed and synthesized for the detection of oxaloyl chloride and phosgene.Energy transfer c... Two bis-naphthalimide-based supramolecular gelators(NN-3 and NN-4)with a little difference of position of amino groups were designed and synthesized for the detection of oxaloyl chloride and phosgene.Energy transfer could be occurred between two naphthalimide groups in molecules NN-3 and NN-4.Yellow gels NN-3 and NN-4 were formed in some mixed solvents,and nanofibers with different size were obtained in these gels.The self-assembly processes of NN-3 and NN-4 in different solvents were investigated by UV-vis absorption,fluorescent spectra,SEM,FTIR,XRD and NMR.Gelators NN-3 and NN-4 could selectively detect oxaloyl chloride in solution and film states,but detect phosgene only in solution.NN-3exhibited the ratiometric detection ability towards oxaloyl chloride and phosgene with the low limit of detection(LOD)of 210 nmol/L and 90 nmol/L,respectively.NN-4 as the corresponding control sample,it owned the higher LOD towards oxaloyl chloride and phosgene of 12.4μmol/L and 64μmol/L,respectively.Interestingly,films NN-3 and NN-4 could sensitively detect oxaloyl chloride gases with the low LOD of2.0 ppm and 8.34 ppm,respectively.The detection mechanisms of NN-3 and NN-4 were well studied by1H NMR titration,HRMS and theoretical calculation. 展开更多
关键词 Bis-naphthalimide Self-assembly Colorimetric detection multi-modes
原文传递
Fake News Detection Based on Cross-Modal Message Aggregation and Gated Fusion Network
19
作者 Fangfang Shan Mengyao Liu +1 位作者 Menghan Zhang Zhenyu Wang 《Computers, Materials & Continua》 SCIE EI 2024年第7期1521-1542,共22页
Social media has become increasingly significant in modern society,but it has also turned into a breeding ground for the propagation of misleading information,potentially causing a detrimental impact on public opinion... Social media has become increasingly significant in modern society,but it has also turned into a breeding ground for the propagation of misleading information,potentially causing a detrimental impact on public opinion and daily life.Compared to pure text content,multmodal content significantly increases the visibility and share ability of posts.This has made the search for efficient modality representations and cross-modal information interaction methods a key focus in the field of multimodal fake news detection.To effectively address the critical challenge of accurately detecting fake news on social media,this paper proposes a fake news detection model based on crossmodal message aggregation and a gated fusion network(MAGF).MAGF first uses BERT to extract cumulative textual feature representations and word-level features,applies Faster Region-based ConvolutionalNeuralNetwork(Faster R-CNN)to obtain image objects,and leverages ResNet-50 and Visual Geometry Group-19(VGG-19)to obtain image region features and global features.The image region features and word-level text features are then projected into a low-dimensional space to calculate a text-image affinity matrix for cross-modal message aggregation.The gated fusion network combines text and image region features to obtain adaptively aggregated features.The interaction matrix is derived through an attention mechanism and further integrated with global image features using a co-attention mechanism to producemultimodal representations.Finally,these fused features are fed into a classifier for news categorization.Experiments were conducted on two public datasets,Twitter and Weibo.Results show that the proposed model achieves accuracy rates of 91.8%and 88.7%on the two datasets,respectively,significantly outperforming traditional unimodal and existing multimodal models. 展开更多
关键词 Fake news detection cross-modalmessage aggregation gate fusion network co-attention mechanism multi-modal representation
在线阅读 下载PDF
Cross-Modal Relation-Aware Networks for Fake News Detection
20
作者 Hui Yu Jinguang Wang 《Journal of New Media》 2022年第1期13-26,共14页
With the speedy development of communication Internet and the widespread use of social multimedia,so many creators have published posts on social multimedia platforms that fake news detection has already been a challe... With the speedy development of communication Internet and the widespread use of social multimedia,so many creators have published posts on social multimedia platforms that fake news detection has already been a challenging task.Although some works use deep learning methods to capture visual and textual information of posts,most existingmethods cannot explicitly model the binary relations among image regions or text tokens to mine the global relation information in a modality deeply such as image or text.Moreover,they cannot fully exploit the supplementary cross-modal information,including image and text relations,to supplement and enrich each modality.In order to address these problems,in this paper,we propose an innovative end-to-end Cross-modal Relation-aware Networks(CRAN),which exploits jointly models the visual and textual information with their corresponding relations in a unified framework.(1)To capture the global structural relations in a modality,we design a global relation-aware network to explicitly model the relation-aware semantics of the fragment features in the target modality from a global scope perspective.(2)To effectively fuse cross-modal information,we propose a cross-modal co-attention network module for multi-modal information fusion,which utilizes the intra-modality relationships and inter-modality relationship jointly among image regions and textual words to replenish and heighten each other.Extensive experiments on two public real-world datasets demonstrate the superior performance of CRAN compared with other state-of-the-art baseline algorithms. 展开更多
关键词 Fake news detection relation-aware networks multi-modal fusion
在线阅读 下载PDF
上一页 1 2 3 下一页 到第
使用帮助 返回顶部