期刊文献+
共找到8,214篇文章
< 1 2 250 >
每页显示 20 50 100
Attention Mechanisms and FFM Feature Fusion Module-Based Modification of the Deep Neural Network for Detection of Structural Cracks
1
作者 Tao Jin Zhekun Shou +1 位作者 Hongchao Liu Yuchun Shao 《Computer Modeling in Engineering & Sciences》 2026年第2期345-366,共22页
This research centers on structural health monitoring of bridges,a critical transportation infrastructure.Owing to the cumulative action of heavy vehicle loads,environmental variations,and material aging,bridge compon... This research centers on structural health monitoring of bridges,a critical transportation infrastructure.Owing to the cumulative action of heavy vehicle loads,environmental variations,and material aging,bridge components are prone to cracks and other defects,severely compromising structural safety and service life.Traditional inspection methods relying on manual visual assessment or vehicle-mounted sensors suffer from low efficiency,strong subjectivity,and high costs,while conventional image processing techniques and early deep learning models(e.g.,UNet,Faster R-CNN)still performinadequately in complex environments(e.g.,varying illumination,noise,false cracks)due to poor perception of fine cracks andmulti-scale features,limiting practical application.To address these challenges,this paper proposes CACNN-Net(CBAM-Augmented CNN),a novel dual-encoder architecture that innovatively couples a CNN for local detail extraction with a CBAM-Transformer for global context modeling.A key contribution is the dedicated Feature FusionModule(FFM),which strategically integratesmulti-scale features and focuses attention on crack regions while suppressing irrelevant noise.Experiments on bridge crack datasets demonstrate that CACNNNet achieves a precision of 77.6%,a recall of 79.4%,and an mIoU of 62.7%.These results significantly outperform several typical models(e.g.,UNet-ResNet34,Deeplabv3),confirming their superior accuracy and robust generalization,providing a high-precision automated solution for bridge crack detection and a novel network design paradigm for structural surface defect identification in complex scenarios,while future research may integrate physical features like depth information to advance intelligent infrastructure maintenance and digital twin management. 展开更多
关键词 Bridge crack diseases structural health monitoring convolutional neural network feature fusion
在线阅读 下载PDF
VIFusion:低光场景下可见光与红外图像的互补融合模型
2
作者 张晓滨 牛燕皓 陈金广 《西安工程大学学报》 2026年第1期126-135,共10页
针对低光场景下可见光与红外图像融合算法存在时序信息丢失、特征图通道冗余、细节模糊等问题,本文基于Vision Transformer框架,提出了一种低光场景下可见光与红外图像的互补融合模型VIFusion。该模型通过包含的双时态特征聚合(dual tem... 针对低光场景下可见光与红外图像融合算法存在时序信息丢失、特征图通道冗余、细节模糊等问题,本文基于Vision Transformer框架,提出了一种低光场景下可见光与红外图像的互补融合模型VIFusion。该模型通过包含的双时态特征聚合(dual temporal feature aggregation,DTFA)模块、特征细化前馈网络(feature refinement feedforward network,FRFN)模块和空间通道注意力机制(spatial channel attention,SCA)模块提升了融合图像的质量和信息表达能力。其中,DTFA模块使用分组卷积保持特征空间完整性,然后进行时序对齐与融合,以增强时序一致性并减少信息损失。FRFN模块对提取的特征进行逐层优化,减少通道冗余。SCA模块通过自适应建模图像空间和通道关系,突出关键特征,提高信息表达能力、增强边缘、纹理等细节信息。实验结果表明:在LLVIP数据集上,VIFusion模型在客观指标(AG、CC、EN、SF、SSIM、VIF、MI)上优于传统方法和深度学习模型(如GTF、TarDAL、DenseFuse等)。在数据集TNO上的泛化实验中,生成的融合图像在细节保留和目标突出上也表现更佳。VIFusion模型为低光场景下的多模态图像融合提供了一种高效实用的解决方案。 展开更多
关键词 双时态特征聚合 特征细化前馈网络 空间通道注意力 图像融合
在线阅读 下载PDF
Self-FAGCFN:Graph-Convolution Fusion Network Based on Feature Fusion and Self-Supervised Feature Alignment for Pneumonia and Tuberculosis Diagnosis
3
作者 Junding Sun Wenhao Tang +5 位作者 Lei Zhao Chaosheng Tang Xiaosheng Wu Zhaozhao Xu Bin Pu Yudong Zhang 《Journal of Bionic Engineering》 2025年第4期2012-2029,共18页
Feature fusion is an important technique in medical image classification that can improve diagnostic accuracy by integrating complementary information from multiple sources.Recently,Deep Learning(DL)has been widely us... Feature fusion is an important technique in medical image classification that can improve diagnostic accuracy by integrating complementary information from multiple sources.Recently,Deep Learning(DL)has been widely used in pulmonary disease diagnosis,such as pneumonia and tuberculosis.However,traditional feature fusion methods often suffer from feature disparity,information loss,redundancy,and increased complexity,hindering the further extension of DL algorithms.To solve this problem,we propose a Graph-Convolution Fusion Network with Self-Supervised Feature Alignment(Self-FAGCFN)to address the limitations of traditional feature fusion methods in deep learning-based medical image classification for respiratory diseases such as pneumonia and tuberculosis.The network integrates Convolutional Neural Networks(CNNs)for robust feature extraction from two-dimensional grid structures and Graph Convolutional Networks(GCNs)within a Graph Neural Network branch to capture features based on graph structure,focusing on significant node representations.Additionally,an Attention-Embedding Ensemble Block is included to capture critical features from GCN outputs.To ensure effective feature alignment between pre-and post-fusion stages,we introduce a feature alignment loss that minimizes disparities.Moreover,to address the limitations of proposed methods,such as inappropriate centroid discrepancies during feature alignment and class imbalance in the dataset,we develop a Feature-Centroid Fusion(FCF)strategy and a Multi-Level Feature-Centroid Update(MLFCU)algorithm,respectively.Extensive experiments on public datasets LungVision and Chest-Xray demonstrate that the Self-FAGCFN model significantly outperforms existing methods in diagnosing pneumonia and tuberculosis,highlighting its potential for practical medical applications. 展开更多
关键词 Feature fusion Self-supervised feature alignment Convolutional neural networks Graph convolutional networks Class imbalance Feature-centroid fusion
在线阅读 下载PDF
DMF: A Deep Multimodal Fusion-Based Network Traffic Classification Model
4
作者 Xiangbin Wang Qingjun Yuan +3 位作者 Weina Niu Qianwei Meng Yongjuan Wang Chunxiang Gu 《Computers, Materials & Continua》 2025年第5期2267-2285,共19页
With the rise of encrypted traffic,traditional network analysis methods have become less effective,leading to a shift towards deep learning-based approaches.Among these,multimodal learning-based classification methods... With the rise of encrypted traffic,traditional network analysis methods have become less effective,leading to a shift towards deep learning-based approaches.Among these,multimodal learning-based classification methods have gained attention due to their ability to leverage diverse feature sets from encrypted traffic,improving classification accuracy.However,existing research predominantly relies on late fusion techniques,which hinder the full utilization of deep features within the data.To address this limitation,we propose a novel multimodal encrypted traffic classification model that synchronizes modality fusion with multiscale feature extraction.Specifically,our approach performs real-time fusion of modalities at each stage of feature extraction,enhancing feature representation at each level and preserving inter-level correlations for more effective learning.This continuous fusion strategy improves the model’s ability to detect subtle variations in encrypted traffic,while boosting its robustness and adaptability to evolving network conditions.Experimental results on two real-world encrypted traffic datasets demonstrate that our method achieves a classification accuracy of 98.23% and 97.63%,outperforming existing multimodal learning-based methods. 展开更多
关键词 Deep fusion intrusion detection multimodal learning network traffic classification
在线阅读 下载PDF
Enhanced Multi-Object Dwarf Mongoose Algorithm for Optimization Stochastic Data Fusion Wireless Sensor Network Deployment
5
作者 Shumin Li Qifang Luo Yongquan Zhou 《Computer Modeling in Engineering & Sciences》 2025年第2期1955-1994,共40页
Wireless sensor network deployment optimization is a classic NP-hard problem and a popular topic in academic research.However,the current research on wireless sensor network deployment problems uses overly simplistic ... Wireless sensor network deployment optimization is a classic NP-hard problem and a popular topic in academic research.However,the current research on wireless sensor network deployment problems uses overly simplistic models,and there is a significant gap between the research results and actual wireless sensor networks.Some scholars have now modeled data fusion networks to make them more suitable for practical applications.This paper will explore the deployment problem of a stochastic data fusion wireless sensor network(SDFWSN),a model that reflects the randomness of environmental monitoring and uses data fusion techniques widely used in actual sensor networks for information collection.The deployment problem of SDFWSN is modeled as a multi-objective optimization problem.The network life cycle,spatiotemporal coverage,detection rate,and false alarm rate of SDFWSN are used as optimization objectives to optimize the deployment of network nodes.This paper proposes an enhanced multi-objective mongoose optimization algorithm(EMODMOA)to solve the deployment problem of SDFWSN.First,to overcome the shortcomings of the DMOA algorithm,such as its low convergence and tendency to get stuck in a local optimum,an encircling and hunting strategy is introduced into the original algorithm to propose the EDMOA algorithm.The EDMOA algorithm is designed as the EMODMOA algorithm by selecting reference points using the K-Nearest Neighbor(KNN)algorithm.To verify the effectiveness of the proposed algorithm,the EMODMOA algorithm was tested at CEC 2020 and achieved good results.In the SDFWSN deployment problem,the algorithm was compared with the Non-dominated Sorting Genetic Algorithm II(NSGAII),Multiple Objective Particle Swarm Optimization(MOPSO),Multi-Objective Evolutionary Algorithm based on Decomposition(MOEA/D),and Multi-Objective Grey Wolf Optimizer(MOGWO).By comparing and analyzing the performance evaluation metrics and optimization results of the objective functions of the multi-objective algorithms,the algorithm outperforms the other algorithms in the SDFWSN deployment results.To better demonstrate the superiority of the algorithm,simulations of diverse test cases were also performed,and good results were obtained. 展开更多
关键词 Stochastic data fusion wireless sensor networks network deployment spatiotemporal coverage dwarf mongoose optimization algorithm multi-objective optimization
在线阅读 下载PDF
Predictions of complete fusion cross‑sections of ^(6,7)Li,^(9)Be,and ^(10)B using a Bayesian neural network method
6
作者 Kai‑Xuan Cheng Rong‑Xing He +1 位作者 Chun‑Yuan Qiao Chun‑Wang Ma 《Nuclear Science and Techniques》 2025年第10期169-175,共7页
A machine learning approach based on Bayesian neural networks was developed to predict the complete fusion cross-sections of weakly bound nuclei.This method was trained and validated using 475 experimental data points... A machine learning approach based on Bayesian neural networks was developed to predict the complete fusion cross-sections of weakly bound nuclei.This method was trained and validated using 475 experimental data points from 39 reaction systems induced by ^(6,7)Li,^(9)Be,and ^(10)B.The constructed Bayesian neural network demonstrated a high degree of accuracy in evaluating complete fusion cross-sections.By comparing the predicted cross-sections with those obtained from a single-barrier penetration model,the suppression effect of ^(6,7)Li and ^(9)Be with a stable nucleus was systematically analyzed.In the cases of ^(6)Li and ^(7)Li,less suppression was predicted for relatively light-mass targets than for heavy-mass targets,and a notably distinct dependence relationship was identified,suggesting that the predominant breakup mechanisms might change in different mass target regions.In addition,minimum suppression factors were predicted to occur near target nuclei with neutron-closed shell. 展开更多
关键词 fusion reaction Weakly bound nuclei Machine learning Bayesian neural network
在线阅读 下载PDF
MMIF:Multimodal Medical Image Fusion Network Based on Multi-Scale Hybrid Attention
7
作者 Jianjun Liu Yang Li +2 位作者 Xiaoting Sun Xiaohui Wang Hanjiang Luo 《Computers, Materials & Continua》 2025年第11期3551-3568,共18页
Multimodal image fusion plays an important role in image analysis and applications.Multimodal medical image fusion helps to combine contrast features from two or more input imaging modalities to represent fused inform... Multimodal image fusion plays an important role in image analysis and applications.Multimodal medical image fusion helps to combine contrast features from two or more input imaging modalities to represent fused information in a single image.One of the critical clinical applications of medical image fusion is to fuse anatomical and functional modalities for rapid diagnosis of malignant tissues.This paper proposes a multimodal medical image fusion network(MMIF-Net)based on multiscale hybrid attention.The method first decomposes the original image to obtain the low-rank and significant parts.Then,to utilize the features at different scales,we add amultiscalemechanism that uses three filters of different sizes to extract the features in the encoded network.Also,a hybrid attention module is introduced to obtain more image details.Finally,the fused images are reconstructed by decoding the network.We conducted experiments with clinical images from brain computed tomography/magnetic resonance.The experimental results show that the multimodal medical image fusion network method based on multiscale hybrid attention works better than other advanced fusion methods. 展开更多
关键词 Medical image fusion multiscale mechanism hybrid attention module encoded network
在线阅读 下载PDF
Dual-channel graph convolutional network with multi-order information fusion for skeleton-based action recognition
8
作者 JIANG Tao HU Zhentao +2 位作者 WANG Kaige QIU Qian REN Xing 《High Technology Letters》 2025年第3期257-265,共9页
Skeleton-based human action recognition focuses on identifying actions from dynamic skeletal data,which contains both temporal and spatial characteristics.However,this approach faces chal-lenges such as viewpoint vari... Skeleton-based human action recognition focuses on identifying actions from dynamic skeletal data,which contains both temporal and spatial characteristics.However,this approach faces chal-lenges such as viewpoint variations,low recognition accuracy,and high model complexity.Skeleton-based graph convolutional network(GCN)generally outperform other deep learning methods in rec-ognition accuracy.However,they often underutilize temporal features and suffer from high model complexity,leading to increased training and validation costs,especially on large-scale datasets.This paper proposes a dual-channel graph convolutional network with multi-order information fusion(DM-AGCN)for human action recognition.The network integrates high frame rate skeleton chan-nels to capture action dynamics and low frame rate channels to preserve static semantic information,effectively balancing temporal and spatial features.This dual-channel architecture allows for separate processing of temporal and spatial information.Additionally,DM-AGCN extracts joint keypoints and bidirectional bone vectors from skeleton sequences,and employs a three-stream graph convolu-tional structure to extract features that describe human movement.Experimental results on the NTU-RGB+D dataset demonstrate that DM-AGCN achieves an accuracy of 89.4%on the X-Sub and 95.8%on the X-View,while reducing model complexity to 3.68 GFLOPs(Giga Floating-point Oper-ations Per Second).On the Kinetics-Skeleton dataset,the model achieves a Top-1 accuracy of 37.2%and a Top-5 accuracy of 60.3%,further validating its effectiveness across different benchmarks. 展开更多
关键词 human action recognition graph convolutional network spatiotemporal fusion feature extraction
在线阅读 下载PDF
Cross-feature fusion speech emotion recognition based on attention mask residual network and Wav2vec 2.0
9
作者 Xiaoke Li Zufan Zhang 《Digital Communications and Networks》 2025年第5期1567-1577,共11页
Speech Emotion Recognition(SER)has received widespread attention as a crucial way for understanding human emotional states.However,the impact of irrelevant information on speech signals and data sparsity limit the dev... Speech Emotion Recognition(SER)has received widespread attention as a crucial way for understanding human emotional states.However,the impact of irrelevant information on speech signals and data sparsity limit the development of SER system.To address these issues,this paper proposes a framework that incorporates the Attentive Mask Residual Network(AM-ResNet)and the self-supervised learning model Wav2vec 2.0 to obtain AM-ResNet features and Wav2vec 2.0 features respectively,together with a cross-attention module to interact and fuse these two features.The AM-ResNet branch mainly consists of maximum amplitude difference detection,mask residual block,and an attention mechanism.Among them,the maximum amplitude difference detection and the mask residual block act on the pre-processing and the network,respectively,to reduce the impact of silent frames,and the attention mechanism assigns different weights to unvoiced and voiced speech to reduce redundant emotional information caused by unvoiced speech.In the Wav2vec 2.0 branch,this model is introduced as a feature extractor to obtain general speech features(Wav2vec 2.0 features)through pre-training with a large amount of unlabeled speech data,which can assist the SER task and cope with data sparsity problems.In the cross-attention module,AM-ResNet features and Wav2vec 2.0 features are interacted with and fused to obtain the cross-fused features,which are used to predict the final emotion.Furthermore,multi-label learning is also used to add ambiguous emotion utterances to deal with data limitations.Finally,experimental results illustrate the usefulness and superiority of our proposed framework over existing state-of-the-art approaches. 展开更多
关键词 Speech emotion recognition Residual network MASK ATTENTION Wav2vec 2.0 Cross-feature fusion
在线阅读 下载PDF
Low-Light Image Enhancement Based on Wavelet Local and Global Feature Fusion Network
10
作者 Shun Song Xiangqian Jiang Dawei Zhao 《Journal of Contemporary Educational Research》 2025年第11期209-214,共6页
A wavelet-based local and global feature fusion network(LAGN)is proposed for low-light image enhancement,aiming to enhance image details and restore colors in dark areas.This study focuses on addressing three key issu... A wavelet-based local and global feature fusion network(LAGN)is proposed for low-light image enhancement,aiming to enhance image details and restore colors in dark areas.This study focuses on addressing three key issues in low-light image enhancement:Enhancing low-light images using LAGN to preserve image details and colors;extracting image edge information via wavelet transform to enhance image details;and extracting local and global features of images through convolutional neural networks and Transformer to improve image contrast.Comparisons with state-of-the-art methods on two datasets verify that LAGN achieves the best performance in terms of details,brightness,and contrast. 展开更多
关键词 Image enhancement Feature fusion Wavelet transform Convolutional Neural network(CNN) TRANSFORMER
在线阅读 下载PDF
A Prediction Method for Concrete Mixing Temperature Based on the Fusion of Physical Models and Neural Networks
11
作者 Lei Zheng Hong Pan +6 位作者 Yuelei Ruan Guoxin Zhang Lei Zhang Jianda Xin Zhenyang Zhu Jianyao Zhang Wei Liu 《Computer Modeling in Engineering & Sciences》 2025年第12期3217-3241,共25页
As a critical material in construction engineering,concrete requires accurate prediction of its outlet temperature to ensure structural quality and enhance construction efficiency.This study proposes a novel hybrid pr... As a critical material in construction engineering,concrete requires accurate prediction of its outlet temperature to ensure structural quality and enhance construction efficiency.This study proposes a novel hybrid prediction method that integrates a heat conduction physical model with a multilayer perceptron(MLP)neural network,dynamically fused via a weighted strategy to achieve high-precision temperature estimation.Experimental results on an independent test set demonstrated the superior performance of the fused model,with a root mean square error(RMSE)of 1.59℃ and a mean absolute error(MAE)of 1.23℃,representing a 25.3%RMSE reduction compared to conventional physical models.Ambient temperature and coarse aggregate temperature were identified as the most influential variables.Furthermore,the model-based temperature control strategy reduced costs by 0.81 CNY/m^(3),showing significant potential for improving resource efficiency and supporting sustainable construction practices. 展开更多
关键词 Concrete outlet temperature prediction physical model neural network dynamic weight fusion temperature control
在线阅读 下载PDF
xCViT:Improved Vision Transformer Network with Fusion of CNN and Xception for Skin Disease Recognition with Explainable AI
12
作者 Armughan Ali Hooria Shahbaz Robertas Damaševicius 《Computers, Materials & Continua》 2025年第4期1367-1398,共32页
Skin cancer is the most prevalent cancer globally,primarily due to extensive exposure to Ultraviolet(UV)radiation.Early identification of skin cancer enhances the likelihood of effective treatment,as delays may lead t... Skin cancer is the most prevalent cancer globally,primarily due to extensive exposure to Ultraviolet(UV)radiation.Early identification of skin cancer enhances the likelihood of effective treatment,as delays may lead to severe tumor advancement.This study proposes a novel hybrid deep learning strategy to address the complex issue of skin cancer diagnosis,with an architecture that integrates a Vision Transformer,a bespoke convolutional neural network(CNN),and an Xception module.They were evaluated using two benchmark datasets,HAM10000 and Skin Cancer ISIC.On the HAM10000,the model achieves a precision of 95.46%,an accuracy of 96.74%,a recall of 96.27%,specificity of 96.00%and an F1-Score of 95.86%.It obtains an accuracy of 93.19%,a precision of 93.25%,a recall of 92.80%,a specificity of 92.89%and an F1-Score of 93.19%on the Skin Cancer ISIC dataset.The findings demonstrate that the model that was proposed is robust and trustworthy when it comes to the classification of skin lesions.In addition,the utilization of Explainable AI techniques,such as Grad-CAM visualizations,assists in highlighting the most significant lesion areas that have an impact on the decisions that are made by the model. 展开更多
关键词 Skin lesions vision transformer CNN Xception deep learning network fusion explainable AI Grad-CAM skin cancer detection
在线阅读 下载PDF
Stochastic state of health estimation for lithium-ion batteries with automated feature fusion using quantum convolutional neural network
13
作者 Chen Liang Shengyu Tao +3 位作者 Xinghao Huang Yezhen Wang Bizhong Xia Xuan Zhang 《Journal of Energy Chemistry》 2025年第7期205-219,共15页
The accurate state of health(SOH)estimation of lithium-ion batteries is crucial for efficient,healthy,and safe operation of battery systems.Extracting meaningful aging information from highly stochastic and noisy data... The accurate state of health(SOH)estimation of lithium-ion batteries is crucial for efficient,healthy,and safe operation of battery systems.Extracting meaningful aging information from highly stochastic and noisy data segments while designing SOH estimation algorithms that efficiently handle the large-scale computational demands of cloud-based battery management systems presents a substantial challenge.In this work,we propose a quantum convolutional neural network(QCNN)model designed for accurate,robust,and generalizable SOH estimation with minimal data and parameter requirements and is compatible with quantum computing cloud platforms in the Noisy Intermediate-Scale Quantum.First,we utilize data from 4 datasets comprising 272 cells,covering 5 chemical compositions,4 rated parameters,and 73operating conditions.We design 5 voltage windows as small as 0.3 V for each cell from incremental capacity peaks for stochastic SOH estimation scenarios generation.We extract 3 effective health indicators(HIs)sequences and develop an automated feature fusion method using quantum rotation gate encoding,achieving an R2of 96%.Subsequently,we design a QCNN whose convolutional layer,constructed with variational quantum circuits,comprises merely 39 parameters.Additionally,we explore the impact of training set size,using strategies,and battery materials on the model’s accuracy.Finally,the QCNN with quantum convolutional layers reduces root mean squared error by 28% and achieves an R^(2)exceeding 96% compared to other three commonly used algorithms.This work demonstrates the effectiveness of quantum encoding for automated feature fusion of HIs extracted from limited discharge data.It highlights the potential of QCNN in improving the accuracy,robustness,and generalization of SOH estimation while dealing with stochastic and noisy data with few parameters and simple structure.It also suggests a new paradigm for leveraging quantum computational power in SOH estimation. 展开更多
关键词 Lithium-ion battery State of health Feature fusion Quantum convolutional neural network Quantum machine learning
在线阅读 下载PDF
Adaptive Fusion Neural Networks for Sparse-Angle X-Ray 3D Reconstruction
14
作者 Shaoyong Hong Bo Yang +4 位作者 Yan Chen Hao Quan Shan Liu Minyi Tang Jiawei Tian 《Computer Modeling in Engineering & Sciences》 2025年第7期1091-1112,共22页
3D medical image reconstruction has significantly enhanced diagnostic accuracy,yet the reliance on densely sampled projection data remains a major limitation in clinical practice.Sparse-angle X-ray imaging,though safe... 3D medical image reconstruction has significantly enhanced diagnostic accuracy,yet the reliance on densely sampled projection data remains a major limitation in clinical practice.Sparse-angle X-ray imaging,though safer and faster,poses challenges for accurate volumetric reconstruction due to limited spatial information.This study proposes a 3D reconstruction neural network based on adaptive weight fusion(AdapFusionNet)to achieve high-quality 3D medical image reconstruction from sparse-angle X-ray images.To address the issue of spatial inconsistency in multi-angle image reconstruction,an innovative adaptive fusion module was designed to score initial reconstruction results during the inference stage and perform weighted fusion,thereby improving the final reconstruction quality.The reconstruction network is built on an autoencoder(AE)framework and uses orthogonal-angle X-ray images(frontal and lateral projections)as inputs.The encoder extracts 2D features,which the decoder maps into 3D space.This study utilizes a lung CT dataset to obtain complete three-dimensional volumetric data,from which digitally reconstructed radiographs(DRR)are generated at various angles to simulate X-ray images.Since real-world clinical X-ray images rarely come with perfectly corresponding 3D“ground truth,”using CT scans as the three-dimensional reference effectively supports the training and evaluation of deep networks for sparse-angle X-ray 3D reconstruction.Experiments conducted on the LIDC-IDRI dataset with simulated X-ray images(DRR images)as training data demonstrate the superior performance of AdapFusionNet compared to other fusion methods.Quantitative results show that AdapFusionNet achieves SSIM,PSNR,and MAE values of 0.332,13.404,and 0.163,respectively,outperforming other methods(SingleViewNet:0.289,12.363,0.182;AvgFusionNet:0.306,13.384,0.159).Qualitative analysis further confirms that AdapFusionNet significantly enhances the reconstruction of lung and chest contours while effectively reducing noise during the reconstruction process.The findings demonstrate that AdapFusionNet offers significant advantages in 3D reconstruction of sparse-angle X-ray images. 展开更多
关键词 3D reconstruction adaptive fusion X-ray imaging medical imaging deep learning neural networks sparse angles autoencoder
暂未订购
Global Context Fusion Network for SAR Ship Detection
15
作者 Boya Zhang Yong Wang 《Journal of Beijing Institute of Technology》 2025年第6期577-589,共13页
Ship detection in synthetic aperture radar(SAR)image is crucial for marine surveillance and navigation.The application of detection network based on deep learning has achieved a promising result in SAR ship detection.... Ship detection in synthetic aperture radar(SAR)image is crucial for marine surveillance and navigation.The application of detection network based on deep learning has achieved a promising result in SAR ship detection.However,the existing networks encounters challenges due to the complex backgrounds,diverse scales and irregular distribution of ship targets.To address these issues,this article proposes a detection algorithm that integrates global context of the images(GCF-Net).First,we construct a global feature extraction module in the backbone network of GCF-Net,which encodes features along different spatial directions.Then,we incorporate bi-directional feature pyramid network(BiFPN)in the neck network to fuse the multi-scale features selectively.Finally,we design a convolution and transformer mixed(CTM)detection head to obtain contextual information of targets and concentrate network attention on the most informative regions of the images.Experimental results demonstrate that the proposed method achieves more accurate detection of ship targets in SAR images. 展开更多
关键词 synthetic aperture radar(SAR) ship detection global context fusion convolutional neural network feature extraction
在线阅读 下载PDF
Rolling Bearing Fault Detection Based on Self-Adaptive Wasserstein Dual Generative Adversarial Networks and Feature Fusion under Small Sample Conditions
16
作者 Qiang Ma Zhuopei Wei +2 位作者 Kai Yang Long Tian Zepeng Li 《Structural Durability & Health Monitoring》 2025年第4期1011-1035,共25页
An intelligent diagnosis method based on self-adaptiveWasserstein dual generative adversarial networks and feature fusion is proposed due to problems such as insufficient sample size and incomplete fault feature extra... An intelligent diagnosis method based on self-adaptiveWasserstein dual generative adversarial networks and feature fusion is proposed due to problems such as insufficient sample size and incomplete fault feature extraction,which are commonly faced by rolling bearings and lead to low diagnostic accuracy.Initially,dual models of the Wasserstein deep convolutional generative adversarial network incorporating gradient penalty(1D-2DWDCGAN)are constructed to augment the original dataset.A self-adaptive loss threshold control training strategy is introduced,and establishing a self-adaptive balancing mechanism for stable model training.Subsequently,a diagnostic model based on multidimensional feature fusion is designed,wherein complex features from various dimensions are extracted,merging the original signal waveform features,structured features,and time-frequency features into a deep composite feature representation that encompasses multiple dimensions and scales;thus,efficient and accurate small sample fault diagnosis is facilitated.Finally,an experiment between the bearing fault dataset of CaseWestern ReserveUniversity and the fault simulation experimental platformdataset of this research group shows that this method effectively supplements the dataset and remarkably improves the diagnostic accuracy.The diagnostic accuracy after data augmentation reached 99.94%and 99.87%in two different experimental environments,respectively.In addition,robustness analysis is conducted on the diagnostic accuracy of the proposed method under different noise backgrounds,verifying its good generalization performance. 展开更多
关键词 Deep learning Wasserstein deep convolutional generative adversarial network small sample learning feature fusion multidimensional data enhancement small sample fault diagnosis
在线阅读 下载PDF
Multimodal Trajectory Generation for Robotic Motion Planning Using Transformer-Based Fusion and Adversarial Learning
17
作者 Shtwai Alsubai Ahmad Almadhor +3 位作者 Abdullah Al Hejaili Najib Ben Aoun Tahani Alsubait Vincent Karovic 《Computer Modeling in Engineering & Sciences》 2026年第2期848-869,共22页
In Human–Robot Interaction(HRI),generating robot trajectories that accurately reflect user intentions while ensuring physical realism remains challenging,especially in unstructured environments.In this study,we devel... In Human–Robot Interaction(HRI),generating robot trajectories that accurately reflect user intentions while ensuring physical realism remains challenging,especially in unstructured environments.In this study,we develop a multimodal framework that integrates symbolic task reasoning with continuous trajectory generation.The approach employs transformer models and adversarial training to map high-level intent to robotic motion.Information from multiple data sources,such as voice traits,hand and body keypoints,visual observations,and recorded paths,is integrated simultaneously.These signals are mapped into a shared representation that supports interpretable reasoning while enabling smooth and realistic motion generation.Based on this design,two different learning strategies are investigated.In the first step,grammar-constrained Linear Temporal Logic(LTL)expressions are created from multimodal human inputs.These expressions are subsequently decoded into robot trajectories.The second method generates trajectories directly from symbolic intent and linguistic data,bypassing an intermediate logical representation.Transformer encoders combine multiple types of information,and autoregressive transformer decoders generate motion sequences.Adding smoothness and speed limits during training increases the likelihood of physical feasibility.To improve the realism and stability of the generated trajectories during training,an adversarial discriminator is also included to guide them toward the distribution of actual robot motion.Tests on the NATSGLD dataset indicate that the complete system exhibits stable training behaviour and performance.In normalised coordinates,the logic-based pipeline has an Average Displacement Error(ADE)of 0.040 and a Final Displacement Error(FDE)of 0.036.The adversarial generator makes substantially more progress,reducing ADE to 0.021 and FDE to 0.018.Visual examination confirms that the generated trajectories closely align with observed motion patterns while preserving smooth temporal dynamics. 展开更多
关键词 Multimodal trajectory generation robotic motion planning transformer networks sensor fusion reinforcement learning generative adversarial networks
在线阅读 下载PDF
A lightweight physics-conditioned diffusion multi-model for medical image reconstruction
18
作者 Raja Vavekanand Ganesh Kumar Shakhlokhon Kurbanova 《Biomedical Engineering Communications》 2026年第2期50-59,共10页
Background:Medical imaging advancements are constrained by fundamental trade-offs between acquisition speed,radiation dose,and image quality,forcing clinicians to work with noisy,incomplete data.Existing reconstructio... Background:Medical imaging advancements are constrained by fundamental trade-offs between acquisition speed,radiation dose,and image quality,forcing clinicians to work with noisy,incomplete data.Existing reconstruction methods either compromise on accuracy with iterative algorithms or suffer from limited generalizability with task-specific deep learning approaches.Methods:We present LDM-PIR,a lightweight physics-conditioned diffusion multi-model for medical image reconstruction that addresses key challenges in magnetic resonance imaging(MRI),CT,and low-photon imaging.Unlike traditional iterative methods,which are computationally expensive,or task-specific deep learning approaches lacking generalizability,integrates three innovations.A physics-conditioned diffusion framework that embeds acquisition operators(Fourier/Radon transforms)and noise models directly into the reconstruction process.A multi-model architecture that unifies denoising,inpainting,and super-resolution via shared weight conditioning.A lightweight design(2.1M parameters)enabling rapid inference(0.8s/image on GPU).Through self-supervised fine-tuning with measurement consistency losses adapts to new imaging modalities using fewer annotated samples.Results:Achieves state-of-the-art performance on fastMRI(peak signal-to-noise ratio(PSNR):34.04 for single-coil/31.50 for multi-coil)and Lung Image Database Consortium and Image Database Resource Initiative(28.83 PSNR under Poisson noise).Clinical evaluations demonstrate superior preservation of anatomical structures,with SSIM improvements of 8.8%for single-coil and 4.36%for multi-coil MRI over uDPIR.Conclusion:It offers a flexible,efficient,and scalable solution for medical image reconstruction,addressing the challenges of noise,undersampling,and modality generalization.The model’s lightweight design allows for rapid inference,while its self-supervised fine-tuning capability minimizes reliance on large annotated datasets,making it suitable for real-world clinical applications. 展开更多
关键词 medical image reconstruction physics-conditioned diffusion multi-task learning self-supervised fine-tuning multimodal fusion lightweight neural networks
在线阅读 下载PDF
A Dual-Stream Framework for Landslide Segmentation with Cross-Attention Enhancement and Gated Multimodal Fusion
19
作者 Md Minhazul Islam Yunfei Yin +2 位作者 Md Tanvir Islam Zheng Yuan Argho Dey 《Computers, Materials & Continua》 2026年第3期285-304,共20页
Automatic segmentation of landslides from remote sensing imagery is challenging because traditional machine learning and early CNN-based models often fail to generalize across heterogeneous landscapes,where segmentati... Automatic segmentation of landslides from remote sensing imagery is challenging because traditional machine learning and early CNN-based models often fail to generalize across heterogeneous landscapes,where segmentation maps contain sparse and fragmented landslide regions under diverse geographical conditions.To address these issues,we propose a lightweight dual-stream siamese deep learning framework that integrates optical and topographical data fusion with an adaptive decoder,guided multimodal fusion,and deep supervision.The framework is built upon the synergistic combination of cross-attention,gated fusion,and sub-pixel upsampling within a unified dual-stream architecture specifically optimized for landslide segmentation,enabling efficient context modeling and robust feature exchange between modalities.The decoder captures long-range context at deeper levels using lightweight cross-attention and refines spatial details at shallower levels through attention-gated skip fusion,enabling precise boundary delineation and fewer false positives.The gated fusion further enhances multimodal integration of optical and topographical cues,and the deep supervision stabilizes training and improves generalization.Moreover,to mitigate checkerboard artifacts,a learnable sub-pixel upsampling is devised to replace the traditional transposed convolution.Despite its compact design with fewer parameters,the model consistently outperforms state-of-the-art baselines.Experiments on two benchmark datasets,Landslide4Sense and Bijie,confirm the effectiveness of the framework.On the Bijie dataset,it achieves an F1-score of 0.9110 and an intersection over union(IoU)of 0.8839.These results highlight its potential for accurate large-scale landslide inventory mapping and real-time disaster response.The implementation is publicly available at https://github.com/mishaown/DiGATe-UNet-LandSlide-Segmentation(accessed on 3 November 2025). 展开更多
关键词 Landslide segmentation remote sensing dual-stream lightweight networks digital elevation model(DEM) gated fusion
在线阅读 下载PDF
Multi-CNN Fusion Framework for Predictive Violence Detection in Animated Media
20
作者 Tahira Khalil Sadeeq Jan +1 位作者 Rania M.Ghoniem Muhammad Imran Khan Khalil 《Computers, Materials & Continua》 2026年第2期2167-2186,共20页
The contemporary era is characterized by rapid technological advancements,particularly in the fields of communication and multimedia.Digital media has significantly influenced the daily lives of individuals of all age... The contemporary era is characterized by rapid technological advancements,particularly in the fields of communication and multimedia.Digital media has significantly influenced the daily lives of individuals of all ages.One of the emerging domains in digital media is the creation of cartoons and animated videos.The accessibility of the internet has led to a surge in the consumption of cartoons among young children,presenting challenges in monitoring and controlling the content they view.The prevalence of cartoon videos containing potentially violent scenes has raised concerns regarding their impact,especially on young and impressionableminds.This article contributes to the growing concerns about the impact of animated media on children’s mental health and offers solutions to help mitigate these effects.To address this issue,an intelligent,multi-CNN fusion framework is proposed for detecting and predicting violent content in upcoming frames of animated videos.The framework integrates probabilistic and deep learning methodologies by leveraging a combination of visual and temporal features for violence prediction in future scenes.Two specific convolutional neural network classifiers i.e.,VGG16 and ResNet18 are employed to classify scenes from animated content as violent or non-violent.To enhance decision robustness,this study introduces a fusion strategy based on weighted averaging,combining the outputs of both Convolutional Neural Networks(CNNs)into a single decision stream.The resulting classifications are subsequently fed into Naive Bayes classifier,which analyzes sequential patterns to forecast violence in future scenes.The experimental findings demonstrate that the proposed framework achieved predictive accuracy of 92.84%,highlighting its effectiveness for intelligent content moderation.These results underscore the potential of intelligent data fusion techniques in enhancing the reliability and robustness of automated violence detection systems in animated content.This framework offers a promising solution for safeguarding young audiences by enabling proactive and accurate moderation of animated videos. 展开更多
关键词 Violence prediction multi-model fusion cartoon videos residual network(ResNet) visual geometry group(VGG) CNN
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部