期刊文献+
共找到1,585篇文章
< 1 2 80 >
每页显示 20 50 100
Self-reduction multi-head attention module for defect recognition of power equipment in substation
1
作者 Yifeng Han Donglian Qi Yunfeng Yan 《Global Energy Interconnection》 2025年第1期82-91,共10页
Safety maintenance of power equipment is of great importance in power grids,in which image-processing-based defect recognition is supposed to classify abnormal conditions during daily inspection.However,owing to the b... Safety maintenance of power equipment is of great importance in power grids,in which image-processing-based defect recognition is supposed to classify abnormal conditions during daily inspection.However,owing to the blurred features of defect images,the current defect recognition algorithm has poor fine-grained recognition ability.Visual attention can achieve fine-grained recognition with its abil-ity to model long-range dependencies while introducing extra computational complexity,especially for multi-head attention in vision transformer structures.Under these circumstances,this paper proposes a self-reduction multi-head attention module that can reduce computational complexity and be easily combined with a Convolutional Neural Network(CNN).In this manner,local and global fea-tures can be calculated simultaneously in our proposed structure,aiming to improve the defect recognition performance.Specifically,the proposed self-reduction multi-head attention can reduce redundant parameters,thereby solving the problem of limited computational resources.Experimental results were obtained based on the defect dataset collected from the substation.The results demonstrated the efficiency and superiority of the proposed method over other advanced algorithms. 展开更多
关键词 multi-head attention Defect recognition Power equipment Computational complexity
在线阅读 下载PDF
Super-Resolution Generative Adversarial Network with Pyramid Attention Module for Face Generation
2
作者 Parvathaneni Naga Srinivasu G.JayaLakshmi +4 位作者 Sujatha Canavoy Narahari Victor Hugo C.de Albuquerque Muhammad Attique Khan Hee-Chan Cho Byoungchol Chang 《Computers, Materials & Continua》 2025年第10期2117-2139,共23页
The generation of high-quality,realistic face generation has emerged as a key field of research in computer vision.This paper proposes a robust approach that combines a Super-Resolution Generative Adversarial Network(... The generation of high-quality,realistic face generation has emerged as a key field of research in computer vision.This paper proposes a robust approach that combines a Super-Resolution Generative Adversarial Network(SRGAN)with a Pyramid Attention Module(PAM)to enhance the quality of deep face generation.The SRGAN framework is designed to improve the resolution of generated images,addressing common challenges such as blurriness and a lack of intricate details.The Pyramid Attention Module further complements the process by focusing on multi-scale feature extraction,enabling the network to capture finer details and complex facial features more effectively.The proposed method was trained and evaluated over 100 epochs on the CelebA dataset,demonstrating consistent improvements in image quality and a marked decrease in generator and discriminator losses,reflecting the model’s capacity to learn and synthesize high-quality images effectively,given adequate computational resources.Experimental outcome demonstrates that the SRGAN model with PAM module has outperformed,yielding an aggregate discriminator loss of 0.055 for real,0.043 for fake,and a generator loss of 10.58 after training for 100 epochs.The model has yielded an structural similarity index measure of 0.923,that has outperformed the other models that are considered in the current study for analysis. 展开更多
关键词 Artificial intelligence generative adversarial network pyramid attention module face generation deep learning
在线阅读 下载PDF
Enhanced Cutaneous Melanoma Segmentation in Dermoscopic Images Using a Dual U-Net Framework with Multi-Path Convolution Block Attention Module and SE-Res-Conv
3
作者 Kun Lan Feiyang Gao +2 位作者 Xiaoliang Jiang Jianzhen Cheng Simon Fong 《Computers, Materials & Continua》 2025年第9期4805-4824,共20页
With the continuous development of artificial intelligence and machine learning techniques,there have been effective methods supporting the work of dermatologist in the field of skin cancer detection.However,object si... With the continuous development of artificial intelligence and machine learning techniques,there have been effective methods supporting the work of dermatologist in the field of skin cancer detection.However,object significant challenges have been presented in accurately segmenting melanomas in dermoscopic images due to the objects that could interfere human observations,such as bubbles and scales.To address these challenges,we propose a dual U-Net network framework for skin melanoma segmentation.In our proposed architecture,we introduce several innovative components that aim to enhance the performance and capabilities of the traditional U-Net.First,we establish a novel framework that links two simplified U-Nets,enabling more comprehensive information exchange and feature integration throughout the network.Second,after cascading the second U-Net,we introduce a skip connection between the decoder and encoder networks,and incorporate a modified receptive field block(MRFB),which is designed to capture multi-scale spatial information.Third,to further enhance the feature representation capabilities,we add a multi-path convolution block attention module(MCBAM)to the first two layers of the first U-Net encoding,and integrate a new squeeze-and-excitation(SE)mechanism with residual connections in the second U-Net.To illustrate the performance of our proposed model,we conducted comprehensive experiments on widely recognized skin datasets.On the ISIC-2017 dataset,the IoU value of our proposed model increased from 0.6406 to 0.6819 and the Dice coefficient increased from 0.7625 to 0.8023.On the ISIC-2018 dataset,the IoU value of proposed model also improved from 0.7138 to 0.7709,while the Dice coefficient increased from 0.8285 to 0.8665.Furthermore,the generalization experiments conducted on the jaw cyst dataset from Quzhou People’s Hospital further verified the outstanding segmentation performance of the proposed model.These findings collectively affirm the potential of our approach as a valuable tool in supporting clinical decision-making in the field of skin cancer detection,as well as advancing research in medical image analysis. 展开更多
关键词 Dual U-Net skin lesion segmentation squeeze-and-excitation modified receptive field block multi-path convolution block attention module
在线阅读 下载PDF
Event-Aware Sarcasm Detection in Chinese Social Media Using Multi-Head Attention and Contrastive Learning
4
作者 Kexuan Niu Xiameng Si +1 位作者 Xiaojie Qi Haiyan Kang 《Computers, Materials & Continua》 2025年第10期2051-2070,共20页
Sarcasm detection is a complex and challenging task,particularly in the context of Chinese social media,where it exhibits strong contextual dependencies and cultural specificity.To address the limitations of existing ... Sarcasm detection is a complex and challenging task,particularly in the context of Chinese social media,where it exhibits strong contextual dependencies and cultural specificity.To address the limitations of existing methods in capturing the implicit semantics and contextual associations in sarcastic expressions,this paper proposes an event-aware model for Chinese sarcasm detection,leveraging a multi-head attention(MHA)mechanism and contrastive learning(CL)strategies.The proposed model employs a dual-path Bidirectional Encoder Representations from Transformers(BERT)encoder to process comment text and event context separately and integrates an MHA mechanism to facilitate deep interactions between the two,thereby capturing multidimensional semantic associations.Additionally,a CL strategy is introduced to enhance feature representation capabilities,further improving the model’s performance in handling class imbalance and complex contextual scenarios.The model achieves state-of-the-art performance on the Chinese sarcasm dataset,with significant improvements in accuracy(79.55%),F1-score(84.22%),and an area under the curve(AUC,84.35%). 展开更多
关键词 Sarcasm detection event-aware multi-head attention contrastive learning NLP
在线阅读 下载PDF
Multi-Head Attention Enhanced Parallel Dilated Convolution and Residual Learning for Network Traffic Anomaly Detection
5
作者 Guorong Qi Jian Mao +2 位作者 Kai Huang Zhengxian You Jinliang Lin 《Computers, Materials & Continua》 2025年第2期2159-2176,共18页
Abnormal network traffic, as a frequent security risk, requires a series of techniques to categorize and detect it. Existing network traffic anomaly detection still faces challenges: the inability to fully extract loc... Abnormal network traffic, as a frequent security risk, requires a series of techniques to categorize and detect it. Existing network traffic anomaly detection still faces challenges: the inability to fully extract local and global features, as well as the lack of effective mechanisms to capture complex interactions between features;Additionally, when increasing the receptive field to obtain deeper feature representations, the reliance on increasing network depth leads to a significant increase in computational resource consumption, affecting the efficiency and performance of detection. Based on these issues, firstly, this paper proposes a network traffic anomaly detection model based on parallel dilated convolution and residual learning (Res-PDC). To better explore the interactive relationships between features, the traffic samples are converted into two-dimensional matrix. A module combining parallel dilated convolutions and residual learning (res-pdc) was designed to extract local and global features of traffic at different scales. By utilizing res-pdc modules with different dilation rates, we can effectively capture spatial features at different scales and explore feature dependencies spanning wider regions without increasing computational resources. Secondly, to focus and integrate the information in different feature subspaces, further enhance and extract the interactions among the features, multi-head attention is added to Res-PDC, resulting in the final model: multi-head attention enhanced parallel dilated convolution and residual learning (MHA-Res-PDC) for network traffic anomaly detection. Finally, comparisons with other machine learning and deep learning algorithms are conducted on the NSL-KDD and CIC-IDS-2018 datasets. The experimental results demonstrate that the proposed method in this paper can effectively improve the detection performance. 展开更多
关键词 Network traffic anomaly detection multi-head attention parallel dilated convolution residual learning
在线阅读 下载PDF
MAMGBR: Group-Buying Recommendation Model Based on Multi-Head Attention Mechanism and Multi-Task Learning
6
作者 Zongzhe Xu Ming Yu 《Computers, Materials & Continua》 2025年第8期2805-2826,共22页
As the group-buying model shows significant progress in attracting new users,enhancing user engagement,and increasing platform profitability,providing personalized recommendations for group-buying users has emerged as... As the group-buying model shows significant progress in attracting new users,enhancing user engagement,and increasing platform profitability,providing personalized recommendations for group-buying users has emerged as a new challenge in the field of recommendation systems.This paper introduces a group-buying recommendation model based on multi-head attention mechanisms and multi-task learning,termed the Multi-head Attention Mechanisms and Multi-task Learning Group-Buying Recommendation(MAMGBR)model,specifically designed to optimize group-buying recommendations on e-commerce platforms.The core dataset of this study comes from the Chinese maternal and infant e-commerce platform“Beibei,”encompassing approximately 430,000 successful groupbuying actions and over 120,000 users.Themodel focuses on twomain tasks:recommending items for group organizers(Task Ⅰ)and recommending participants for a given group-buying event(Task Ⅱ).In model evaluation,MAMGBR achieves an MRR@10 of 0.7696 for Task I,marking a 20.23%improvement over baseline models.Furthermore,in Task II,where complex interaction patterns prevail,MAMGBR utilizes auxiliary loss functions to effectively model the multifaceted roles of users,items,and participants,leading to a 24.08%increase in MRR@100 under a 1:99 sample ratio.Experimental results show that compared to benchmark models,such as NGCF and EATNN,MAMGBR’s integration ofmulti-head attentionmechanisms,expert networks,and gating mechanisms enables more accurate modeling of user preferences and social associations within group-buying scenarios,significantly enhancing recommendation accuracy and platform group-buying success rates. 展开更多
关键词 Group-buying recommendation multi-head attention mechanism multi-task learning
在线阅读 下载PDF
SSA-LSTM-Multi-Head Attention Modelling Approach for Prediction of Coal Dust Maximum Explosion Pressure Based on the Synergistic Effect of Particle Size and Concentration
7
作者 Yongli Liu Weihao Li +1 位作者 Haitao Wang Taoren Du 《Computer Modeling in Engineering & Sciences》 2025年第5期2261-2286,共26页
Coal dust explosions are severe safety accidents in coal mine production,posing significant threats to life and property.Predicting the maximum explosion pressure(Pm)of coal dust using deep learning models can effecti... Coal dust explosions are severe safety accidents in coal mine production,posing significant threats to life and property.Predicting the maximum explosion pressure(Pm)of coal dust using deep learning models can effectively assess potential risks and provide a scientific basis for preventing coal dust explosions.In this study,a 20-L explosion sphere apparatus was used to test the maximum explosion pressure of coal dust under seven different particle sizes and ten mass concentrations(Cdust),resulting in a dataset of 70 experimental groups.Through Spearman correlation analysis and random forest feature selection methods,particle size(D_(10),D_(20),D_(50))and mass concentration(Cdust)were identified as critical feature parameters from the ten initial parameters of the coal dust samples.Based on this,a hybrid Long Short-Term Memory(LSTM)network model incorporating a Multi-Head Attention Mechanism and the Sparrow Search Algorithm(SSA)was proposed to predict the maximum explosion pressure of coal dust.The results demonstrate that the SSA-LSTM-Multi-Head Attention model excels in predicting the maximum explosion pressure of coal dust.The four evaluation metrics indicate that the model achieved a coefficient of determination(R^(2)),root mean square error(RMSE),mean absolute percentage error(MAPE),and mean absolute error(MAE)of 0.9841,0.0030,0.0074,and 0.0049,respectively,in the training set.In the testing set,these values were 0.9743,0.0087,0.0108,and 0.0069,respectively.Compared to artificial neural networks(ANN),random forest(RF),support vector machines(SVM),particle swarm optimized-SVM(PSO-SVM)neural networks,and the traditional single-model LSTM,the SSA-LSTM-Multi-Head Attention model demonstrated superior generalization capability and prediction accuracy.The findings of this study not only advance the application of deep learning in coal dust explosion prediction but also provide robust technical support for the prevention and risk assessment of coal dust explosions. 展开更多
关键词 Coal dust explosion deep learning maximum explosion pressure predictive model SSA-LSTM multi-head attention mechanism
在线阅读 下载PDF
A local-global dynamic hypergraph convolution with multi-head flow attention for traffic flow forecasting
8
作者 ZHANG Hong LI Yang +3 位作者 LUO Shengjun ZHANG Pengcheng ZHANG Xijun YI Min 《High Technology Letters》 2025年第3期246-256,共11页
Traffic flow prediction is a crucial element of intelligent transportation systems.However,accu-rate traffic flow prediction is quite challenging because of its highly nonlinear,complex,and dynam-ic characteristics.To... Traffic flow prediction is a crucial element of intelligent transportation systems.However,accu-rate traffic flow prediction is quite challenging because of its highly nonlinear,complex,and dynam-ic characteristics.To address the difficulties in simultaneously capturing local and global dynamic spatiotemporal correlations in traffic flow,as well as the high time complexity of existing models,a multi-head flow attention-based local-global dynamic hypergraph convolution(MFA-LGDHC)pre-diction model is proposed.which consists of multi-head flow attention(MHFA)mechanism,graph convolution network(GCN),and local-global dynamic hypergraph convolution(LGHC).MHFA is utilized to extract the time dependency of traffic flow and reduce the time complexity of the model.GCN is employed to catch the spatial dependency of traffic flow.LGHC utilizes down-sampling con-volution and isometric convolution to capture the local and global spatial dependencies of traffic flow.And dynamic hypergraph convolution is used to model the dynamic higher-order relationships of the traffic road network.Experimental results indicate that the MFA-LGDHC model outperforms current popular baseline models and exhibits good prediction performance. 展开更多
关键词 traffic flow prediction multi-head flow attention graph convolution hypergraph learning dynamic spatio-temporal properties
在线阅读 下载PDF
Unsupervised multi-modal image translation based on the squeeze-and-excitation mechanism and feature attention module 被引量:1
9
作者 胡振涛 HU Chonghao +1 位作者 YANG Haoran SHUAI Weiwei 《High Technology Letters》 EI CAS 2024年第1期23-30,共8页
The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-genera... The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-generator mechanism is employed among the advanced approaches available to model different domain mappings,which results in inefficient training of neural networks and pattern collapse,leading to inefficient generation of image diversity.To address this issue,this paper introduces a multi-modal unsupervised image translation framework that uses a generator to perform multi-modal image translation.Specifically,firstly,the domain code is introduced in this paper to explicitly control the different generation tasks.Secondly,this paper brings in the squeeze-and-excitation(SE)mechanism and feature attention(FA)module.Finally,the model integrates multiple optimization objectives to ensure efficient multi-modal translation.This paper performs qualitative and quantitative experiments on multiple non-paired benchmark image translation datasets while demonstrating the benefits of the proposed method over existing technologies.Overall,experimental results have shown that the proposed method is versatile and scalable. 展开更多
关键词 multi-modal image translation generative adversarial network(GAN) squeezeand-excitation(SE)mechanism feature attention(FA)module
在线阅读 下载PDF
Application of the improved dung beetle optimizer,muti-head attention and hybrid deep learning algorithms to groundwater depth prediction in the Ningxia area,China 被引量:1
10
作者 Jiarui Cai Bo Sun +5 位作者 Huijun Wang Yi Zheng Siyu Zhou Huixin Li Yanyan Huang Peishu Zong 《Atmospheric and Oceanic Science Letters》 2025年第1期18-23,共6页
Due to the lack of accurate data and complex parameterization,the prediction of groundwater depth is a chal-lenge for numerical models.Machine learning can effectively solve this issue and has been proven useful in th... Due to the lack of accurate data and complex parameterization,the prediction of groundwater depth is a chal-lenge for numerical models.Machine learning can effectively solve this issue and has been proven useful in the prediction of groundwater depth in many areas.In this study,two new models are applied to the prediction of groundwater depth in the Ningxia area,China.The two models combine the improved dung beetle optimizer(DBO)algorithm with two deep learning models:The Multi-head Attention-Convolution Neural Network-Long Short Term Memory networks(MH-CNN-LSTM)and the Multi-head Attention-Convolution Neural Network-Gated Recurrent Unit(MH-CNN-GRU).The models with DBO show better prediction performance,with larger R(correlation coefficient),RPD(residual prediction deviation),and lower RMSE(root-mean-square error).Com-pared with the models with the original DBO,the R and RPD of models with the improved DBO increase by over 1.5%,and the RMSE decreases by over 1.8%,indicating better prediction results.In addition,compared with the multiple linear regression model,a traditional statistical model,deep learning models have better prediction performance. 展开更多
关键词 Groundwater depth multi-head attention Improved dung beetle optimizer CNN-LSTM CNN-GRU Ningxia
在线阅读 下载PDF
Residual Attention-BiConvLSTM:一种新的全球电离层TEC map预测模型
11
作者 王浩然 刘海军 +5 位作者 袁静 乐会军 李良超 陈羿 单维锋 袁国铭 《地球物理学报》 北大核心 2025年第2期413-430,共18页
电离层总电子含量(TEC)预测对提高全球卫星导航系统(GNSS)的精度具有重要意义.现有的TEC map预测模型主要通过顺序堆叠时空特征提取单元来实现.这种模型搭建方法会因多个卷积层顺序堆叠而损失细粒度的TEC map的空间特征,导致模型精度不... 电离层总电子含量(TEC)预测对提高全球卫星导航系统(GNSS)的精度具有重要意义.现有的TEC map预测模型主要通过顺序堆叠时空特征提取单元来实现.这种模型搭建方法会因多个卷积层顺序堆叠而损失细粒度的TEC map的空间特征,导致模型精度不够;还会由于多层堆叠导致梯度消失或梯度爆炸问题.本文借鉴残差注意力(Residual Attention)的思想,在TEC map预测模型中增加了残差注意力模块,提出了Residual Attention-BiConvLSTM模型.该模型中的残差注意力模块能同时提取粗、细粒度空间特征,并对其进行加权.本文在全球TEC map数据上与ConvLSTM、ConvGRU、ED-ConvLSTM和C1PG进行了对比实验.实验结果表明,本文所提出的Residual Attention-BiConvLSTM模型的RMSE、MAE、MAPE和R^(2)在太阳活动高年和年均优于对比模型.本文还在一次磁暴事件中对比了5种模型的预测效果.实验结果表明,大磁暴发生时,本文模型与C1PG相近,优于其他3种对比模型.本文的研究工作为电离层map预测模型搭建提供一个新思路. 展开更多
关键词 电离层TEC map预测 残差注意力模块 Residual attention-BiConvLSTM 时空预测模型
在线阅读 下载PDF
An Intelligent Framework for Resilience Recovery of FANETs with Spatio-Temporal Aggregation and Multi-Head Attention Mechanism 被引量:1
12
作者 Zhijun Guo Yun Sun +2 位作者 YingWang Chaoqi Fu Jilong Zhong 《Computers, Materials & Continua》 SCIE EI 2024年第5期2375-2398,共24页
Due to the time-varying topology and possible disturbances in a conflict environment,it is still challenging to maintain the mission performance of flying Ad hoc networks(FANET),which limits the application of Unmanne... Due to the time-varying topology and possible disturbances in a conflict environment,it is still challenging to maintain the mission performance of flying Ad hoc networks(FANET),which limits the application of Unmanned Aerial Vehicle(UAV)swarms in harsh environments.This paper proposes an intelligent framework to quickly recover the cooperative coveragemission by aggregating the historical spatio-temporal network with the attention mechanism.The mission resilience metric is introduced in conjunction with connectivity and coverage status information to simplify the optimization model.A spatio-temporal node pooling method is proposed to ensure all node location features can be updated after destruction by capturing the temporal network structure.Combined with the corresponding Laplacian matrix as the hyperparameter,a recovery algorithm based on the multi-head attention graph network is designed to achieve rapid recovery.Simulation results showed that the proposed framework can facilitate rapid recovery of the connectivity and coverage more effectively compared to the existing studies.The results demonstrate that the average connectivity and coverage results is improved by 17.92%and 16.96%,respectively compared with the state-of-the-art model.Furthermore,by the ablation study,the contributions of each different improvement are compared.The proposed model can be used to support resilient network design for real-time mission execution. 展开更多
关键词 RESILIENCE cooperative mission FANET spatio-temporal node pooling multi-head attention graph network
在线阅读 下载PDF
Pyramid–MixNet: Integrate Attention into Encoder-Decoder Transformer Framework for Automatic Railway Surface Damage Segmentation
13
作者 Hui Luo Wenqing Li Wei Zeng 《Computers, Materials & Continua》 2025年第7期1567-1580,共14页
Rail surface damage is a critical component of high-speed railway infrastructure,directly affecting train operational stability and safety.Existing methods face limitations in accuracy and speed for small-sample,multi... Rail surface damage is a critical component of high-speed railway infrastructure,directly affecting train operational stability and safety.Existing methods face limitations in accuracy and speed for small-sample,multi-category,and multi-scale target segmentation tasks.To address these challenges,this paper proposes Pyramid-MixNet,an intelligent segmentation model for high-speed rail surface damage,leveraging dataset construction and expansion alongside a feature pyramid-based encoder-decoder network with multi-attention mechanisms.The encoding net-work integrates Spatial Reduction Masked Multi-Head Attention(SRMMHA)to enhance global feature extraction while reducing trainable parameters.The decoding network incorporates Mix-Attention(MA),enabling multi-scale structural understanding and cross-scale token group correlation learning.Experimental results demonstrate that the proposed method achieves 62.17%average segmentation accuracy,80.28%Damage Dice Coefficient,and 56.83 FPS,meeting real-time detection requirements.The model’s high accuracy and scene adaptability significantly improve the detection of small-scale and complex multi-scale rail damage,offering practical value for real-time monitoring in high-speed railway maintenance systems. 展开更多
关键词 Pyramid vision transformer encoder–decoder architecture railway damage segmentation masked multi-head attention mix-attention
在线阅读 下载PDF
Double Self-Attention Based Fully Connected Feature Pyramid Network for Field Crop Pest Detection
14
作者 Zijun Gao Zheyi Li +2 位作者 Chunqi Zhang Ying Wang Jingwen Su 《Computers, Materials & Continua》 2025年第6期4353-4371,共19页
Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of intersp... Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of interspecies similarity,multi-scale,and background complexity of pests.To address these problems,this study proposes an FD-YOLO pest target detection model.The FD-YOLO model uses a Fully Connected Feature Pyramid Network(FC-FPN)instead of a PANet in the neck,which can adaptively fuse multi-scale information so that the model can retain small-scale target features in the deep layer,enhance large-scale target features in the shallow layer,and enhance the multiplexing of effective features.A dual self-attention module(DSA)is then embedded in the C3 module of the neck,which captures the dependencies between the information in both spatial and channel dimensions,effectively enhancing global features.We selected 16 types of pests that widely damage field crops in the IP102 pest dataset,which were used as our dataset after data supplementation and enhancement.The experimental results showed that FD-YOLO’s mAP@0.5 improved by 6.8%compared to YOLOv5,reaching 82.6%and 19.1%–5%better than other state-of-the-art models.This method provides an effective new approach for detecting similar or multiscale pests in field crops. 展开更多
关键词 Pest detection YOLOv5 feature pyramid network transformer attention module
在线阅读 下载PDF
Transmission Facility Detection with Feature-Attention Multi-Scale Robustness Network and Generative Adversarial Network
15
作者 Yunho Na Munsu Jeon +4 位作者 Seungmin Joo Junsoo Kim Ki-Yong Oh Min Ku Kim Joon-Young Park 《Computer Modeling in Engineering & Sciences》 2025年第7期1013-1044,共32页
This paper proposes an automated detection framework for transmission facilities using a featureattention multi-scale robustness network(FAMSR-Net)with high-fidelity virtual images.The proposed framework exhibits thre... This paper proposes an automated detection framework for transmission facilities using a featureattention multi-scale robustness network(FAMSR-Net)with high-fidelity virtual images.The proposed framework exhibits three key characteristics.First,virtual images of the transmission facilities generated using StyleGAN2-ADA are co-trained with real images.This enables the neural network to learn various features of transmission facilities to improve the detection performance.Second,the convolutional block attention module is deployed in FAMSR-Net to effectively extract features from images and construct multi-dimensional feature maps,enabling the neural network to perform precise object detection in various environments.Third,an effective bounding box optimization method called Scylla-IoU is deployed on FAMSR-Net,considering the intersection over union,center point distance,angle,and shape of the bounding box.This enables the detection of power facilities of various sizes accurately.Extensive experiments demonstrated that FAMSRNet outperforms other neural networks in detecting power facilities.FAMSR-Net also achieved the highest detection accuracy when virtual images of the transmission facilities were co-trained in the training phase.The proposed framework is effective for the scheduled operation and maintenance of transmission facilities because an optical camera is currently the most promising tool for unmanned aerial vehicles.This ultimately contributes to improved inspection efficiency,reduced maintenance risks,and more reliable power delivery across extensive transmission facilities. 展开更多
关键词 Object detection virtual image transmission facility convolutional block attention module Scylla-IoU
在线阅读 下载PDF
MMIF:Multimodal Medical Image Fusion Network Based on Multi-Scale Hybrid Attention
16
作者 Jianjun Liu Yang Li +2 位作者 Xiaoting Sun Xiaohui Wang Hanjiang Luo 《Computers, Materials & Continua》 2025年第11期3551-3568,共18页
Multimodal image fusion plays an important role in image analysis and applications.Multimodal medical image fusion helps to combine contrast features from two or more input imaging modalities to represent fused inform... Multimodal image fusion plays an important role in image analysis and applications.Multimodal medical image fusion helps to combine contrast features from two or more input imaging modalities to represent fused information in a single image.One of the critical clinical applications of medical image fusion is to fuse anatomical and functional modalities for rapid diagnosis of malignant tissues.This paper proposes a multimodal medical image fusion network(MMIF-Net)based on multiscale hybrid attention.The method first decomposes the original image to obtain the low-rank and significant parts.Then,to utilize the features at different scales,we add amultiscalemechanism that uses three filters of different sizes to extract the features in the encoded network.Also,a hybrid attention module is introduced to obtain more image details.Finally,the fused images are reconstructed by decoding the network.We conducted experiments with clinical images from brain computed tomography/magnetic resonance.The experimental results show that the multimodal medical image fusion network method based on multiscale hybrid attention works better than other advanced fusion methods. 展开更多
关键词 Medical image fusion multiscale mechanism hybrid attention module encoded network
在线阅读 下载PDF
AG-GCN: Vehicle Re-Identification Based on Attention-Guided Graph Convolutional Network
17
作者 Ya-Jie Sun Li-Wei Qiao Sai Ji 《Computers, Materials & Continua》 2025年第7期1769-1785,共17页
Vehicle re-identification involves matching images of vehicles across varying camera views.The diversity of camera locations along different roadways leads to significant intra-class variation and only minimal inter-c... Vehicle re-identification involves matching images of vehicles across varying camera views.The diversity of camera locations along different roadways leads to significant intra-class variation and only minimal inter-class similarity in the collected vehicle images,which increases the complexity of re-identification tasks.To tackle these challenges,this study proposes AG-GCN(Attention-Guided Graph Convolutional Network),a novel framework integrating several pivotal components.Initially,AG-GCN embeds a lightweight attention module within the ResNet-50 structure to learn feature weights automatically,thereby improving the representation of vehicle features globally by highlighting salient features and suppressing extraneous ones.Moreover,AG-GCN adopts a graph-based structure to encapsulate deep local features.A graph convolutional network then amalgamates these features to understand the relationships among vehicle-related characteristics.Subsequently,we amalgamate feature maps from both the attention and graph-based branches for a more comprehensive representation of vehicle features.The framework then gauges feature similarities and ranks them,thus enhancing the accuracy of vehicle re-identification.Comprehensive qualitative and quantitative analyses on two publicly available datasets verify the efficacy of AG-GCN in addressing intra-class and inter-class variability issues. 展开更多
关键词 Vehicle re-identification a lightweight attention module global features local features graph convolution network
在线阅读 下载PDF
Multimodal medical image fusion based on mask optimization and parallel attention mechanism
18
作者 DI Jing LIANG Chan +1 位作者 GUO Wenqing LIAN Jing 《Journal of Measurement Science and Instrumentation》 2025年第1期26-36,共11页
Medical image fusion technology is crucial for improving the detection accuracy and treatment efficiency of diseases,but existing fusion methods have problems such as blurred texture details,low contrast,and inability... Medical image fusion technology is crucial for improving the detection accuracy and treatment efficiency of diseases,but existing fusion methods have problems such as blurred texture details,low contrast,and inability to fully extract fused image information.Therefore,a multimodal medical image fusion method based on mask optimization and parallel attention mechanism was proposed to address the aforementioned issues.Firstly,it converted the entire image into a binary mask,and constructed a contour feature map to maximize the contour feature information of the image and a triple path network for image texture detail feature extraction and optimization.Secondly,a contrast enhancement module and a detail preservation module were proposed to enhance the overall brightness and texture details of the image.Afterwards,a parallel attention mechanism was constructed using channel features and spatial feature changes to fuse images and enhance the salient information of the fused images.Finally,a decoupling network composed of residual networks was set up to optimize the information between the fused image and the source image so as to reduce information loss in the fused image.Compared with nine high-level methods proposed in recent years,the seven objective evaluation indicators of our method have improved by 6%−31%,indicating that this method can obtain fusion results with clearer texture details,higher contrast,and smaller pixel differences between the fused image and the source image.It is superior to other comparison algorithms in both subjective and objective indicators. 展开更多
关键词 multimodal medical image fusion binary mask contrast enhancement module parallel attention mechanism decoupling network
在线阅读 下载PDF
Marine organism classification method based on hierarchical multi-scale attention mechanism
19
作者 XU Haotian CHENG Yuanzhi +1 位作者 ZHAO Dong XIE Peidong 《Optoelectronics Letters》 2025年第6期354-361,共8页
We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hie... We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hierarchical efficient multi-scale attention(H-EMA) module is designed for lightweight feature extraction, achieving outstanding performance at a relatively low cost. Secondly, an improved EfficientNetV2 block is used to integrate information from different scales better and enhance inter-layer message passing. Furthermore, introducing the convolutional block attention module(CBAM) enhances the model's perception of critical features, optimizing its generalization ability. Lastly, Focal Loss is introduced to adjust the weights of complex samples to address the issue of imbalanced categories in the dataset, further improving the model's performance. The model achieved 96.11% accuracy on the intertidal marine organism dataset of Nanji Islands and 84.78% accuracy on the CIFAR-100 dataset, demonstrating its strong generalization ability to meet the demands of oceanic biological image classification. 展开更多
关键词 integrate information different scales hierarchical multi scale attention lightweight feature extraction focal loss efficientnetv marine organism classification oceanic biological image classification methods convolutional block attention module
原文传递
Structured Multi-Head Attention Stock Index Prediction Method Based Adaptive Public Opinion Sentiment Vector
20
作者 Cheng Zhao Zhe Peng +2 位作者 Xuefeng Lan Yuefeng Cen Zuxin Wang 《Computers, Materials & Continua》 SCIE EI 2024年第1期1503-1523,共21页
The present study examines the impact of short-term public opinion sentiment on the secondary market,with a focus on the potential for such sentiment to cause dramatic stock price fluctuations and increase investment ... The present study examines the impact of short-term public opinion sentiment on the secondary market,with a focus on the potential for such sentiment to cause dramatic stock price fluctuations and increase investment risk.The quantification of investment sentiment indicators and the persistent analysis of their impact has been a complex and significant area of research.In this paper,a structured multi-head attention stock index prediction method based adaptive public opinion sentiment vector is proposed.The proposedmethod utilizes an innovative approach to transform numerous investor comments on social platforms over time into public opinion sentiment vectors expressing complex sentiments.It then analyzes the continuous impact of these vectors on the market through the use of aggregating techniques and public opinion data via a structured multi-head attention mechanism.The experimental results demonstrate that the public opinion sentiment vector can provide more comprehensive feedback on market sentiment than traditional sentiment polarity analysis.Furthermore,the multi-head attention mechanism is shown to improve prediction accuracy through attention convergence on each type of input information separately.Themean absolute percentage error(MAPE)of the proposedmethod is 0.463%,a reduction of 0.294% compared to the benchmark attention algorithm.Additionally,the market backtesting results indicate that the return was 24.560%,an improvement of 8.202% compared to the benchmark algorithm.These results suggest that themarket trading strategy based on thismethod has the potential to improve trading profits. 展开更多
关键词 Public opinion sentiment structured multi-head attention stock index prediction deep learning
在线阅读 下载PDF
上一页 1 2 80 下一页 到第
使用帮助 返回顶部