期刊文献+
共找到61篇文章
< 1 2 4 >
每页显示 20 50 100
Deep neural network based on adversarial training for short-term high-resolution precipitation nowcasting from radar echo images
1
作者 Ruikai YANG Shuangjian JIAO Nan YANG 《Journal of Oceanology and Limnology》 2026年第1期85-98,共14页
Precipitation nowcasting is of great importance for disaster prevention and mitigation.However,precipitation is a complex spatio-temporal phenomenon influenced by various underlying physical factors.Even slight change... Precipitation nowcasting is of great importance for disaster prevention and mitigation.However,precipitation is a complex spatio-temporal phenomenon influenced by various underlying physical factors.Even slight changes in the initial precipitation field can have a significant impact on the future precipitation patterns,making the nowcasting of short-term high-resolution precipitation a major challenge.Traditional deep learning methods often have difficulty capturing the long-term spatial dependence of precipitation and are usually at a low resolution.To address these issues,based upon the Simpler yet Better Video Prediction(SimVP)framework,we proposed a deep generative neural network that incorporates the Simple Parameter-Free Attention Module(SimAM)and Generative Adversarial Networks(GANs)for short-term high-resolution precipitation event forecasting.Through an adversarial training strategy,critical precipitation features were extracted from complex radar echo images.During the adversarial learning process,the dynamic competition between the generator and the discriminator could continuously enhance the model in prediction accuracy and resolution for short-term precipitation.Experimental results demonstrate that the proposed method could effectively forecast short-term precipitation events on various scales and showed the best overall performance among existing methods. 展开更多
关键词 precipitation nowcasting deep learning Simple Parameter-Free Attention Module(SimAM) Generative Adversarial Networks(GANs)
在线阅读 下载PDF
Research on YOLO algorithm for lightweight PCB defect detection based on MobileViT 被引量:2
2
作者 LIU Yuchen LIU Fuzheng JIANG Mingshun 《Optoelectronics Letters》 2025年第8期483-490,共8页
Current you only look once(YOLO)-based algorithm model is facing the challenge of overwhelming parameters and calculation complexity under the printed circuit board(PCB)defect detection application scenario.In order t... Current you only look once(YOLO)-based algorithm model is facing the challenge of overwhelming parameters and calculation complexity under the printed circuit board(PCB)defect detection application scenario.In order to solve this problem,we propose a new method,which combined the lightweight network mobile vision transformer(Mobile Vi T)with the convolutional block attention module(CBAM)mechanism and the new regression loss function.This method needed less computation resources,making it more suitable for embedded edge detection devices.Meanwhile,the new loss function improved the positioning accuracy of the bounding box and enhanced the robustness of the model.In addition,experiments on public datasets demonstrate that the improved model achieves an average accuracy of 87.9%across six typical defect detection tasks,while reducing computational costs by nearly 90%.It significantly reduces the model's computational requirements while maintaining accuracy,ensuring reliable performance for edge deployment. 展开更多
关键词 YOLO lightweight network mobile vision transformer mobile Lightweight Network convolutional block attention module cbam mechanism MobileViT CBAM PCB Defect Detection Regression Loss Function
原文传递
YOLO-based lightweight traffic sign detection algorithm and mobile deployment 被引量:1
3
作者 WU Yaqin ZHANG Tao +2 位作者 NIU Jianjun CHANG Yan LIU Ganjun 《Optoelectronics Letters》 2025年第4期249-256,共8页
This paper proposes a lightweight traffic sign detection system based on you only look once(YOLO).Firstly,the classification to fusion(C2f)structure is integrated into the backbone network,employing deformable convolu... This paper proposes a lightweight traffic sign detection system based on you only look once(YOLO).Firstly,the classification to fusion(C2f)structure is integrated into the backbone network,employing deformable convolution and bi-directional feature pyramid network(BiFPN)_Concat to improve the adaptability of the network.Secondly,the simple attention module(SimAm)is embedded to prioritize key features and reduce the complexity of the model after the C2f layer at the end of the backbone network.Next,the focal efficient intersection over union(EloU)is introduced to adjust the weights of challenging samples.Finally,we accomplish the design and deployment for the mobile app.The results demonstrate improvements,with the F1 score of 0.8987,mean average precision(mAP)@0.5 of 98.8%,mAP@0.5:0.95 of 75.6%,and the detection speed of 50 frames per second(FPS). 展开更多
关键词 c f layer simple attention module simam reduce complexity traffic sign detection prioritize key features backbone networkemploying classification backbone networknextthe
原文传递
Malicious Document Detection Based on GGE Visualization
4
作者 Youhe Wang Yi Sun +1 位作者 Yujie Li Chuanqi Zhou 《Computers, Materials & Continua》 SCIE EI 2025年第1期1233-1254,共22页
With the development of anti-virus technology,malicious documents have gradually become the main pathway of Advanced Persistent Threat(APT)attacks,therefore,the development of effective malicious document classifiers ... With the development of anti-virus technology,malicious documents have gradually become the main pathway of Advanced Persistent Threat(APT)attacks,therefore,the development of effective malicious document classifiers has become particularly urgent.Currently,detection methods based on document structure and behavioral features encounter challenges in feature engineering,these methods not only have limited accuracy,but also consume large resources,and usually can only detect documents in specific formats,which lacks versatility and adaptability.To address such problems,this paper proposes a novel malicious document detection method-visualizing documents as GGE images(Grayscale,Grayscale matrix,Entropy).The GGE method visualizes the original byte sequence of the malicious document as a grayscale image,the information entropy sequence of the document as an entropy image,and at the same time,the grayscale level co-occurrence matrix and the texture and spatial information stored in it are converted into grayscale matrix image,and fuses the three types of images to get the GGE color image.The Convolutional Block Attention Module-EfficientNet-B0(CBAM-EfficientNet-B0)model is then used for classification,combining transfer learning and applying the pre-trained model on the ImageNet dataset to the feature extraction process of GGE images.As shown in the experimental results,the GGE method has superior performance compared with other methods,which is suitable for detecting malicious documents in different formats,and achieves an accuracy of 99.44%and 97.39%on Portable Document Format(PDF)and office datasets,respectively,and consumes less time during the detection process,which can be effectively applied to the task of detecting malicious documents in real-time. 展开更多
关键词 Malicious document VISUALIZATION EfficientNet-B0 convolutional block attention module GGE image
在线阅读 下载PDF
F-Net:breast cancerous lesion region segmentation based on improved U-Net
5
作者 DENG Xiangyu PAN Lihao DANG Zhiyan 《Optoelectronics Letters》 2025年第12期761-768,共8页
In order to solve the challenge of breast cancer region segmentation,we improved the U-Net.The convolutional block attention module with prioritized attention(CBAM-PA)and dilated transformer(Dformer)modules were desig... In order to solve the challenge of breast cancer region segmentation,we improved the U-Net.The convolutional block attention module with prioritized attention(CBAM-PA)and dilated transformer(Dformer)modules were designed to replace the convolutional layers at the encoding side in the base U-Net,the input logic of the U-Net was improved by dynamically adjusting the input size of each layer,and the short connections in the U-Net were replaced with crosslayer connections to enhance the image restoration capability at the decoding side.On the breast ultrasound images(BUSI)dataset,we obtain a Dice coefficient of 0.8031 and an intersection-over-union(IoU)value of 0.7362.The experimental results show that the proposed enhancement method effectively improves the accuracy and quality of breast cancer lesion region segmentation. 展开更多
关键词 input logic crosslayer connections short connections breast cancer image restoration capabili convolutional block attention module breast cancer region segmentationwe convolutional layers
原文传递
Marine organism classification method based on hierarchical multi-scale attention mechanism
6
作者 XU Haotian CHENG Yuanzhi +1 位作者 ZHAO Dong XIE Peidong 《Optoelectronics Letters》 2025年第6期354-361,共8页
We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hie... We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hierarchical efficient multi-scale attention(H-EMA) module is designed for lightweight feature extraction, achieving outstanding performance at a relatively low cost. Secondly, an improved EfficientNetV2 block is used to integrate information from different scales better and enhance inter-layer message passing. Furthermore, introducing the convolutional block attention module(CBAM) enhances the model's perception of critical features, optimizing its generalization ability. Lastly, Focal Loss is introduced to adjust the weights of complex samples to address the issue of imbalanced categories in the dataset, further improving the model's performance. The model achieved 96.11% accuracy on the intertidal marine organism dataset of Nanji Islands and 84.78% accuracy on the CIFAR-100 dataset, demonstrating its strong generalization ability to meet the demands of oceanic biological image classification. 展开更多
关键词 integrate information different scales hierarchical multi scale attention lightweight feature extraction focal loss efficientnetv marine organism classification oceanic biological image classification methods convolutional block attention module
原文传递
Transmission Facility Detection with Feature-Attention Multi-Scale Robustness Network and Generative Adversarial Network
7
作者 Yunho Na Munsu Jeon +4 位作者 Seungmin Joo Junsoo Kim Ki-Yong Oh Min Ku Kim Joon-Young Park 《Computer Modeling in Engineering & Sciences》 2025年第7期1013-1044,共32页
This paper proposes an automated detection framework for transmission facilities using a featureattention multi-scale robustness network(FAMSR-Net)with high-fidelity virtual images.The proposed framework exhibits thre... This paper proposes an automated detection framework for transmission facilities using a featureattention multi-scale robustness network(FAMSR-Net)with high-fidelity virtual images.The proposed framework exhibits three key characteristics.First,virtual images of the transmission facilities generated using StyleGAN2-ADA are co-trained with real images.This enables the neural network to learn various features of transmission facilities to improve the detection performance.Second,the convolutional block attention module is deployed in FAMSR-Net to effectively extract features from images and construct multi-dimensional feature maps,enabling the neural network to perform precise object detection in various environments.Third,an effective bounding box optimization method called Scylla-IoU is deployed on FAMSR-Net,considering the intersection over union,center point distance,angle,and shape of the bounding box.This enables the detection of power facilities of various sizes accurately.Extensive experiments demonstrated that FAMSRNet outperforms other neural networks in detecting power facilities.FAMSR-Net also achieved the highest detection accuracy when virtual images of the transmission facilities were co-trained in the training phase.The proposed framework is effective for the scheduled operation and maintenance of transmission facilities because an optical camera is currently the most promising tool for unmanned aerial vehicles.This ultimately contributes to improved inspection efficiency,reduced maintenance risks,and more reliable power delivery across extensive transmission facilities. 展开更多
关键词 Object detection virtual image transmission facility convolutional block attention module Scylla-IoU
在线阅读 下载PDF
Learned distributed image compression with decoder side information
8
作者 Yankai Yin Zhe Sun +2 位作者 Peiying Ruan Ruidong Li Feng Duan 《Digital Communications and Networks》 2025年第2期349-358,共10页
With the rapid development of digital communication and the widespread use of the Internet of Things,multi-view image compression has attracted increasing attention as a fundamental technology for image data communica... With the rapid development of digital communication and the widespread use of the Internet of Things,multi-view image compression has attracted increasing attention as a fundamental technology for image data communication.Multi-view image compression aims to improve compression efficiency by leveraging correlations between images.However,the requirement of synchronization and inter-image communication at the encoder side poses significant challenges,especially for constrained devices.In this study,we introduce a novel distributed image compression model based on the attention mechanism to address the challenges associated with the availability of side information only during decoding.Our model integrates an encoder network,a quantization module,and a decoder network,to ensure both high compression performance and high-quality image reconstruction.The encoder uses a deep Convolutional Neural Network(CNN)to extract high-level features from the input image,which then pass through the quantization module for further compression before undergoing lossless entropy coding.The decoder of our model consists of three main components that allow us to fully exploit the information within and between images on the decoder side.Specifically,we first introduce a channel-spatial attention module to capture and refine information within individual image feature maps.Second,we employ a semi-coupled convolution module to extract both shared and specific information in images.Finally,a cross-attention module is employed to fuse mutual information extracted from side information.The effectiveness of our model is validated on various datasets,including KITTI Stereo and Cityscapes.The results highlight the superior compression capabilities of our method,surpassing state-of-the-art techniques. 展开更多
关键词 Digital communication Image compression Side information Channel-spatial attention module Cross-attention module
在线阅读 下载PDF
AG-GCN: Vehicle Re-Identification Based on Attention-Guided Graph Convolutional Network
9
作者 Ya-Jie Sun Li-Wei Qiao Sai Ji 《Computers, Materials & Continua》 2025年第7期1769-1785,共17页
Vehicle re-identification involves matching images of vehicles across varying camera views.The diversity of camera locations along different roadways leads to significant intra-class variation and only minimal inter-c... Vehicle re-identification involves matching images of vehicles across varying camera views.The diversity of camera locations along different roadways leads to significant intra-class variation and only minimal inter-class similarity in the collected vehicle images,which increases the complexity of re-identification tasks.To tackle these challenges,this study proposes AG-GCN(Attention-Guided Graph Convolutional Network),a novel framework integrating several pivotal components.Initially,AG-GCN embeds a lightweight attention module within the ResNet-50 structure to learn feature weights automatically,thereby improving the representation of vehicle features globally by highlighting salient features and suppressing extraneous ones.Moreover,AG-GCN adopts a graph-based structure to encapsulate deep local features.A graph convolutional network then amalgamates these features to understand the relationships among vehicle-related characteristics.Subsequently,we amalgamate feature maps from both the attention and graph-based branches for a more comprehensive representation of vehicle features.The framework then gauges feature similarities and ranks them,thus enhancing the accuracy of vehicle re-identification.Comprehensive qualitative and quantitative analyses on two publicly available datasets verify the efficacy of AG-GCN in addressing intra-class and inter-class variability issues. 展开更多
关键词 Vehicle re-identification a lightweight attention module global features local features graph convolution network
在线阅读 下载PDF
Rolling Bearing Fault Diagnosis Based on MTF Encoding and CBAM-LCNN Mechanism
10
作者 Wei Liu Sen Liu +2 位作者 Yinchao He Jiaojiao Wang Yu Gu 《Computers, Materials & Continua》 2025年第3期4863-4880,共18页
To address the issues of slow diagnostic speed,low accuracy,and poor generalization performance in traditional rolling bearing fault diagnosis methods,we propose a rolling bearing fault diagnosis method based on Marko... To address the issues of slow diagnostic speed,low accuracy,and poor generalization performance in traditional rolling bearing fault diagnosis methods,we propose a rolling bearing fault diagnosis method based on Markov Transition Field(MTF)image encoding combined with a lightweight convolutional neural network that integrates a Convolutional Block Attention Module(CBAM-LCNN).Specifically,we first use the Markov Transition Field to convert the original one-dimensional vibration signals of rolling bearings into two-dimensional images.Then,we construct a lightweight convolutional neural network incorporating the convolutional attention module(CBAM-LCNN).Finally,the two-dimensional images obtained from MTF mapping are fed into the CBAM-LCNN network for image feature extraction and fault diagnosis.We validate the effectiveness of the proposed method on the bearing fault datasets from Guangdong University of Petrochemical Technology’s multi-stage centrifugal fan and Case Western Reserve University.Experimental results show that,compared to other advanced baseline methods,the proposed rolling bearing fault diagnosis method offers faster diagnostic speed and higher diagnostic accuracy.In addition,we conducted experiments on the Xi’an Jiaotong University rolling bearing dataset,achieving excellent results in bearing fault diagnosis.These results validate the strong generalization performance of the proposed method.The method presented in this paper not only effectively diagnoses faults in rolling bearings but also serves as a reference for fault diagnosis in other equipment. 展开更多
关键词 Rolling bearing fault diagnosis markov transition field lightweight convolutional neural network convolutional block attention module
在线阅读 下载PDF
Enhanced Cutaneous Melanoma Segmentation in Dermoscopic Images Using a Dual U-Net Framework with Multi-Path Convolution Block Attention Module and SE-Res-Conv
11
作者 Kun Lan Feiyang Gao +2 位作者 Xiaoliang Jiang Jianzhen Cheng Simon Fong 《Computers, Materials & Continua》 2025年第9期4805-4824,共20页
With the continuous development of artificial intelligence and machine learning techniques,there have been effective methods supporting the work of dermatologist in the field of skin cancer detection.However,object si... With the continuous development of artificial intelligence and machine learning techniques,there have been effective methods supporting the work of dermatologist in the field of skin cancer detection.However,object significant challenges have been presented in accurately segmenting melanomas in dermoscopic images due to the objects that could interfere human observations,such as bubbles and scales.To address these challenges,we propose a dual U-Net network framework for skin melanoma segmentation.In our proposed architecture,we introduce several innovative components that aim to enhance the performance and capabilities of the traditional U-Net.First,we establish a novel framework that links two simplified U-Nets,enabling more comprehensive information exchange and feature integration throughout the network.Second,after cascading the second U-Net,we introduce a skip connection between the decoder and encoder networks,and incorporate a modified receptive field block(MRFB),which is designed to capture multi-scale spatial information.Third,to further enhance the feature representation capabilities,we add a multi-path convolution block attention module(MCBAM)to the first two layers of the first U-Net encoding,and integrate a new squeeze-and-excitation(SE)mechanism with residual connections in the second U-Net.To illustrate the performance of our proposed model,we conducted comprehensive experiments on widely recognized skin datasets.On the ISIC-2017 dataset,the IoU value of our proposed model increased from 0.6406 to 0.6819 and the Dice coefficient increased from 0.7625 to 0.8023.On the ISIC-2018 dataset,the IoU value of proposed model also improved from 0.7138 to 0.7709,while the Dice coefficient increased from 0.8285 to 0.8665.Furthermore,the generalization experiments conducted on the jaw cyst dataset from Quzhou People’s Hospital further verified the outstanding segmentation performance of the proposed model.These findings collectively affirm the potential of our approach as a valuable tool in supporting clinical decision-making in the field of skin cancer detection,as well as advancing research in medical image analysis. 展开更多
关键词 Dual U-Net skin lesion segmentation squeeze-and-excitation modified receptive field block multi-path convolution block attention module
在线阅读 下载PDF
CGMISeg:Context-Guided Multi-Scale Interactive for Efficient Semantic Segmentation
12
作者 Ze Wang Jin Qin +1 位作者 Chuhua Huang Yongjun Zhang 《Computers, Materials & Continua》 2025年第9期5811-5829,共19页
Semantic segmentation has made significant breakthroughs in various application fields,but achieving both accurate and efficient segmentation with limited computational resources remains a major challenge.To this end,... Semantic segmentation has made significant breakthroughs in various application fields,but achieving both accurate and efficient segmentation with limited computational resources remains a major challenge.To this end,we propose CGMISeg,an efficient semantic segmentation architecture based on a context-guided multi-scale interaction strategy,aiming to significantly reduce computational overhead while maintaining segmentation accuracy.CGMISeg consists of three core components:context-aware attention modulation,feature reconstruction,and crossinformation fusion.Context-aware attention modulation is carefully designed to capture key contextual information through channel and spatial attention mechanisms.The feature reconstruction module reconstructs contextual information from different scales,modeling key rectangular areas by capturing critical contextual information in both horizontal and vertical directions,thereby enhancing the focus on foreground features.The cross-information fusion module aims to fuse the reconstructed high-level features with the original low-level features during upsampling,promoting multi-scale interaction and enhancing the model’s ability to handle objects at different scales.We extensively evaluated CGMISeg on ADE20K,Cityscapes,and COCO-Stuff,three widely used datasets benchmarks,and the experimental results show that CGMISeg exhibits significant advantages in segmentation performance,computational efficiency,and inference speed,clearly outperforming several mainstream methods,including SegFormer,Feedformer,and SegNext.Specifically,CGMISeg achieves 42.9%mIoU(Mean Intersection over Union)and 15.7 FPS(Frames Per Second)on the ADE20K dataset with 3.8 GFLOPs(Giga Floating-point Operations Per Second),outperforming Feedformer and SegNeXt by 3.7%and 1.8%in mIoU,respectively,while also offering reduced computational complexity and faster inference.CGMISeg strikes an excellent balance between accuracy and efficiency,significantly enhancing both computational and inference performance while maintaining high precision,showcasing exceptional practical value and strong potential for widespread applications. 展开更多
关键词 Semantic segmentation context-aware attention modulation feature reconstruction cross-information fusion
在线阅读 下载PDF
MMIF:Multimodal Medical Image Fusion Network Based on Multi-Scale Hybrid Attention
13
作者 Jianjun Liu Yang Li +2 位作者 Xiaoting Sun Xiaohui Wang Hanjiang Luo 《Computers, Materials & Continua》 2025年第11期3551-3568,共18页
Multimodal image fusion plays an important role in image analysis and applications.Multimodal medical image fusion helps to combine contrast features from two or more input imaging modalities to represent fused inform... Multimodal image fusion plays an important role in image analysis and applications.Multimodal medical image fusion helps to combine contrast features from two or more input imaging modalities to represent fused information in a single image.One of the critical clinical applications of medical image fusion is to fuse anatomical and functional modalities for rapid diagnosis of malignant tissues.This paper proposes a multimodal medical image fusion network(MMIF-Net)based on multiscale hybrid attention.The method first decomposes the original image to obtain the low-rank and significant parts.Then,to utilize the features at different scales,we add amultiscalemechanism that uses three filters of different sizes to extract the features in the encoded network.Also,a hybrid attention module is introduced to obtain more image details.Finally,the fused images are reconstructed by decoding the network.We conducted experiments with clinical images from brain computed tomography/magnetic resonance.The experimental results show that the multimodal medical image fusion network method based on multiscale hybrid attention works better than other advanced fusion methods. 展开更多
关键词 Medical image fusion multiscale mechanism hybrid attention module encoded network
在线阅读 下载PDF
Double Self-Attention Based Fully Connected Feature Pyramid Network for Field Crop Pest Detection
14
作者 Zijun Gao Zheyi Li +2 位作者 Chunqi Zhang Ying Wang Jingwen Su 《Computers, Materials & Continua》 2025年第6期4353-4371,共19页
Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of intersp... Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of interspecies similarity,multi-scale,and background complexity of pests.To address these problems,this study proposes an FD-YOLO pest target detection model.The FD-YOLO model uses a Fully Connected Feature Pyramid Network(FC-FPN)instead of a PANet in the neck,which can adaptively fuse multi-scale information so that the model can retain small-scale target features in the deep layer,enhance large-scale target features in the shallow layer,and enhance the multiplexing of effective features.A dual self-attention module(DSA)is then embedded in the C3 module of the neck,which captures the dependencies between the information in both spatial and channel dimensions,effectively enhancing global features.We selected 16 types of pests that widely damage field crops in the IP102 pest dataset,which were used as our dataset after data supplementation and enhancement.The experimental results showed that FD-YOLO’s mAP@0.5 improved by 6.8%compared to YOLOv5,reaching 82.6%and 19.1%–5%better than other state-of-the-art models.This method provides an effective new approach for detecting similar or multiscale pests in field crops. 展开更多
关键词 Pest detection YOLOv5 feature pyramid network transformer attention module
在线阅读 下载PDF
基于注意力机制的图像目标检测算法性能提升研究
15
作者 苏子晴 《中国科技纵横》 2025年第22期67-69,共3页
针对YOLOv5在小目标检测及复杂背景处理中性能不足的问题,本文提出融合多种注意力机制的改进方法。通过在Backbone、Neck、Head阶段分别引入CBAM、Coordinate Attention和轻量空间注意力模块,设计混合注意力模块(HAM),可以有效提升模型... 针对YOLOv5在小目标检测及复杂背景处理中性能不足的问题,本文提出融合多种注意力机制的改进方法。通过在Backbone、Neck、Head阶段分别引入CBAM、Coordinate Attention和轻量空间注意力模块,设计混合注意力模块(HAM),可以有效提升模型特征表达与定位能力。研究结果表明,改进后的模型在mAP、Recall等指标上均优于传统方法,尤其在小目标检测任务中表现突出。 展开更多
关键词 YOLOv5 注意力机制 目标检测 CBAM Hybrid Attention Module
在线阅读 下载PDF
Super-Resolution Generative Adversarial Network with Pyramid Attention Module for Face Generation
16
作者 Parvathaneni Naga Srinivasu G.JayaLakshmi +4 位作者 Sujatha Canavoy Narahari Victor Hugo C.de Albuquerque Muhammad Attique Khan Hee-Chan Cho Byoungchol Chang 《Computers, Materials & Continua》 2025年第10期2117-2139,共23页
The generation of high-quality,realistic face generation has emerged as a key field of research in computer vision.This paper proposes a robust approach that combines a Super-Resolution Generative Adversarial Network(... The generation of high-quality,realistic face generation has emerged as a key field of research in computer vision.This paper proposes a robust approach that combines a Super-Resolution Generative Adversarial Network(SRGAN)with a Pyramid Attention Module(PAM)to enhance the quality of deep face generation.The SRGAN framework is designed to improve the resolution of generated images,addressing common challenges such as blurriness and a lack of intricate details.The Pyramid Attention Module further complements the process by focusing on multi-scale feature extraction,enabling the network to capture finer details and complex facial features more effectively.The proposed method was trained and evaluated over 100 epochs on the CelebA dataset,demonstrating consistent improvements in image quality and a marked decrease in generator and discriminator losses,reflecting the model’s capacity to learn and synthesize high-quality images effectively,given adequate computational resources.Experimental outcome demonstrates that the SRGAN model with PAM module has outperformed,yielding an aggregate discriminator loss of 0.055 for real,0.043 for fake,and a generator loss of 10.58 after training for 100 epochs.The model has yielded an structural similarity index measure of 0.923,that has outperformed the other models that are considered in the current study for analysis. 展开更多
关键词 Artificial intelligence generative adversarial network pyramid attention module face generation deep learning
在线阅读 下载PDF
Data augmentation method for light guide plate based on improved CycleGAN
17
作者 GONG Yefei YAN Chao +2 位作者 XIAO Ming LU Mingli GAO Hua 《Optoelectronics Letters》 2025年第9期555-561,共7页
An improved cycle-consistent generative adversarial network(CycleGAN) method for defect data augmentation based on feature fusion and self attention residual module is proposed to address the insufficiency of defect s... An improved cycle-consistent generative adversarial network(CycleGAN) method for defect data augmentation based on feature fusion and self attention residual module is proposed to address the insufficiency of defect sample data for light guide plate(LGP) in production,as well as the problem of minor defects.Two optimizations are made to the generator of CycleGAN:fusion of low resolution features obtained from partial up-sampling and down-sampling with high-resolution features,combination of self attention mechanism with residual network structure to replace the original residual module.Qualitative and quantitative experiments were conducted to compare different data augmentation methods,and the results show that the defect images of the LGP generated by the improved network were more realistic,and the accuracy of the you only look once version 5(YOLOv5) detection network for the LGP was improved by 5.6%,proving the effectiveness and accuracy of the proposed method. 展开更多
关键词 feature fusion self attention mec data augmentation light guide plate lgp cyclegan fusion low resolution features defect data augmentation self attention residual module minor defectstwo
原文传递
CSC-YOLO:An Image Recognition Model for Surface Defect Detection of Copper Strip and Plates
18
作者 ZHANG Guo CHEN Tao WANG Jianping 《Journal of Shanghai Jiaotong university(Science)》 2025年第5期1037-1049,共13页
In order to meet the requirements of accurate identification of surface defects on copper strip in industrial production,a detection model of surface defects based on machine vision,CSC-YOLO,is proposed.The model uses... In order to meet the requirements of accurate identification of surface defects on copper strip in industrial production,a detection model of surface defects based on machine vision,CSC-YOLO,is proposed.The model uses YOLOv4-tiny as the benchmark network.First,K-means clustering is introduced into the benchmark network to obtain anchor frames that match the self-built dataset.Second,a cross-region fusion module is introduced in the backbone network to solve the difficult target recognition problem by fusing contextual semantic information.Third,the spatial pyramid pooling-efficient channel attention network(SPP-E)module is introduced in the path aggregation network(PANet)to enhance the extraction of features.Fourth,to prevent the loss of channel information,a lightweight attention mechanism is introduced to improve the performance of the network.Finally,the performance of the model is improved by adding adjustment factors to correct the loss function for the dimensional characteristics of the surface defects.CSC-YOLO was tested on the self-built dataset of surface defects in copper strip,and the experimental results showed that the mAP of the model can reach 93.58%,which is a 3.37% improvement compared with the benchmark network,and FPS,although decreasing compared with the benchmark network,reached 104.CSC-YOLO takes into account the real-time requirements of copper strip production.The comparison experiments with Faster RCNN,SSD300,YOLOv3,YOLOv4,Resnet50-YOLOv4,YOLOv5s,YOLOv7,and other algorithms show that the algorithm obtains a faster computation speed while maintaining a higher detection accuracy. 展开更多
关键词 copper strip surface defect detection K-means clustering cross-region fusion module spatial pyramid pooling-efficient channel attention network(SPP-E)module YOLOv4-tiny
原文传递
基于注意力特征融合的SqueezeNet细粒度图像分类模型 被引量:8
19
作者 李明悦 何乐生 +1 位作者 雷晨 龚友梅 《云南大学学报(自然科学版)》 CAS CSCD 北大核心 2021年第5期868-876,共9页
针对现有细粒度图像分类算法普遍存在的模型结构复杂、参数多、分类准确率较低等问题,提出一种注意力特征融合的SqueezeNet细粒度图像分类模型.通过对现有细粒度图像分类算法和轻量级卷积神经网络的分析,首先使用3个典型的预训练轻量级... 针对现有细粒度图像分类算法普遍存在的模型结构复杂、参数多、分类准确率较低等问题,提出一种注意力特征融合的SqueezeNet细粒度图像分类模型.通过对现有细粒度图像分类算法和轻量级卷积神经网络的分析,首先使用3个典型的预训练轻量级卷积神经网络,对其微调后在公开的细粒度图像数据集上进行验证,经比较后选择了模型性能最佳的SqueezeNet作为图像的特征提取器;然后将两个具有注意力机制的卷积模块嵌入至SqueezeNet网络的每个Fire模块;接着提取出改进后的SqueezeNet的中间层特征进行双线性融合形成新的注意力特征图,与网络的全局特征再融合后分类;最后通过实验对比和可视化分析,网络嵌入Convolution Block Attention Module(CBAM)模块的分类准确率在鸟类、汽车、飞机数据集上依次提高了8.96%、4.89%和5.85%,嵌入Squeeze-and-Excitation(SE)模块的分类准确率依次提高了9.81%、4.52%和2.30%,且新模型在参数量、运行效率等方面比现有算法更具优势. 展开更多
关键词 细粒度图像分类 轻量级卷积神经网络 SqueezeNet 注意力机制 Convolution Block Attention Module(CBAM) Squeeze-and-Excitation(SE) 特征融合
在线阅读 下载PDF
ANC: Attention Network for COVID-19 Explainable Diagnosis Based on Convolutional Block Attention Module 被引量:10
20
作者 Yudong Zhang Xin Zhang Weiguo Zhu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2021年第6期1037-1058,共22页
Aim: To diagnose COVID-19 more efficiently and more correctly, this study proposed a novel attention network forCOVID-19 (ANC). Methods: Two datasets were used in this study. An 18-way data augmentation was proposed t... Aim: To diagnose COVID-19 more efficiently and more correctly, this study proposed a novel attention network forCOVID-19 (ANC). Methods: Two datasets were used in this study. An 18-way data augmentation was proposed toavoid overfitting. Then, convolutional block attention module (CBAM) was integrated to our model, the structureof which is fine-tuned. Finally, Grad-CAM was used to provide an explainable diagnosis. Results: The accuracyof our ANC methods on two datasets are 96.32% ± 1.06%, and 96.00% ± 1.03%, respectively. Conclusions: Thisproposed ANC method is superior to 9 state-of-the-art approaches. 展开更多
关键词 Deep learning convolutional block attention module attention mechanism COVID-19 explainable diagnosis
在线阅读 下载PDF
上一页 1 2 4 下一页 到第
使用帮助 返回顶部