期刊文献+
共找到24篇文章
< 1 2 >
每页显示 20 50 100
M2ANet:Multi-branch and multi-scale attention network for medical image segmentation 被引量:1
1
作者 Wei Xue Chuanghui Chen +3 位作者 Xuan Qi Jian Qin Zhen Tang Yongsheng He 《Chinese Physics B》 2025年第8期547-559,共13页
Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to ... Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures. 展开更多
关键词 medical image segmentation convolutional neural network multi-branch attention multi-scale feature fusion
原文传递
EHDC-YOLO: Enhancing Object Detection for UAV Imagery via Multi-Scale Edge and Detail Capture
2
作者 Zhiyong Deng Yanchen Ye Jiangling Guo 《Computers, Materials & Continua》 2026年第1期1665-1682,共18页
With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods ... With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods face numerous challenges in practical deployment,including scale variation handling,feature degradation,and complex backgrounds.To address these issues,we propose Edge-enhanced and Detail-Capturing You Only Look Once(EHDC-YOLO),a novel framework for object detection in Unmanned Aerial Vehicle(UAV)imagery.Based on the You Only Look Once version 11 nano(YOLOv11n)baseline,EHDC-YOLO systematically introduces several architectural enhancements:(1)a Multi-Scale Edge Enhancement(MSEE)module that leverages multi-scale pooling and edge information to enhance boundary feature extraction;(2)an Enhanced Feature Pyramid Network(EFPN)that integrates P2-level features with Cross Stage Partial(CSP)structures and OmniKernel convolutions for better fine-grained representation;and(3)Dynamic Head(DyHead)with multi-dimensional attention mechanisms for enhanced cross-scale modeling and perspective adaptability.Comprehensive experiments on the Vision meets Drones for Detection(VisDrone-DET)2019 dataset demonstrate that EHDC-YOLO achieves significant improvements,increasing mean Average Precision(mAP)@0.5 from 33.2%to 46.1%(an absolute improvement of 12.9 percentage points)and mAP@0.5:0.95 from 19.5%to 28.0%(an absolute improvement of 8.5 percentage points)compared with the YOLOv11n baseline,while maintaining a reasonable parameter count(2.81 M vs the baseline’s 2.58 M).Further ablation studies confirm the effectiveness of each proposed component,while visualization results highlight EHDC-YOLO’s superior performance in detecting objects and handling occlusions in complex drone scenarios. 展开更多
关键词 UAV imagery object detection multi-scale feature fusion edge enhancement detail preservation YOLO feature pyramid network attention mechanism
在线阅读 下载PDF
MSFResNet:A ResNeXt50 model based on multi-scale feature fusion for wild mushroom identification
3
作者 YANG Yang JU Tao +1 位作者 YANG Wenjie ZHAO Yuyang 《Journal of Measurement Science and Instrumentation》 2025年第1期66-74,共9页
To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network mo... To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network model is proposed by fusing multi-scale feature information.Firstly,a multi-scale feature extraction module is designed to obtain multi-scale information on feature images by using different scales of convolution kernels.Meanwhile,the channel attention mechanism is used to increase the global information acquisition of the network.Secondly,the feature images processed by the multi-scale feature extraction module are fused with the deep feature images through short links to guide the full learning of the network,thus reducing the loss of texture details of the deep network feature images,and improving network generalization ability and recognition accuracy.Finally,the validity of the MSFResNet model is verified using public datasets and applied to wild mushroom identification.Experimental results show that compared with ResNeXt50 network model,the accuracy of the MSFResNet model is improved by 6.01%on the FGVC-Aircraft common dataset.It achieves 99.13%classification accuracy on the wild mushroom dataset,which is 0.47%higher than ResNeXt50.Furthermore,the experimental results of the thermal map show that the MSFResNet model significantly reduces the interference of background information,making the network focus on the location of the main body of wild mushroom,which can effectively improve the accuracy of wild mushroom identification. 展开更多
关键词 multi-scale feature fusion attention mechanism ResNeXt50 wild mushroom identification deep learning
在线阅读 下载PDF
AMSFuse:Adaptive Multi-Scale Feature Fusion Network for Diabetic Retinopathy Classification
4
作者 Chengzhang Zhu Ahmed Alasri +5 位作者 Tao Xu Yalong Xiao Abdulrahman Noman Raeed Alsabri Xuanchu Duan Monir Abdullah 《Computers, Materials & Continua》 2025年第3期5153-5167,共15页
Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure p... Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure prompt diagnosis and effective treatment.Deep learning-based automated diagnosis for diabetic retinopathy can facilitate early detection and treatment.However,traditional deep learning models that focus on local views often learn feature representations that are less discriminative at the semantic level.On the other hand,models that focus on global semantic-level information might overlook critical,subtle local pathological features.To address this issue,we propose an adaptive multi-scale feature fusion network called(AMSFuse),which can adaptively combine multi-scale global and local features without compromising their individual representation.Specifically,our model incorporates global features for extracting high-level contextual information from retinal images.Concurrently,local features capture fine-grained details,such as microaneurysms,hemorrhages,and exudates,which are critical for DR diagnosis.These global and local features are adaptively fused using a fusion block,followed by an Integrated Attention Mechanism(IAM)that refines the fused features by emphasizing relevant regions,thereby enhancing classification accuracy for DR classification.Our model achieves 86.3%accuracy on the APTOS dataset and 96.6%RFMiD,both of which are comparable to state-of-the-art methods. 展开更多
关键词 Diabetic retinopathy multi-scale feature fusion global features local features integrated attention mechanism retinal images
暂未订购
Lightweight Human Pose Estimation Based on Multi-Attention Mechanism
5
作者 LIN Xiao LU Meichen +1 位作者 GAO Mufeng LI Yan 《Journal of Shanghai Jiaotong university(Science)》 2025年第5期899-910,共12页
Human pose estimation has received much attention from the research community because of its wide range of applications.However,current research for pose estimation is usually complex and computationally intensive,esp... Human pose estimation has received much attention from the research community because of its wide range of applications.However,current research for pose estimation is usually complex and computationally intensive,especially the feature loss problems in the feature fusion process.To address the above problems,we propose a lightweight human pose estimation network based on multi-attention mechanism(LMANet).In our method,network parameters can be significantly reduced by lightweighting the bottleneck blocks with depth-wise separable convolution on the high-resolution networks.After that,we also introduce a multi-attention mechanism to improve the model prediction accuracy,and the channel attention module is added in the initial stage of the network to enhance the local cross-channel information interaction.More importantly,we inject spatial crossawareness module in the multi-scale feature fusion stage to reduce the spatial information loss during feature extraction.Extensive experiments on COCO2017 dataset and MPII dataset show that LMANet can guarantee a higher prediction accuracy with fewer network parameters and computational effort.Compared with the highresolution network HRNet,the number of parameters and the computational complexity of the network are reduced by 67%and 73%,respectively. 展开更多
关键词 human pose estimation attention mechanisms multi-scale feature fusion high-resolution networks
原文传递
Attention Guided Multi Scale Feature Fusion Network for Automatic Prostate Segmentation
6
作者 Yuchun Li Mengxing Huang +1 位作者 Yu Zhang Zhiming Bai 《Computers, Materials & Continua》 SCIE EI 2024年第2期1649-1668,共20页
The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prosta... The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prostate segmentation,but due to the variability caused by prostate diseases,automatic segmentation of the prostate presents significant challenges.In this paper,we propose an attention-guided multi-scale feature fusion network(AGMSF-Net)to segment prostate MRI images.We propose an attention mechanism for extracting multi-scale features,and introduce a 3D transformer module to enhance global feature representation by adding it during the transition phase from encoder to decoder.In the decoder stage,a feature fusion module is proposed to obtain global context information.We evaluate our model on MRI images of the prostate acquired from a local hospital.The relative volume difference(RVD)and dice similarity coefficient(DSC)between the results of automatic prostate segmentation and ground truth were 1.21%and 93.68%,respectively.To quantitatively evaluate prostate volume on MRI,which is of significant clinical significance,we propose a unique AGMSF-Net.The essential performance evaluation and validation experiments have demonstrated the effectiveness of our method in automatic prostate segmentation. 展开更多
关键词 Prostate segmentation multi-scale attention 3D Transformer feature fusion MRI
在线阅读 下载PDF
A hierarchical framework for cervical cell classification using attention-based multi-scale local binary convolutional neural networks
7
作者 Tao Wan Lei Cao +2 位作者 Yulan Jin Dong Chen Zengchang Qin 《Medicine in Novel Technology and Devices》 2025年第3期213-228,共16页
Traditional classification methods for cervical cells heavily rely on manual feature extraction,constraining their versatility due to the intricacies of cytology images.Although deep learning approaches offer remarkab... Traditional classification methods for cervical cells heavily rely on manual feature extraction,constraining their versatility due to the intricacies of cytology images.Although deep learning approaches offer remarkable po-tential,they often sacrifice domain-specific knowledge,particularly the morphological patterns characterizing various cell subtypes during automated feature extraction.To bridge this gap,we introduce a novel hierarchical framework that integrates robust features from color,texture,and morphology with latent representations discovered by an improved attention-based multi-scale local binary convolutional neural networks(MS-LBCNN),designed to facilitate powerful feature extraction mechanism.We enhance the standard 6-class Bethesda system(TBS)classification by incorporating a coarse-to-refine fusion strategy,which optimizes the classification pro-cess.The proposed method is uniquely equipped to manage the complexities present in both individual and clustered cell images.Upon rigorous evaluation across three independent data cohorts,our method consistently surpassed existing state-of-the-art techniques.The experimental results indicated the potential of our method in enhancing the development of automation-aided diagnostic systems,and bolstering both the accuracy and ef-ficiency of cytology screening procedures. 展开更多
关键词 Cervical cell classification multi-scale local binary convolutional neural networks attention mechanism The Bethesda system Feature fusion
在线阅读 下载PDF
Image Inpainting Technique Incorporating Edge Prior and Attention Mechanism 被引量:1
8
作者 Jinxian Bai Yao Fan +1 位作者 Zhiwei Zhao Lizhi Zheng 《Computers, Materials & Continua》 SCIE EI 2024年第1期999-1025,共27页
Recently,deep learning-based image inpainting methods have made great strides in reconstructing damaged regions.However,these methods often struggle to produce satisfactory results when dealing with missing images wit... Recently,deep learning-based image inpainting methods have made great strides in reconstructing damaged regions.However,these methods often struggle to produce satisfactory results when dealing with missing images with large holes,leading to distortions in the structure and blurring of textures.To address these problems,we combine the advantages of transformers and convolutions to propose an image inpainting method that incorporates edge priors and attention mechanisms.The proposed method aims to improve the results of inpainting large holes in images by enhancing the accuracy of structure restoration and the ability to recover texture details.This method divides the inpainting task into two phases:edge prediction and image inpainting.Specifically,in the edge prediction phase,a transformer architecture is designed to combine axial attention with standard self-attention.This design enhances the extraction capability of global structural features and location awareness.It also balances the complexity of self-attention operations,resulting in accurate prediction of the edge structure in the defective region.In the image inpainting phase,a multi-scale fusion attention module is introduced.This module makes full use of multi-level distant features and enhances local pixel continuity,thereby significantly improving the quality of image inpainting.To evaluate the performance of our method.comparative experiments are conducted on several datasets,including CelebA,Places2,and Facade.Quantitative experiments show that our method outperforms the other mainstream methods.Specifically,it improves Peak Signal-to-Noise Ratio(PSNR)and Structure Similarity Index Measure(SSIM)by 1.141~3.234 db and 0.083~0.235,respectively.Moreover,it reduces Learning Perceptual Image Patch Similarity(LPIPS)and Mean Absolute Error(MAE)by 0.0347~0.1753 and 0.0104~0.0402,respectively.Qualitative experiments reveal that our method excels at reconstructing images with complete structural information and clear texture details.Furthermore,our model exhibits impressive performance in terms of the number of parameters,memory cost,and testing time. 展开更多
关键词 Image inpainting TRANSFORMER edge prior axial attention multi-scale fusion attention
在线阅读 下载PDF
A Lightweight Multiscale Feature Fusion Network for Solar Cell Defect Detection
9
作者 Xiaoyun Chen Lanyao Zhang +3 位作者 Xiaoling Chen Yigang Cen Linna Zhang Fugui Zhang 《Computers, Materials & Continua》 SCIE EI 2025年第1期521-542,共22页
Solar cell defect detection is crucial for quality inspection in photovoltaic power generation modules.In the production process,defect samples occur infrequently and exhibit random shapes and sizes,which makes it cha... Solar cell defect detection is crucial for quality inspection in photovoltaic power generation modules.In the production process,defect samples occur infrequently and exhibit random shapes and sizes,which makes it challenging to collect defective samples.Additionally,the complex surface background of polysilicon cell wafers complicates the accurate identification and localization of defective regions.This paper proposes a novel Lightweight Multiscale Feature Fusion network(LMFF)to address these challenges.The network comprises a feature extraction network,a multi-scale feature fusion module(MFF),and a segmentation network.Specifically,a feature extraction network is proposed to obtain multi-scale feature outputs,and a multi-scale feature fusion module(MFF)is used to fuse multi-scale feature information effectively.In order to capture finer-grained multi-scale information from the fusion features,we propose a multi-scale attention module(MSA)in the segmentation network to enhance the network’s ability for small target detection.Moreover,depthwise separable convolutions are introduced to construct depthwise separable residual blocks(DSR)to reduce the model’s parameter number.Finally,to validate the proposed method’s defect segmentation and localization performance,we constructed three solar cell defect detection datasets:SolarCells,SolarCells-S,and PVEL-S.SolarCells and SolarCells-S are monocrystalline silicon datasets,and PVEL-S is a polycrystalline silicon dataset.Experimental results show that the IOU of our method on these three datasets can reach 68.5%,51.0%,and 92.7%,respectively,and the F1-Score can reach 81.3%,67.5%,and 96.2%,respectively,which surpasses other commonly usedmethods and verifies the effectiveness of our LMFF network. 展开更多
关键词 Defect segmentation multi-scale feature fusion multi-scale attention depthwise separable residual block
在线阅读 下载PDF
DDFNet:real-time salient object detection with dual-branch decoding fusion for steel plate surface defects
10
作者 Tao Wang Wang-zhe Du +5 位作者 Xu-wei Li Hua-xin Liu Yuan-ming Liu Xiao-miao Niu Ya-xing Liu Tao Wang 《Journal of Iron and Steel Research International》 2025年第8期2421-2433,共13页
A novel dual-branch decoding fusion convolutional neural network model(DDFNet)specifically designed for real-time salient object detection(SOD)on steel surfaces is proposed.DDFNet is based on a standard encoder–decod... A novel dual-branch decoding fusion convolutional neural network model(DDFNet)specifically designed for real-time salient object detection(SOD)on steel surfaces is proposed.DDFNet is based on a standard encoder–decoder architecture.DDFNet integrates three key innovations:first,we introduce a novel,lightweight multi-scale progressive aggregation residual network that effectively suppresses background interference and refines defect details,enabling efficient salient feature extraction.Then,we propose an innovative dual-branch decoding fusion structure,comprising the refined defect representation branch and the enhanced defect representation branch,which enhance accuracy in defect region identification and feature representation.Additionally,to further improve the detection of small and complex defects,we incorporate a multi-scale attention fusion module.Experimental results on the public ESDIs-SOD dataset show that DDFNet,with only 3.69 million parameters,achieves detection performance comparable to current state-of-the-art models,demonstrating its potential for real-time industrial applications.Furthermore,our DDFNet-L variant consistently outperforms leading methods in detection performance.The code is available at https://github.com/13140W/DDFNet. 展开更多
关键词 Steel plate surface defect Real-time detection Salient object detection Dual-branch decoder multi-scale attention fusion multi-scale residual fusion
原文传递
Multi-Label Image Classification Model Based on Multiscale Fusion and Adaptive Label Correlation
11
作者 YE Jihua JIANG Lu +2 位作者 XIAO Shunjie ZONG Yi JIANG Aiwen 《Journal of Shanghai Jiaotong university(Science)》 2025年第5期889-898,共10页
At present,research on multi-label image classification mainly focuses on exploring the correlation between labels to improve the classification accuracy of multi-label images.However,in existing methods,label correla... At present,research on multi-label image classification mainly focuses on exploring the correlation between labels to improve the classification accuracy of multi-label images.However,in existing methods,label correlation is calculated based on the statistical information of the data.This label correlation is global and depends on the dataset,not suitable for all samples.In the process of extracting image features,the characteristic information of small objects in the image is easily lost,resulting in a low classification accuracy of small objects.To this end,this paper proposes a multi-label image classification model based on multiscale fusion and adaptive label correlation.The main idea is:first,the feature maps of multiple scales are fused to enhance the feature information of small objects.Semantic guidance decomposes the fusion feature map into feature vectors of each category,then adaptively mines the correlation between categories in the image through the self-attention mechanism of graph attention network,and obtains feature vectors containing category-related information for the final classification.The mean average precision of the model on the two public datasets of VOC 2007 and MS COCO 2014 reached 95.6% and 83.6%,respectively,and most of the indicators are better than those of the existing latest methods. 展开更多
关键词 image classification label correlation graph attention network small object multi-scale fusion
原文传递
MA2Net:Multi-Scale Adaptive Mixed Attention Network for Image Demoiréing
12
作者 Ji-Wei Wang Li-Yong Shen Hao-Nan Zhao 《Computational Visual Media》 2025年第3期619-634,共16页
Image demoiréing is a complex image-restoration task because of the color and shape variations of moirépatterns.With the development of mobile devices,mobile phones can now be used to capture images at multi... Image demoiréing is a complex image-restoration task because of the color and shape variations of moirépatterns.With the development of mobile devices,mobile phones can now be used to capture images at multiple resolutions.This difficulty increases when attempting to remove moiréfrom both low-and high-resolution images,as different resolutions make it challenging for existing methods to match the scales and textures of moiré.To solve these problems,we built a mixed attention residual module(MARM)by combining multi-scale feature extraction and mixed attention methods.Based on MARM,we propose a multi-scale adaptive mixed attention network(MA2Net)that can adapt to input images of different sizes and remove moiréof various shapes.Our model achieved the best results on four public datasets with resolutions ranging from 256×256 to 4k.Extensive experiments demonstrated the effectiveness of our model,which outperformed state-of-the-art methods by a large margin.We also conducted experiments on image deraining to validate the effectiveness of our model in other image-restoration tasks,and MA2Net achieved state-of-the-art performance on the Rain200H dataset. 展开更多
关键词 image demoiréing mixed attention multi-scale fusion deep learning
原文传递
An improved multiscale fusion dense network with efficient multiscale attention mechanism for apple leaf disease identification 被引量:1
13
作者 Dandan DAI Hui LIU 《Frontiers of Agricultural Science and Engineering》 2025年第2期173-189,共17页
With the development of smart agriculture,accurately identifying crop diseases through visual recognition techniques instead of by eye has been a significant challenge.This study focused on apple leaf disease,which is... With the development of smart agriculture,accurately identifying crop diseases through visual recognition techniques instead of by eye has been a significant challenge.This study focused on apple leaf disease,which is closely related to the final yield of apples.A multiscale fusion dense network combined with an efficient multiscale attention(EMA)mechanism called Incept_EMA_DenseNet was developed to better identify eight complex apple leaf disease images.Incept_EMA_DenseNet consists of three crucial parts:the inception module,which substituted the convolution layer with multiscale fusion methods in the shallow feature extraction layer;the EMA mechanism,which is used for obtaining appropriate weights of different dense blocks;and the improved DenseNet based on DenseNet_121.Specifically,to find appropriate multiscale fusion methods,the residual module and inception module were compared to determine the performance of each technique,and Incept_EMA_DenseNet achieved an accuracy of 95.38%.Second,this work used three attention mechanisms,and the efficient multiscale attention mechanism obtained the best performance.Third,the convolution layers and bottlenecks were modified without performance degradation,reducing half of the computational load compared with the original models.Incept_EMA_DenseNet,as proposed in this paper,has an accuracy of 96.76%,being 2.93%,3.44%,and 4.16%better than Resnet50,DenseNet_121 and GoogLeNet,respectively,proved to be reliable and beneficial,and can effectively and conveniently assist apple growers with leaf disease identification in the field. 展开更多
关键词 Incept_EMA_DenseNet multi-scale fusion module efficient multiscale attention mechanism apple leaf disease identification
原文传递
Improved YOLOv8 network using multi-scale feature fusion for detecting small tea shoots in complex environments
14
作者 Yatao Li Liuhuan Tan +4 位作者 Zhenghao Zhong Leiying He Jianneng Chen Chuanyu Wu Zhengmin Wu 《International Journal of Agricultural and Biological Engineering》 2025年第5期223-233,共11页
Tea shoot segmentation is crucial for the automation of high-quality tea plucking.However,accurate segmentation of tea shoots in unstructured and complex environments presents significant challenges due to the small s... Tea shoot segmentation is crucial for the automation of high-quality tea plucking.However,accurate segmentation of tea shoots in unstructured and complex environments presents significant challenges due to the small size of the targets and the similarity in color between the shoots and their background.To address these challenges and achieve accurate recognition of tea shoots in complex settings,an advanced tea shoot segmentation network model is proposed based on You Only Look Once version 8 segmentation(YOLOv8-seg)network model.Firstly,to enhance the model’s segmentation capability for small targets,this study designed a feature fusion network that incorporates shallow,large-scale features extracted by the backbone network.Subsequently,the features extracted at different scales by the backbone network are fused to obtain both global and local features,thereby enhancing the overall information representation capability of the features.Furthermore,the Efficient Channel Attention mechanism was integrated into the feature fusion process and combined with a reparameterization technique to refine and improve the efficiency of the fusion process.Finally,Wise-IoU with a dynamic non-monotonic aggregation mechanism was employed to assign varying gradient gains to anchor boxes of differing qualities.Experimental results demonstrate that the improved network model increases the AP50 of box and mask by 4.33%and 4.55%,respectively,while maintaining a smaller parameter count and reduced computational demand.Compared to other classical segmentation algorithms models,the proposed model excels in tea shoot segmentation.Overall,the advancements proposed in this study effectively segment tea shoots in complex environments,offering significant theoretical and practical contributions to the automated plucking of high-quality tea. 展开更多
关键词 tea shoot segmentation multi-scale fusion attention mechanism reparameterization technique YOLOv8-seg
原文传递
双通道噪声抑制网络及在阴影去除中的应用
15
作者 黄璞 苏畅 +1 位作者 杨章静 杨国为 《小型微型计算机系统》 北大核心 2025年第10期2431-2439,共9页
针对图像中阴影覆盖导致信息缺失的问题,提出了一种基于Transformer的阴影去除方法——双通道噪声抑制网络(DNSNet).该方法在Transformer的基础上集成了全局双通道注意力模块,结合通道注意力和空间注意力机制,以捕获全面的全局上下文信... 针对图像中阴影覆盖导致信息缺失的问题,提出了一种基于Transformer的阴影去除方法——双通道噪声抑制网络(DNSNet).该方法在Transformer的基础上集成了全局双通道注意力模块,结合通道注意力和空间注意力机制,以捕获全面的全局上下文信息,从而实现精确的阴影去除,显著提升了阴影区域的清晰度和准确性.在阴影处理阶段,DNSNet进一步引入了噪声抑制注意力聚合模块,有针对性地突出关键特征,从而有效改善了阴影区域的处理效果.在ISTD、ISTD+和SRD数据集上的实验结果表明,DNSNet在阴影去除任务中,相较于现有方法,表现优异,不仅有效减少了阴影对图像质量的影响,还成功保留了图像的关键细节和自然纹理. 展开更多
关键词 阴影去除 TRANSFORMER 多尺度混合注意力框架 噪声抑制注意力聚合模块 全局双通道
在线阅读 下载PDF
CE-CDNet:A Transformer-Based Channel Optimization Approach for Change Detection in Remote Sensing
16
作者 Jia Liu Hang Gu +5 位作者 Fangmei Liu Hao Chen Zuhe Li Gang Xu Qidong Liu Wei Wang 《Computers, Materials & Continua》 2025年第4期803-822,共20页
In recent years,convolutional neural networks(CNN)and Transformer architectures have made significant progress in the field of remote sensing(RS)change detection(CD).Most of the existing methods directly stack multipl... In recent years,convolutional neural networks(CNN)and Transformer architectures have made significant progress in the field of remote sensing(RS)change detection(CD).Most of the existing methods directly stack multiple layers of Transformer blocks,which achieves considerable improvement in capturing variations,but at a rather high computational cost.We propose a channel-Efficient Change Detection Network(CE-CDNet)to address the problems of high computational cost and imbalanced detection accuracy in remote sensing building change detection.The adaptive multi-scale feature fusion module(CAMSF)and lightweight Transformer decoder(LTD)are introduced to improve the change detection effect.The CAMSF module can adaptively fuse multi-scale features to improve the model’s ability to detect building changes in complex scenes.In addition,the LTD module reduces computational costs and maintains high detection accuracy through an optimized self-attention mechanism and dimensionality reduction operation.Experimental test results on three commonly used remote sensing building change detection data sets show that CE-CDNet can reduce a certain amount of computational overhead while maintaining detection accuracy comparable to existing mainstream models,showing good performance advantages. 展开更多
关键词 Remote sensing change detection attention mechanism channel optimization multi-scale feature fusion
在线阅读 下载PDF
Grasp Detection with Hierarchical Multi-Scale Feature Fusion and Inverted Shuffle Residual
17
作者 Wenjie Geng Zhiqiang Cao +3 位作者 Peiyu Guan Fengshui Jing Min Tan Junzhi Yu 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2024年第1期244-256,共13页
Grasp detection plays a critical role for robot manipulation.Mainstream pixel-wise grasp detection networks with encoder-decoder structure receive much attention due to good accuracy and efficiency.However,they usuall... Grasp detection plays a critical role for robot manipulation.Mainstream pixel-wise grasp detection networks with encoder-decoder structure receive much attention due to good accuracy and efficiency.However,they usually transmit the high-level feature in the encoder to the decoder,and low-level features are neglected.It is noted that low-level features contain abundant detail information,and how to fully exploit low-level features remains unsolved.Meanwhile,the channel information in high-level feature is also not well mined.Inevitably,the performance of grasp detection is degraded.To solve these problems,we propose a grasp detection network with hierarchical multi-scale feature fusion and inverted shuffle residual.Both low-level and high-level features in the encoder are firstly fused by the designed skip connections with attention module,and the fused information is then propagated to corresponding layers of the decoder for in-depth feature fusion.Such a hierarchical fusion guarantees the quality of grasp prediction.Furthermore,an inverted shuffle residual module is created,where the high-level feature from encoder is split in channel and the resultant split features are processed in their respective branches.By such differentiation processing,more high-dimensional channel information is kept,which enhances the representation ability of the network.Besides,an information enhancement module is added before the encoder to reinforce input information.The proposed method attains 98.9%and 97.8%in image-wise and object-wise accuracy on the Cornell grasping dataset,respectively,and the experimental results verify the effectiveness of the method. 展开更多
关键词 grasp detection hierarchical multi-scale feature fusion skip connections with attention inverted shuffle residual
原文传递
Learning multi-scale attention network for fine-grained visual classification
18
作者 Peipei Zhao Siyan Yang +4 位作者 Wei Ding Ruyi Liu Wentian Xin Xiangzeng Liu Qiguang Miao 《Journal of Information and Intelligence》 2025年第6期492-503,共12页
Fine-grained visual classification(FGVC)is a very challenging task due to distinguishing subcategories under the same super-category.Recent works mainly localize discriminative image regions and capture subtle inter-c... Fine-grained visual classification(FGVC)is a very challenging task due to distinguishing subcategories under the same super-category.Recent works mainly localize discriminative image regions and capture subtle inter-class differences by utilizing attention-based methods.However,at the same layer,most attention-based works only consider large-scale attention blocks with the same size as feature maps,and they ignore small-scale attention blocks that are smaller than feature maps.To distinguish subcategories,it is important to exploit small local regions.In this work,a novel multi-scale attention network(MSANet)is proposed to capture large and small regions at the same layer in fine-grained visual classification.Specifically,a novel multi-scale attention layer(MSAL)is proposed,which generates multiple groups in each feature maps to capture different-scale discriminative regions.The groups based on large-scale regions can exploit global features and the groups based on the small-scale regions can extract local subtle features.Then,a simple feature fusion strategy is utilized to fully integrate global features and local subtle features to mine information that are more conducive to FGVC.Comprehensive experiments in Caltech-UCSD Birds-200-2011(CUB),FGVC-Aircraft(AIR)and Stanford Cars(Cars)datasets show that our method achieves the competitive performances,which demonstrate its effectiveness. 展开更多
关键词 Fine-grained visual classification multi-scale attention network multi-scale attention module Feature fusion strategy
原文传递
融合多角度特征的文本匹配模型 被引量:2
19
作者 李广 刘新 +2 位作者 马中昊 黄浩钰 张远明 《计算机系统应用》 2022年第7期158-164,共7页
文本匹配是自然语言处理的一个核心研究领域,深度文本匹配模型大致可以分为表示型和交互型两种类型,表示型模型容易失去语义焦点难以衡量词上下文重要性,交互型模型缺少句型、句间等全局性信息.针对以上问题提出一种融合多角度特征的文... 文本匹配是自然语言处理的一个核心研究领域,深度文本匹配模型大致可以分为表示型和交互型两种类型,表示型模型容易失去语义焦点难以衡量词上下文重要性,交互型模型缺少句型、句间等全局性信息.针对以上问题提出一种融合多角度特征的文本匹配模型,该模型以孪生网络为基本架构,利用BERT模型生成词向量进行词相似度融合加强语义特征,利用Bi-LSTM对文本的句型结构特征进行编码,即融合文本词性序列的句型结构信息,使用Transformer编码器对文本句型结构特征和文本特征进行多层次交互,最后拼接向量推理计算出两个文本之间的相似度.在Quora部分数据集上的实验表明,本模型相比于经典深度匹配模型有更好的表现. 展开更多
关键词 文本匹配 句型结构 Transformer框架 孪生网络 Bi-LSTM 特征融合 注意力机制 自然语言处理
在线阅读 下载PDF
Tea Leaf Disease Diagnosis Based on Improved Lightweight U-Net3+
20
作者 HU Yumeng GUAN Feifan +5 位作者 XIE Dongchen MA Ping YU Youben ZHOU Jie NIE Yanming HUANG Lüwen 《智慧农业(中英文)》 2026年第1期15-27,共13页
[Objective]Leaf diseases significantly affect both the yield and quality of tea throughout the year.To address the issue of inadequate segmentation finesse in the current tea spot segmentation models,a novel diagnosis... [Objective]Leaf diseases significantly affect both the yield and quality of tea throughout the year.To address the issue of inadequate segmentation finesse in the current tea spot segmentation models,a novel diagnosis of the severity of tea spots was proposed in this research,designated as MDC-U-Net3+,to enhance segmentation accuracy on the base framework of U-Net3+.[Methods]Multi-scale feature fusion module(MSFFM)was incorporated into the backbone network of U-Net3+to obtain feature information across multiple receptive fields of diseased spots,thereby reducing the loss of features within the encoder.Dual multi-scale attention(DMSA)was incorporated into the skip connection process to mitigate the segmentation boundary ambiguity issue.This integration facilitates the comprehensive fusion of fine-grained and coarse-grained semantic information at full scale.Furthermore,the segmented mask image was subjected to conditional random fields(CRF)to enhance the optimization of the segmentation results[Results and Discussions]The improved model MDC-U-Net3+achieved a mean pixel accuracy(mPA)of 94.92%,accompanied by a mean Intersection over Union(mIoU)ratio of 90.9%.When compared to the mPA and mIoU of U-Net3+,MDC-U-Net3+model showed improvements of 1.85 and 2.12 percentage points,respectively.These results illustrated a more effective segmentation performance than that achieved by other classical semantic segmentation models.[Conclusions]The methodology presented herein could provide data support for automated disease detection and precise medication,consequently reducing the losses associated with tea diseases. 展开更多
关键词 disease diagnosis semantic segmentation U-Net3+ multi-scale feature fusion attention mechanism conditional random fields
在线阅读 下载PDF
上一页 1 2 下一页 到第
使用帮助 返回顶部