期刊文献+
共找到40,718篇文章
< 1 2 250 >
每页显示 20 50 100
SIM-Net:A Multi-Scale Attention-Guided Deep Learning Framework for High-Precision PCB Defect Detection
1
作者 Ping Fang Mengjun Tong 《Computers, Materials & Continua》 2026年第4期1754-1770,共17页
Defect detection in printed circuit boards(PCB)remains challenging due to the difficulty of identifying small-scale defects,the inefficiency of conventional approaches,and the interference from complex backgrounds.To ... Defect detection in printed circuit boards(PCB)remains challenging due to the difficulty of identifying small-scale defects,the inefficiency of conventional approaches,and the interference from complex backgrounds.To address these issues,this paper proposes SIM-Net,an enhanced detection framework derived from YOLOv11.The model integrates SPDConv to preserve fine-grained features for small object detection,introduces a novel convolutional partial attention module(C2PAM)to suppress redundant background information and highlight salient regions,and employs a multi-scale fusion network(MFN)with a multi-grain contextual module(MGCT)to strengthen contextual representation and accelerate inference.Experimental evaluations demonstrate that SIM-Net achieves 92.4%mAP,92%accuracy,and 89.4%recall with an inference speed of 75.1 FPS,outperforming existing state-of-the-art methods.These results confirm the robustness and real-time applicability of SIM-Net for PCB defect inspection. 展开更多
关键词 Deep learning small object detection PCB defect detection attention mechanism multi-scale fusion network
在线阅读 下载PDF
M2ANet:Multi-branch and multi-scale attention network for medical image segmentation 被引量:1
2
作者 Wei Xue Chuanghui Chen +3 位作者 Xuan Qi Jian Qin Zhen Tang Yongsheng He 《Chinese Physics B》 2025年第8期547-559,共13页
Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to ... Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures. 展开更多
关键词 medical image segmentation convolutional neural network multi-branch attention multi-scale feature fusion
原文传递
MA-VoxelMorph:Multi-scale attention-based VoxelMorph for nonrigid registration of thoracoabdominal CT images
3
作者 Qing Huang Lei Ren +3 位作者 Tingwei Quan Minglei Yang Hongmei Yuan Kai Cao 《Journal of Innovative Optical Health Sciences》 2025年第1期135-151,共17页
This paper aims to develop a nonrigid registration method of preoperative and intraoperative thoracoabdominal CT images in computer-assisted interventional surgeries for accurate tumor localization and tissue visualiz... This paper aims to develop a nonrigid registration method of preoperative and intraoperative thoracoabdominal CT images in computer-assisted interventional surgeries for accurate tumor localization and tissue visualization enhancement.However,fine structure registration of complex thoracoabdominal organs and large deformation registration caused by respiratory motion is challenging.To deal with this problem,we propose a 3D multi-scale attention VoxelMorph(MAVoxelMorph)registration network.To alleviate the large deformation problem,a multi-scale axial attention mechanism is utilized by using a residual dilated pyramid pooling for multi-scale feature extraction,and position-aware axial attention for long-distance dependencies between pixels capture.To further improve the large deformation and fine structure registration results,a multi-scale context channel attention mechanism is employed utilizing content information via adjacent encoding layers.Our method was evaluated on four public lung datasets(DIR-Lab dataset,Creatis dataset,Learn2Reg dataset,OASIS dataset)and a local dataset.Results proved that the proposed method achieved better registration performance than current state-of-the-art methods,especially in handling the registration of large deformations and fine structures.It also proved to be fast in 3D image registration,using about 1.5 s,and faster than most methods.Qualitative and quantitative assessments proved that the proposed MA-VoxelMorph has the potential to realize precise and fast tumor localization in clinical interventional surgeries. 展开更多
关键词 Thoracoabdominal CT image registration large deformation fine structure multi-scale attention mechanism
原文传递
Marine organism classification method based on hierarchical multi-scale attention mechanism
4
作者 XU Haotian CHENG Yuanzhi +1 位作者 ZHAO Dong XIE Peidong 《Optoelectronics Letters》 2025年第6期354-361,共8页
We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hie... We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hierarchical efficient multi-scale attention(H-EMA) module is designed for lightweight feature extraction, achieving outstanding performance at a relatively low cost. Secondly, an improved EfficientNetV2 block is used to integrate information from different scales better and enhance inter-layer message passing. Furthermore, introducing the convolutional block attention module(CBAM) enhances the model's perception of critical features, optimizing its generalization ability. Lastly, Focal Loss is introduced to adjust the weights of complex samples to address the issue of imbalanced categories in the dataset, further improving the model's performance. The model achieved 96.11% accuracy on the intertidal marine organism dataset of Nanji Islands and 84.78% accuracy on the CIFAR-100 dataset, demonstrating its strong generalization ability to meet the demands of oceanic biological image classification. 展开更多
关键词 integrate information different scales hierarchical multi scale attention lightweight feature extraction focal loss efficientnetv marine organism classification oceanic biological image classification methods convolutional block attention module
原文传递
Multi-Scale Attention-Based Deep Neural Network for Brain Disease Diagnosis 被引量:1
5
作者 Yin Liang Gaoxu Xu Sadaqat ur Rehman 《Computers, Materials & Continua》 SCIE EI 2022年第9期4645-4661,共17页
Whole brain functional connectivity(FC)patterns obtained from resting-state functional magnetic resonance imaging(rs-fMRI)have been widely used in the diagnosis of brain disorders such as autism spectrum disorder(ASD)... Whole brain functional connectivity(FC)patterns obtained from resting-state functional magnetic resonance imaging(rs-fMRI)have been widely used in the diagnosis of brain disorders such as autism spectrum disorder(ASD).Recently,an increasing number of studies have focused on employing deep learning techniques to analyze FC patterns for brain disease classification.However,the high dimensionality of the FC features and the interpretation of deep learning results are issues that need to be addressed in the FC-based brain disease classification.In this paper,we proposed a multi-scale attention-based deep neural network(MSA-DNN)model to classify FC patterns for the ASD diagnosis.The model was implemented by adding a flexible multi-scale attention(MSA)module to the auto-encoder based backbone DNN,which can extract multi-scale features of the FC patterns and change the level of attention for different FCs by continuous learning.Our model will reinforce the weights of important FC features while suppress the unimportant FCs to ensure the sparsity of the model weights and enhance the model interpretability.We performed systematic experiments on the large multi-sites ASD dataset with both ten-fold and leaveone-site-out cross-validations.Results showed that our model outperformed classical methods in brain disease classification and revealed robust intersite prediction performance.We also localized important FC features and brain regions associated with ASD classification.Overall,our study further promotes the biomarker detection and computer-aided classification for ASD diagnosis,and the proposed MSA module is flexible and easy to implement in other classification networks. 展开更多
关键词 Autism spectrum disorder diagnosis resting-state fMRI deep neural network functional connectivity multi-scale attention module
在线阅读 下载PDF
Multi-scale attention encoder for street-to-aerial image geo-localization 被引量:4
6
作者 Songlian Li Zhigang Tu +1 位作者 Yujin Chen Tan Yu 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第1期166-176,共11页
The goal of street-to-aerial cross-view image geo-localization is to determine the location of the query street-view image by retrieving the aerial-view image from the same place.The drastic viewpoint and appearance g... The goal of street-to-aerial cross-view image geo-localization is to determine the location of the query street-view image by retrieving the aerial-view image from the same place.The drastic viewpoint and appearance gap between the aerial-view and the street-view images brings a huge challenge against this task.In this paper,we propose a novel multiscale attention encoder to capture the multiscale contextual information of the aerial/street-view images.To bridge the domain gap between these two view images,we first use an inverse polar transform to make the street-view images approximately aligned with the aerial-view images.Then,the explored multiscale attention encoder is applied to convert the image into feature representation with the guidance of the learnt multiscale information.Finally,we propose a novel global mining strategy to enable the network to pay more attention to hard negative exemplars.Experiments on standard benchmark datasets show that our approach obtains 81.39%top-1 recall rate on the CVUSA dataset and 71.52%on the CVACT dataset,achieving the state-of-the-art performance and outperforming most of the existing methods significantly. 展开更多
关键词 global mining strategy image geo-localization multiscale attention encoder street-to-aerial cross-view
在线阅读 下载PDF
MSADCN:Multi-Scale Attentional Densely Connected Network for Automated Bone Age Assessment 被引量:1
7
作者 Yanjun Yu Lei Yu +2 位作者 Huiqi Wang Haodong Zheng Yi Deng 《Computers, Materials & Continua》 SCIE EI 2024年第2期2225-2243,共19页
Bone age assessment(BAA)helps doctors determine how a child’s bones grow and develop in clinical medicine.Traditional BAA methods rely on clinician expertise,leading to time-consuming predictions and inaccurate resul... Bone age assessment(BAA)helps doctors determine how a child’s bones grow and develop in clinical medicine.Traditional BAA methods rely on clinician expertise,leading to time-consuming predictions and inaccurate results.Most deep learning-based BAA methods feed the extracted critical points of images into the network by providing additional annotations.This operation is costly and subjective.To address these problems,we propose a multi-scale attentional densely connected network(MSADCN)in this paper.MSADCN constructs a multi-scale dense connectivity mechanism,which can avoid overfitting,obtain the local features effectively and prevent gradient vanishing even in limited training data.First,MSADCN designs multi-scale structures in the densely connected network to extract fine-grained features at different scales.Then,coordinate attention is embedded to focus on critical features and automatically locate the regions of interest(ROI)without additional annotation.In addition,to improve the model’s generalization,transfer learning is applied to train the proposed MSADCN on the public dataset IMDB-WIKI,and the obtained pre-trained weights are loaded onto the Radiological Society of North America(RSNA)dataset.Finally,label distribution learning(LDL)and expectation regression techniques are introduced into our model to exploit the correlation between hand bone images of different ages,which can obtain stable age estimates.Extensive experiments confirm that our model can converge more efficiently and obtain a mean absolute error(MAE)of 4.64 months,outperforming some state-of-the-art BAA methods. 展开更多
关键词 Bone age assessment deep learning attentional densely connected network muti-scale
在线阅读 下载PDF
YOLOv12-enhanced:multi-scale attention and edge information fusion for industrial valve nozzle detection
8
作者 Bo Liu Jian Zhang 《Advances in Engineering Innovation》 2026年第3期80-91,共12页
Accurate valve nozzle detection is an important component of industrial visual inspection systems;however,structural complexity,scale variation,illumination fluctuation,and partial occlusion remain challenging factors... Accurate valve nozzle detection is an important component of industrial visual inspection systems;however,structural complexity,scale variation,illumination fluctuation,and partial occlusion remain challenging factors that affect detection stability.This study presents YOLOv12-Enhanced,a refined singlestage detection framework developed for industrial valve nozzle scenarios.The proposed approach incorporates three architectural modifications:a RepViT backbone to enhance hierarchical feature representation through structural re-parameterization and global–local modeling,a Spatial Pyramid Pooling Fast(SPPF)module combined with C2PSA attention to strengthen multi-scale contextual feature extraction,and a Global Edge Information Fusion(GEIF)module to integrate shallow edge information with deep semantic features for improved boundary alignment.Experimental evaluation on the Pascal Visual Object Classes(VOC)dataset shows that the proposed model achieves 71.0%mAP50 and 54.4%mAP50–95 under identical training conditions,exceeding the baseline YOLOv12n.Ablation experiments further demonstrate that each module contributes incremental performance gains.Evaluation on a self-constructed valve nozzle dataset consisting of 500 real industrial images indicates stable detection behavior under varying illumination and partial occlusion conditions.The experimental findings suggest that the proposed structural refinements provide a balanced enhancement in feature representation and localization precision while maintaining comparable computational complexity. 展开更多
关键词 YOLOv12-enhanced valve nozzle detection multi-scale attention edge information fusion industrial inspection
在线阅读 下载PDF
YOLO-SPDNet:Multi-Scale Sequence and Attention-Based Tomato Leaf Disease Detection Model
9
作者 Meng Wang Jinghan Cai +6 位作者 Wenzheng Liu Xue Yang Jingjing Zhang Qiangmin Zhou Fanzhen Wang Hang Zhang Tonghai Liu 《Phyton-International Journal of Experimental Botany》 2026年第1期290-308,共19页
Tomato is a major economic crop worldwide,and diseases on tomato leaves can significantly reduce both yield and quality.Traditional manual inspection is inefficient and highly subjective,making it difficult to meet th... Tomato is a major economic crop worldwide,and diseases on tomato leaves can significantly reduce both yield and quality.Traditional manual inspection is inefficient and highly subjective,making it difficult to meet the requirements of early disease identification in complex natural environments.To address this issue,this study proposes an improved YOLO11-based model,YOLO-SPDNet(Scale Sequence Fusion,Position-Channel Attention,and Dual Enhancement Network).The model integrates the SEAM(Self-Ensembling Attention Mechanism)semantic enhancement module,the MLCA(Mixed Local Channel Attention)lightweight attention mechanism,and the SPA(Scale-Position-Detail Awareness)module composed of SSFF(Scale Sequence Feature Fusion),TFE(Triple Feature Encoding),and CPAM(Channel and Position Attention Mechanism).These enhancements strengthen fine-grained lesion detection while maintaining model lightweightness.Experimental results show that YOLO-SPDNet achieves an accuracy of 91.8%,a recall of 86.5%,and an mAP@0.5 of 90.6%on the test set,with a computational complexity of 12.5 GFLOPs.Furthermore,the model reaches a real-time inference speed of 987 FPS,making it suitable for deployment on mobile agricultural terminals and online monitoring systems.Comparative analysis and ablation studies further validate the reliability and practical applicability of the proposed model in complex natural scenes. 展开更多
关键词 Tomato disease detection YOLO multi-scale feature fusion attention mechanism lightweight model
在线阅读 下载PDF
M2ATNet: Multi-Scale Multi-Attention Denoising and Feature Fusion Transformer for Low-Light Image Enhancement
10
作者 Zhongliang Wei Jianlong An Chang Su 《Computers, Materials & Continua》 2026年第1期1819-1838,共20页
Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approach... Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments. 展开更多
关键词 Low-light image enhancement multi-scale multi-attention TRANSFORMER
在线阅读 下载PDF
Learning multi-scale attention network for fine-grained visual classification
11
作者 Peipei Zhao Siyan Yang +4 位作者 Wei Ding Ruyi Liu Wentian Xin Xiangzeng Liu Qiguang Miao 《Journal of Information and Intelligence》 2025年第6期492-503,共12页
Fine-grained visual classification(FGVC)is a very challenging task due to distinguishing subcategories under the same super-category.Recent works mainly localize discriminative image regions and capture subtle inter-c... Fine-grained visual classification(FGVC)is a very challenging task due to distinguishing subcategories under the same super-category.Recent works mainly localize discriminative image regions and capture subtle inter-class differences by utilizing attention-based methods.However,at the same layer,most attention-based works only consider large-scale attention blocks with the same size as feature maps,and they ignore small-scale attention blocks that are smaller than feature maps.To distinguish subcategories,it is important to exploit small local regions.In this work,a novel multi-scale attention network(MSANet)is proposed to capture large and small regions at the same layer in fine-grained visual classification.Specifically,a novel multi-scale attention layer(MSAL)is proposed,which generates multiple groups in each feature maps to capture different-scale discriminative regions.The groups based on large-scale regions can exploit global features and the groups based on the small-scale regions can extract local subtle features.Then,a simple feature fusion strategy is utilized to fully integrate global features and local subtle features to mine information that are more conducive to FGVC.Comprehensive experiments in Caltech-UCSD Birds-200-2011(CUB),FGVC-Aircraft(AIR)and Stanford Cars(Cars)datasets show that our method achieves the competitive performances,which demonstrate its effectiveness. 展开更多
关键词 Fine-grained visual classification multi-scale attention network multi-scale attention module Feature fusion strategy
原文传递
Seasonal-spatial distribution and prevailing wind analysis of Martian dust devils in Amazonis Planitia using a multi-scale attention network
12
作者 Linlin SHI Gan LIU +5 位作者 Jialong LAI Xu MA Feifei CUI Xiaoping ZHANG Yi XU Qiquan YANG 《Science China Earth Sciences》 2026年第3期1032-1053,共22页
The detection of dust devils on Mars poses significant challenges,primarily due to the substantial variability in target scales,the susceptibility of small-scale features to loss or distortion during feature extractio... The detection of dust devils on Mars poses significant challenges,primarily due to the substantial variability in target scales,the susceptibility of small-scale features to loss or distortion during feature extraction and fusion,and the interference from complex Martian backgrounds.To tackle these issues,we propose the Dynamic Triplet Fusion Attentive Net(DTFA-Net),a framework tailored for Martian dust devil detection.Within DTFA-Net,we design a Multi-Dimensional Dynamic Feature Pyramid Network(MDFPN),which is based on the Bi-directional Feature Pyramid Network(BiFPN)and enhances multi-level feature fusion by incorporating shallow-layer features and employing a cross-scale connection strategy.Additionally,we propose three innovative lightweight and plug-and-play modules:the Local-Channel Cross-Stage Module(LCCS)to boost feature diversity,the Progressive Feature Enhancement Module(PFE)to increase the focus on critical features,and the Triplet-Aware Cross-Stage Module(TACS)for capturing interactions across spatial and channel dimensions.Furthermore,the framework incorporates the Dynamic Head(DyHead),which uses multi-dimensional attention mechanisms to dynamically adjust to various scales,spatial positions,and detection challenges.Experimental results show that DTFA-Net achieved a detection Precision of 94.3%,a Recall of 92.8%,and mAP50 of 96.6%on the Amazonis Planitia dust devil dataset.Its overall performance significantly surpasses that of existing mainstream methods,while also demonstrating strong generalization capability on cross-regional datasets.Beyond detection,this framework was further applied to analyze the seasonal activity and spatial distribution patterns of Martian dust devils in Amazonis Planitia,and the prevailing winds of each season were examined to explore the mechanisms underlying the formation of activity hotspots.In addition,the model was extended to the ten core candidate landing sites of the Tianwen-3 mission to systematically assess dust devil distribution across these regions.Based on the detection results and previous studies,we suggest that four sites—Kasei Valles,Oxia Planum,McLaughlin Crater,and Mawrth Vallis—offer a more balanced trade-off among scientific value,dust-cleaning,and engineering safety,making them relatively ideal landing sites for the Tianwen-3 mission.Overall,this study provides important insights into the spatiotemporal distribution,activity patterns,and potential environmental risks of Martian dust devils,thereby offering valuable references for Mars exploration missions.Furthermore,it provides guidance for the optimized design and safe operation of spacecraft,and contributes scientific support for the planning and implementation of future Mars exploration endeavors. 展开更多
关键词 Deep learning Feature fusion Object detection Dust devils attention mechanisms
原文传递
Distributed Multi-scale Attention and Predictor-based Control for AC Microgrids with Time Delays and Cyber Failures
13
作者 Yutong Li Ningxuan Guo +3 位作者 Lili Wang Jian Hou Yinan Wang Gangfeng Yan 《Journal of Modern Power Systems and Clean Energy》 2025年第5期1800-1812,共13页
Distributed secondary control has been proposed to maintain frequency/voltage synchronization and power sharing for distributed energy sources in AC microgrids(MGs).The cyber layer is susceptible to time delays and cy... Distributed secondary control has been proposed to maintain frequency/voltage synchronization and power sharing for distributed energy sources in AC microgrids(MGs).The cyber layer is susceptible to time delays and cyber failures and thus,a distributed resilient secondary control should be investigated.This paper proposes a distributed multi-scale attention and predictor-based control(DMAPC)strategy to address false data injection attacks and packet loss failures with time delays.The multi-scale attention mechanism enables the system to selectively focus on neighbors'states with higher confidence evaluated in different time scales,while the data-driven predictor compensates for lost neighbors'states in the nonlinear controller.The DMAPC does not impose strict limitations on the number of false communication links or upper bound for false data.Besides,the DMAPC is formulated as an uncertain system with time delays and is proven to be uniformly ultimately bounded.Extensive experiments on a hardware-in-the-loop MG testbed have validated the effectiveness of DMAPC,which successfully relaxes restrictions on cyber failures compared to existing strategies. 展开更多
关键词 Microgrid(MG) secondary control time delay cyber failure packet loss(PL) false data injection(FDI)attack denial-of-service(DoS)attack distributed multi-scale attention PREDICTOR
原文传递
DDSR-Net:Direct Document Shadow Removal Leveraging Multi-scale Attention
14
作者 Bingshu Wang Ze Wang +3 位作者 Wenjie Liu Xiaoshui Huang C.L.Philip Chen Yue Zhao 《Machine Intelligence Research》 2025年第3期452-465,共14页
Shadows in document images are undesirable yet inevitable.They can decrease the clarity and readability of the images.The existing methods for removing shadows from documents still face some challenges,such as the tra... Shadows in document images are undesirable yet inevitable.They can decrease the clarity and readability of the images.The existing methods for removing shadows from documents still face some challenges,such as the traditional heuristics lack universality and the optimization goal of subnetworks is not consistent for multistage deep learning methods.In this paper,we introduce an end-to-end direct document shadow removal network(DDSR-Net),where we employ a 3-layer UNet++as the backbone to extract features from diverse scales.To further improve the performance of DDSR-Net,we integrate the multi-scale attention(MSA)blocks into each node.The MSA block allocates different weights to feature vectors at different positions,achieving automatic feature alignment and significantly enhancing the end-to-end network's ability to handle shadow processing.To verify the effectiveness of the proposed DDSR-Net,qualitative and quantitative experiments are conducted on multiple open-source document shadow removal datasets.The experimental results demonstrate that our method outperforms the existing state-of-the-art methods on these datasets.Our code and models will be released to the public. 展开更多
关键词 Deep learning END-TO-END multi-scale attention document shadow removal U-net
原文传递
Lightweight Underwater Target Detection Using YOLOv8 with Multi-Scale Cross-Channel Attention
15
作者 Xueyan Ding Xiyu Chen +1 位作者 Jiaxin Wang Jianxin Zhang 《Computers, Materials & Continua》 SCIE EI 2025年第1期713-727,共15页
Underwater target detection is extensively applied in domains such as underwater search and rescue,environmental monitoring,and marine resource surveys.It is crucial in enabling autonomous underwater robot operations ... Underwater target detection is extensively applied in domains such as underwater search and rescue,environmental monitoring,and marine resource surveys.It is crucial in enabling autonomous underwater robot operations and promoting ocean exploration.Nevertheless,low imaging quality,harsh underwater environments,and obscured objects considerably increase the difficulty of detecting underwater targets,making it difficult for current detection methods to achieve optimal performance.In order to enhance underwater object perception and improve target detection precision,we propose a lightweight underwater target detection method using You Only Look Once(YOLO)v8 with multi-scale cross-channel attention(MSCCA),named YOLOv8-UOD.In the proposed multiscale cross-channel attention module,multi-scale attention(MSA)augments the variety of attentional perception by extracting information from innately diverse sensory fields.The cross-channel strategy utilizes RepVGGbased channel shuffling(RCS)and one-shot aggregation(OSA)to rearrange feature map channels according to specific rules.It aggregates all features only once in the final feature mapping,resulting in the extraction of more comprehensive and valuable feature information.The experimental results show that the proposed YOLOv8-UOD achieves a mAP50 of 95.67%and FLOPs of 23.8 G on the Underwater Robot Picking Contest 2017(URPC2017)dataset,outperforming other methods in terms of detection precision and computational cost-efficiency. 展开更多
关键词 Deep learning underwater target detection attention mechanism
在线阅读 下载PDF
Magnetic Resonance Image Super-Resolution Based on GAN and Multi-Scale Residual Dense Attention Network
16
作者 GUAN Chunling YU Suping +1 位作者 XU Wujun FAN Hong 《Journal of Donghua University(English Edition)》 2025年第4期435-441,共7页
The application of image super-resolution(SR)has brought significant assistance in the medical field,aiding doctors to make more precise diagnoses.However,solely relying on a convolutional neural network(CNN)for image... The application of image super-resolution(SR)has brought significant assistance in the medical field,aiding doctors to make more precise diagnoses.However,solely relying on a convolutional neural network(CNN)for image SR may lead to issues such as blurry details and excessive smoothness.To address the limitations,we proposed an algorithm based on the generative adversarial network(GAN)framework.In the generator network,three different sizes of convolutions connected by a residual dense structure were used to extract detailed features,and an attention mechanism combined with dual channel and spatial information was applied to concentrate the computing power on crucial areas.In the discriminator network,using InstanceNorm to normalize tensors sped up the training process while retaining feature information.The experimental results demonstrate that our algorithm achieves higher peak signal-to-noise ratio(PSNR)and structural similarity index measure(SSIM)compared to other methods,resulting in an improved visual quality. 展开更多
关键词 magnetic resonance(MR) image super-resolution(SR) attention mechanism generative adversarial network(GAN) multi-scale convolution
在线阅读 下载PDF
MSSTGCN: Multi-Head Self-Attention and Spatial-Temporal Graph Convolutional Network for Multi-Scale Traffic Flow Prediction
17
作者 Xinlu Zong Fan Yu +1 位作者 Zhen Chen Xue Xia 《Computers, Materials & Continua》 2025年第2期3517-3537,共21页
Accurate traffic flow prediction has a profound impact on modern traffic management. Traffic flow has complex spatial-temporal correlations and periodicity, which poses difficulties for precise prediction. To address ... Accurate traffic flow prediction has a profound impact on modern traffic management. Traffic flow has complex spatial-temporal correlations and periodicity, which poses difficulties for precise prediction. To address this problem, a Multi-head Self-attention and Spatial-Temporal Graph Convolutional Network (MSSTGCN) for multiscale traffic flow prediction is proposed. Firstly, to capture the hidden traffic periodicity of traffic flow, traffic flow is divided into three kinds of periods, including hourly, daily, and weekly data. Secondly, a graph attention residual layer is constructed to learn the global spatial features across regions. Local spatial-temporal dependence is captured by using a T-GCN module. Thirdly, a transformer layer is introduced to learn the long-term dependence in time. A position embedding mechanism is introduced to label position information for all traffic sequences. Thus, this multi-head self-attention mechanism can recognize the sequence order and allocate weights for different time nodes. Experimental results on four real-world datasets show that the MSSTGCN performs better than the baseline methods and can be successfully adapted to traffic prediction tasks. 展开更多
关键词 Graph convolutional network traffic flow prediction multi-scale traffic flow spatial-temporal model
在线阅读 下载PDF
MMIF:Multimodal Medical Image Fusion Network Based on Multi-Scale Hybrid Attention
18
作者 Jianjun Liu Yang Li +2 位作者 Xiaoting Sun Xiaohui Wang Hanjiang Luo 《Computers, Materials & Continua》 2025年第11期3551-3568,共18页
Multimodal image fusion plays an important role in image analysis and applications.Multimodal medical image fusion helps to combine contrast features from two or more input imaging modalities to represent fused inform... Multimodal image fusion plays an important role in image analysis and applications.Multimodal medical image fusion helps to combine contrast features from two or more input imaging modalities to represent fused information in a single image.One of the critical clinical applications of medical image fusion is to fuse anatomical and functional modalities for rapid diagnosis of malignant tissues.This paper proposes a multimodal medical image fusion network(MMIF-Net)based on multiscale hybrid attention.The method first decomposes the original image to obtain the low-rank and significant parts.Then,to utilize the features at different scales,we add amultiscalemechanism that uses three filters of different sizes to extract the features in the encoded network.Also,a hybrid attention module is introduced to obtain more image details.Finally,the fused images are reconstructed by decoding the network.We conducted experiments with clinical images from brain computed tomography/magnetic resonance.The experimental results show that the multimodal medical image fusion network method based on multiscale hybrid attention works better than other advanced fusion methods. 展开更多
关键词 Medical image fusion multiscale mechanism hybrid attention module encoded network
在线阅读 下载PDF
Transmission Facility Detection with Feature-Attention Multi-Scale Robustness Network and Generative Adversarial Network
19
作者 Yunho Na Munsu Jeon +4 位作者 Seungmin Joo Junsoo Kim Ki-Yong Oh Min Ku Kim Joon-Young Park 《Computer Modeling in Engineering & Sciences》 2025年第7期1013-1044,共32页
This paper proposes an automated detection framework for transmission facilities using a featureattention multi-scale robustness network(FAMSR-Net)with high-fidelity virtual images.The proposed framework exhibits thre... This paper proposes an automated detection framework for transmission facilities using a featureattention multi-scale robustness network(FAMSR-Net)with high-fidelity virtual images.The proposed framework exhibits three key characteristics.First,virtual images of the transmission facilities generated using StyleGAN2-ADA are co-trained with real images.This enables the neural network to learn various features of transmission facilities to improve the detection performance.Second,the convolutional block attention module is deployed in FAMSR-Net to effectively extract features from images and construct multi-dimensional feature maps,enabling the neural network to perform precise object detection in various environments.Third,an effective bounding box optimization method called Scylla-IoU is deployed on FAMSR-Net,considering the intersection over union,center point distance,angle,and shape of the bounding box.This enables the detection of power facilities of various sizes accurately.Extensive experiments demonstrated that FAMSRNet outperforms other neural networks in detecting power facilities.FAMSR-Net also achieved the highest detection accuracy when virtual images of the transmission facilities were co-trained in the training phase.The proposed framework is effective for the scheduled operation and maintenance of transmission facilities because an optical camera is currently the most promising tool for unmanned aerial vehicles.This ultimately contributes to improved inspection efficiency,reduced maintenance risks,and more reliable power delivery across extensive transmission facilities. 展开更多
关键词 Object detection virtual image transmission facility convolutional block attention module Scylla-IoU
在线阅读 下载PDF
Deep Multi-Scale and Attention-Based Architectures for Semantic Segmentation in Biomedical Imaging
20
作者 Majid Harouni Vishakha Goyal +2 位作者 Gabrielle Feldman Sam Michael Ty C.Voss 《Computers, Materials & Continua》 2025年第10期331-366,共36页
Semantic segmentation plays a foundational role in biomedical image analysis, providing precise information about cellular, tissue, and organ structures in both biological and medical imaging modalities. Traditional a... Semantic segmentation plays a foundational role in biomedical image analysis, providing precise information about cellular, tissue, and organ structures in both biological and medical imaging modalities. Traditional approaches often fail in the face of challenges such as low contrast, morphological variability, and densely packed structures. Recent advancements in deep learning have transformed segmentation capabilities through the integration of fine-scale detail preservation, coarse-scale contextual modeling, and multi-scale feature fusion. This work provides a comprehensive analysis of state-of-the-art deep learning models, including U-Net variants, attention-based frameworks, and Transformer-integrated networks, highlighting innovations that improve accuracy, generalizability, and computational efficiency. Key architectural components such as convolution operations, shallow and deep blocks, skip connections, and hybrid encoders are examined for their roles in enhancing spatial representation and semantic consistency. We further discuss the importance of hierarchical and instance-aware segmentation and annotation in interpreting complex biological scenes and multiplexed medical images. By bridging methodological developments with diverse application domains, this paper outlines current trends and future directions for semantic segmentation, emphasizing its critical role in facilitating annotation, diagnosis, and discovery in biomedical research. 展开更多
关键词 Biomedical semantic segmentation multi-scale feature fusion fine-and coarse-scale features convolution operations shallow and deep blocks skip connections
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部