期刊文献+
共找到515篇文章
< 1 2 26 >
每页显示 20 50 100
Infrared road object detection algorithm based on spatial depth channel attention network and improved YOLOv8
1
作者 LI Song SHI Tao +1 位作者 JING Fangke CUI Jie 《Optoelectronics Letters》 2025年第8期491-498,共8页
Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm f... Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm for infrared images,F-YOLOv8,is proposed.First,a spatial-to-depth network replaces the traditional backbone network's strided convolution or pooling layer.At the same time,it combines with the channel attention mechanism so that the neural network focuses on the channels with large weight values to better extract low-resolution image feature information;then an improved feature pyramid network of lightweight bidirectional feature pyramid network(L-BiFPN)is proposed,which can efficiently fuse features of different scales.In addition,a loss function of insertion of union based on the minimum point distance(MPDIoU)is introduced for bounding box regression,which obtains faster convergence speed and more accurate regression results.Experimental results on the FLIR dataset show that the improved algorithm can accurately detect infrared road targets in real time with 3%and 2.2%enhancement in mean average precision at 50%IoU(mAP50)and mean average precision at 50%—95%IoU(mAP50-95),respectively,and 38.1%,37.3%and 16.9%reduction in the number of model parameters,the model weight,and floating-point operations per second(FLOPs),respectively.To further demonstrate the detection capability of the improved algorithm,it is tested on the public dataset PASCAL VOC,and the results show that F-YOLO has excellent generalized detection performance. 展开更多
关键词 feature pyramid network infrared road object detection infrared imagesf yolov backbone networks channel attention mechanism spatial depth channel attention network object detection improved YOLOv
原文传递
SA-ResNet:An Intrusion Detection Method Based on Spatial Attention Mechanism and Residual Neural Network Fusion
2
作者 Zengyu Cai Yuming Dai +1 位作者 Jianwei Zhang Yuan Feng 《Computers, Materials & Continua》 2025年第5期3335-3350,共16页
The rapid development and widespread adoption of Internet technology have significantly increased Internet traffic,highlighting the growing importance of network security.Intrusion Detection Systems(IDS)are essential ... The rapid development and widespread adoption of Internet technology have significantly increased Internet traffic,highlighting the growing importance of network security.Intrusion Detection Systems(IDS)are essential for safeguarding network integrity.To address the low accuracy of existing intrusion detection models in identifying network attacks,this paper proposes an intrusion detection method based on the fusion of Spatial Attention mechanism and Residual Neural Network(SA-ResNet).Utilizing residual connections can effectively capture local features in the data;by introducing a spatial attention mechanism,the global dependency relationships of intrusion features can be extracted,enhancing the intrusion recognition model’s focus on the global features of intrusions,and effectively improving the accuracy of intrusion recognition.The proposed model in this paper was experimentally verified on theNSL-KDD dataset.The experimental results showthat the intrusion recognition accuracy of the intrusion detection method based on SA-ResNet has reached 99.86%,and its overall accuracy is 0.41% higher than that of traditional Convolutional Neural Network(CNN)models. 展开更多
关键词 Intrusion detection deep learning residual neural network spatial attention mechanism
在线阅读 下载PDF
Feature pyramid attention network for audio-visual scene classification 被引量:1
3
作者 Liguang Zhou Yuhongze Zhou +3 位作者 Xiaonan Qi Junjie Hu Tin Lun Lam Yangsheng Xu 《CAAI Transactions on Intelligence Technology》 2025年第2期359-374,共16页
Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and text... Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and textures found in visual images.The focus of recent studies has predominantly revolved around extracting features from diverse neural network structures,inadvertently neglecting the acquisition of semantically meaningful regions and crucial components within audio-visual data.The authors present a feature pyramid attention network(FPANet)for audio-visual scene understanding,which extracts semantically significant characteristics from audio-visual data.The authors’approach builds multi-scale hierarchical features of sound spectrograms and visual images using a feature pyramid representation and localises the semantically relevant regions with a feature pyramid attention module(FPAM).A dimension alignment(DA)strategy is employed to align feature maps from multiple layers,a pyramid spatial attention(PSA)to spatially locate essential regions,and a pyramid channel attention(PCA)to pinpoint significant temporal frames.Experiments on visual scene classification(VSC),audio scene classification(ASC),and AVSC tasks demonstrate that FPANet achieves performance on par with state-of-the-art(SOTA)approaches,with a 95.9 F1-score on the ADVANCE dataset and a relative improvement of 28.8%.Visualisation results show that FPANet can prioritise semantically meaningful areas in audio-visual signals. 展开更多
关键词 dimension alignment feature pyramid attention network pyramid channel attention pyramid spatial attention semantic relevant regions
在线阅读 下载PDF
Integrating multi-modal information to detect spatial domains of spatial transcriptomics by graph attention network 被引量:1
4
作者 Yuying Huo Yilang Guo +4 位作者 Jiakang Wang Huijie Xue Yujuan Feng Weizheng Chen Xiangyu Li 《Journal of Genetics and Genomics》 SCIE CAS CSCD 2023年第9期720-733,共14页
Recent advances in spatially resolved transcriptomic technologies have enabled unprecedented opportunities to elucidate tissue architecture and function in situ.Spatial transcriptomics can provide multimodal and compl... Recent advances in spatially resolved transcriptomic technologies have enabled unprecedented opportunities to elucidate tissue architecture and function in situ.Spatial transcriptomics can provide multimodal and complementary information simultaneously,including gene expression profiles,spatial locations,and histology images.However,most existing methods have limitations in efficiently utilizing spatial information and matched high-resolution histology images.To fully leverage the multi-modal information,we propose a SPAtially embedded Deep Attentional graph Clustering(SpaDAC)method to identify spatial domains while reconstructing denoised gene expression profiles.This method can efficiently learn the low-dimensional embeddings for spatial transcriptomics data by constructing multi-view graph modules to capture both spatial location connectives and morphological connectives.Benchmark results demonstrate that SpaDAC outperforms other algorithms on several recent spatial transcriptomics datasets.SpaDAC is a valuable tool for spatial domain detection,facilitating the comprehension of tissue architecture and cellular microenvironment.The source code of SpaDAC is freely available at Github(https://github.com/huoyuying/SpaDAC.git). 展开更多
关键词 spatialtranscriptomics spatial domaindetection Multi-modal integration Graph attention network
原文传递
考虑空间相关性的MSCNN LSTM Attention能见度预测模型
5
作者 王小建 苏彤 +6 位作者 马飞 林智婕 白元旦 郭庆元 魏俊涛 黄凯 徐玉凤 《安全与环境学报》 北大核心 2025年第4期1622-1632,共11页
准确预测能见度对保障交通运输安全具有重要意义。针对现有方法在能见度预测时对影响因素空间相关性考虑不足导致预测精度较低的问题,研究构建了一种考虑空间相关性的能见度预测模型。利用一维多尺度卷积神经网络(Multi-Scale Convoluti... 准确预测能见度对保障交通运输安全具有重要意义。针对现有方法在能见度预测时对影响因素空间相关性考虑不足导致预测精度较低的问题,研究构建了一种考虑空间相关性的能见度预测模型。利用一维多尺度卷积神经网络(Multi-Scale Convolutional Neural Network, MSCNN)提取能见度以预测各影响因素下不同精细度的空间特征,并将其进行线性融合得到多因素空间特征,实现对能见度预测影响因素的空间特征提取;利用Attention机制加强对关键信息关注的优势以对长短期记忆神经网络(Long-Short Term Memory Neural Network, LSTM)方法进行改进,进而增强模型对重要时序信息关注的能力和模型预测的准确性,实现在考虑影响因素空间相关性下对能见度的预测。以2021—2023年西安市逐时气象数据和污染物数据为试验数据,采用均方根误差(RMSE)、平均绝对误差(MAE)和R2指标对模型进行评价。试验结果显示,研究模型MAE下降26.3%~39.1%,RMSE下降25%~40%,R2提升3.7%~16.4%,能见度预测精度较高。 展开更多
关键词 环境科学技术基础学科 能见度预测 空间相关性 一维多尺度卷积神经网络 长短期记忆神经网络 注意力机制
原文传递
An attention-based prototypical network for forest fire smoke few-shot detection 被引量:3
6
作者 Tingting Li Haowei Zhu +1 位作者 Chunhe Hu Junguo Zhang 《Journal of Forestry Research》 SCIE CAS CSCD 2022年第5期1493-1504,共12页
Existing almost deep learning methods rely on a large amount of annotated data, so they are inappropriate for forest fire smoke detection with limited data. In this paper, a novel hybrid attention-based few-shot learn... Existing almost deep learning methods rely on a large amount of annotated data, so they are inappropriate for forest fire smoke detection with limited data. In this paper, a novel hybrid attention-based few-shot learning method, named Attention-Based Prototypical Network, is proposed for forest fire smoke detection. Specifically, feature extraction network, which consists of convolutional block attention module, could extract high-level and discriminative features and further decrease the false alarm rate resulting from suspected smoke areas. Moreover, we design a metalearning module to alleviate the overfitting issue caused by limited smoke images, and the meta-learning network enables achieving effective detection via comparing the distance between the class prototype of support images and the features of query images. A series of experiments on forest fire smoke datasets and miniImageNet dataset testify that the proposed method is superior to state-of-the-art few-shot learning approaches. 展开更多
关键词 Forest fire smoke detection Few-shot learning Channel attention module spatial attention module Prototypical network
在线阅读 下载PDF
DCA-YOLO:Detection Algorithm for YOLOv8 Pulmonary Nodules Based on Attention Mechanism Optimization 被引量:1
7
作者 SONG Yongsheng LIU Guohua 《Journal of Donghua University(English Edition)》 2025年第1期78-87,共10页
Pulmonary nodules represent an early manifestation of lung cancer.However,pulmonary nodules only constitute a small portion of the overall image,posing challenges for physicians in image interpretation and potentially... Pulmonary nodules represent an early manifestation of lung cancer.However,pulmonary nodules only constitute a small portion of the overall image,posing challenges for physicians in image interpretation and potentially leading to false positives or missed detections.To solve these problems,the YOLOv8 network is enhanced by adding deformable convolution and atrous spatial pyramid pooling(ASPP),along with the integration of a coordinate attention(CA)mechanism.This allows the network to focus on small targets while expanding the receptive field without losing resolution.At the same time,context information on the target is gathered and feature expression is enhanced by attention modules in different directions.It effectively improves the positioning accuracy and achieves good results on the LUNA16 dataset.Compared with other detection algorithms,it improves the accuracy of pulmonary nodule detection to a certain extent. 展开更多
关键词 pulmonary nodule YOLOv8 network object detection deformable convolution atrous spatial pyramid pooling(ASPP) coordinate attention(CA)mechanism
在线阅读 下载PDF
Image Inpainting Detection Based on High-Pass Filter Attention Network
8
作者 Can Xiao Feng Li +3 位作者 Dengyong Zhang Pu Huang Xiangling Ding Victor S.Sheng 《Computer Systems Science & Engineering》 SCIE EI 2022年第12期1145-1154,共10页
Image inpainting based on deep learning has been greatly improved.The original purpose of image inpainting was to repair some broken photos, suchas inpainting artifacts. However, it may also be used for malicious oper... Image inpainting based on deep learning has been greatly improved.The original purpose of image inpainting was to repair some broken photos, suchas inpainting artifacts. However, it may also be used for malicious operations,such as destroying evidence. Therefore, detection and localization of imageinpainting operations are essential. Recent research shows that high-pass filteringfull convolutional network (HPFCN) is applied to image inpainting detection andachieves good results. However, those methods did not consider the spatial location and channel information of the feature map. To solve these shortcomings, weintroduce the squeezed excitation blocks (SE) and propose a high-pass filter attention full convolutional network (HPACN). In feature extraction, we apply concurrent spatial and channel attention (scSE) to enhance feature extraction and obtainmore information. Channel attention (cSE) is introduced in upsampling toenhance detection and localization. The experimental results show that the proposed method can achieve improvement on ImageNet. 展开更多
关键词 Image inpainting detection spatial attention channel attention full convolutional network high-pass filter
在线阅读 下载PDF
Lightweight Cross-Modal Multispectral Pedestrian Detection Based on Spatial Reweighted Attention Mechanism
9
作者 Lujuan Deng Ruochong Fu +3 位作者 Zuhe Li Boyi Liu Mengze Xue Yuhao Cui 《Computers, Materials & Continua》 SCIE EI 2024年第3期4071-4089,共19页
Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion s... Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion scenarios. However, while continuously improving cross-modal feature extraction and fusion, ensuring the model’s detection speed is also a challenging issue. We have devised a deep learning network model for cross-modal pedestrian detection based on Resnet50, aiming to focus on more reliable features and enhance the model’s detection efficiency. This model employs a spatial attention mechanism to reweight the input visible light and infrared image data, enhancing the model’s focus on different spatial positions and sharing the weighted feature data across different modalities, thereby reducing the interference of multi-modal features. Subsequently, lightweight modules with depthwise separable convolution are incorporated to reduce the model’s parameter count and computational load through channel-wise and point-wise convolutions. The network model algorithm proposed in this paper was experimentally validated on the publicly available KAIST dataset and compared with other existing methods. The experimental results demonstrate that our approach achieves favorable performance in various complex environments, affirming the effectiveness of the multispectral pedestrian detection technology proposed in this paper. 展开更多
关键词 Multispectral pedestrian detection convolutional neural networks depth separable convolution spatially reweighted attention mechanism
在线阅读 下载PDF
Channel attention based wavelet cascaded network for image super-resolution
10
作者 CHEN Jian HUANG Detian HUANG Weiqin 《High Technology Letters》 EI CAS 2022年第2期197-207,共11页
Convolutional neural networks(CNNs) have shown great potential for image super-resolution(SR).However,most existing CNNs only reconstruct images in the spatial domain,resulting in insufficient high-frequency details o... Convolutional neural networks(CNNs) have shown great potential for image super-resolution(SR).However,most existing CNNs only reconstruct images in the spatial domain,resulting in insufficient high-frequency details of reconstructed images.To address this issue,a channel attention based wavelet cascaded network for image super-resolution(CWSR) is proposed.Specifically,a second-order channel attention(SOCA) mechanism is incorporated into the network,and the covariance matrix normalization is utilized to explore interdependencies between channel-wise features.Then,to boost the quality of residual features,the non-local module is adopted to further improve the global information integration ability of the network.Finally,taking the image loss in the spatial and wavelet domains into account,a dual-constrained loss function is proposed to optimize the network.Experimental results illustrate that CWSR outperforms several state-of-the-art methods in terms of both visual quality and quantitative metrics. 展开更多
关键词 image super-resolution(SR) wavelet transform convolutional neural network(CNN) second-order channel attention(SOCA) non-local self-similarity
在线阅读 下载PDF
面向交通流量预测的时空Graph-CoordAttention网络 被引量:2
11
作者 刘建松 康雁 +2 位作者 李浩 王韬 王海宁 《计算机科学》 CSCD 北大核心 2023年第S01期558-564,共7页
交通预测是城市智能交通系统的一个重要研究组成部分,使人们的出行更加效率和安全。由于复杂的时间和空间依赖性,准确预测交通流量仍然是一个巨大的挑战。近年来,图卷积网络(GCN)在交通预测方面表现出巨大的潜力,但基于GCN的模型往往侧... 交通预测是城市智能交通系统的一个重要研究组成部分,使人们的出行更加效率和安全。由于复杂的时间和空间依赖性,准确预测交通流量仍然是一个巨大的挑战。近年来,图卷积网络(GCN)在交通预测方面表现出巨大的潜力,但基于GCN的模型往往侧重于单独捕捉时间和空间的依赖性,忽视了时间和空间依赖性之间的动态关联性,不能很好地融合它们。此外,以前的方法使用现实世界的静态交通网络来构建空间邻接矩阵,这可能忽略了动态的空间依赖性。为了克服这些局限性,并提高模型的性能,提出了一种新颖的时空Graph-CoordAttention网络(STGCA)。具体来说,提出了时空同步模块,用来建模不同时刻的时空依赖交融关系。然后,提出了一种动态图学习的方案,基于车流量之间数据关联,挖掘出潜在的图信息。在4个公开的数据集上和现有基线模型进行对比实验,STGCA表现了优异的性能。 展开更多
关键词 交通流量预测 时空预测 图卷积网络 注意力机制 时空依赖
在线阅读 下载PDF
基于深度学习的Attention U-Net语义分割模型研究 被引量:2
12
作者 薛泽民 邹连旭 +3 位作者 黄志威 冉杰 余若岩 郑国勋 《长春工程学院学报(自然科学版)》 2023年第4期97-101,共5页
针对当前深度神经网络在处理图像分割过程中普遍存在的处理耗时长、实时性低和分割准确率不高的问题,提出了一种融入注意力机制的U-Net网络对GAN扩充的数据集进行训练的模型,试验结果表明:相较于U-Net++、SegNet和DeepLabV1等传统模型,... 针对当前深度神经网络在处理图像分割过程中普遍存在的处理耗时长、实时性低和分割准确率不高的问题,提出了一种融入注意力机制的U-Net网络对GAN扩充的数据集进行训练的模型,试验结果表明:相较于U-Net++、SegNet和DeepLabV1等传统模型,提出模型的平均损失约为129%,与U-Net++、DeepLabV1模型较为接近;平均精确度约为95.4%,比U-Net++提高了1.7%,比SegNet提高了6%,比DeepLabV1提高了1.7%。 展开更多
关键词 数据增强 语义分割 空间注意力机制 生成对抗网络
在线阅读 下载PDF
Person Re-Identification Based on Spatial Feature Learning and Multi-Granularity Feature Fusion
13
作者 DIAO Zijian CAO Shuai +4 位作者 LI Wenwei LIANG Jianan WEN Guilin HUANG Weici ZHANG Shouming 《Journal of Shanghai Jiaotong university(Science)》 2025年第2期363-374,共12页
In view of the weak ability of the convolutional neural networks to explicitly learn spatial invariance and the probabilistic loss of discriminative features caused by occlusion and background interference in pedestri... In view of the weak ability of the convolutional neural networks to explicitly learn spatial invariance and the probabilistic loss of discriminative features caused by occlusion and background interference in pedestrian re-identification tasks,a person re-identification method combining spatial feature learning and multi-granularity feature fusion was proposed.First,an attention spatial transformation network(A-STN)is proposed to learn spatial features and solve the problem of misalignment of pedestrian spatial features.Then the network was divided into a global branch,a local coarse-grained fusion branch,and a local fine-grained fusion branch to extract pedestrian global features,coarse-grained fusion features,and fine-grained fusion features,respectively.Among them,the global branch enriches the global features by fusing different pooling features.The local coarse-grained fusion branch uses an overlay pooling to enhance each local feature while learning the correlation relationship between multi-granularity features.The local fine-grained fusion branch uses a differential pooling to obtain the differential features that were fused with global features to learn the relationship between pedestrian local features and pedestrian global features.Finally,the proposed method was compared on three public datasets:Market1501,DukeMTMC-ReID and CUHK03.The experimental results were better than those of the comparative methods,which verifies the effectiveness of the proposed method. 展开更多
关键词 pedestrian re-identification spatial features attention spatial transformation network multi-branch network relation features
原文传递
Exploring the Influence of Tourism Network Attention on the Development of Tourism in the Yangtze River Delta:A Spatial Analysis
14
作者 WANG Yuewei DI Jiao +1 位作者 CHEN Hang AN Lidan 《Journal of Resources and Ecology》 2025年第4期1103-1115,共13页
This study incorporates both positive and negative tourism network attention into a comprehensive framework to examine their distinct effects on tourism development in the Yangtze River Delta(YRD).In particular,this s... This study incorporates both positive and negative tourism network attention into a comprehensive framework to examine their distinct effects on tourism development in the Yangtze River Delta(YRD).In particular,this study uses a spatial econometric model to accurately examine the relationship between positive and negative tourism network attention and regional tourism development,including the impact of tourism network attention on local and neighboring areas.In addition,the framework also uses fuzzy set qualitative comparative analysis(fsQCA)to explore the path combination of network attention and other factors that affect varied stages of tourism development in each city of the YRD,and expounds its driving mechanism.Research findings reveal:(1)Positive tourism network attention has a“U-shaped”influence on regional tourism development.(2)Positive tourism network attention significantly promotes tourism development of both local and neighboring areas,while negative tourism network attention both hinders local tourism development and adversely affects neighboring areas via spillover effects.(3)Multiple paths for tourism development exist in the region,including four modes:Demand-facility driven,demand-resource-facility-transportation driven,word of mouth-transportation driven,and traffic-resource driven.Using the YRD as a case study,this research offers empirical evidence and theoretical insights into how positive and negative tourism network attention influence tourism development in the region. 展开更多
关键词 spatial effect network attention regional tourism fsQCA
原文传递
SpaGRA:Graph augmentation facilitates domain identification for spatially resolved transcriptomics
15
作者 Xue Sun Wei Zhang +8 位作者 Wenrui Li Na Yu Daoliang Zhang Qi Zou Qiongye Dong Xianglin Zhang Zhiping Liu Zhiyuan Yuan Rui Gao 《Journal of Genetics and Genomics》 2025年第1期93-104,共12页
Recent advances in spatially resolved transcriptomics(SRT)have provided new opportunities for characterizing spatial structures of various tissues.Graph-based geometric deep learning has gained widespread adoption for... Recent advances in spatially resolved transcriptomics(SRT)have provided new opportunities for characterizing spatial structures of various tissues.Graph-based geometric deep learning has gained widespread adoption for spatial domain identification tasks.Currently,most methods define adjacency relation between cells or spots by their spatial distance in SRT data,which overlooks key biological interactions like gene expression similarities,and leads to inaccuracies in spatial domain identification.To tackle this challenge,we propose a novel method,SpaGRA(https://github.com/sunxue-yy/SpaGRA),for automatic multi-relationship construction based on graph augmentation.SpaGRA uses spatial distance as prior knowledge and dynamically adjusts edge weights with multi-head graph attention networks(GATs).This helps SpaGRA to uncover diverse node relationships and enhance message passing in geometric contrastive learning.Additionally,SpaGRA uses these multi-view relationships to construct negative samples,addressing sampling bias posed by random selection.Experimental results show that SpaGRA presents superior domain identification performance on multiple datasets generated from different protocols.Using SpaGRA,we analyze the functional regions in the mouse hypothalamus,identify key genes related to heart development in mouse embryos,and observe cancer-associated fibroblasts enveloping cancer cells in the latest Visium HD data.Overall,SpaGRA can effectively characterize spatial structures across diverse SRT datasets. 展开更多
关键词 spatial domain identification spatially resolved transcriptomics Multi-head graph attention networks Graph augmentation Geometric contrastive learning
原文传递
基于多重卷积和空谱注意力Transformer的双流高光谱图像分类网络
16
作者 王素玉 吴世国 《北京工业大学学报》 北大核心 2026年第1期75-83,共9页
针对现有的卷积神经网络(convolutional neural network,CNN)方法在高光谱图像分类过程中存在的空谱联合特征利用不充分,对全局特征的关注度不足的问题,设计了一种基于多重卷积和空谱注意力Transformer的双流高光谱图像分类网络,通过CNN... 针对现有的卷积神经网络(convolutional neural network,CNN)方法在高光谱图像分类过程中存在的空谱联合特征利用不充分,对全局特征的关注度不足的问题,设计了一种基于多重卷积和空谱注意力Transformer的双流高光谱图像分类网络,通过CNN和Transformer相结合的双流结构,实现局部和全局特征的充分利用。首先,在CNN支路,设计了一种基于多重卷积的空谱特征融合结构,通过多重卷积实现空间和光谱维特征的充分挖掘和融合;其次,在Transformer网络支路则使用空谱注意力机制捕获整个图像的全局信息;最后,2条分支通过决策级融合实现了高性能的分类效果。基于4个典型数据集的测试结果表明,该算法的分类结果与当前主流算法相比,均有不同程度的提升。 展开更多
关键词 双流网络 多重卷积 空谱注意力机制 高光谱图像 地物分类 特征融合
在线阅读 下载PDF
Concurrent channel and spatial attention in Fully Convolutional Network for individual pig image segmentation 被引量:3
17
作者 Zhiwei Hu Hua Yang +1 位作者 Tiantian Lou Hongwen Yan 《International Journal of Agricultural and Biological Engineering》 SCIE CAS 2023年第1期232-242,共11页
The separation of individual pigs from the pigpen scenes is crucial for precision farming,and the technology based on convolutional neural networks can provide a low-cost,non-contact,non-invasive method of pig image s... The separation of individual pigs from the pigpen scenes is crucial for precision farming,and the technology based on convolutional neural networks can provide a low-cost,non-contact,non-invasive method of pig image segmentation.However,two factors limit the development of this field.On the one hand,the individual pigs are easy to stick together,and the occlusion of debris such as pigpens can easily make the model misjudgment.On the other hand,manual labeling of group-raised pig data is time-consuming and labor-intensive and is prone to labeling errors.Therefore,it is urgent for an individual pig image segmentation model that can perform well in individual scenarios and can be easily migrated to a group-raised environment.In order to solve the above problems,taking individual pigs as research objects,an individual pig image segmentation dataset containing 2066 images was constructed,and a series of algorithms based on fully convolutional networks were proposed to solve the pig image segmentation problem.In order to capture the long-range dependencies and weaken the background information such as pigpens while enhancing the information of individual parts of pigs,the channel and spatial attention blocks were introduced into the best-performing decoders UNet and LinkNet.Experiments show that using ResNext50 as the encoder and Unet as the decoder as the basic model,adding two attention blocks at the same time achieves 98.30%and 96.71%on the F1 and IOU metrics,respectively.Compared with the model adding channel attention block alone,the two metrics are improved by 0.13%and 0.22%,respectively.The experiment of introducing channel and spatial attention alone shows that spatial attention is more effective than channel attention.Taking VGG16-LinkNet as an example,compared with channel attention,spatial attention improves the F1 and IOU metrics by 0.16%and 0.30%,respectively.Furthermore,the heatmap of the feature of different layers of the decoder after adding different attention information proves that with the increase of layers,the boundary of pig image segmentation is clearer.In order to verify the effectiveness of the individual pig image segmentation model in group-raised scenes,the transfer performance of the model is verified in three scenarios of high separation,deep adhesion,and pigpen occlusion.The experiments show that the segmentation results of adding attention information,especially the simultaneous fusion of channel and spatial attention blocks,are more refined and complete.The attention-based individual pig image segmentation model can be effectively transferred to the field of group-raised pigs and can provide a reference for its pre-segmentation. 展开更多
关键词 PIG image segmentation Fully Convolutional network(FCN) attention mechanism channel and spatial attention
原文传递
Single-Image Dehazing Based on Two-Stream Convolutional Neural Network 被引量:3
18
作者 Meng Jun Li Yuanyuan +1 位作者 Liang HuaHua Ma You 《Journal of Artificial Intelligence and Technology》 2022年第3期100-110,共11页
The haze weather environment leads to the deterioration of the visual effect of the image,and it is difficult to carry out the work of the advanced vision task.Therefore,dehazing the haze image is an important step be... The haze weather environment leads to the deterioration of the visual effect of the image,and it is difficult to carry out the work of the advanced vision task.Therefore,dehazing the haze image is an important step before the execution of the advanced vision task.Traditional dehazing algorithms achieve image dehazing by improving image brightness and contrast or constructing artificial priors such as color attenuation priors and dark channel priors.However,the effect is unstable when dealing with complex scenes.In the method based on convolutional neural network,the image dehazing network of the encoding and decoding structure does not consider the difference before and after the dehazing image,and the image spatial information is lost in the encoding stage.In order to overcome these problems,this paper proposes a novel end-to-end two-stream convolutional neural network for single-image dehazing.The network model is composed of a spatial information feature stream and a highlevel semantic feature stream.The spatial information feature stream retains the detailed information of the dehazing image,and the high-level semantic feature stream extracts the multi-scale structural features of the dehazing image.A spatial information auxiliary module is designed and placed between the feature streams.This module uses the attention mechanism to construct a unified expression of different types of information and realizes the gradual restoration of the clear image with the semantic information auxiliary spatial information in the dehazing network.A parallel residual twicing module is proposed,which performs dehazing on the difference information of features at different stages to improve the model’s ability to discriminate haze images.The peak signal-to-noise ratio(PSNR)and structural similarity are used to quantitatively evaluate the similarity between the dehazing results of each algorithm and the original image.The structure similarity and PSNR of the method in this paper reached 0.852 and 17.557dB on the HazeRD dataset,which were higher than existing comparison algorithms.On the SOTS dataset,the indicators are 0.955 and 27.348dB,which are sub-optimal results.In experiments with real haze images,this method can also achieve excellent visual restoration effects.The experimental results show that the model proposed in this paper can restore desired visual effects without fog images,and it also has good generalization performance in real haze scenes. 展开更多
关键词 attention mechanism image dehazing semantic feature spatial information two-stream network
在线阅读 下载PDF
Multi Attention Generative Adversarial Network for Pedestrian Trajectory Prediction Based on Spatial Gridding
19
作者 Huihui An Miao Liu +2 位作者 Xiaolan Wang Weiwei Zhang Jun Gong 《Automotive Innovation》 CSCD 2024年第3期443-455,共13页
Accurate and efficient pedestrian trajectory prediction is one of the key capabilities for the safe operation of self-driving vehicles.Therefore,it is of great significance to study pedestrian trajectory prediction al... Accurate and efficient pedestrian trajectory prediction is one of the key capabilities for the safe operation of self-driving vehicles.Therefore,it is of great significance to study pedestrian trajectory prediction algorithms applicable to complex interaction scenarios.In this study,a spatial gridding-based multi-attention generative adversarial network(SGMA-GAN)is proposed,which is modeled with generative adversarial network as the main framework.Firstly,the map information is gridded to better represent the pedestrian state information in tensor form,improve the stability of the state space and network structure.Secondly,temporal and spatial attention mechanisms are introduced to account for the effects of historical trajectories and spatial interaction features.Finally,the model is evaluated with both Eidgenössische Technische Hochschule(ETH)and University of Cyprus(UCY)datasets.The results showed that as the prediction step size gradually increased,compared with the relatively new SGANv2,the mean average displacement error(ADE)and Final displacement error(FDE)of SGMA-GAN in five scenarios increased by 10.61%and 4.65%,respectively. 展开更多
关键词 Prediction of pedestrian trajectory Generative adversarial network Multi attention mechanism spatial gridding
原文传递
基于YOLOv8改进的跌倒检测算法:CASL-YOLO 被引量:1
20
作者 徐慧英 赵蕊 +1 位作者 朱信忠 黄晓 《浙江师范大学学报(自然科学版)》 CAS 2025年第1期36-44,共9页
跌倒对老年人危害极大,是我国65岁以上老年人致残和伤害死亡的首要原因.然而,目前主流的跌倒检测技术受环境的干扰较大,在物体遮挡、光照变化等复杂场景下的检测准确率较低,且模型的参数量和计算量较高,导致成本居高不下,不能很好地部... 跌倒对老年人危害极大,是我国65岁以上老年人致残和伤害死亡的首要原因.然而,目前主流的跌倒检测技术受环境的干扰较大,在物体遮挡、光照变化等复杂场景下的检测准确率较低,且模型的参数量和计算量较高,导致成本居高不下,不能很好地部署应用于实际生活场景.针对上述问题,提出了一种在复杂环境下轻量级的基于YOLOv8模型改进的跌倒检测算法:CASL-YOLO.首先,该模型引入空间深度卷积(SPD-Conv)模块替代传统卷积模块,通过对每个特征映射进行卷积操作,保留通道维度中的全部信息,从而提高模型在低分辨率图像和小物体检测方面的性能;其次,引入基于位置信息的注意力机制,以捕获跨通道、方向和位置感知的信息,从而更准确地定位和识别人体目标;最后,在特征提取模块中引入选择性大卷积核(LSKNet)动态调整感受野,以有效处理跌倒检测场景中的复杂环境信息,提高网络的感知能力和检测精度.实验结果表明,在公开的Human Fall数据集上,CASL-YOLO的mAP@0.5达到96.8%,优于基线YOLOv8n,同时模型仅有3.4×MiB的参数量和11.7×10^(9)的计算量.相比其他检测算法,CASL-YOLO在参数量和计算量小幅增加的情况下,实现了更高的精度和性能,同时满足实际场景的部署要求. 展开更多
关键词 跌倒检测 YOLOv8 注意力机制 空间深度卷积 选择性大卷积核
在线阅读 下载PDF
上一页 1 2 26 下一页 到第
使用帮助 返回顶部