期刊文献+
共找到35篇文章
< 1 2 >
每页显示 20 50 100
A Lightweight Convolutional Neural Network with Hierarchical Multi-Scale Feature Fusion for Image Classification 被引量:2
1
作者 Adama Dembele Ronald Waweru Mwangi Ananda Omutokoh Kube 《Journal of Computer and Communications》 2024年第2期173-200,共28页
Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso... Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline. 展开更多
关键词 MobileNet Image Classification Lightweight convolutional Neural Network Depthwise dilated Separable convolution Hierarchical multi-scale Feature Fusion
在线阅读 下载PDF
Deep Multi-Scale and Attention-Based Architectures for Semantic Segmentation in Biomedical Imaging
2
作者 Majid Harouni Vishakha Goyal +2 位作者 Gabrielle Feldman Sam Michael Ty C.Voss 《Computers, Materials & Continua》 2025年第10期331-366,共36页
Semantic segmentation plays a foundational role in biomedical image analysis, providing precise information about cellular, tissue, and organ structures in both biological and medical imaging modalities. Traditional a... Semantic segmentation plays a foundational role in biomedical image analysis, providing precise information about cellular, tissue, and organ structures in both biological and medical imaging modalities. Traditional approaches often fail in the face of challenges such as low contrast, morphological variability, and densely packed structures. Recent advancements in deep learning have transformed segmentation capabilities through the integration of fine-scale detail preservation, coarse-scale contextual modeling, and multi-scale feature fusion. This work provides a comprehensive analysis of state-of-the-art deep learning models, including U-Net variants, attention-based frameworks, and Transformer-integrated networks, highlighting innovations that improve accuracy, generalizability, and computational efficiency. Key architectural components such as convolution operations, shallow and deep blocks, skip connections, and hybrid encoders are examined for their roles in enhancing spatial representation and semantic consistency. We further discuss the importance of hierarchical and instance-aware segmentation and annotation in interpreting complex biological scenes and multiplexed medical images. By bridging methodological developments with diverse application domains, this paper outlines current trends and future directions for semantic segmentation, emphasizing its critical role in facilitating annotation, diagnosis, and discovery in biomedical research. 展开更多
关键词 Biomedical semantic segmentation multi-scale feature fusion fine-and coarse-scale features convolution operations shallow and deep blocks skip connections
在线阅读 下载PDF
Chinese named entity recognition with multi-network fusion of multi-scale lexical information 被引量:1
3
作者 Yan Guo Hong-Chen Liu +3 位作者 Fu-Jiang Liu Wei-Hua Lin Quan-Sen Shao Jun-Shun Su 《Journal of Electronic Science and Technology》 EI CAS CSCD 2024年第4期53-80,共28页
Named entity recognition(NER)is an important part in knowledge extraction and one of the main tasks in constructing knowledge graphs.In today’s Chinese named entity recognition(CNER)task,the BERT-BiLSTM-CRF model is ... Named entity recognition(NER)is an important part in knowledge extraction and one of the main tasks in constructing knowledge graphs.In today’s Chinese named entity recognition(CNER)task,the BERT-BiLSTM-CRF model is widely used and often yields notable results.However,recognizing each entity with high accuracy remains challenging.Many entities do not appear as single words but as part of complex phrases,making it difficult to achieve accurate recognition using word embedding information alone because the intricate lexical structure often impacts the performance.To address this issue,we propose an improved Bidirectional Encoder Representations from Transformers(BERT)character word conditional random field(CRF)(BCWC)model.It incorporates a pre-trained word embedding model using the skip-gram with negative sampling(SGNS)method,alongside traditional BERT embeddings.By comparing datasets with different word segmentation tools,we obtain enhanced word embedding features for segmented data.These features are then processed using the multi-scale convolution and iterated dilated convolutional neural networks(IDCNNs)with varying expansion rates to capture features at multiple scales and extract diverse contextual information.Additionally,a multi-attention mechanism is employed to fuse word and character embeddings.Finally,CRFs are applied to learn sequence constraints and optimize entity label annotations.A series of experiments are conducted on three public datasets,demonstrating that the proposed method outperforms the recent advanced baselines.BCWC is capable to address the challenge of recognizing complex entities by combining character-level and word-level embedding information,thereby improving the accuracy of CNER.Such a model is potential to the applications of more precise knowledge extraction such as knowledge graph construction and information retrieval,particularly in domain-specific natural language processing tasks that require high entity recognition precision. 展开更多
关键词 Bi-directional long short-term memory(BiLSTM) Chinese named entity recognition(CNER) Iterated dilated convolutional neural network(IDCNN) Multi-network integration multi-scale lexical features
在线阅读 下载PDF
Two Stages Segmentation Algorithm of Breast Tumor in DCE-MRI Based on Multi-Scale Feature and Boundary Attention Mechanism
4
作者 Bing Li Liangyu Wang +3 位作者 Xia Liu Hongbin Fan Bo Wang Shoudi Tong 《Computers, Materials & Continua》 SCIE EI 2024年第7期1543-1561,共19页
Nuclearmagnetic resonance imaging of breasts often presents complex backgrounds.Breast tumors exhibit varying sizes,uneven intensity,and indistinct boundaries.These characteristics can lead to challenges such as low a... Nuclearmagnetic resonance imaging of breasts often presents complex backgrounds.Breast tumors exhibit varying sizes,uneven intensity,and indistinct boundaries.These characteristics can lead to challenges such as low accuracy and incorrect segmentation during tumor segmentation.Thus,we propose a two-stage breast tumor segmentation method leveraging multi-scale features and boundary attention mechanisms.Initially,the breast region of interest is extracted to isolate the breast area from surrounding tissues and organs.Subsequently,we devise a fusion network incorporatingmulti-scale features and boundary attentionmechanisms for breast tumor segmentation.We incorporate multi-scale parallel dilated convolution modules into the network,enhancing its capability to segment tumors of various sizes through multi-scale convolution and novel fusion techniques.Additionally,attention and boundary detection modules are included to augment the network’s capacity to locate tumors by capturing nonlocal dependencies in both spatial and channel domains.Furthermore,a hybrid loss function with boundary weight is employed to address sample class imbalance issues and enhance the network’s boundary maintenance capability through additional loss.Themethod was evaluated using breast data from 207 patients at RuijinHospital,resulting in a 6.64%increase in Dice similarity coefficient compared to the benchmarkU-Net.Experimental results demonstrate the superiority of the method over other segmentation techniques,with fewer model parameters. 展开更多
关键词 Dynamic contrast-enhanced magnetic resonance imaging(DCE-MRI) breast tumor segmentation multi-scale dilated convolution boundary attention the hybrid loss function with boundary weight
在线阅读 下载PDF
深度网络用于胸部病理分类
5
作者 张娜 邓广宏 +1 位作者 荆文龙 李勇 《传感器与微系统》 北大核心 2025年第7期58-62,共5页
胸部病理预测在胸部疾病早期诊断中具有极其重要的价值,基于深度卷积网络预测胸部病理已成为当今研究的热点话题。然而,胸部病理在形状,大小以及生长空间等的不可预测性,是较多研究的难点。因此,本文提出一种深度卷积网络命名为密集空... 胸部病理预测在胸部疾病早期诊断中具有极其重要的价值,基于深度卷积网络预测胸部病理已成为当今研究的热点话题。然而,胸部病理在形状,大小以及生长空间等的不可预测性,是较多研究的难点。因此,本文提出一种深度卷积网络命名为密集空洞网络(DenDnet)的胸部病理预测模型。引入空洞卷积,改进了传统的密集块,删减了较多的特征图拼接,为了增大感受野,采用空洞卷积替换常规卷积,设计了一种有效的密集空洞卷积块在深度网络中不断叠加进行特征提取。使用的数据集是COVID-19 Radiography Database,在该数据集上分别验证了DenDnetn,DenDnets,DenDnetm,DenDnetl四种不同深度的模型性能。其中DenDnets的验证准确率为84.63%,测试准确率为84.88%,AUC值分别为0.952,0.955,0.962以及0.994。均优于其他三种模型。 展开更多
关键词 密集块 空洞卷积 深度学习 胸部影像
在线阅读 下载PDF
面向高分辨率遥感影像建筑物提取的SD-BASNet网络
6
作者 朱娟娟 黄亮 朱莎莎 《自然资源遥感》 北大核心 2025年第5期122-130,共9页
针对网络模型参数量大、下采样过程丢失影像建筑物细节信息的问题,受轻量级网络的启发,设计了一种融入深度可分离残差块和空洞卷积的建筑物提取网络(SD-BASNet)。首先,在深度监督编码器预测模块中设计了一个深度可分离残差块,将深度可... 针对网络模型参数量大、下采样过程丢失影像建筑物细节信息的问题,受轻量级网络的启发,设计了一种融入深度可分离残差块和空洞卷积的建筑物提取网络(SD-BASNet)。首先,在深度监督编码器预测模块中设计了一个深度可分离残差块,将深度可分离卷积引入主干网络ResNet中,避免卷积核过大,减少网络的参数量;其次,为防止网络轻量化带来的精度下降,将空洞卷积融入后处理优化模块的编码层,增大特征图的感受野,从而捕捉更广泛的上下文信息,提高建筑物特征提取的准确性。在WHU建筑物数据集上进行实验,在不同尺度建筑物提取中均表现较好,其平均交并比和平均像素精度分别为92.25%和96.59%,其召回率、精确率和F1指标分别达到96.50%,93.79%和92.61%。与PSPNet,SegNet,DeepLabV3,SE-UNet,UNet++等语义分割网络相比,SD-BASNet网络提取精度得到了显著提升,且提取的建筑物完整度更好;与基础网络BASNet相比,SD-BASNet网络的参数量与运行时间也有所减少,证实了该文提出的SD-BASNet网络的有效性。 展开更多
关键词 建筑物提取 高分辨率遥感影像 BASNet网络 深度可分离残差块 空洞卷积
在线阅读 下载PDF
融合高效卷积注意力的时域卷积网络短期负荷预测模型
7
作者 孙东磊 李文升 +1 位作者 梁露 张智晟 《山东科技大学学报(自然科学版)》 北大核心 2025年第5期83-90,共8页
为避免时域卷积网络中膨胀卷积结构导致的负荷信息不连续现象,进一步提升预测模型对重要负荷特征的提取能力,本研究提出一种融合高效卷积注意力模块的混合膨胀卷积改进时域卷积网络(ECBAM-HTCN)的短期负荷预测模型。该模型以具备并行计... 为避免时域卷积网络中膨胀卷积结构导致的负荷信息不连续现象,进一步提升预测模型对重要负荷特征的提取能力,本研究提出一种融合高效卷积注意力模块的混合膨胀卷积改进时域卷积网络(ECBAM-HTCN)的短期负荷预测模型。该模型以具备并行计算能力的时域卷积网络为基础学习负荷数据特征,通过构建混合膨胀卷积层改进时域卷积网络残差块,利用不同膨胀系数的卷积自适应地捕获不同距离下全部负荷数据,避免信息不连续;同时,引入能够自适应调整卷积核大小的一维卷积改进传统卷积注意力模块,高效捕获负荷数据空间和通道两个维度的重要信息。基于实际电网负荷数据仿真实验表明,在短期负荷预测任务中,所提出的ECBAM-HTCN模型具有较高的预测精度和较好的稳定性。 展开更多
关键词 短期负荷预测 时域卷积网络 混合膨胀卷积 高效卷积注意力模块
在线阅读 下载PDF
Deep Learning-Based Algorithm for Robust Object Detection in Flooded and Rainy Environments
8
作者 Pengfei Wang Jiwu Sun +4 位作者 Lu Lu Hongchen Li Hongzhe Liu Cheng Xu Yongqiang Liu 《Computers, Materials & Continua》 2025年第8期2883-2903,共21页
Flooding and heavy rainfall under extreme weather conditions pose significant challenges to target detection algorithms.Traditional methods often struggle to address issues such as image blurring,dynamic noise interfe... Flooding and heavy rainfall under extreme weather conditions pose significant challenges to target detection algorithms.Traditional methods often struggle to address issues such as image blurring,dynamic noise interference,and variations in target scale.Conventional neural network(CNN)-based target detection approaches face notable limitations in such adverse weather scenarios,primarily due to the fixed geometric sampling structures that hinder adaptability to complex backgrounds and dynamically changing object appearances.To address these challenges,this paper proposes an optimized YOLOv9 model incorporating an improved deformable convolutional network(DCN)enhanced with a multi-scale dilated attention(MSDA)mechanism.Specifically,the DCN module enhances themodel’s adaptability to target deformation and noise interference by adaptively adjusting the sampling grid positions,while also integrating feature amplitude modulation to further improve robustness.Additionally,theMSDA module is introduced to capture contextual features acrossmultiple scales,effectively addressing issues related to target occlusion and scale variation commonly encountered in flood-affected environments.Experimental evaluations are conducted on the ISE-UFDS and UA-DETRAC datasets.The results demonstrate that the proposedmodel significantly outperforms state-of-the-art methods in key evaluation metrics,including precision,recall,F1-score,and mAP(Mean Average Precision).Notably,the model exhibits superior robustness and generalization performance under simulated severe weather conditions,offering reliable technical support for disaster emergency response systems.This study contributes to enhancing the accuracy and real-time capabilities of flood early warning systems,thereby supporting more effective disaster mitigation strategies. 展开更多
关键词 YOLO vehicle detection FLOOD deformable convolutional networks multi-scale dilated attention
暂未订购
A multi-scale convolutional neural network for bearing compound fault diagnosis under various noise conditions 被引量:10
9
作者 JIN YanRui QIN ChengJin +2 位作者 ZHANG ZhiNan TAO JianFeng LIU ChengLiang 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2022年第11期2551-2563,共13页
Recently,with the urgent demand for data-driven approaches in practical industrial scenarios,the deep learning diagnosis model in noise environments has attracted increasing attention.However,the existing research has... Recently,with the urgent demand for data-driven approaches in practical industrial scenarios,the deep learning diagnosis model in noise environments has attracted increasing attention.However,the existing research has two limitations:(1)the complex and changeable environmental noise,which cannot ensure the high-performance diagnosis of the model in different noise domains and(2)the possibility of multiple faults occurring simultaneously,which brings challenges to the model diagnosis.This paper presents a novel anti-noise multi-scale convolutional neural network(AM-CNN)for solving the issue of compound fault diagnosis under different intensity noises.First,we propose a residual pre-processing block according to the principle of noise superposition to process the input information and present the residual loss to construct a new loss function.Additionally,considering the strong coupling of input information,we design a multi-scale convolution block to realize multi-scale feature extraction for enhancing the proposed model’s robustness and effectiveness.Finally,a multi-label classifier is utilized to simultaneously distinguish multiple bearing faults.The proposed AM-CNN is verified under our collected compound fault dataset.On average,AM-CNN improves 39.93%accuracy and 25.84%F1-macro under the no-noise working condition and 45.67%accuracy and 27.72%F1-macro under different intensity noise working conditions compared with the existing methods.Furthermore,the experimental results show that AM-CNN can achieve good cross-domain performance with 100%accuracy and 100%F1-macro.Thus,AM-CNN has the potential to be an accurate and stable fault diagnosis tool. 展开更多
关键词 ANTI-NOISE residual pre-processing block bearing compound fault multi-label classifier multi-scale convolution feature extraction
原文传递
Hard-rock tunnel lithology identification using multiscale dilated convolutional attention network based on tunnel face images 被引量:1
10
作者 Wenjun ZHANG Wuqi ZHANG +5 位作者 Gaole ZHANG Jun HUANG Minggeng LI Xiaohui WANG Fei YE Xiaoming GUAN 《Frontiers of Structural and Civil Engineering》 SCIE EI CSCD 2023年第12期1796-1812,共17页
For real-time classification of rock-masses in hard-rock tunnels,quick determination of the rock lithology on the tunnel face during construction is essential.Motivated by current breakthroughs in artificial intellige... For real-time classification of rock-masses in hard-rock tunnels,quick determination of the rock lithology on the tunnel face during construction is essential.Motivated by current breakthroughs in artificial intelligence technology in machine vision,a new automatic detection approach for classifying tunnel lithology based on tunnel face images was developed.The method benefits from residual learning for training a deep convolutional neural network(DCNN),and a multi-scale dilated convolutional attention block is proposed.The block with different dilation rates can provide various receptive fields,and thus it can extract multi-scale features.Moreover,the attention mechanism is utilized to select the salient features adaptively and further improve the performance of the model.In this study,an initial image data set made up of photographs of tunnel faces consisting of basalt,granite,siltstone,and tuff was first collected.After classifying and enhancing the training,validation,and testing data sets,a new image data set was generated.A comparison of the experimental findings demonstrated that the suggested approach outperforms previous classifiers in terms of various indicators,including accuracy,precision,recall,F1-score,and computing time.Finally,a visualization analysis was performed to explain the process of the network in the classification of tunnel lithology through feature extraction.Overall,this study demonstrates the potential of using artificial intelligence methods for in situ rock lithology classification utilizing geological images of the tunnel face. 展开更多
关键词 hard-rock tunnel face intelligent lithology identification multi-scale dilated convolutional attention network image classification deep learning
原文传递
基于混合空洞卷积和注意力多尺度网络的残饵密度估计 被引量:1
11
作者 张丽珍 李延天 +3 位作者 李志坚 孟雄栋 张永琪 吴迪 《农业工程学报》 EI CAS CSCD 北大核心 2024年第14期137-145,共9页
及时、准确地估算饵料盘中残留饲料量是提高养殖效益的重要措施。针对虾类养殖场景下残饵检测模型复杂度高、计数精度低的问题,提出了一种基于混合空洞卷积和注意力多尺度网络(hybrid dilated convolution and attention multi-scale ne... 及时、准确地估算饵料盘中残留饲料量是提高养殖效益的重要措施。针对虾类养殖场景下残饵检测模型复杂度高、计数精度低的问题,提出了一种基于混合空洞卷积和注意力多尺度网络(hybrid dilated convolution and attention multi-scale network,HAMNet)的残饵密度估计方法。首先,借鉴MCNN(multi-column convolutional neural network)多列架构的思想设计并行卷积块(parallel convolution block,PCB),使网络在单列架构中提取多种尺度的残饵特征,简化了网络结构并减轻了计算量;同时为了弥补网络结构简化造成残饵特征表示能力略有不足的问题,引入混合空洞卷积块(hybrid dilated convolution block,HDCB)避免信息丢失并增大感受野,增强模型深入挖掘多尺度残饵信息的能力。其次,在网络中嵌入通道注意力机制(channel attention mechanism,CAM),利用通道之间的相互依赖性重新校准有用特征信息的权重,凸显目标与背景的差异性。最后,针对下采样导致密度图质量差的问题,应用可学习的转置卷积恢复特征图细节信息,进而提升模型计数性能。利用饵料盘条件下采集的残饵图像进行了验证,试验结果表明,与基准模型MCNN相比,HAMNet模型的平均绝对误差、均方根误差和计算量分别降低了44.4%、40.8%和13.7%,参数量仅为0.52 MB。与经典密度估计模型CMTL(cascaded multi-task learning)、SANet(scale aggregation network)、CSRNet(congested scene recognition network)相比,该模型在各项性能指标上保持了最佳平衡,明显处于优势。该研究可为人工智能在水产养殖中快速量化残饵提供参考。 展开更多
关键词 水产养殖 模型 残饵 密度估计 并行卷积块 混合空洞卷积 通道注意力机制 转置卷积
在线阅读 下载PDF
基于改进CBAM和BiGRU的入侵检测模型 被引量:1
12
作者 许东园 曹争光 黄春麟 《计算机技术与发展》 2024年第9期88-93,共6页
现有方法存在着网络流量数据不平衡、检测准确率不足和误报率上升等问题。该文提出了一种基于改进的CBAM(Convolutional Block Attention Module)、膨胀卷积和BiGRU(双向门控循环单元)的网络入侵检测模型,旨在解决现有方法存在的问题。... 现有方法存在着网络流量数据不平衡、检测准确率不足和误报率上升等问题。该文提出了一种基于改进的CBAM(Convolutional Block Attention Module)、膨胀卷积和BiGRU(双向门控循环单元)的网络入侵检测模型,旨在解决现有方法存在的问题。具体来说,为了应对数据分布不平衡的问题,采用了ADASYN(自适应过采样)算法进行自适应过采样,以平衡数据集。为解决检测准确率不足和误报率上升的问题,在特征提取阶段,首先引入了三层膨胀卷积,拓展感受野范围,从而全面地捕捉网络流量的特征。其次,采用改进的CBAM模块增强膨胀卷积对高级特征的提取能力。最后,引入BiGRU用于更深入捕捉特征之间的长期依赖关系,进一步提升模型的性能。实验结果表明,在NSL-KDD数据集上,该方法相对于其他方法具有更高的准确率(99.51%)和更低的误检率(2.90%),这表明该模型在网络入侵检测任务中是一种可行有效的方法。 展开更多
关键词 网络入侵检测 膨胀卷积 卷积注意力模块 双向门控循环单元 ADASYN过采样
在线阅读 下载PDF
基于改进YOLOv8的SAR舰船目标检测算法 被引量:2
13
作者 谷岳 邓松峰 +2 位作者 沈霁 穆文涛 赵恩棋 《计算机与现代化》 2024年第12期78-83,共6页
为了提高SAR图像中舰船目标检测的准确性,特别是在面对目标大小不均、分布密集和复杂背景的挑战时,提出一种基于YOLOv8改进的YOLO-3M舰船目标检测算法。首先,算法引入多尺度膨胀卷积特征提取模块(Multiscale Dilated Convolution Block,... 为了提高SAR图像中舰船目标检测的准确性,特别是在面对目标大小不均、分布密集和复杂背景的挑战时,提出一种基于YOLOv8改进的YOLO-3M舰船目标检测算法。首先,算法引入多尺度膨胀卷积特征提取模块(Multiscale Dilated Convolution Block,MSDB)到主干网络中,使用多个膨胀率不同的卷积来提取多尺度特征,在不增加计算成本的情况下增大了感受野;其次,在颈部网络中引入多维度协作注意力机制(Multidimensional Collaborative Attention,MCA),在通道、高度和宽度3个维度上捕捉关键特征,实现不同维度信息的交互,帮助网络有效地关注到复杂背景中的关键部分;最后,在检测头引入MPDIoU损失函数,以应对现有损失函数在处理预测边界框与实际边界框时,尽管长宽比相同但宽度和高度数值完全不同时无法有效进行检测的问题。在SSDD数据集上的实验结果表明,本文算法在准确率和平均精度更高的同时,有效减少了参数量和计算量,使得模型更轻量并更适合于资源受限的环境,并且在复杂舰船的误检和漏检情况上有了显著的改善。 展开更多
关键词 舰船检测 SAR图像 YOLOv8 多尺度膨胀卷积模块 多维度协作注意力机制 MPDIoU
在线阅读 下载PDF
融合高频信息增强和注意力机制的视线估计方法
14
作者 印洁 沈文忠 邵洁 《上海电力大学学报》 CAS 2024年第3期279-284,共6页
准确的视线注视方向估计是人机交互和虚拟现实等应用场景中的关键技术。基于外观的视线估计是目前的主流方法,然而,因为眼睛外观、光线条件和头部姿态的多样性,所以无约束环境下的视线估计仍然是一个具有挑战性的任务。提出了一种高频... 准确的视线注视方向估计是人机交互和虚拟现实等应用场景中的关键技术。基于外观的视线估计是目前的主流方法,然而,因为眼睛外观、光线条件和头部姿态的多样性,所以无约束环境下的视线估计仍然是一个具有挑战性的任务。提出了一种高频信息视线估计网络(HFA-Net)。首先,在神经网络中加入高频信息提取模块和卷积注意力模块(CBAM),帮助网络减少冗余信息的影响;其次,将视线分为两个角度分别进行回归,并使用独立损失函数进行优化;最后,在公开数据集MPIIGaze上进行训练和测试。实验结果表明,该方法在MPIIGaze上取得了4.17°的最佳角度估计误差,超越目前主流算法。 展开更多
关键词 视线估计 高频信息提取 扩张卷积 卷积注意力模块
在线阅读 下载PDF
基于TR-YOLOv5的输电线路多类缺陷目标检测方法 被引量:19
15
作者 郝帅 赵新生 +3 位作者 马旭 张旭 何田 侯李祥 《图学学报》 CSCD 北大核心 2023年第4期667-676,共10页
针对复杂环境中输电线路多类缺陷目标的多尺度检测问题,提出一种基于Transformer和感受野模块的YOLOv5输电线路多类缺陷目标检测算法,简记为TR-YOLOv5。首先,搭建了YOLOv5网络,针对复杂背景造成缺陷目标的显著性低,进而影响检测精度的问... 针对复杂环境中输电线路多类缺陷目标的多尺度检测问题,提出一种基于Transformer和感受野模块的YOLOv5输电线路多类缺陷目标检测算法,简记为TR-YOLOv5。首先,搭建了YOLOv5网络,针对复杂背景造成缺陷目标的显著性低,进而影响检测精度的问题,在Backbone部分引入Transformer模块,通过利用多头注意力结构获取特征图像素点间的相关性和全局信息,增强缺陷目标的特征表达能力,从而提升模型检测精度;其次,由于待检测目标受多尺度影响,在Neck部分引入感受野模块提取目标不同尺度的特征,利用空洞卷积增大感受野,为后续PANet结构保留更细致的特征,增强Neck特征融合能力,提高模型对多尺度缺陷目标的检测精度;然后,为了提升预测边框回归精度,引入CIOU函数,进一步提高算法检测精度;最后,利用某电力巡检部门近3年的数据对该算法进行验证。实验结果表明,相比于7种对比算法,本文算法具有较高检测精度的同时具有较好的实时性,其平均检测精度可达95.6%,1280×720分辨率的巡检图像检测速度为125帧/秒。 展开更多
关键词 YOLOv5 输电线路缺陷检测 空洞卷积 TRANSFORMER 感受野模块 损失函数
在线阅读 下载PDF
基于密集扩张卷积残差网络的地震数据随机噪声压制方法 被引量:5
16
作者 高磊 沈侯森 闵帆 《石油物探》 CSCD 北大核心 2023年第4期655-668,共14页
地震数据处理过程中压制随机噪声是提高地震数据质量的重要环节之一,其关键是有效压制噪声并尽可能地保留有效信号。针对深度学习方法在地震数据去噪处理时局部特征提取的局限性,提出了一种基于密集扩张卷积残差网络(DDCRN)的去噪方法。... 地震数据处理过程中压制随机噪声是提高地震数据质量的重要环节之一,其关键是有效压制噪声并尽可能地保留有效信号。针对深度学习方法在地震数据去噪处理时局部特征提取的局限性,提出了一种基于密集扩张卷积残差网络(DDCRN)的去噪方法。DDCRN主要由多个密集扩张卷积特征融合块(DDCFFB)构成,DDCFFB内部的密集块和多尺度扩张卷积可以用来并行提取特征,融合结构可以用来融合特征,残差结构则跳跃连接通道数。其中,密集块连接不同的卷积层来学习特征,关注局部特征的传播和重用,高效提取复杂信息;多尺度扩张卷积扩大感受野,增加特征提取范围;残差学习则加快网络训练的收敛速度。分别采用K奇异值分解(KSVD)、频域空间域反卷积(f-x decon)、去噪卷积神经网络(DnCNN)、U型网络(Unet)以及DDCRN去噪方法对合成地震数据和实际地震数据进行去噪处理。结果表明,DDCRN去噪方法不仅能更有效地压制随机噪声,同时还能更完整地保留同相轴的连续性。 展开更多
关键词 地震数据去噪 特征融合 卷积神经网络 密集块 扩张卷积
在线阅读 下载PDF
基于多尺度稠密残差网络的JPEG压缩伪迹去除方法 被引量:3
17
作者 陈书贞 张祎俊 练秋生 《电子与信息学报》 EI CSCD 北大核心 2019年第10期2479-2486,共8页
JPEG在高压缩比的情况下,解压缩后的图像会产生块效应、边缘振荡效应和模糊,严重影响了图像的视觉效果。为了去除JPEG压缩伪迹,该文提出了多尺度稠密残差网络。首先把扩张卷积引入到残差网络的稠密块中,利用不同的扩张因子,使其形成多... JPEG在高压缩比的情况下,解压缩后的图像会产生块效应、边缘振荡效应和模糊,严重影响了图像的视觉效果。为了去除JPEG压缩伪迹,该文提出了多尺度稠密残差网络。首先把扩张卷积引入到残差网络的稠密块中,利用不同的扩张因子,使其形成多尺度稠密块;然后采用4个多尺度稠密块将网络设计成包含2条支路的结构,其中后一条支路用于补充前一条支路没有提取到的特征;最后采用残差学习的方法来提高网络的性能。为了提高网络的通用性,采用具有不同压缩质量因子的联合训练方式对网络进行训练,针对不同压缩质量因子训练出一个通用模型。经实验表明,该文方法不仅具有较高的JPEG压缩伪迹去除性能,且具有较强的泛化能力。 展开更多
关键词 JPEG压缩 压缩伪迹 多尺度稠密块 扩张卷积
在线阅读 下载PDF
基于传递注意力机制的非均匀雾图去雾算法 被引量:1
18
作者 王科平 段雨朦 +1 位作者 杨艺 费树岷 《模式识别与人工智能》 EI CSCD 北大核心 2022年第7期575-588,共14页
针对非均匀雾霾图像难以建模、去雾时容易出现残留的问题,文中提出基于传递注意力机制的非均匀雾图去雾算法.针对雾霾分布的非均匀性,在网络中构建传递注意力机制,使注意力特征图中的权重信息在各个注意力块之间流动,有针对性地处理非... 针对非均匀雾霾图像难以建模、去雾时容易出现残留的问题,文中提出基于传递注意力机制的非均匀雾图去雾算法.针对雾霾分布的非均匀性,在网络中构建传递注意力机制,使注意力特征图中的权重信息在各个注意力块之间流动,有针对性地处理非均匀有雾图像中的雾霾噪声.为了减少普通深度卷积导致复原图像中细节信息丢失问题,构建稀疏结构平滑空洞卷积,用于提取图像特征,在保证较大感受野的同时保留更多的细节信息.最后,并联一个轻量级的残差块结构,用于补充重构图像的色彩、细节信息.实验表明,文中算法在非均匀有雾图像数据集和人工合成有雾图像数据集上均能取得较优效果,在主观效果和客观指标上都具有较大优势. 展开更多
关键词 图像去雾 深度学习 稀疏块 平滑空洞卷积
在线阅读 下载PDF
基于循环生成对抗网络的人脸素描合成 被引量:4
19
作者 葛延良 孙笑笑 +2 位作者 张乔 王冬梅 王肖肖 《吉林大学学报(理学版)》 CAS 北大核心 2022年第4期897-905,共9页
针对当前卷积神经网络通常以降低感受野为条件获得多尺度图像特征,以及很难捕获各特征通道之间重要关系的问题,结合循环生成对抗网络结构的特点提出一种新的多尺度自注意力机制的循环生成对抗网络.首先,在生成器中使用VGG16模块组成U-Ne... 针对当前卷积神经网络通常以降低感受野为条件获得多尺度图像特征,以及很难捕获各特征通道之间重要关系的问题,结合循环生成对抗网络结构的特点提出一种新的多尺度自注意力机制的循环生成对抗网络.首先,在生成器中使用VGG16模块组成U-Net结构网络,以增强对图像特征信息的提取,同时对网络中的下采样和上采样进行改进,以提高特征分辨率,获取更多的细节信息;其次,设计多尺度特征聚合模块,采用不同采样率的多个并行空洞卷积,整合了不同尺度上的空间信息,在保持图像较大感受野的同时,多比例地捕捉图像信息;最后,为捕获空间维度和通道维度中的特征依赖关系,设计像素自注意力模块对空间维度和通道维度上的语义依赖关系进行建模,以增强图像特征的表现能力,提高生成素描图像的质量. 展开更多
关键词 深度学习 循环生成对抗网络 空洞卷积 多尺度特征聚合模块 像素自注意力模块
在线阅读 下载PDF
集成注意力机制和扩张卷积的道路提取模型 被引量:9
20
作者 王勇 曾祥强 《中国图象图形学报》 CSCD 北大核心 2022年第10期3102-3115,共14页
目的为解决当前遥感影像道路提取方法普遍存在的自动化程度低、提取精度不高和由于样本数量不平衡导致的模型训练不稳定等问题,本文提出一种集成注意力机制和扩张卷积的道路提取模型(attention and dilated convolutional U-Net,A&D... 目的为解决当前遥感影像道路提取方法普遍存在的自动化程度低、提取精度不高和由于样本数量不平衡导致的模型训练不稳定等问题,本文提出一种集成注意力机制和扩张卷积的道路提取模型(attention and dilated convolutional U-Net,A&D-UNet)。方法A&D-UNet聚合网络模型以经典U-Net网络结构为基础,在编码部分引入残差学习单元(residual learning unit,RLU),降低深度卷积神经网络在训练时的复杂度;应用卷积注意力模块(convolutional block attention module,CBAM)从通道和空间维度两个方面优化分配权重,突出道路特征信息;并使用扩张卷积单元(dilated convolutional unit,DCU)感受更大范围的特征区域,整合道路的上下文信息。采用二进制交叉熵(binary cross entropy,BCE)和Dice相结合的复合损失函数训练模型,减轻遥感影像中样本数量不平衡导致的模型不稳定。结果在公开的美国马萨诸塞州和Deep Globe道路数据集上进行模型验证实验,并与传统的U-Net、Link-Net和D-LinkNet图像分割模型对比分析。在美国马萨诸塞州道路测试集上,本文构建的A&D-UNet模型的总体精度、F1分数和交并比等评价指标分别为95.27%、77.96%和79.89%,均优于对比算法,在测试集中对线性特征明显、标签遗漏标记以及存在树木遮挡的道路区域具有更好的识别效果;在Deep Globe道路测试集上,A&D-UNet模型的总体精度、F1分数和交并比分别为94.01%、77.06%和78.44%,且对线性特征明显的主干道路、标签未标记的狭窄道路以及阴影遮挡的城市道路都具有较好的提取效果。结论本文提出的A&D-UNet道路提取模型,综合了残差学习、注意力机制和扩张卷积的优点,有效提升了目标分割的性能,是一种提取效果较好、值得推广的聚合网络模型。 展开更多
关键词 道路信息 残差学习单元(RLU) 卷积注意力模块(CBAM) 扩张卷积单元(DCU) 损失函数
原文传递
上一页 1 2 下一页 到第
使用帮助 返回顶部