期刊文献+
共找到24篇文章
< 1 2 >
每页显示 20 50 100
A Lightweight Convolutional Neural Network with Hierarchical Multi-Scale Feature Fusion for Image Classification 被引量:2
1
作者 Adama Dembele Ronald Waweru Mwangi Ananda Omutokoh Kube 《Journal of Computer and Communications》 2024年第2期173-200,共28页
Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso... Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline. 展开更多
关键词 MobileNet Image Classification Lightweight convolutional Neural Network Depthwise dilated Separable convolution Hierarchical multi-scale Feature Fusion
在线阅读 下载PDF
Chinese named entity recognition with multi-network fusion of multi-scale lexical information 被引量:3
2
作者 Yan Guo Hong-Chen Liu +3 位作者 Fu-Jiang Liu Wei-Hua Lin Quan-Sen Shao Jun-Shun Su 《Journal of Electronic Science and Technology》 EI CAS CSCD 2024年第4期53-80,共28页
Named entity recognition(NER)is an important part in knowledge extraction and one of the main tasks in constructing knowledge graphs.In today’s Chinese named entity recognition(CNER)task,the BERT-BiLSTM-CRF model is ... Named entity recognition(NER)is an important part in knowledge extraction and one of the main tasks in constructing knowledge graphs.In today’s Chinese named entity recognition(CNER)task,the BERT-BiLSTM-CRF model is widely used and often yields notable results.However,recognizing each entity with high accuracy remains challenging.Many entities do not appear as single words but as part of complex phrases,making it difficult to achieve accurate recognition using word embedding information alone because the intricate lexical structure often impacts the performance.To address this issue,we propose an improved Bidirectional Encoder Representations from Transformers(BERT)character word conditional random field(CRF)(BCWC)model.It incorporates a pre-trained word embedding model using the skip-gram with negative sampling(SGNS)method,alongside traditional BERT embeddings.By comparing datasets with different word segmentation tools,we obtain enhanced word embedding features for segmented data.These features are then processed using the multi-scale convolution and iterated dilated convolutional neural networks(IDCNNs)with varying expansion rates to capture features at multiple scales and extract diverse contextual information.Additionally,a multi-attention mechanism is employed to fuse word and character embeddings.Finally,CRFs are applied to learn sequence constraints and optimize entity label annotations.A series of experiments are conducted on three public datasets,demonstrating that the proposed method outperforms the recent advanced baselines.BCWC is capable to address the challenge of recognizing complex entities by combining character-level and word-level embedding information,thereby improving the accuracy of CNER.Such a model is potential to the applications of more precise knowledge extraction such as knowledge graph construction and information retrieval,particularly in domain-specific natural language processing tasks that require high entity recognition precision. 展开更多
关键词 Bi-directional long short-term memory(BiLSTM) Chinese named entity recognition(CNER) Iterated dilated convolutional neural network(IDCNN) Multi-network integration multi-scale lexical features
在线阅读 下载PDF
Two Stages Segmentation Algorithm of Breast Tumor in DCE-MRI Based on Multi-Scale Feature and Boundary Attention Mechanism
3
作者 Bing Li Liangyu Wang +3 位作者 Xia Liu Hongbin Fan Bo Wang Shoudi Tong 《Computers, Materials & Continua》 SCIE EI 2024年第7期1543-1561,共19页
Nuclearmagnetic resonance imaging of breasts often presents complex backgrounds.Breast tumors exhibit varying sizes,uneven intensity,and indistinct boundaries.These characteristics can lead to challenges such as low a... Nuclearmagnetic resonance imaging of breasts often presents complex backgrounds.Breast tumors exhibit varying sizes,uneven intensity,and indistinct boundaries.These characteristics can lead to challenges such as low accuracy and incorrect segmentation during tumor segmentation.Thus,we propose a two-stage breast tumor segmentation method leveraging multi-scale features and boundary attention mechanisms.Initially,the breast region of interest is extracted to isolate the breast area from surrounding tissues and organs.Subsequently,we devise a fusion network incorporatingmulti-scale features and boundary attentionmechanisms for breast tumor segmentation.We incorporate multi-scale parallel dilated convolution modules into the network,enhancing its capability to segment tumors of various sizes through multi-scale convolution and novel fusion techniques.Additionally,attention and boundary detection modules are included to augment the network’s capacity to locate tumors by capturing nonlocal dependencies in both spatial and channel domains.Furthermore,a hybrid loss function with boundary weight is employed to address sample class imbalance issues and enhance the network’s boundary maintenance capability through additional loss.Themethod was evaluated using breast data from 207 patients at RuijinHospital,resulting in a 6.64%increase in Dice similarity coefficient compared to the benchmarkU-Net.Experimental results demonstrate the superiority of the method over other segmentation techniques,with fewer model parameters. 展开更多
关键词 Dynamic contrast-enhanced magnetic resonance imaging(DCE-MRI) breast tumor segmentation multi-scale dilated convolution boundary attention the hybrid loss function with boundary weight
在线阅读 下载PDF
面向面部动作单元的自适应图注意力微表情检测网络
4
作者 马飞 安佳祺 +1 位作者 杨飞霞 徐光宪 《计算机科学与探索》 北大核心 2026年第4期1193-1206,共14页
微表情检测旨在视频中定位幅度微弱、时间短暂的表情区间。其难点在于有效提取面部区域间的动态关联特征和多尺度时序特征,进而精准捕捉面部各区域微小动作之间的关联。针对这些问题,提出了一种融合自适应图注意力和多尺度可变空洞卷积... 微表情检测旨在视频中定位幅度微弱、时间短暂的表情区间。其难点在于有效提取面部区域间的动态关联特征和多尺度时序特征,进而精准捕捉面部各区域微小动作之间的关联。针对这些问题,提出了一种融合自适应图注意力和多尺度可变空洞卷积的微表情检测网络(AG-DDNet)。通过引入参数可学习矩阵来实现键值对的特征变换,通过计算面部区域特征向量间的相似度得到动态邻接矩阵,并结合图注意力机制计算区域间权重系数,实现特征的动态融合;采用了多尺度可变空洞卷积模块,通过自适应池化与卷积组合的预测器生成动态感受野,从而实现多尺度的特征提取;引入基于Fisher信息矩阵的自然梯度优化机制,通过Fisher Adam优化器有效捕捉参数空间的几何结构信息,实现学习率的精确自适应调整,从而显著增强了模型对微表情和宏表情的协同检测能力。在微表情检测任务中,该算法与同类代表性算法相比,在CAS(ME)2数据集和SAMM Long Videos数据集上的性能分别提升了54.20%和20.11%。与最新算法相比,两个数据集上的提升幅度分别为38.43%和6.81%,有效证明了该方法在长视频微表情检测任务上的优越性能。 展开更多
关键词 微表情检测 自适应图注意力 多尺度可变空洞卷积 面部动作单元 长视频分析
在线阅读 下载PDF
基于可变形卷积和注意力机制的路面裂缝检测
5
作者 谢永华 方育才 彭银佳 《计算机工程与设计》 北大核心 2026年第1期279-285,共7页
为解决路面裂缝检测中图像边缘特征难以学习和背景噪声干扰的问题,提出一个基于可变形卷积和注意力机制的可端到端训练的路面裂缝检测网络。该网络基于U-Net结构设计,在特征融合部分添加边缘感知模块来增强裂缝边缘的检测能力;在编码器... 为解决路面裂缝检测中图像边缘特征难以学习和背景噪声干扰的问题,提出一个基于可变形卷积和注意力机制的可端到端训练的路面裂缝检测网络。该网络基于U-Net结构设计,在特征融合部分添加边缘感知模块来增强裂缝边缘的检测能力;在编码器部分使用空洞残差模块扩大感受野并保留更多细节信息;在解码器部分添加注意力机制提高对裂缝特征的关注度,抑制背景噪声。实验结果表明,该网络在MPA、mIoU和F1值这3项指标上均优于其它对比网络,验证了该网络的有效性。 展开更多
关键词 裂缝检测 语义分割 编码解码 可变形卷积 空洞卷积 残差连接 注意力机制
在线阅读 下载PDF
基于多尺度特征融合的矿用钢索损伤检测网络
6
作者 徐永恒 裴晓芳 《计算机工程与设计》 北大核心 2026年第2期576-583,共8页
矿用钢索的安全性直接影响作业人员生命保障与设备运行,其损伤检测面临尺度多变、形态不规则的挑战。针对此问题,提出一种轻量级的可形变与多尺度融合网络。设计一种DRConv卷积模块,提升网络在复杂环境下的精度。基于MSDA注意力机制,提... 矿用钢索的安全性直接影响作业人员生命保障与设备运行,其损伤检测面临尺度多变、形态不规则的挑战。针对此问题,提出一种轻量级的可形变与多尺度融合网络。设计一种DRConv卷积模块,提升网络在复杂环境下的精度。基于MSDA注意力机制,提出空洞空间金字塔SPDA改进原有的SPPF模块,提升上采样效果。基于DCN和D-LKA的思想设计SLNK模块,并结合RT-DETR解码器中的CCFM网络,形成一种全新的融合多尺度和可形变卷积的颈部网络RTSLNK,轻量化模型的同时提高精度。实验结果表明,相较于原模型YOLOv8n,平均精度提高5.2%,参数量降低10.5%,在矿用钢索损伤检测任务中表现出色。 展开更多
关键词 矿用钢丝绳索 表面损伤检测 可形变卷积 空洞空间金字塔 可变形大核注意力 跨尺度特征融合 轻量化网络
在线阅读 下载PDF
Attention-enhanced multi-time scale LSTM for soft sensor modeling of corn starch liquefaction
7
作者 Yu Zhuang Zhongyi Zhang +5 位作者 Jin Tao Yi Li Fan Li Yu Wang Lei Zhang Jian Du 《Chinese Journal of Chemical Engineering》 2026年第1期132-144,共13页
Data-driven deep learning modeling has been increasingly applied to quality prediction in complex chemical processes.However,the data show complex temporal features due to different residence times and strong coupling... Data-driven deep learning modeling has been increasingly applied to quality prediction in complex chemical processes.However,the data show complex temporal features due to different residence times and strong coupling relationships among chemical entities.This study proposes a multi-scale temporal feature extraction module to extract local dynamic temporal features across different time scales and combines it with long short-term memory(LSTM)networks to capture global temporal patterns,thereby taking full advantage of available data.In addition,variable-wise channel attention is integrated into the model to enhance attention on the essential parts of the feature maps and improve predictive performance.Furthermore,by analyzing the attention weights,the model quickly identifies the key variables that significantly affect the predictions.Finally,the model is applied to a real corn starch liquefaction process and achieves an accurate product quality prediction with an R^(2) value of 0.9392,which represents a 4%to 9%improvement over traditional models and demonstrates the superiority of the proposed approach. 展开更多
关键词 multi-scale dilated causal convolution Neural networks Soft sensor Systems engineering attention mechanism Biochemical engineering
在线阅读 下载PDF
Hard-rock tunnel lithology identification using multiscale dilated convolutional attention network based on tunnel face images 被引量:1
8
作者 Wenjun ZHANG Wuqi ZHANG +5 位作者 Gaole ZHANG Jun HUANG Minggeng LI Xiaohui WANG Fei YE Xiaoming GUAN 《Frontiers of Structural and Civil Engineering》 SCIE EI CSCD 2023年第12期1796-1812,共17页
For real-time classification of rock-masses in hard-rock tunnels,quick determination of the rock lithology on the tunnel face during construction is essential.Motivated by current breakthroughs in artificial intellige... For real-time classification of rock-masses in hard-rock tunnels,quick determination of the rock lithology on the tunnel face during construction is essential.Motivated by current breakthroughs in artificial intelligence technology in machine vision,a new automatic detection approach for classifying tunnel lithology based on tunnel face images was developed.The method benefits from residual learning for training a deep convolutional neural network(DCNN),and a multi-scale dilated convolutional attention block is proposed.The block with different dilation rates can provide various receptive fields,and thus it can extract multi-scale features.Moreover,the attention mechanism is utilized to select the salient features adaptively and further improve the performance of the model.In this study,an initial image data set made up of photographs of tunnel faces consisting of basalt,granite,siltstone,and tuff was first collected.After classifying and enhancing the training,validation,and testing data sets,a new image data set was generated.A comparison of the experimental findings demonstrated that the suggested approach outperforms previous classifiers in terms of various indicators,including accuracy,precision,recall,F1-score,and computing time.Finally,a visualization analysis was performed to explain the process of the network in the classification of tunnel lithology through feature extraction.Overall,this study demonstrates the potential of using artificial intelligence methods for in situ rock lithology classification utilizing geological images of the tunnel face. 展开更多
关键词 hard-rock tunnel face intelligent lithology identification multi-scale dilated convolutional attention network image classification deep learning
原文传递
基于改进YOLOv8的光伏电池缺陷检测
9
作者 刘闯闯 袁金丽 +2 位作者 郑美曼 吴晨曦 郭志涛 《电子测量技术》 北大核心 2025年第21期139-147,共9页
针对传统视觉方法在太阳能电池检测中面临的小目标缺陷识别准确率低、不同尺度特征捕获能力不足等问题,本文提出了一种基于跨尺度特征增强与动态参数优化的YOLOv8改进算法。首先,以多分支残差结构为核心,融合重参数化技术与可调膨胀卷积... 针对传统视觉方法在太阳能电池检测中面临的小目标缺陷识别准确率低、不同尺度特征捕获能力不足等问题,本文提出了一种基于跨尺度特征增强与动态参数优化的YOLOv8改进算法。首先,以多分支残差结构为核心,融合重参数化技术与可调膨胀卷积,设计膨胀重参数残差模块,通过跨层级特征交互增强目标缺陷的上下文感知能力,提高检测精度;其次,在C2f模块中嵌入可变形卷积,结合辅助检测模块构建动态特征适应网络,提升对微小缺陷的几何特征提取;最后,引入具有动态聚焦机制的损失函数优化边界框匹配,提高回归精度。实验结果表明,改进模型在3.03 M参数量下实现91.8%的平均精度,较基准模型提升4%,保持轻量参数同时提高了检测性能。 展开更多
关键词 光伏缺陷 YOLOv8 膨胀残差 重参数 可变形卷积 辅助检测
原文传递
MFF-YOLO:An Improved YOLO Algorithm Based on Multi-Scale Semantic Feature Fusion 被引量:1
10
作者 Junsan Zhang Chenyang Xu +2 位作者 Shigen Shen Jie Zhu Peiying Zhang 《Tsinghua Science and Technology》 2025年第5期2097-2113,共17页
The YOLOv5 algorithm is widely used in edge computing systems for object detection.However,the limited computing resources of embedded devices and the large model size of existing deep learning based methods increase ... The YOLOv5 algorithm is widely used in edge computing systems for object detection.However,the limited computing resources of embedded devices and the large model size of existing deep learning based methods increase the difficulty of real-time object detection on edge devices.To address this issue,we propose a smaller,less computationally intensive,and more accurate algorithm for object detection.Multi-scale Feature Fusion-YOLO(MFF-YOLO)is built on top of the YOLOv5s framework,but it contains substantial improvements to YOLOv5s.First,we design the MFF module to improve the feature propagation path in the feature pyramid,which further integrates the semantic information from different paths of feature layers.Then,a large convolution-kernel module is used in the bottleneck.The structure enlarges the receptive field and preserves shallow semantic information,which overcomes the performance limitation arising from uneven propagation in Feature Pyramid Networks(FPN).In addition,a multi-branch downsampling method based on depthwise separable convolutions and a bottleneck structure with deformable convolutions are designed to reduce the complexity of the backbone network and minimize the real-time performance loss caused by the increased model complexity.The experimental results on PASCAL VOC and MS COCO datasets show that,compared with YOLOv5s,MFF-YOLO reduces the number of parameters by 7%and the number of FLoating point Operations Per second(FLOPs)by 11.8%.The mAP@0.5 has improved by 3.7%and 5.5%,and the mAP@0.5:0.95 has improved by 6.5%and 6.2%,respetively.Furthermore,compared with YOLOv7-tiny,PP-YOLO-tiny,and other mainstream methods,MFF-YOLO has achieved better results on multiple indicators. 展开更多
关键词 object detection YOLOv5 Feature Pyramid Networks(FPN) feature fusion deformable convolutional Networks(DCN) multi-scale Feature Fusion(MFF)
原文传递
Deep Learning-Based Algorithm for Robust Object Detection in Flooded and Rainy Environments
11
作者 Pengfei Wang Jiwu Sun +4 位作者 Lu Lu Hongchen Li Hongzhe Liu Cheng Xu Yongqiang Liu 《Computers, Materials & Continua》 2025年第8期2883-2903,共21页
Flooding and heavy rainfall under extreme weather conditions pose significant challenges to target detection algorithms.Traditional methods often struggle to address issues such as image blurring,dynamic noise interfe... Flooding and heavy rainfall under extreme weather conditions pose significant challenges to target detection algorithms.Traditional methods often struggle to address issues such as image blurring,dynamic noise interference,and variations in target scale.Conventional neural network(CNN)-based target detection approaches face notable limitations in such adverse weather scenarios,primarily due to the fixed geometric sampling structures that hinder adaptability to complex backgrounds and dynamically changing object appearances.To address these challenges,this paper proposes an optimized YOLOv9 model incorporating an improved deformable convolutional network(DCN)enhanced with a multi-scale dilated attention(MSDA)mechanism.Specifically,the DCN module enhances themodel’s adaptability to target deformation and noise interference by adaptively adjusting the sampling grid positions,while also integrating feature amplitude modulation to further improve robustness.Additionally,theMSDA module is introduced to capture contextual features acrossmultiple scales,effectively addressing issues related to target occlusion and scale variation commonly encountered in flood-affected environments.Experimental evaluations are conducted on the ISE-UFDS and UA-DETRAC datasets.The results demonstrate that the proposedmodel significantly outperforms state-of-the-art methods in key evaluation metrics,including precision,recall,F1-score,and mAP(Mean Average Precision).Notably,the model exhibits superior robustness and generalization performance under simulated severe weather conditions,offering reliable technical support for disaster emergency response systems.This study contributes to enhancing the accuracy and real-time capabilities of flood early warning systems,thereby supporting more effective disaster mitigation strategies. 展开更多
关键词 YOLO vehicle detection FLOOD deformable convolutional networks multi-scale dilated attention
暂未订购
基于特征重用和膨胀卷积的遥感图像舰船检测 被引量:3
12
作者 曲海成 李瑞柯 +1 位作者 王蒙 单以盟 《智能系统学报》 CSCD 北大核心 2024年第5期1298-1308,共11页
在光学遥感图像中,港口内的舰船目标通常处于密集的船只群中,并受到周围环境的干扰和遮挡,如集装箱、车辆等。为了进一步提高现有舰船目标检测算法的精度和泛化性能,提出了一种基于特征重用和膨胀卷积的遥感图像舰船检测算法。首先构建... 在光学遥感图像中,港口内的舰船目标通常处于密集的船只群中,并受到周围环境的干扰和遮挡,如集装箱、车辆等。为了进一步提高现有舰船目标检测算法的精度和泛化性能,提出了一种基于特征重用和膨胀卷积的遥感图像舰船检测算法。首先构建了基于分组卷积和拆分注意力的残差块来提取特征,同时嵌入可变形卷积提取更加符合舰船尺度变化的特征;接着,构造了多尺度感受野模块,通过并行提取多尺度特征后再进行融合来减少信息损失;最后,在原有特征金字塔的基础上构建了一条自底向上的特征重用聚合路径以提高特征表示能力。在大型遥感数据集DOTA和舰船数据集HRSC2016上进行实验,实验结果表明,所提方法能够有效缓解舰船目标漏检和误检问题,提高了遥感图像舰船目标检测的精度。 展开更多
关键词 遥感图像 舰船检测 特征重用 膨胀卷积 拆分注意力 分组卷积 特征金字塔 可变形卷积
在线阅读 下载PDF
改进Mask RCNN的焊缝缺陷检测 被引量:15
13
作者 杨彬 亚森江·木沙 安波 《机械设计与制造》 北大核心 2023年第6期157-161,共5页
焊接缺陷检测是焊接行业的一项重要工作,利用X射线焊缝缺陷图像进行缺陷检测是焊接无损检测的重要手段。为实现对缺陷的自动识别和定位,结合缺陷的具体特征提出了一种改进的Mask RCNN实例分割网络实现对图像进行缺陷检测和分割。该方法... 焊接缺陷检测是焊接行业的一项重要工作,利用X射线焊缝缺陷图像进行缺陷检测是焊接无损检测的重要手段。为实现对缺陷的自动识别和定位,结合缺陷的具体特征提出了一种改进的Mask RCNN实例分割网络实现对图像进行缺陷检测和分割。该方法在原有网络的基础上通过采用变形卷积更好地提取不规则形状缺陷特征信息,引入空洞卷积加强高层特征的感受野,在局部图像中融合全局图像信息使局部图像获取上下文信息,利用迁移学习和数据增强降低对训练数据的需求,提升检测和分割精度。最终,通过对焊缝X射线数据集上进行实验,验证改进的Mask RCNN模型与原始Mask RCNN模型以及Faster RCNN模型等模型进行客观比较,并对实验结果进行可行性分析,提出的模型表现出更精确的检测精度和更好的性能。实验结果表明改进的Mask RCNN模型可以更好的适用于焊缝缺陷检测中。 展开更多
关键词 Mask RCNN 变形卷积 空洞卷积 迁移学习 数据增强
在线阅读 下载PDF
基于ScoreCAM的X光安检违禁品检测 被引量:6
14
作者 赵晴 张海刚 +3 位作者 汤圣涛 毛亮 孙红星 杨金锋 《计算机工程与设计》 北大核心 2022年第12期3483-3492,共10页
针对X光安检违禁品检测依赖数据标注,存在多姿态变化且小目标不易检出的问题,改进ResNet50网络,提出基于弱监督机制的X光安检违禁品检测模型。通过ScoreCAM算法,依靠图片类别标签实现可视化和违禁品定位,结合可变形卷积和空洞卷积,设计... 针对X光安检违禁品检测依赖数据标注,存在多姿态变化且小目标不易检出的问题,改进ResNet50网络,提出基于弱监督机制的X光安检违禁品检测模型。通过ScoreCAM算法,依靠图片类别标签实现可视化和违禁品定位,结合可变形卷积和空洞卷积,设计4种在主干网络不同位置添加可变形空洞卷积模块的网络结构,使卷积核的形态更贴近违禁品轮廓,提高违禁品特征的提取能力。仿真结果表明,该模型结果更加准确,能有效应对违禁品目标多姿态变化、遮挡、小目标漏检的技术难题。 展开更多
关键词 X光安检 弱监督 违禁品检测 小目标 空洞卷积 可变形卷积 遮挡
在线阅读 下载PDF
基于S3DD-YOLOv8n的矿工行为检测算法 被引量:4
15
作者 李海川 贺星亮 +1 位作者 贾仟国 李利 《矿业安全与环保》 CAS 北大核心 2024年第5期96-104,共9页
为防范潜在隐患、保障煤矿安全生产,对矿井作业人员行为进行检测已成为提高矿井安全管理水平的重要方式。鉴于目前常用的智能检测方法精度普遍较低,提出基于S3DD-YOLOv8n的矿工行为检测算法:为提取视频数据的时间信息并保持连续性,在YOL... 为防范潜在隐患、保障煤矿安全生产,对矿井作业人员行为进行检测已成为提高矿井安全管理水平的重要方式。鉴于目前常用的智能检测方法精度普遍较低,提出基于S3DD-YOLOv8n的矿工行为检测算法:为提取视频数据的时间信息并保持连续性,在YOLOv8n的骨干网络中引入3D空洞卷积,改进数据增强算法;引入压缩-激励SE(Squeeze&Excitation)注意力机制,提高网络对重点信息的关注程度;引入可变形卷积提高模型对矿工行为的拟合度。经DsLMF+数据集实验验证,该算法的平均精度均值mAP50达到了97.0%,相比YOLOv8n提升了4.0%,同时精确率P和回归率R分别提升了12.9%、7.0%,达到92.5%、90.4%,该算法可高效、精准地检测矿工行为。 展开更多
关键词 矿工行为检测 YOLOv8n 3D空洞卷积 SE注意力机制 可变形卷积
在线阅读 下载PDF
DT-Net:Joint Dual-Input Transformer and CNN for Retinal Vessel Segmentation 被引量:1
16
作者 Wenran Jia Simin Ma +1 位作者 Peng Geng Yan Sun 《Computers, Materials & Continua》 SCIE EI 2023年第9期3393-3411,共19页
Retinal vessel segmentation in fundus images plays an essential role in the screening,diagnosis,and treatment of many diseases.The acquired fundus images generally have the following problems:uneven illumination,high ... Retinal vessel segmentation in fundus images plays an essential role in the screening,diagnosis,and treatment of many diseases.The acquired fundus images generally have the following problems:uneven illumination,high noise,and complex structure.It makes vessel segmentation very challenging.Previous methods of retinal vascular segmentation mainly use convolutional neural networks on U Network(U-Net)models,and they have many limitations and shortcomings,such as the loss of microvascular details at the end of the vessels.We address the limitations of convolution by introducing the transformer into retinal vessel segmentation.Therefore,we propose a hybrid method for retinal vessel segmentation based on modulated deformable convolution and the transformer,named DT-Net.Firstly,multi-scale image features are extracted by deformable convolution and multi-head selfattention(MHSA).Secondly,image information is recovered,and vessel morphology is refined by the proposed transformer decoder block.Finally,the local prediction results are obtained by the side output layer.The accuracy of the vessel segmentation is improved by the hybrid loss function.Experimental results show that our method obtains good segmentation performance on Specificity(SP),Sensitivity(SE),Accuracy(ACC),Curve(AUC),and F1-score on three publicly available fundus datasets such as DRIVE,STARE,and CHASE_DB1. 展开更多
关键词 Retinal vessel segmentation deformable convolution multi-scale TRANSFORMER hybrid loss function
在线阅读 下载PDF
基于残差混合扩张卷积的深度编解码人类精子头部分割网络
17
作者 吕琪贤 范朝刚 詹曙 《计算机工程与科学》 CSCD 北大核心 2021年第4期721-728,共8页
精子头部形状是精子形态分析中的一个重要指标,对诊断男性不育十分重要,因此准确高效地分割出精子头部至关重要。基于此,在残差网络的基础上融合扩张卷积与堆叠残差结构,构建了一个新型编解码分割网络。建立了一个用于分割人类精子头部... 精子头部形状是精子形态分析中的一个重要指标,对诊断男性不育十分重要,因此准确高效地分割出精子头部至关重要。基于此,在残差网络的基础上融合扩张卷积与堆叠残差结构,构建了一个新型编解码分割网络。建立了一个用于分割人类精子头部的数据集,其中包含1207幅图像,并利用它来训练测试网络。所提出的网络能在多精子、无染色原图中获得优良的分割结果,在验证集上得到了96.06%的Dice系数。实验结果表明,堆叠残差模块和残差混合扩张卷积模块对分割效果有着显著提升作用。此外,本文网络处理的是呈现出精子真实形态的图像,其分割出的精准结果有利于医生临床诊断。 展开更多
关键词 人类精子头部分割 精子畸形 深度学习 残差结构 混合扩张卷积
在线阅读 下载PDF
基于改进FCOS的细长物体检测算法 被引量:2
18
作者 王梅 胡晓杰 《沈阳理工大学学报》 CAS 2022年第4期8-13,19,共7页
目标检测作为计算机视觉的重要分支之一应用广泛,其中针对细长物体的检测不仅研究成果少,且识别精度低。相较于基于锚框的检测算法,无锚框方法对任意几何形状物体的定位均具有较好的灵活性,能更好地适应细长物体的形状,其中基于全卷积... 目标检测作为计算机视觉的重要分支之一应用广泛,其中针对细长物体的检测不仅研究成果少,且识别精度低。相较于基于锚框的检测算法,无锚框方法对任意几何形状物体的定位均具有较好的灵活性,能更好地适应细长物体的形状,其中基于全卷积网络的一阶段检测算法(FCOS)通过基于中心度的预测框抑制机制可以更好地标定细长目标。据此提出改进FCOS的细长物体检测算法,将FCOS骨干网络中卷积运算替换为可变形卷积,设计了增强的特征金字塔网络的特征融合模块(EFPN),EFPN充分利用通道注意力机制和空洞卷积减少语义信息的丢失,同时能进行有效的特征融合;为更好标定细长目标,使用带有细长度的中心度抑制低质量的检测框。实验结果表明,改进的算法与FCOS相比平均精度提升了3.3%,与卷积神经网络(Faster R-CNN)相比提升了6.9%,验证了其有效性。 展开更多
关键词 细长物体检测 可变形卷积 注意力机制 空洞卷积 中心度
在线阅读 下载PDF
Pre-training transformer with dual-branch context content module for table detection in document images
19
作者 Yongzhi LI Pengle ZHANG +2 位作者 Meng SUN Jin HUANG Ruhan HE 《虚拟现实与智能硬件(中英文)》 EI 2024年第5期408-420,共13页
Background Document images such as statistical reports and scientific journals are widely used in information technology.Accurate detection of table areas in document images is an essential prerequisite for tasks such... Background Document images such as statistical reports and scientific journals are widely used in information technology.Accurate detection of table areas in document images is an essential prerequisite for tasks such as information extraction.However,because of the diversity in the shapes and sizes of tables,existing table detection methods adapted from general object detection algorithms,have not yet achieved satisfactory results.Incorrect detection results might lead to the loss of critical information.Methods Therefore,we propose a novel end-to-end trainable deep network combined with a self-supervised pretraining transformer for feature extraction to minimize incorrect detections.To better deal with table areas of different shapes and sizes,we added a dualbranch context content attention module(DCCAM)to high-dimensional features to extract context content information,thereby enhancing the network's ability to learn shape features.For feature fusion at different scales,we replaced the original 3×3 convolution with a multilayer residual module,which contains enhanced gradient flow information to improve the feature representation and extraction capability.Results We evaluated our method on public document datasets and compared it with previous methods,which achieved state-of-the-art results in terms of evaluation metrics such as recall and F1-score.https://github.com/Yong Z-Lee/TD-DCCAM. 展开更多
关键词 Table detection Document image analysis TRANSFORMER dilated convolution deformable convolution Feature fusion
在线阅读 下载PDF
RealFuVSR:Feature enhanced real-world video super-resolution
20
作者 Zhi LI Xiongwen PANG +1 位作者 Yiyue JIANG Yujie WANG 《Virtual Reality & Intelligent Hardware》 EI 2023年第6期523-537,共15页
Background Recurrent recovery is a common method for video super-resolution(VSR)that models the correlation between frames via hidden states.However,the application of this structure in real-world scenarios can lead t... Background Recurrent recovery is a common method for video super-resolution(VSR)that models the correlation between frames via hidden states.However,the application of this structure in real-world scenarios can lead to unsatisfactory artifacts.We found that in real-world VSR training,the use of unknown and complex degradation can better simulate the degradation process in the real world.Methods Based on this,we propose the RealFuVSR model,which simulates real-world degradation and mitigates artifacts caused by the VSR.Specifically,we propose a multiscale feature extraction module(MSF)module that extracts and fuses features from multiple scales,thereby facilitating the elimination of hidden state artifacts.To improve the accuracy of the hidden state alignment information,RealFuVSR uses an advanced optical flow-guided deformable convolution.Moreover,a cascaded residual upsampling module was used to eliminate noise caused by the upsampling process.Results The experiment demonstrates that RealFuVSR model can not only recover high-quality videos but also outperforms the state-of-the-art RealBasicVSR and RealESRGAN models. 展开更多
关键词 Video super-resolution deformable convolution Cascade residual upsampling Second-order degradation multi-scale feature extraction
在线阅读 下载PDF
上一页 1 2 下一页 到第
使用帮助 返回顶部