期刊文献+
共找到569篇文章
< 1 2 29 >
每页显示 20 50 100
A multivariate grey incidence model for different scale data based on spatial pyramid pooling 被引量:7
1
作者 ZHANG Ke CUI Le YIN Yao 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2020年第4期770-779,共10页
In order to solve the problem that existing multivariate grey incidence models cannot be applied to time series on different scales, a new model is proposed based on spatial pyramid pooling.Firstly, local features of ... In order to solve the problem that existing multivariate grey incidence models cannot be applied to time series on different scales, a new model is proposed based on spatial pyramid pooling.Firstly, local features of multivariate time series on different scales are pooled and aggregated by spatial pyramid pooling to construct n levels feature pooling matrices on the same scale. Secondly,Deng's multivariate grey incidence model is introduced to measure the degree of incidence between feature pooling matrices at each level. Thirdly, grey incidence degrees at each level are integrated into a global incidence degree. Finally, the performance of the proposed model is verified on two data sets compared with a variety of algorithms. The results illustrate that the proposed model is more effective and efficient than other similarity measure algorithms. 展开更多
关键词 grey system spatial pyramid pooling grey incidence multivariate time series
在线阅读 下载PDF
Automatic Segmentation Method for Cone-Beam Computed Tomography Image of the Bone Graft Region within Maxillary Sinus Based on the Atrous Spatial Pyramid Convolution Network 被引量:1
2
作者 XU Jiangchang HE Shamin +2 位作者 YU Dedong WU Yiqun CHEN Xiaojun 《Journal of Shanghai Jiaotong university(Science)》 EI 2021年第3期298-305,共8页
Sinus floor elevation with a lateral window approach requires bone graft(BG)to ensure sufficient bone mass,and it is necessary to measure and analyse the BG region for follow-up of postoperative patients.However,the B... Sinus floor elevation with a lateral window approach requires bone graft(BG)to ensure sufficient bone mass,and it is necessary to measure and analyse the BG region for follow-up of postoperative patients.However,the BG region from cone-beam computed tomography(CBCT)images is connected to the margin of the maxillary sinus,and its boundary is blurred.Common segmentation methods are usually performed manually by experienced doctors,and are complicated by challenges such as low efficiency and low precision.In this study,an auto-segmentation approach was applied to the BG region within the maxillary sinus based on an atrous spatial pyramid convolution(ASPC)network.The ASPC module was adopted using residual connections to compose multiple atrous convolutions,which could extract more features on multiple scales.Subsequently,a segmentation network of the BG region with multiple ASPC modules was established,which effectively improved the segmentation performance.Although the training data were insufficient,our networks still achieved good auto-segmentation results,with a dice coefficient(Dice)of 87.13%,an Intersection over Union(Iou)of 78.01%,and a sensitivity of 95.02%.Compared with other methods,our method achieved a better segmentation effect,and effectively reduced the misjudgement of segmentation.Our method can thus be used to implement automatic segmentation of the BG region and improve doctors’work efficiency,which is of great importance for developing preliminary studies on the measurement of postoperative BG within the maxillary sinus. 展开更多
关键词 atrous spatial pyramid convolution(ASPC) bone graft(BG)region medical image segmentation residual connection
原文传递
Local-Tetra-Patterns for Face Recognition Encoded on Spatial Pyramid Matching
3
作者 Khuram Nawaz Khayam Zahid Mehmood +4 位作者 Hassan Nazeer Chaudhry Muhammad Usman Ashraf Usman Tariq Mohammed Nawaf Altouri Khalid Alsubhi 《Computers, Materials & Continua》 SCIE EI 2022年第3期5039-5058,共20页
Face recognition is a big challenge in the research field with a lot of problems like misalignment,illumination changes,pose variations,occlusion,and expressions.Providing a single solution to solve all these problems... Face recognition is a big challenge in the research field with a lot of problems like misalignment,illumination changes,pose variations,occlusion,and expressions.Providing a single solution to solve all these problems at a time is a challenging task.We have put some effort to provide a solution to solving all these issues by introducing a face recognition model based on local tetra patterns and spatial pyramid matching.The technique is based on a procedure where the input image is passed through an algorithm that extracts local features by using spatial pyramid matching andmax-pooling.Finally,the input image is recognized using a robust kernel representation method using extracted features.The qualitative and quantitative analysis of the proposed method is carried on benchmark image datasets.Experimental results showed that the proposed method performs better in terms of standard performance evaluation parameters as compared to state-of-the-art methods on AR,ORL,LFW,and FERET face recognition datasets. 展开更多
关键词 Face recognition local tetra patterns spatial pyramid matching robust kernel representation max-pooling
在线阅读 下载PDF
EYE-YOLO: a multi-spatial pyramid pooling and Focal-EIOU loss inspired tiny YOLOv7 for fundus eye disease detection
4
作者 Akhil Kumar R.Dhanalakshmi 《International Journal of Intelligent Computing and Cybernetics》 2024年第3期503-522,共20页
Purpose:The purpose of this work is to present an approach for autonomous detection of eye disease in fundus images.Furthermore,this work presents an improved variant of the Tiny YOLOv7 model developed specifically fo... Purpose:The purpose of this work is to present an approach for autonomous detection of eye disease in fundus images.Furthermore,this work presents an improved variant of the Tiny YOLOv7 model developed specifically for eye disease detection.The model proposed in this work is a highly useful tool for the development of applications for autonomous detection of eye diseases in fundus images that can help and assist ophthalmologists.Design/methodology/approach:The approach adopted to carry out this work is twofold.Firstly,a richly annotated dataset consisting of eye disease classes,namely,cataract,glaucoma,retinal disease and normal eye,was created.Secondly,an improved variant of the Tiny YOLOv7 model was developed and proposed as EYE-YOLO.The proposed EYE-YOLO model has been developed by integrating multi-spatial pyramid pooling in the feature extraction network and Focal-EIOU loss in the detection network of the Tiny YOLOv7 model.Moreover,at run time,the mosaic augmentation strategy has been utilized with the proposed model to achieve benchmark results.Further,evaluations have been carried out for performance metrics,namely,precision,recall,F1 Score,average precision(AP)and mean average precision(mAP).Findings:The proposed EYE-YOLO achieved 28%higher precision,18%higher recall,24%higher F1 Score and 30.81%higher mAP than the Tiny YOLOv7 model.Moreover,in terms of AP for each class of the employed dataset,it achieved 9.74%higher AP for cataract,27.73%higher AP for glaucoma,72.50%higher AP for retina disease and 13.26%higher AP for normal eye.In comparison to the state-of-the-art Tiny YOLOv5,Tiny YOLOv6 and Tiny YOLOv8 models,the proposed EYE-YOLO achieved 6:23.32%higher mAP.Originality/value:This work addresses the problem of eye disease recognition as a bounding box regression and detection problem.Whereas,the work in the related research is largely based on eye disease classification.The other highlight of this work is to propose a richly annotated dataset for different eye diseases useful for training deep learning-based object detectors.The major highlight of this work lies in the proposal of an improved variant of the Tiny YOLOv7 model focusing on eye disease detection.The proposed modifications in the Tiny YOLOv7 aided the proposed model in achieving better results as compared to the state-of-the-art Tiny YOLOv8 and YOLOv8 Nano. 展开更多
关键词 Tiny YOLOv7 spatial pyramid pooling Focal-EIOU loss Eye disease detection
在线阅读 下载PDF
HSPOG:An Optimized Target Recognition Method Based on Histogram of Spatial Pyramid Oriented Gradients 被引量:4
5
作者 Shaojun Guo Feng Liu +3 位作者 Xiaohu Yuan Chunrong Zou Li Chen Tongsheng Shen 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2021年第4期475-483,共9页
The Histograms of Oriented Gradients(HOG)can produce good results in an image target recognition mission,but it requires the same size of the target images for classification of inputs.In response to this shortcoming,... The Histograms of Oriented Gradients(HOG)can produce good results in an image target recognition mission,but it requires the same size of the target images for classification of inputs.In response to this shortcoming,this paper performs spatial pyramid segmentation on target images of any size,gets the pixel size of each image block dynamically,and further calculates and normalizes the gradient of the oriented feature of each block region in each image layer.The new feature is called the Histogram of Spatial Pyramid Oriented Gradients(HSPOG).This approach can obtain stable vectors for images of any size,and increase the target detection rate in the image recognition process significantly.Finally,the article verifies the algorithm using VOC2012 image data and compares the effect of HOG. 展开更多
关键词 Histograms of Oriented Gradients(HOG) Histogram of spatial pyramid Oriented Gradients(HSPOG) object recognition spatial pyramid segmentation
原文传递
Feature pyramid attention network for audio-visual scene classification
6
作者 Liguang Zhou Yuhongze Zhou +3 位作者 Xiaonan Qi Junjie Hu Tin Lun Lam Yangsheng Xu 《CAAI Transactions on Intelligence Technology》 2025年第2期359-374,共16页
Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and text... Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and textures found in visual images.The focus of recent studies has predominantly revolved around extracting features from diverse neural network structures,inadvertently neglecting the acquisition of semantically meaningful regions and crucial components within audio-visual data.The authors present a feature pyramid attention network(FPANet)for audio-visual scene understanding,which extracts semantically significant characteristics from audio-visual data.The authors’approach builds multi-scale hierarchical features of sound spectrograms and visual images using a feature pyramid representation and localises the semantically relevant regions with a feature pyramid attention module(FPAM).A dimension alignment(DA)strategy is employed to align feature maps from multiple layers,a pyramid spatial attention(PSA)to spatially locate essential regions,and a pyramid channel attention(PCA)to pinpoint significant temporal frames.Experiments on visual scene classification(VSC),audio scene classification(ASC),and AVSC tasks demonstrate that FPANet achieves performance on par with state-of-the-art(SOTA)approaches,with a 95.9 F1-score on the ADVANCE dataset and a relative improvement of 28.8%.Visualisation results show that FPANet can prioritise semantically meaningful areas in audio-visual signals. 展开更多
关键词 dimension alignment feature pyramid attention network pyramid channel attention pyramid spatial attention semantic relevant regions
在线阅读 下载PDF
DCA-YOLO:Detection Algorithm for YOLOv8 Pulmonary Nodules Based on Attention Mechanism Optimization 被引量:1
7
作者 SONG Yongsheng LIU Guohua 《Journal of Donghua University(English Edition)》 2025年第1期78-87,共10页
Pulmonary nodules represent an early manifestation of lung cancer.However,pulmonary nodules only constitute a small portion of the overall image,posing challenges for physicians in image interpretation and potentially... Pulmonary nodules represent an early manifestation of lung cancer.However,pulmonary nodules only constitute a small portion of the overall image,posing challenges for physicians in image interpretation and potentially leading to false positives or missed detections.To solve these problems,the YOLOv8 network is enhanced by adding deformable convolution and atrous spatial pyramid pooling(ASPP),along with the integration of a coordinate attention(CA)mechanism.This allows the network to focus on small targets while expanding the receptive field without losing resolution.At the same time,context information on the target is gathered and feature expression is enhanced by attention modules in different directions.It effectively improves the positioning accuracy and achieves good results on the LUNA16 dataset.Compared with other detection algorithms,it improves the accuracy of pulmonary nodule detection to a certain extent. 展开更多
关键词 pulmonary nodule YOLOv8 network object detection deformable convolution atrous spatial pyramid pooling(ASPP) coordinate attention(CA)mechanism
在线阅读 下载PDF
基于UCTransNet的建筑损害评估模型
8
作者 谢国波 张文亮 +1 位作者 何林 林志毅 《计算机工程与设计》 北大核心 2025年第1期44-51,共8页
为提高建筑损害的评估精度,提出一种基于UCTransNet的双阶段灾后建筑损害评估模型(MGDLNet)。阶段一使用UCTransNet完成建筑分割。阶段二使用改进后的DM-UCTransNet进行建筑损害评估,通过差异特征提取模块充分融合多尺度的建筑损害特征... 为提高建筑损害的评估精度,提出一种基于UCTransNet的双阶段灾后建筑损害评估模型(MGDLNet)。阶段一使用UCTransNet完成建筑分割。阶段二使用改进后的DM-UCTransNet进行建筑损害评估,通过差异特征提取模块充分融合多尺度的建筑损害特征,嵌入空间金字塔更好捕捉小目标建筑及边缘特征,引入深度监督机制和改进损失函数加强浅层特征学习并平衡样本。实验结果表明,MGDLNet在目标数据集有较大优势,其加权F1得分相较于SegNet、UNet、DeeplabV3+、TransUNet和UCTransNet分别提高了8.6%、1.9%、5.0%、2.7%和1.4%。 展开更多
关键词 建筑损害评估 UCTransNet 双阶段 差异特征 空间金字塔 深度监督 损失函数
在线阅读 下载PDF
改进U-Net模型的隧道掌子面图像语义分割研究
9
作者 陈登峰 程静 +1 位作者 赵蕾 何拓航 《防灾减灾工程学报》 北大核心 2025年第4期776-783,共8页
隧道掌子面岩体结构是判断岩土工程地质条件、制定施工和支护方案、预防塌方及涌水等事故的直观依据。将U-Net模型应用于掌子面岩体结构图像分割与识别时,下采样过程中缩小图像尺寸会导致岩体部分细节信息丢失,上采样过程中将低层特征... 隧道掌子面岩体结构是判断岩土工程地质条件、制定施工和支护方案、预防塌方及涌水等事故的直观依据。将U-Net模型应用于掌子面岩体结构图像分割与识别时,下采样过程中缩小图像尺寸会导致岩体部分细节信息丢失,上采样过程中将低层特征传递到高层的跳跃连接导致特征映射过大。因此,提出加入空洞空间卷积池化金字塔模块ASPP和卷积注意力模块CBAM的改进U-Net模型。在U-Net模型的跳跃连接过程中加ASPP,利用不同膨胀率的空洞卷积捕获不同尺度的上下文信息,融合不同感受野的信息,从而更全面的理解图像内容;U-Net模型的下采样过程中加入CBAM,使网络模型更加关注有用的特征,从而增强特征的表达能力。实验结果表明,改进的网络模型相较于原始U-Net模型分割和识别性能有显著提升,在某隧道工程掌子面岩体图像数据集上Precision达到93.04%,mIoU达到74.98%,mPA达到78.89%。 展开更多
关键词 隧道掌子面 图像语义分割 卷积注意力模块 空洞空间卷积池化金字塔模块
原文传递
基于多尺度特征融合与重构卷积的肝肿瘤图像分割方法
10
作者 马金林 酒志青 +4 位作者 马自萍 夏明格 张凯 程叶霞 马瑞士 《华南理工大学学报(自然科学版)》 北大核心 2025年第5期94-108,共15页
针对肝肿瘤图像特征表达能力不足和全局上下文信息传递受限的问题,该文提出一种基于改进U-Net的肝肿瘤图像分割方法。首先,设计了一种低秩重构卷积来优化传统卷积运算所导致的大量参数问题,并用其构建使用残差结构改进编解码器的卷积核... 针对肝肿瘤图像特征表达能力不足和全局上下文信息传递受限的问题,该文提出一种基于改进U-Net的肝肿瘤图像分割方法。首先,设计了一种低秩重构卷积来优化传统卷积运算所导致的大量参数问题,并用其构建使用残差结构改进编解码器的卷积核重构模块,使编码器保留更多的细节信息,并使解码器能更有效地恢复信息,以提升肝肿瘤图像特征的表达能力。然后,为丰富全局上下文信息的传递,设计了三分支空间金字塔池化模块来优化瓶颈结构的信息传递,打破单一路径的限制。接着,设计了多尺度特征融合模块来优化编码器信息的复用机制,增强模型对全局上下文信息的建模能力,并提升其在提取不同尺度肝肿瘤图像特征时的效能。最后,在LiTS2017和3DIRCADb数据集上对该文方法的性能进行了测试。实验结果表明:在LiTS2017数据集上的肝脏图像分割任务中,该文方法的Dice系数和IoU值分别达97.56%和95.25%,在肝肿瘤图像分割任务中的Dice系数和IoU值分别达89.71%和81.58%;在3DIRCADb数据集上的肝脏图像分割任务中,该文方法的Dice系数和IoU值分别达97.63%和95.39%,在肝肿瘤图像分割任务中的Dice系数和IoU值分别达89.62%和81.63%。 展开更多
关键词 肝肿瘤图像分割 卷积核重构 空间金字塔池化 多尺度特征融合
在线阅读 下载PDF
结合倒残差自注意力机制的遥感图像目标检测
11
作者 赵文清 赵振寰 巩佳潇 《智能系统学报》 北大核心 2025年第1期64-72,共9页
针对遥感图像目标检测存在背景信息干扰严重、待检测目标尺寸差异大等问题,提出一种结合倒残差自注意力机制的目标检测方法。首先,使用具有强特征提取能力的倒残差自注意力机制骨干网络充分提取目标特征,降低复杂背景信息的干扰;其次,... 针对遥感图像目标检测存在背景信息干扰严重、待检测目标尺寸差异大等问题,提出一种结合倒残差自注意力机制的目标检测方法。首先,使用具有强特征提取能力的倒残差自注意力机制骨干网络充分提取目标特征,降低复杂背景信息的干扰;其次,构造多尺度空间金字塔池化模块,提供多尺度感受野,增强捕捉不同尺寸目标的能力;最后,提出轻量级特征融合模块,对骨干网络提取的特征图进行融合,充分结合低层与高层特征,提高网络对不同尺寸目标的检测能力。与传统网络及其他改进目标检测算法进行对比,实验发现该方法的检测精度明显优于其他算法。此外,在DIOR数据集和RSOD数据集上设计消融实验,结果表明,该方法在DIOR数据集与RSOD数据集上的平均精度均值比YOLOv8算法分别提升4.6和4.2百分点,明显提升遥感图像目标检测的精度。 展开更多
关键词 遥感图像 目标检测 倒残差 自注意力机制 多尺度 空间金字塔 特征提取 特征融合
在线阅读 下载PDF
基于多尺度渐近金字塔的太阳电池缺陷检测网络
12
作者 朱磊 耿萃萃 +3 位作者 李博涛 潘杨 张博 姚丽娜 《太阳能学报》 北大核心 2025年第5期267-274,共8页
以YOLOv8网络为基础提出一种多尺度渐近金字塔网络MSANet。首先使用带有分层特征融合结构的特征提取块M-Block替换常规卷积层,以增强网络对多尺度目标的特征提取能力;其次引入空间注意力机制(SRU),抑制背景区域的特征冗余,使网络能更关... 以YOLOv8网络为基础提出一种多尺度渐近金字塔网络MSANet。首先使用带有分层特征融合结构的特征提取块M-Block替换常规卷积层,以增强网络对多尺度目标的特征提取能力;其次引入空间注意力机制(SRU),抑制背景区域的特征冗余,使网络能更关注重点区域的同时减少参数量的引入;最后提出一种改进渐近金字塔网络AFPNa结构,缓解网络在特征融合过程中信息的丢失或退化问题,提升缺陷检测精度。实验结果表明,与YOLOv8原模型及RTMDET等7种先进检测网络相比,MSANet具有更高的检测精度,相较原模型均值平均精度提升5.7个百分点。 展开更多
关键词 缺陷检测 深度学习 太阳电池 分层特征融合结构 多尺度渐近金字塔 空间注意力机制
原文传递
基于改进Hyper-YOLO的煤矿输送带异物检测方法
13
作者 李刚 朱宇 +6 位作者 杨庆贺 邹军鹏 才天 贺鹏 张亚兵 赵艺鸣 田鑫浩 《工矿自动化》 北大核心 2025年第7期114-121,共8页
基于YOLO系列的输送带异物检测技术已取得丰富的研究成果,但其颈部网络无法使相隔较远的特征层直接交换特征信息,引发小目标漏检、重复检测等问题。Hyper-YOLO可在颈部网络实现特征层之间跨层、跨位置的高阶关联,但会增加计算量,且降低... 基于YOLO系列的输送带异物检测技术已取得丰富的研究成果,但其颈部网络无法使相隔较远的特征层直接交换特征信息,引发小目标漏检、重复检测等问题。Hyper-YOLO可在颈部网络实现特征层之间跨层、跨位置的高阶关联,但会增加计算量,且降低对高频特征信息的敏感性,导致在噪声较为敏感的区域特征提取能力下降,预测边界框发生偏移。针对上述问题,提出一种基于改进Hyper-YOLO的煤矿输送带异物检测方法。在图像预处理阶段采用动态对比度受限自适应直方图均衡化(Dy-CLAHE)方法,将Laplacian算子引入对比度受限自适应直方图均衡化(CLAHE)框架,建立噪声水平与对比度限制阈值之间的动态映射关系,有效解决了粉尘环境下图像细节丢失和噪声放大的问题;对Hyper-YOLO进行改进,采用高效交并比(EIoU)损失函数优化边界框回归过程,提升了预测边界框定位精度,并在混合聚合网络(MANet)的深层和浅层嵌入高效通道注意力机制(ECA)模块,通过局部跨通道交互动态调整通道权重,有效平衡对高频和低频特征信息的敏感性,降低小目标异物的漏检率,同时通过简化快速空间金字塔池化(SimSPPF)模块,减少了冗余计算,在保证精度的同时提升了推理速度。实验结果表明:改进Hyper-YOLO在准确率和mAP@0.5指标上分别为94.2%和93.4%,相较于Hyper-YOLO提高了5.0%和3.5%,参数量为3.26×10^(6)个,召回率为87.7%,检测速度为158帧/s,满足煤矿井下异物实时检测的需求;在不同煤矿输送带异物检测场景下无漏检及重复检测情况,预测边界框更贴合异物。 展开更多
关键词 煤矿输送带 异物检测 Hyper-YOLO 动态对比度受限自适应直方图均衡 EIoU 高效通道注意力机制 简化快速空间金字塔池化
在线阅读 下载PDF
Steel surface defect detection based on lightweight YOLOv7
14
作者 SHI Tao WU Rongxin +1 位作者 ZHU Wenxu MA Qingliang 《Optoelectronics Letters》 2025年第5期306-313,共8页
Aiming at the problems of low detection efficiency and difficult positioning of traditional steel surface defect detection methods,a lightweight steel surface defect detection model based on you only look once version... Aiming at the problems of low detection efficiency and difficult positioning of traditional steel surface defect detection methods,a lightweight steel surface defect detection model based on you only look once version 7(YOLOv7)is proposed.First,a cascading style sheets(CSS)block module is proposed,which uses more lightweight operations to obtain redundant information in the feature map,reduces the amount of computation,and effectively improves the detection speed.Secondly,the improved spatial pyramid pooling with cross stage partial convolutions(SPPCSPC)structure is adopted to ensure that the model can also pay attention to the defect location information while predicting the defect category information,obtain richer defect features.In addition,the convolution operation in the original model is simplified,which significantly reduces the size of the model and helps to improve the detection speed.Finally,using efficient intersection over union(EIOU)loss to focus on high-quality anchors,speed up convergence and improve positioning accuracy.Experiments were carried out on the Northeastern University-defect(NEU-DET)steel surface defect dataset.Compared with the original YOLOv7 model,the number of parameters of this model was reduced by 40%,the frames per second(FPS)reached 112,and the average accuracy reached 79.1%,the detection accuracy and speed have been improved,which can meet the needs of steel surface defect detection. 展开更多
关键词 obtain redundant information defect detection steel surface cascading style sheets block module lightweight yolov lightweight operations spatial pyramid pooling steel surface defect detection
原文传递
DrownACB-YOLO:an Improved YOLO for Drowning Detection in Swimming Pools
15
作者 ZENG Xiaoya XU Wujun ZHANG Xiunian 《Journal of Donghua University(English Edition)》 2025年第4期417-424,共8页
With the rise in drowning accidents in swimming pools,the demand for the precision and speed in artificial intelligence(AI)drowning detection methods has become increasingly crucial.Here,an improved YOLO-based method,... With the rise in drowning accidents in swimming pools,the demand for the precision and speed in artificial intelligence(AI)drowning detection methods has become increasingly crucial.Here,an improved YOLO-based method,named DrownACB-YOLO,for drowning detection in swimming pools is proposed.Since existing methods focus on the drowned state,a transition label is added to the original dataset to provide timely alerts.Following this expanded dataset,two improvements are implemented in the original YOLOv5.Firstly,the spatial pyramid pooling(SPP)module and the default upsampling operator are replaced by the atrous spatial pyramid pooling(ASPP)module and the content-aware reassembly of feature(CARAFE)module,respectively.Secondly,the cross stage partial bottleneck with three convolutions(C3)module at the end of the backbone is replaced with the bottleneck transformer(BotNet)module.The results of comparison experiments demonstrate that DrownACB-YOLO performs better than other models. 展开更多
关键词 drowning detection YOLO atrous spatial pyramid pooling(ASPP) content-aware reassembly of feature(CARAFE)
在线阅读 下载PDF
基于YOLOv5s的舰船小目标检测方法研究
16
作者 师红宇 蔡自桂 +1 位作者 杜文 张哲于 《舰船电子工程》 2025年第2期34-38,73,共6页
海面舰船目标检测容易受陆地、海浪等背景的干扰。针对舰船小目标检测精度低和鲁棒性差的问题,提出一种改进的舰船目标检测模型CWMA-YOLOv5s。首先,设计具有多分支跨层连接的C2f模块丰富多目标舰船梯度流信息。然后,设计并实现了残差多... 海面舰船目标检测容易受陆地、海浪等背景的干扰。针对舰船小目标检测精度低和鲁棒性差的问题,提出一种改进的舰船目标检测模型CWMA-YOLOv5s。首先,设计具有多分支跨层连接的C2f模块丰富多目标舰船梯度流信息。然后,设计并实现了残差多头自注意力融合模块优化特征融合效果。其次,改进Predection网络,设计SCP结构,提高了舰船目标的显著度。最后,引入改进的WIOU损失函数解决CIOU损失函数带来的梯度爆炸和模型提前退化问题。实验结果表明,与YOLOv5s模型相比,该模型在MASATI-v2数据集上,精度提高了13.1%,召回率提高了12.8%,mAP@50提高了6.8%。与其他同类型检测算法相比,该算法拥有更好的学习能力,整体检测精度达到了82.3%,具有较强的鲁棒性。 展开更多
关键词 舰船检测 多头自注意力机制 空间上下文金字塔 WIOU损失函数
在线阅读 下载PDF
基于卷积金字塔网络的PPO算法求解作业车间调度问题 被引量:1
17
作者 徐帅 李艳武 +1 位作者 谢辉 牛晓伟 《现代制造工程》 北大核心 2025年第3期19-30,共12页
作业车间调度问题是一个经典的NP-hard组合优化问题,其调度方案的优劣直接影响制造系统的运行效率。为得到更优的调度策略,以最小化最大完工时间为优化目标,提出了一种基于近端策略优化(Proximal Policy Optimization,PPO)和卷积神经网... 作业车间调度问题是一个经典的NP-hard组合优化问题,其调度方案的优劣直接影响制造系统的运行效率。为得到更优的调度策略,以最小化最大完工时间为优化目标,提出了一种基于近端策略优化(Proximal Policy Optimization,PPO)和卷积神经网络(Convolutional Neural Network,CNN)的深度强化学习(Deep Reinforcement Learning,DRL)调度方法。设计了一种三通道状态表示方法,选取16种启发式调度规则作为动作空间,将奖励函数等价为最小化机器总空闲时间。为使训练得到的调度策略能够处理不同规模的调度算例,在卷积神经网络中使用空间金字塔池化(Spatial Pyramid Pooling,SPP),将不同维度的特征矩阵转化为固定长度的特征向量。在公开OR-Library的42个作业车间调度(Job-Shop Scheduling Problem,JSSP)算例上进行了计算实验。仿真实验结果表明,该算法优于单一启发式调度规则和遗传算法,在大部分算例中取得了比现有深度强化学习算法更好的结果,且平均完工时间最小。 展开更多
关键词 深度强化学习 作业车间调度 卷积神经网络 近端策略优化 空间金字塔池化
在线阅读 下载PDF
基于改进的YOLOv8n海洋动物目标检测算法:DPSC-YOLO 被引量:1
18
作者 梁佳杰 徐慧英 +3 位作者 朱信忠 王舒梦 刘子洋 李琛 《计算机工程与科学》 北大核心 2025年第4期695-705,共11页
在海洋复杂的环境中,由于图像拍摄模糊、背景复杂,导致基于深度学习的目标检测算法存在特征提取困难和目标漏检等问题,因此海洋目标检测算法需要更加高效且性能优越。为此提出了一种基于YOLOv8n改进的海洋动物目标检测算法:DPSC-YOLO。... 在海洋复杂的环境中,由于图像拍摄模糊、背景复杂,导致基于深度学习的目标检测算法存在特征提取困难和目标漏检等问题,因此海洋目标检测算法需要更加高效且性能优越。为此提出了一种基于YOLOv8n改进的海洋动物目标检测算法:DPSC-YOLO。在主干网络中引入DCNv2模块,通过增强空间建模能力来适应对象的几何变化;在主干网络末端引入空间金字塔池化SPPFCSPC,在保持模型感知场不变的同时减少模型的计算量;在颈部网络增加F 2极小目标检测头,结合其余3个尺度,使用4个不同的感受野检测层提高小目标检测精度;在颈部网络的C2f模块中结合CoTAttention注意力机制更好地利用相邻键之间的上下文信息,并根据数据的特点动态调整注意力分配。实验结果表明,DPSC-YOLO目标检测算法与YOLOv8n相比mAP@0.5提升了1.1%,mAP@0.5:0.95提升了4.6%,同时仅有较少的参数量和计算量的增加,证明DPSC-YOLO更适合复杂海洋环境中的目标检测任务。 展开更多
关键词 YOLOv8 DCNv2 SPPFCSPC 上下文注意力机制 小目标检测头
在线阅读 下载PDF
基于改进YOLOv5的半监督车辆检测算法 被引量:1
19
作者 高睿 安国成 +1 位作者 邹丹平 裴凌 《计算机工程》 北大核心 2025年第3期300-309,共10页
目前,交通场景中的车辆检测存在目标尺度差异显著以及遮挡重叠严重等问题,且对大规模数据进行完全标注需要较高的成本。针对以上情况,提出一种基于改进YOLOv5的半监督车辆检测算法。引入SimOTA样本匹配方法,优化次优匹配现象,改善目标... 目前,交通场景中的车辆检测存在目标尺度差异显著以及遮挡重叠严重等问题,且对大规模数据进行完全标注需要较高的成本。针对以上情况,提出一种基于改进YOLOv5的半监督车辆检测算法。引入SimOTA样本匹配方法,优化次优匹配现象,改善目标尺度形状变化导致的检测困难;提出一种新的空间金字塔池化网络SPPFA,通过引入LSKA,在增大感受野的同时实现空间和通道的自适应性,缓解大尺度目标和遮挡问题产生的影响;将CIoU替换为SIoU,优化回归损失函数。在此基础上,提出一种改进的半监督深度学习算法,通过优化损失函数设计,增强算法学习未标注样本中有益信息的能力,有效提高模型对车辆的检测精度。实验结果表明,改进后的算法在自制车辆数据集上mAP@0.5指标达到了58.2%,相较YOLOv5n基线模型提升了11.1百分点,且模型体积远小于主流目标检测算法,具有良好的工程应用前景。 展开更多
关键词 YOLOv5 车辆检测 样本匹配 空间金字塔池化 半监督学习
在线阅读 下载PDF
基于大模型的钻井现场人体姿态估计方法研究 被引量:1
20
作者 刘兆年 连远锋 +2 位作者 师印亮 王宁 姜彬 《钻采工艺》 北大核心 2025年第1期104-112,共9页
准确的人体姿态估计对钻井现场员工行为的监测和安全预警至关重要。针对钻井平台现场监控视频中存在高反光、高模糊和遮挡问题,提出一种基于双向特征融合的人体姿态估计模型,通过构建一种高效的双向特征融合机制,在ViT预训练模型的基础... 准确的人体姿态估计对钻井现场员工行为的监测和安全预警至关重要。针对钻井平台现场监控视频中存在高反光、高模糊和遮挡问题,提出一种基于双向特征融合的人体姿态估计模型,通过构建一种高效的双向特征融合机制,在ViT预训练模型的基础上引入空洞金字塔池化技术捕捉的图像多尺度空间特征。该机制可同时关注ViT预训练模型内部特征、多尺度空间特征以及两者间的交互特征,实现多类特征的高效集成。实验结果表明,通过与基准模型HRNet的对比,文章方法在KAP和KAR上分别实现了3.6%和4.1%的显著提升。同时,在南海某平台的智能监控系统中对所提出的模型进行应用测试,仍然显示出较高的准确性,为后续深入研究员工不安全行为的智能分析提供了精确的动作估计基础。 展开更多
关键词 人体姿态估计 预训练大模型 空洞金字塔池化 双向特征融合
在线阅读 下载PDF
上一页 1 2 29 下一页 到第
使用帮助 返回顶部