期刊文献+
共找到701篇文章
< 1 2 36 >
每页显示 20 50 100
Dynamic Multi-Graph Spatio-Temporal Graph Traffic Flow Prediction in Bangkok:An Application of a Continuous Convolutional Neural Network
1
作者 Pongsakon Promsawat Weerapan Sae-dan +2 位作者 Marisa Kaewsuwan Weerawat Sudsutad Aphirak Aphithana 《Computer Modeling in Engineering & Sciences》 SCIE EI 2025年第1期579-607,共29页
The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to u... The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets. 展开更多
关键词 Graph neural networks convolutional neural network deep learning dynamic multi-graph spatio-temporal
在线阅读 下载PDF
A Remote Sensing Image Semantic Segmentation Method by Combining Deformable Convolution with Conditional Random Fields 被引量:13
2
作者 Zongcheng ZUO Wen ZHANG Dongying ZHANG 《Journal of Geodesy and Geoinformation Science》 2020年第3期39-49,共11页
Currently,deep convolutional neural networks have made great progress in the field of semantic segmentation.Because of the fixed convolution kernel geometry,standard convolution neural networks have been limited the a... Currently,deep convolutional neural networks have made great progress in the field of semantic segmentation.Because of the fixed convolution kernel geometry,standard convolution neural networks have been limited the ability to simulate geometric transformations.Therefore,a deformable convolution is introduced to enhance the adaptability of convolutional networks to spatial transformation.Considering that the deep convolutional neural networks cannot adequately segment the local objects at the output layer due to using the pooling layers in neural network architecture.To overcome this shortcoming,the rough prediction segmentation results of the neural network output layer will be processed by fully connected conditional random fields to improve the ability of image segmentation.The proposed method can easily be trained by end-to-end using standard backpropagation algorithms.Finally,the proposed method is tested on the ISPRS dataset.The results show that the proposed method can effectively overcome the influence of the complex structure of the segmentation object and obtain state-of-the-art accuracy on the ISPRS Vaihingen 2D semantic labeling dataset. 展开更多
关键词 high-resolution remote sensing image semantic segmentation deformable convolution network conditions random fields
在线阅读 下载PDF
Multi-Layer Feature Extraction with Deformable Convolution for Fabric Defect Detection 被引量:1
3
作者 Jielin Jiang Chao Cui +1 位作者 Xiaolong Xu Yan Cui 《Intelligent Automation & Soft Computing》 2024年第4期725-744,共20页
In the textile industry,the presence of defects on the surface of fabric is an essential factor in determining fabric quality.Therefore,identifying fabric defects forms a crucial part of the fabric production process.... In the textile industry,the presence of defects on the surface of fabric is an essential factor in determining fabric quality.Therefore,identifying fabric defects forms a crucial part of the fabric production process.Traditional fabric defect detection algorithms can only detect specific materials and specific fabric defect types;in addition,their detection efficiency is low,and their detection results are relatively poor.Deep learning-based methods have many advantages in the field of fabric defect detection,however,such methods are less effective in identifying multiscale fabric defects and defects with complex shapes.Therefore,we propose an effective algorithm,namely multilayer feature extraction combined with deformable convolution(MFDC),for fabric defect detection.In MFDC,multi-layer feature extraction is used to fuse the underlying location features with high-level classification features through a horizontally connected top-down architecture to improve the detection of multi-scale fabric defects.On this basis,a deformable convolution is added to solve the problem of the algorithm’s weak detection ability of irregularly shaped fabric defects.In this approach,Roi Align and Cascade-RCNN are integrated to enhance the adaptability of the algorithm in materials with complex patterned backgrounds.The experimental results show that the MFDC algorithm can achieve good detection results for both multi-scale fabric defects and defects with complex shapes,at the expense of a small increase in detection time. 展开更多
关键词 Fabric defect detection multi-layer features deformable convolution
在线阅读 下载PDF
An Arrhythmia Intelligent Recognition Method Based on a Multimodal Information and Spatio-Temporal Hybrid Neural Network Model
4
作者 Xinchao Han Aojun Zhang +6 位作者 Runchuan Li Shengya Shen Di Zhang Bo Jin Longfei Mao Linqi Yang Shuqin Zhang 《Computers, Materials & Continua》 2025年第2期3443-3465,共23页
Electrocardiogram (ECG) analysis is critical for detecting arrhythmias, but traditional methods struggle with large-scale Electrocardiogram data and rare arrhythmia events in imbalanced datasets. These methods fail to... Electrocardiogram (ECG) analysis is critical for detecting arrhythmias, but traditional methods struggle with large-scale Electrocardiogram data and rare arrhythmia events in imbalanced datasets. These methods fail to perform multi-perspective learning of temporal signals and Electrocardiogram images, nor can they fully extract the latent information within the data, falling short of the accuracy required by clinicians. Therefore, this paper proposes an innovative hybrid multimodal spatiotemporal neural network to address these challenges. The model employs a multimodal data augmentation framework integrating visual and signal-based features to enhance the classification performance of rare arrhythmias in imbalanced datasets. Additionally, the spatiotemporal fusion module incorporates a spatiotemporal graph convolutional network to jointly model temporal and spatial features, uncovering complex dependencies within the Electrocardiogram data and improving the model’s ability to represent complex patterns. In experiments conducted on the MIT-BIH arrhythmia dataset, the model achieved 99.95% accuracy, 99.80% recall, and a 99.78% F1 score. The model was further validated for generalization using the clinical INCART arrhythmia dataset, and the results demonstrated its effectiveness in terms of both generalization and robustness. 展开更多
关键词 Multimodal learning spatio-temporal hybrid graph convolutional network data imbalance ECG classification
在线阅读 下载PDF
A local-global dynamic hypergraph convolution with multi-head flow attention for traffic flow forecasting
5
作者 ZHANG Hong LI Yang +3 位作者 LUO Shengjun ZHANG Pengcheng ZHANG Xijun YI Min 《High Technology Letters》 2025年第3期246-256,共11页
Traffic flow prediction is a crucial element of intelligent transportation systems.However,accu-rate traffic flow prediction is quite challenging because of its highly nonlinear,complex,and dynam-ic characteristics.To... Traffic flow prediction is a crucial element of intelligent transportation systems.However,accu-rate traffic flow prediction is quite challenging because of its highly nonlinear,complex,and dynam-ic characteristics.To address the difficulties in simultaneously capturing local and global dynamic spatiotemporal correlations in traffic flow,as well as the high time complexity of existing models,a multi-head flow attention-based local-global dynamic hypergraph convolution(MFA-LGDHC)pre-diction model is proposed.which consists of multi-head flow attention(MHFA)mechanism,graph convolution network(GCN),and local-global dynamic hypergraph convolution(LGHC).MHFA is utilized to extract the time dependency of traffic flow and reduce the time complexity of the model.GCN is employed to catch the spatial dependency of traffic flow.LGHC utilizes down-sampling con-volution and isometric convolution to capture the local and global spatial dependencies of traffic flow.And dynamic hypergraph convolution is used to model the dynamic higher-order relationships of the traffic road network.Experimental results indicate that the MFA-LGDHC model outperforms current popular baseline models and exhibits good prediction performance. 展开更多
关键词 traffic flow prediction multi-head flow attention graph convolution hypergraph learning dynamic spatio-temporal properties
在线阅读 下载PDF
CW-HRNet:Constrained Deformable Sampling and Wavelet-Guided Enhancement for Lightweight Crack Segmentation
6
作者 Dewang Ma 《Journal of Electronic Research and Application》 2025年第5期269-280,共12页
This paper presents CW-HRNet,a high-resolution,lightweight crack segmentation network designed to address challenges in complex scenes with slender,deformable,and blurred crack structures.The model incorporates two ke... This paper presents CW-HRNet,a high-resolution,lightweight crack segmentation network designed to address challenges in complex scenes with slender,deformable,and blurred crack structures.The model incorporates two key modules:Constrained Deformable Convolution(CDC),which stabilizes geometric alignment by applying a tanh limiter and learnable scaling factor to the predicted offsets,and the Wavelet Frequency Enhancement Module(WFEM),which decomposes features using Haar wavelets to preserve low-frequency structures while enhancing high-frequency boundaries and textures.Evaluations on the CrackSeg9k benchmark demonstrate CW-HRNet’s superior performance,achieving 82.39%mIoU with only 7.49M parameters and 10.34 GFLOPs,outperforming HrSegNet-B48 by 1.83% in segmentation accuracy with minimal complexity overhead.The model also shows strong cross-dataset generalization,achieving 60.01%mIoU and 66.22%F1 on Asphalt3k without fine-tuning.These results highlight CW-HRNet’s favorable accuracyefficiency trade-off for real-world crack segmentation tasks. 展开更多
关键词 Crack segmentation Lightweight semantic segmentation deformable convolution Wavelet transform Road infrastructure
在线阅读 下载PDF
Attention-Augmented YOLOv8 with Ghost Convolution for Real-Time Vehicle Detection in Intelligent Transportation Systems
7
作者 Syed Sajid Ullah Muhammad Zunair Zamir +1 位作者 Ahsan Ishfaq Salman Khan 《Journal on Artificial Intelligence》 2025年第1期255-274,共20页
Accurate vehicle detection is essential for autonomous driving,traffic monitoring,and intelligent transportation systems.This paper presents an enhanced YOLOv8n model that incorporates the Ghost Module,Convolutional B... Accurate vehicle detection is essential for autonomous driving,traffic monitoring,and intelligent transportation systems.This paper presents an enhanced YOLOv8n model that incorporates the Ghost Module,Convolutional Block Attention Module(CBAM),and Deformable Convolutional Networks v2(DCNv2).The Ghost Module streamlines feature generation to reduce redundancy,CBAM applies channel and spatial attention to improve feature focus,and DCNv2 enables adaptability to geometric variations in vehicle shapes.These components work together to improve both accuracy and computational efficiency.Evaluated on the KITTI dataset,the proposed model achieves 95.4%mAP@0.5—an 8.97% gain over standard YOLOv8n—along with 96.2% precision,93.7% recall,and a 94.93%F1-score.Comparative analysis with seven state-of-the-art detectors demonstrates consistent superiority in key performance metrics.An ablation study is also conducted to quantify the individual and combined contributions of GhostModule,CBAM,and DCNv2,highlighting their effectiveness in improving detection performance.By addressing feature redundancy,attention refinement,and spatial adaptability,the proposed model offers a robust and scalable solution for vehicle detection across diverse traffic scenarios. 展开更多
关键词 YOLOv8n vehicle detection deformable convolutional networks(DCNv2) ghost module convolutional block attention module(CBAM) attention mechanisms
在线阅读 下载PDF
Improved spatio-temporal alignment measurement method for hull deformation
8
作者 XU Dongsheng YU Yuanjin +1 位作者 ZHANG Xiaoli PENG Xiafu 《Journal of Systems Engineering and Electronics》 SCIE CSCD 2024年第2期485-494,共10页
In this paper,an improved spatio-temporal alignment measurement method is presented to address the inertial matching measurement of hull deformation under the coexistence of time delay and large misalignment angle.Lar... In this paper,an improved spatio-temporal alignment measurement method is presented to address the inertial matching measurement of hull deformation under the coexistence of time delay and large misalignment angle.Large misalignment angle and time delay often occur simultaneously and bring great challenges to the accurate measurement of hull deformation in space and time.The proposed method utilizes coarse alignment with large misalignment angle and time delay estimation of inertial measurement unit modeling to establish a brand-new spatiotemporal aligned hull deformation measurement model.In addition,two-step loop control is designed to ensure the accurate description of dynamic deformation angle and static deformation angle by the time-space alignment method of hull deformation.The experiments illustrate that the proposed method can effectively measure the hull deformation angle when time delay and large misalignment angle coexist. 展开更多
关键词 inertial measurement spatio-temporal alignment hull deformation
在线阅读 下载PDF
Land cover classification from remote sensing images based on multi-scale fully convolutional network 被引量:18
9
作者 Rui Li Shunyi Zheng +2 位作者 Chenxi Duan Libo Wang Ce Zhang 《Geo-Spatial Information Science》 SCIE EI CSCD 2022年第2期278-294,共17页
Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propos... Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propose a Multi-Scale Fully Convolutional Network(MSFCN)with a multi-scale convolutional kernel as well as a Channel Attention Block(CAB)and a Global Pooling Module(GPM)in this paper to exploit discriminative representations from two-dimensional(2D)satellite images.Meanwhile,to explore the ability of the proposed MSFCN for spatio-temporal images,we expand our MSFCN to three-dimension using three-dimensional(3D)CNN,capable of harnessing each land cover category’s time series interac-tion from the reshaped spatio-temporal remote sensing images.To verify the effectiveness of the proposed MSFCN,we conduct experiments on two spatial datasets and two spatio-temporal datasets.The proposed MSFCN achieves 60.366%on the WHDLD dataset and 75.127%on the GID dataset in terms of mIoU index while the figures for two spatio-temporal datasets are 87.753%and 77.156%.Extensive comparative experiments and abla-tion studies demonstrate the effectiveness of the proposed MSFCN. 展开更多
关键词 spatio-temporal remote sensing images Multi-Scale Fully convolutional Network land cover classification
原文传递
An attention graph stacked autoencoder for anomaly detection of electro-mechanical actuator using spatio-temporal multivariate signals 被引量:1
10
作者 Jianyu WANG Heng ZHANG Qiang MIAO 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2024年第9期506-520,共15页
Health monitoring of electro-mechanical actuator(EMA)is critical to ensure the security of airplanes.It is difficult or even impossible to collect enough labeled failure or degradation data from actual EMA.The autoenc... Health monitoring of electro-mechanical actuator(EMA)is critical to ensure the security of airplanes.It is difficult or even impossible to collect enough labeled failure or degradation data from actual EMA.The autoencoder based on reconstruction loss is a popular model that can carry out anomaly detection with only consideration of normal training data,while it fails to capture spatio-temporal information from multivariate time series signals of multiple monitoring sensors.To mine the spatio-temporal information from multivariate time series signals,this paper proposes an attention graph stacked autoencoder for EMA anomaly detection.Firstly,attention graph con-volution is introduced into autoencoder to convolve temporal information from neighbor features to current features based on different weight attentions.Secondly,stacked autoencoder is applied to mine spatial information from those new aggregated temporal features.Finally,based on the bench-mark reconstruction loss of normal training data,different health thresholds calculated by several statistic indicators can carry out anomaly detection for new testing data.In comparison with tra-ditional stacked autoencoder,the proposed model could obtain higher fault detection rate and lower false alarm rate in EMA anomaly detection experiment. 展开更多
关键词 Anomaly detection spatio-temporal informa-tion Multivariate time series signals Attention graph convolution Stacked autoencoder
原文传递
Optical Flow with Learning Feature for Deformable Medical Image Registration 被引量:1
11
作者 Jinrong Hu Lujin Li +3 位作者 Ying Fu Maoyang Zou Jiliu Zhou Shanhui Sun 《Computers, Materials & Continua》 SCIE EI 2022年第5期2773-2788,共16页
Deformable medical image registration plays a vital role in medical image applications,such as placing different temporal images at the same time point or different modality images into the same coordinate system.Vari... Deformable medical image registration plays a vital role in medical image applications,such as placing different temporal images at the same time point or different modality images into the same coordinate system.Various strategies have been developed to satisfy the increasing needs of deformable medical image registration.One popular registration method is estimating the displacement field by computing the optical flow between two images.The motion field(flow field)is computed based on either gray-value or handcrafted descriptors such as the scale-invariant feature transform(SIFT).These methods assume that illumination is constant between images.However,medical images may not always satisfy this assumption.In this study,we propose a metric learning-based motion estimation method called Siamese Flow for deformable medical image registration.We train metric learners using a Siamese network,which produces an image patch descriptor that guarantees a smaller feature distance in two similar anatomical structures and a larger feature distance in two dissimilar anatomical structures.In the proposed registration framework,the flow field is computed based on such features and is close to the real deformation field due to the excellent feature representation ability of the Siamese network.Experimental results demonstrate that the proposed method outperforms the Demons,SIFT Flow,Elastix,and VoxelMorph networks regarding registration accuracy and robustness,particularly with large deformations. 展开更多
关键词 deformation registration feature extraction optical flow convolutional neural network
在线阅读 下载PDF
A Deformable Network with Attention Mechanism for Retinal Vessel Segmentation
12
作者 Xiaolong Zhu Wenjian Li +2 位作者 Weihang Zhang Dongwei Li Huiqi Li 《Journal of Beijing Institute of Technology》 EI CAS 2024年第3期186-193,共8页
The intensive application of deep learning in medical image processing has facilitated the advancement of automatic retinal vessel segmentation research.To overcome the limitation that traditional U-shaped vessel segm... The intensive application of deep learning in medical image processing has facilitated the advancement of automatic retinal vessel segmentation research.To overcome the limitation that traditional U-shaped vessel segmentation networks fail to extract features in fundus image sufficiently,we propose a novel network(DSeU-net)based on deformable convolution and squeeze excitation residual module.The deformable convolution is utilized to dynamically adjust the receptive field for the feature extraction of retinal vessel.And the squeeze excitation residual module is used to scale the weights of the low-level features so that the network learns the complex relationships of the different feature layers efficiently.We validate the DSeU-net on three public retinal vessel segmentation datasets including DRIVE,CHASEDB1,and STARE,and the experimental results demonstrate the satisfactory segmentation performance of the network. 展开更多
关键词 retinal vessel segmentation deformable convolution attention mechanism deep learning
暂未订购
DSD-MatchingNet:Deformable sparse-to-dense feature matching for learning accurate correspondences
13
作者 Yicheng ZHAO Han ZHANG +3 位作者 Ping LU Ping LI Enhua WU Bin SHENG 《Virtual Reality & Intelligent Hardware》 2022年第5期432-443,共12页
Background Exploring correspondences across multiview images is the basis of various computer vision tasks.However,most existing methods have limited accuracy under challenging conditions.Method To learn more robust a... Background Exploring correspondences across multiview images is the basis of various computer vision tasks.However,most existing methods have limited accuracy under challenging conditions.Method To learn more robust and accurate correspondences,we propose DSD-MatchingNet for local feature matching in this study.First,we develop a deformable feature extraction module to obtain multilevel feature maps,which harvest contextual information from dynamic receptive fields.The dynamic receptive fields provided by the deformable convolution network ensure that our method obtains dense and robust correspondence.Second,we utilize sparse-to-dense matching with symmetry of correspondence to implement accurate pixel-level matching,which enables our method to produce more accurate correspondences.Result Experiments show that our proposed DSD-MatchingNet achieves a better performance on the image matching benchmark,as well as on the visual localization benchmark.Specifically,our method achieved 91.3%mean matching accuracy on the HPatches dataset and 99.3%visual localization recalls on the Aachen Day-Night dataset. 展开更多
关键词 Image matching deformable convolution network Sparse-to-dense matching
在线阅读 下载PDF
Deep Bi-Directional Adaptive Gating Graph Convolutional Networks for Spatio-Temporal Traffic Forecasting
14
作者 Xin Wang Jianhui Lv +5 位作者 Madini O.Alassafi Fawaz E.Alsaadi B.D.Parameshachari Longhao Zou Gang Feng Zhonghua Liu 《Tsinghua Science and Technology》 2025年第5期2060-2080,共21页
With the advent of deep learning,various deep neural network architectures have been proposed to capture the complex spatio-temporal dependencies in traffic data.This paper introduces a novel Deep Bi-directional Adapt... With the advent of deep learning,various deep neural network architectures have been proposed to capture the complex spatio-temporal dependencies in traffic data.This paper introduces a novel Deep Bi-directional Adaptive Gating Graph Convolutional Network(DBAG-GCN)model for spatio-temporal traffic forecasting.The proposed model leverages the power of graph convolutional networks to capture the spatial dependencies in the road network topology and incorporates bi-directional gating mechanisms to control the information flow adaptively.Furthermore,we introduce a multi-scale temporal convolution module to capture multi-scale temporal dynamics and a contextual attention mechanism to integrate external factors such as weather conditions and event information.Extensive experiments on real-world traffic datasets demonstrate the superior performance of DBAG-GCN compared to state-of-the-art baselines,achieving significant improvements in prediction accuracy and computational efficiency.The DBAG-GCN model provides a powerful and flexible framework for spatio-temporal traffic forecasting,paving the way for intelligent transportation management and urban planning. 展开更多
关键词 traffic forecasting spatio-temporal modeling Graph convolutional Networks(GCNs) adaptive gating
原文传递
Pore network modeling of gas-water two-phase flow in deformed multi-scale fracture-porous media
15
作者 Dai-Gang Wang Yu-Shan Ma +6 位作者 Zhe Hu Tong Wu Ji-Rui Hou Zhen-Chang Jiang Xin-Xuan Qi Kao-Ping Song Fang-zhou Liu 《Petroleum Science》 2025年第5期2096-2108,共13页
Two actual rocks drilled from a typical ultra-deep hydrocarbon reservoir in the Tarim Basin are selected to conduct in-situ stress-loading micro-focus CT scanning experiments.The gray images of rock microstructure at ... Two actual rocks drilled from a typical ultra-deep hydrocarbon reservoir in the Tarim Basin are selected to conduct in-situ stress-loading micro-focus CT scanning experiments.The gray images of rock microstructure at different stress loading stages are obtained.The U-Net fully convolutional neural network is utilized to achieve fine semantic segmentation of rock skeleton,pore space,and microfractures based on CT slice images of deep rocks.The three-dimensional digital rock models of deformed multiscale fractured-porous media at different stress loading stages are thereafter reconstructed,and the equivalent fracture-pore network models are finally extracted to explore the underlying mechanisms of gas-water two-phase flow at the pore-scale.Results indicate that,in the process of insitu stress loading,both the deep rocks have experienced three stages:linear elastic deformation,nonlinear plastic deformation,and shear failure.The micro-mechanical behavior greatly affects the dynamic deformation of rock microstructure and gas-water two-phase flow.In the linear elastic deformation stage,with the increase in in-situ stress,both the deep rocks are gradually compacted,leading to decreases in average pore radius,pore throat ratio,tortuosity,and water-phase relative permeability,while the coordination number nearly remains unchanged.In the plastic deformation stage,the synergistic influence of rock compaction and existence of micro-fractures typically exert a great effect on pore-throat topological properties and gas-water relative permeability.In the shear failure stage,due to the generation and propagation of micro-fractures inside the deep rock,the topological connectivity becomes better,fluid flow paths increase,and flow conductivity is promoted,thus leading to sharp increases in average pore radius and coordination number,rapid decreases in pore throat ratio and tortuosity,as well as remarkable improvement in relative permeability of gas phase and waterphase. 展开更多
关键词 Ultra-deep reservoir In-situ stress loading U-Netfully convolutional neural network CTscanning Microstructure deformation Pore-scalefluid flow
原文传递
基于多模态语义信息的文本生成图像方法
16
作者 杨冰 周家辉 +1 位作者 姚金良 向学勤 《浙江大学学报(工学版)》 北大核心 2026年第2期360-369,共10页
针对文本语义与图像语义不一致以及图像细节表现不足的问题,提出新的文本生成图像方法.基于多模态语义信息建立鉴别依据,在文本语义基础上引入真实图像语义,以解决文本描述信息密度低的问题,有效缓解生成图像细节缺失或失真的现象.在生... 针对文本语义与图像语义不一致以及图像细节表现不足的问题,提出新的文本生成图像方法.基于多模态语义信息建立鉴别依据,在文本语义基础上引入真实图像语义,以解决文本描述信息密度低的问题,有效缓解生成图像细节缺失或失真的现象.在生成器中集成可变形卷积和星模块卷积,增强生成器表达能力,提高生成图像的细节表现和整体质量.为了验证所提方法的有效性,在CUB数据集和COCO数据集上进行模型训练及评估.与生成式对抗对比语言-图像预训练模型(GALIP)相比,所提方法在保证高效生成的同时,在细节表现、语义一致性及整体质量上具有显著优势. 展开更多
关键词 文本生成图像 多模态语义 可变形卷积 星模块卷积 语义对齐鉴别器
在线阅读 下载PDF
面向面部动作单元的自适应图注意力微表情检测网络
17
作者 马飞 安佳祺 +1 位作者 杨飞霞 徐光宪 《计算机科学与探索》 北大核心 2026年第4期1193-1206,共14页
微表情检测旨在视频中定位幅度微弱、时间短暂的表情区间。其难点在于有效提取面部区域间的动态关联特征和多尺度时序特征,进而精准捕捉面部各区域微小动作之间的关联。针对这些问题,提出了一种融合自适应图注意力和多尺度可变空洞卷积... 微表情检测旨在视频中定位幅度微弱、时间短暂的表情区间。其难点在于有效提取面部区域间的动态关联特征和多尺度时序特征,进而精准捕捉面部各区域微小动作之间的关联。针对这些问题,提出了一种融合自适应图注意力和多尺度可变空洞卷积的微表情检测网络(AG-DDNet)。通过引入参数可学习矩阵来实现键值对的特征变换,通过计算面部区域特征向量间的相似度得到动态邻接矩阵,并结合图注意力机制计算区域间权重系数,实现特征的动态融合;采用了多尺度可变空洞卷积模块,通过自适应池化与卷积组合的预测器生成动态感受野,从而实现多尺度的特征提取;引入基于Fisher信息矩阵的自然梯度优化机制,通过Fisher Adam优化器有效捕捉参数空间的几何结构信息,实现学习率的精确自适应调整,从而显著增强了模型对微表情和宏表情的协同检测能力。在微表情检测任务中,该算法与同类代表性算法相比,在CAS(ME)2数据集和SAMM Long Videos数据集上的性能分别提升了54.20%和20.11%。与最新算法相比,两个数据集上的提升幅度分别为38.43%和6.81%,有效证明了该方法在长视频微表情检测任务上的优越性能。 展开更多
关键词 微表情检测 自适应图注意力 多尺度可变空洞卷积 面部动作单元 长视频分析
在线阅读 下载PDF
基于可变形卷积和注意力机制的路面裂缝检测
18
作者 谢永华 方育才 彭银佳 《计算机工程与设计》 北大核心 2026年第1期279-285,共7页
为解决路面裂缝检测中图像边缘特征难以学习和背景噪声干扰的问题,提出一个基于可变形卷积和注意力机制的可端到端训练的路面裂缝检测网络。该网络基于U-Net结构设计,在特征融合部分添加边缘感知模块来增强裂缝边缘的检测能力;在编码器... 为解决路面裂缝检测中图像边缘特征难以学习和背景噪声干扰的问题,提出一个基于可变形卷积和注意力机制的可端到端训练的路面裂缝检测网络。该网络基于U-Net结构设计,在特征融合部分添加边缘感知模块来增强裂缝边缘的检测能力;在编码器部分使用空洞残差模块扩大感受野并保留更多细节信息;在解码器部分添加注意力机制提高对裂缝特征的关注度,抑制背景噪声。实验结果表明,该网络在MPA、mIoU和F1值这3项指标上均优于其它对比网络,验证了该网络的有效性。 展开更多
关键词 裂缝检测 语义分割 编码解码 可变形卷积 空洞卷积 残差连接 注意力机制
在线阅读 下载PDF
基于改进YOLOv7的遥感图像目标检测方法
19
作者 陈辉 田博 +2 位作者 赵永红 瞿海平 梁建虎 《兰州理工大学学报》 北大核心 2026年第1期93-100,共8页
为了解决遥感图像中小目标规模大、目标分布密集以及容易产生漏检和误检等问题,提出了一种基于改进YOLOv7模型的遥感图像目标检测方法.该方法首先在YOLOv7模型中引入DCNv2结构和残差结构,重新构建了新的骨干网络,以增强目标浅层特征信... 为了解决遥感图像中小目标规模大、目标分布密集以及容易产生漏检和误检等问题,提出了一种基于改进YOLOv7模型的遥感图像目标检测方法.该方法首先在YOLOv7模型中引入DCNv2结构和残差结构,重新构建了新的骨干网络,以增强目标浅层特征信息的提取,并提高网络的准确性.其次,在颈部网络中采用新的特征融合模块,并通过SimAM注意力机制,自适应调节浅层特征的纹理信息和深层语义信息的融合权重,更有针对性地抑制提取浅层特征时带来的噪声.最后,采用归一化高斯瓦瑟斯坦距离损失作为模型的回归损失函数,取代传统的IOU,以提高多尺度目标的检测能力.该算法在DOTAv1.0数据集上小目标平均精度达到20.1%,在DIOR数据集上小目标平均精度达到29.0%.同时,与YOLOv7、YOLOv6等方法相比,该算法展现出了较强的竞争力. 展开更多
关键词 遥感图像 目标检测 可变形卷积网络 SimAM注意力机制 高斯瓦瑟斯坦距离
在线阅读 下载PDF
基于MTF-DSGT的复杂支路串联故障电弧检测
20
作者 余琼芳 谭文新 +1 位作者 吴琼 张宇海 《兵器装备工程学报》 北大核心 2026年第3期235-242,291,共9页
针对低压交流配电系统中复杂支路串联电弧故障检测困难、易引发电气火灾的挑战,提出了基于马尔可夫变迁场与可变形自引导Transformer(Markov transition field and deformable convolutional self-guided transformer,MTF-DSGT)的检测... 针对低压交流配电系统中复杂支路串联电弧故障检测困难、易引发电气火灾的挑战,提出了基于马尔可夫变迁场与可变形自引导Transformer(Markov transition field and deformable convolutional self-guided transformer,MTF-DSGT)的检测方案。利用马尔可夫变迁场将一维电流信号转换为图像,融合可变形卷积网络(deformable convolutional network,DCN)提取局部特征及自引导Transformer捕捉全局信息,以提高故障识别精度。实验结果显示,该方案在复杂支路电路中检测准确率达99.88%,在Jetson Orin Nano平台测试耗时仅7.78 ms。该方案能高效辨识串联电弧故障,具备实时处理能力,适合边缘设备部署。 展开更多
关键词 串联故障电弧 可变形卷积 自引导Transformer 马尔可夫变迁场 复杂支路
在线阅读 下载PDF
上一页 1 2 36 下一页 到第
使用帮助 返回顶部