期刊文献+
共找到540篇文章
< 1 2 27 >
每页显示 20 50 100
DMHFR:Decoder with Multi-Head Feature Receptors for Tract Image Segmentation
1
作者 Jianuo Huang Bohan Lai +2 位作者 Weiye Qiu Caixu Xu Jie He 《Computers, Materials & Continua》 2025年第3期4841-4862,共22页
The self-attention mechanism of Transformers,which captures long-range contextual information,has demonstrated significant potential in image segmentation.However,their ability to learn local,contextual relationships ... The self-attention mechanism of Transformers,which captures long-range contextual information,has demonstrated significant potential in image segmentation.However,their ability to learn local,contextual relationships between pixels requires further improvement.Previous methods face challenges in efficiently managing multi-scale fea-tures of different granularities from the encoder backbone,leaving room for improvement in their global representation and feature extraction capabilities.To address these challenges,we propose a novel Decoder with Multi-Head Feature Receptors(DMHFR),which receives multi-scale features from the encoder backbone and organizes them into three feature groups with different granularities:coarse,fine-grained,and full set.These groups are subsequently processed by Multi-Head Feature Receptors(MHFRs)after feature capture and modeling operations.MHFRs include two Three-Head Feature Receptors(THFRs)and one Four-Head Feature Receptor(FHFR).Each group of features is passed through these MHFRs and then fed into axial transformers,which help the model capture long-range dependencies within the features.The three MHFRs produce three distinct feature outputs.The output from the FHFR serves as auxiliary auxiliary features in the prediction head,and the prediction output and their losses will eventually be aggregated.Experimental results show that the Transformer using DMHFR outperforms 15 state of the arts(SOTA)methods on five public datasets.Specifically,it achieved significant improvements in mean DICE scores over the classic Parallel Reverse Attention Network(PraNet)method,with gains of 4.1%,2.2%,1.4%,8.9%,and 16.3%on the CVC-ClinicDB,Kvasir-SEG,CVC-T,CVC-ColonDB,and ETIS-LaribPolypDB datasets,respectively. 展开更多
关键词 Medical image segmentation feature exploration feature aggregation deep learning multi-head feature receptor
在线阅读 下载PDF
Implicit Feature Contrastive Learning for Few-Shot Object Detection
2
作者 Gang Li Zheng Zhou +6 位作者 Yang Zhang Chuanyun Xu Zihan Ruan Pengfei Lv Ru Wang Xinyu Fan Wei Tan 《Computers, Materials & Continua》 2025年第7期1615-1632,共18页
Although conventional object detection methods achieve high accuracy through extensively annotated datasets,acquiring such large-scale labeled data remains challenging and cost-prohibitive in numerous real-world appli... Although conventional object detection methods achieve high accuracy through extensively annotated datasets,acquiring such large-scale labeled data remains challenging and cost-prohibitive in numerous real-world applications.Few-shot object detection presents a new research idea that aims to localize and classify objects in images using only limited annotated examples.However,the inherent challenge in few-shot object detection lies in the insufficient sample diversity to fully characterize the sample feature distribution,which consequently impacts model performance.Inspired by contrastive learning principles,we propose an Implicit Feature Contrastive Learning(IFCL)module to address this limitation and augment feature diversity for more robust representational learning.This module generates augmented support sample features in a mixed feature space and implicitly contrasts them with query Region of Interest(RoI)features.This approach facilitates more comprehensive learning of both intra-class feature similarity and inter-class feature diversity,thereby enhancing the model’s object classification and localization capabilities.Extensive experiments on PASCAL VOC show that our method achieves a respective improvement of 3.2%,1.8%,and 2.3%on 10-shot of three Novel Sets compared to the baseline model FPD. 展开更多
关键词 Few-shot learning object detection implicit contrastive learning feature mixing feature aggregation
在线阅读 下载PDF
Enhancing Classroom Behavior Recognition with Lightweight Multi-Scale Feature Fusion
3
作者 Chuanchuan Wang Ahmad Sufril Azlan Mohamed +3 位作者 Xiao Yang Hao Zhang Xiang Li Mohd Halim Bin Mohd Noor 《Computers, Materials & Continua》 2025年第10期855-874,共20页
Classroom behavior recognition is a hot research topic,which plays a vital role in assessing and improving the quality of classroom teaching.However,existing classroom behavior recognition methods have challenges for ... Classroom behavior recognition is a hot research topic,which plays a vital role in assessing and improving the quality of classroom teaching.However,existing classroom behavior recognition methods have challenges for high recognition accuracy with datasets with problems such as scenes with blurred pictures,and inconsistent objects.To address this challenge,we proposed an effective,lightweight object detector method called the RFNet model(YOLO-FR).The YOLO-FR is a lightweight and effective model.Specifically,for efficient multi-scale feature extraction,effective feature pyramid shared convolutional(FPSC)was designed to improve the feature extract performance by leveraging convolutional layers with varying dilation rates from the input image in the backbone.Secondly,to address the problem of multi-scale variability in the scene,we design the Rep Ghost fusion Cross Stage Partial and Efficient Layer Aggregation Network(RGCSPELAN)to improve the network performance further and reduce the amount of computation and the number of parameters.In addition,by conducting experimental valuation on the SCB dataset3 and STBD-08 dataset.Experimental results indicate that,compared to the baseline model,the RFNet model has increased mean accuracy precision(mAP@50)from 69.6%to 71.0%on the SCB dataset3 and from 91.8%to 93.1%on the STBD-08 dataset.The RFNet approach has effectiveness precision at 68.6%,surpassing the baseline method(YOLOv11)at 3.3%and archieve the minimal size(4.9 M)on the SCB dataset3.Finally,comparing it with other algorithms,it accurately detects student behavior in complex classroom environments results confirmed that RFNet is well-suited for real-time and efficiently recognizing classroom behaviors. 展开更多
关键词 Classroom action recognition YOLO-FR feature pyramid shared convolutional rep ghost cross stage partial efficient layer aggregation network(RGCSPELAN)
在线阅读 下载PDF
Feature-Based Aggregation and Deep Reinforcement Learning:A Survey and Some New Implementations 被引量:15
4
作者 Dimitri P.Bertsekas 《IEEE/CAA Journal of Automatica Sinica》 EI CSCD 2019年第1期1-31,共31页
In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinfor... In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinforcement learning schemes. We introduce features of the states of the original problem, and we formulate a smaller "aggregate" Markov decision problem, whose states relate to the features. We discuss properties and possible implementations of this type of aggregation, including a new approach to approximate policy iteration. In this approach the policy improvement operation combines feature-based aggregation with feature construction using deep neural networks or other calculations. We argue that the cost function of a policy may be approximated much more accurately by the nonlinear function of the features provided by aggregation, than by the linear function of the features provided by neural networkbased reinforcement learning, thereby potentially leading to more effective policy improvement. 展开更多
关键词 REINFORCEMENT learning dynamic programming Markovian DECISION problems aggregation feature-based ARCHITECTURES policy ITERATION DEEP neural networks rollout algorithms
在线阅读 下载PDF
Point Cloud Classification Using Content-Based Transformer via Clustering in Feature Space 被引量:6
5
作者 Yahui Liu Bin Tian +2 位作者 Yisheng Lv Lingxi Li Fei-Yue Wang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期231-239,共9页
Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to est... Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT. 展开更多
关键词 Content-based Transformer deep learning feature aggregator local attention point cloud classification
在线阅读 下载PDF
Multi-view feature fusion for rolling bearing fault diagnosis using random forest and autoencoder 被引量:8
6
作者 Sun Wenqing Deng Aidong +4 位作者 Deng Minqiang Zhu Jing Zhai Yimeng Cheng Qiang Liu Yang 《Journal of Southeast University(English Edition)》 EI CAS 2019年第3期302-309,共8页
To improve the accuracy and robustness of rolling bearing fault diagnosis under complex conditions, a novel method based on multi-view feature fusion is proposed. Firstly, multi-view features from perspectives of the ... To improve the accuracy and robustness of rolling bearing fault diagnosis under complex conditions, a novel method based on multi-view feature fusion is proposed. Firstly, multi-view features from perspectives of the time domain, frequency domain and time-frequency domain are extracted through the Fourier transform, Hilbert transform and empirical mode decomposition (EMD).Then, the random forest model (RF) is applied to select features which are highly correlated with the bearing operating state. Subsequently, the selected features are fused via the autoencoder (AE) to further reduce the redundancy. Finally, the effectiveness of the fused features is evaluated by the support vector machine (SVM). The experimental results indicate that the proposed method based on the multi-view feature fusion can effectively reflect the difference in the state of the rolling bearing, and improve the accuracy of fault diagnosis. 展开更多
关键词 multi-view features feature fusion fault diagnosis rolling bearing machine learning
在线阅读 下载PDF
3D Surface Reconstruction of Coarse Aggregate Particles from Occlusion-Free Multi-View Images
7
作者 GAO Rong SUN Zhaoyun +5 位作者 GUO Jianxing LI Wei YANG Ming HAO Xueli YAO Bobin WANG Huifeng 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2024年第4期301-314,共14页
Rapidly and accurately assessing the geometric characteristics of coarse aggregate particles is crucial for ensuring pavement performance in highway engineering.This article introduces an innovative system for the thr... Rapidly and accurately assessing the geometric characteristics of coarse aggregate particles is crucial for ensuring pavement performance in highway engineering.This article introduces an innovative system for the three-dimensional(3D)surface reconstruction of coarse aggregate particles using occlusion-free multi-view imaging.The system captures synchronized images of particles in free fall,employing a matte sphere and a nonlinear optimization approach to estimate the camera projection matrices.A pre-trained segmentation model is utilized to eliminate the background of the images.The Shape from Silhouettes(SfS)algorithm is then applied to generate 3D voxel data,followed by the Marching Cubes algorithm to construct the 3D surface contour.Validation against standard parts and diverse coarse aggregate particles confirms the method's high accuracy,with an average measurement precision of 0.434 mm and a significant increase in scanning and reconstruction efficiency. 展开更多
关键词 3D shape reconstruction multi-view imaging coarse aggregate particles shape from Silhouettes multi-camera calibration
原文传递
Modelling the temporal-varied nonlinear velocity profile of debris flow using a stratification aggregation algorithm in 3D-HBP-SPH framework
8
作者 HAN Zheng XIE Wendu +5 位作者 ZENG Chuicheng LI Yange CHEN Guangqi CHEN Ningsheng HU Guisheng WANG Weidong 《Journal of Mountain Science》 SCIE CSCD 2024年第12期3945-3960,共16页
Estimation of velocity profile within mud depth is a long-standing and essential problem in debris flow dynamics.Until now,various velocity profiles have been proposed based on the fitting analysis of experimental mea... Estimation of velocity profile within mud depth is a long-standing and essential problem in debris flow dynamics.Until now,various velocity profiles have been proposed based on the fitting analysis of experimental measurements,but these are often limited by the observation conditions,such as the number of configured sensors.Therefore,the resulting linear velocity profiles usually exhibit limitations in reproducing the temporal-varied and nonlinear behavior during the debris flow process.In this study,we present a novel approach to explore the debris flow velocity profile in detail upon our previous 3D-HBPSPH numerical model,i.e.,the three-dimensional Smoothed Particle Hydrodynamic model incorporating the Herschel-Bulkley-Papanastasiou rheology.Specifically,we propose a stratification aggregation algorithm for interpreting the details of SPH particles,which enables the recording of temporal velocities of debris flow at different mud depths.To analyze the velocity profile,we introduce a logarithmic-based nonlinear model with two key parameters,that a controlling the shape of velocity profile and b concerning its temporal evolution.We verify the proposed velocity profile and explore its sensitivity using 34 sets of velocity data from three individual flume experiments in previous literature.Our results demonstrate that the proposed temporalvaried nonlinear velocity profile outperforms the previous linear profiles. 展开更多
关键词 Debris flow Velocity profile Temporal varied feature NONLINEAR Stratification aggregation algorithm
原文传递
Online identification and extraction method of regional large-scale adjustable load-aggregation characteristics
9
作者 Siwei Li Liang Yue +1 位作者 Xiangyu Kong Chengshan Wang 《Global Energy Interconnection》 EI CSCD 2024年第3期313-323,共11页
This article introduces the concept of load aggregation,which involves a comprehensive analysis of loads to acquire their external characteristics for the purpose of modeling and analyzing power systems.The online ide... This article introduces the concept of load aggregation,which involves a comprehensive analysis of loads to acquire their external characteristics for the purpose of modeling and analyzing power systems.The online identification method is a computer-involved approach for data collection,processing,and system identification,commonly used for adaptive control and prediction.This paper proposes a method for dynamically aggregating large-scale adjustable loads to support high proportions of new energy integration,aiming to study the aggregation characteristics of regional large-scale adjustable loads using online identification techniques and feature extraction methods.The experiment selected 300 central air conditioners as the research subject and analyzed their regulation characteristics,economic efficiency,and comfort.The experimental results show that as the adjustment time of the air conditioner increases from 5 minutes to 35 minutes,the stable adjustment quantity during the adjustment period decreases from 28.46 to 3.57,indicating that air conditioning loads can be controlled over a long period and have better adjustment effects in the short term.Overall,the experimental results of this paper demonstrate that analyzing the aggregation characteristics of regional large-scale adjustable loads using online identification techniques and feature extraction algorithms is effective. 展开更多
关键词 Load aggregation Regional large-scale Online recognition feature extraction method
在线阅读 下载PDF
Feature Fusion Multi-View Hashing Based on Random Kernel Canonical Correlation Analysis 被引量:2
10
作者 Junshan Tan Rong Duan +2 位作者 Jiaohua Qin Xuyu Xiang Yun Tan 《Computers, Materials & Continua》 SCIE EI 2020年第5期675-689,共15页
Hashing technology has the advantages of reducing data storage and improving the efficiency of the learning system,making it more and more widely used in image retrieval.Multi-view data describes image information mor... Hashing technology has the advantages of reducing data storage and improving the efficiency of the learning system,making it more and more widely used in image retrieval.Multi-view data describes image information more comprehensively than traditional methods using a single-view.How to use hashing to combine multi-view data for image retrieval is still a challenge.In this paper,a multi-view fusion hashing method based on RKCCA(Random Kernel Canonical Correlation Analysis)is proposed.In order to describe image content more accurately,we use deep learning dense convolutional network feature DenseNet to construct multi-view by combining GIST feature or BoW_SIFT(Bag-of-Words model+SIFT feature)feature.This algorithm uses RKCCA method to fuse multi-view features to construct association features and apply them to image retrieval.The algorithm generates binary hash code with minimal distortion error by designing quantization regularization terms.A large number of experiments on benchmark datasets show that this method is superior to other multi-view hashing methods. 展开更多
关键词 HASHING multi-view data random kernel canonical correlation analysis feature fusion deep learning
在线阅读 下载PDF
Multi-Index Image Retrieval Hash Algorithm Based on Multi-View Feature Coding
11
作者 Rong Duan Junshan Tan +3 位作者 Jiaohua Qin Xuyu Xiang Yun Tan N.eal NXiong 《Computers, Materials & Continua》 SCIE EI 2020年第12期2335-2350,共16页
In recent years,with the massive growth of image data,how to match the image required by users quickly and efficiently becomes a challenge.Compared with single-view feature,multi-view feature is more accurate to descr... In recent years,with the massive growth of image data,how to match the image required by users quickly and efficiently becomes a challenge.Compared with single-view feature,multi-view feature is more accurate to describe image information.The advantages of hash method in reducing data storage and improving efficiency also make us study how to effectively apply to large-scale image retrieval.In this paper,a hash algorithm of multi-index image retrieval based on multi-view feature coding is proposed.By learning the data correlation between different views,this algorithm uses multi-view data with deeper level image semantics to achieve better retrieval results.This algorithm uses a quantitative hash method to generate binary sequences,and uses the hash code generated by the association features to construct database inverted index files,so as to reduce the memory burden and promote the efficient matching.In order to reduce the matching error of hash code and ensure the retrieval accuracy,this algorithm uses inverted multi-index structure instead of single-index structure.Compared with other advanced image retrieval method,this method has better retrieval performance. 展开更多
关键词 HASHING multi-view feature large-scale image retrieval feature coding feature matching
在线阅读 下载PDF
Auto-Weighted Neutrosophic Fuzzy Clustering for Multi-View Data
12
作者 Zhe Liu Jiahao Shi +2 位作者 Dania Santina Yulong Huang Nabil Mlaiki 《Computer Modeling in Engineering & Sciences》 2025年第9期3531-3555,共25页
The increasing prevalence of multi-view data has made multi-view clustering a crucial technique for discovering latent structures from heterogeneous representations.However,traditional fuzzy clustering algorithms show... The increasing prevalence of multi-view data has made multi-view clustering a crucial technique for discovering latent structures from heterogeneous representations.However,traditional fuzzy clustering algorithms show limitations with the inherent uncertainty and imprecision of such data,as they rely on a single-dimensional membership value.To overcome these limitations,we propose an auto-weighted multi-view neutrosophic fuzzy clustering(AW-MVNFC)algorithm.Our method leverages the neutrosophic framework,an extension of fuzzy sets,to explicitly model imprecision and ambiguity through three membership degrees.The core novelty of AWMVNFC lies in a hierarchical weighting strategy that adaptively learns the contributions of both individual data views and the importance of each feature within a view.Through a unified objective function,AW-MVNFC jointly optimizes the neutrosophic membership assignments,cluster centers,and the distributions of view and feature weights.Comprehensive experiments conducted on synthetic and real-world datasets demonstrate that our algorithm achieves more accurate and stable clustering than existing methods,demonstrating its effectiveness in handling the complexities of multi-view data. 展开更多
关键词 multi-view data neutrosophic fuzzy clustering view weight feature weight UNCERTAINTY
在线阅读 下载PDF
ST-SIGMA:Spatio-temporal semantics and interaction graph aggregation for multi-agent perception and trajectory forecasting 被引量:5
13
作者 Yang Fang Bei Luo +3 位作者 Ting Zhao Dong He Bingbing Jiang Qilie Liu 《CAAI Transactions on Intelligence Technology》 SCIE EI 2022年第4期744-757,共14页
Scene perception and trajectory forecasting are two fundamental challenges that are crucial to a safe and reliable autonomous driving(AD)system.However,most proposed methods aim at addressing one of the two challenges... Scene perception and trajectory forecasting are two fundamental challenges that are crucial to a safe and reliable autonomous driving(AD)system.However,most proposed methods aim at addressing one of the two challenges mentioned above with a single model.To tackle this dilemma,this paper proposes spatio-temporal semantics and interaction graph aggregation for multi-agent perception and trajectory forecasting(STSIGMA),an efficient end-to-end method to jointly and accurately perceive the AD environment and forecast the trajectories of the surrounding traffic agents within a unified framework.ST-SIGMA adopts a trident encoder-decoder architecture to learn scene semantics and agent interaction information on bird’s-eye view(BEV)maps simultaneously.Specifically,an iterative aggregation network is first employed as the scene semantic encoder(SSE)to learn diverse scene information.To preserve dynamic interactions of traffic agents,ST-SIGMA further exploits a spatio-temporal graph network as the graph interaction encoder.Meanwhile,a simple yet efficient feature fusion method to fuse semantic and interaction features into a unified feature space as the input to a novel hierarchical aggregation decoder for downstream prediction tasks is designed.Extensive experiments on the nuScenes data set have demonstrated that the proposed ST-SIGMA achieves significant improvements compared to the state-of-theart(SOTA)methods in terms of scene perception and trajectory forecasting,respectively.Therefore,the proposed approach outperforms SOTA in terms of model generalisation and robustness and is therefore more feasible for deployment in realworld AD scenarios. 展开更多
关键词 feature fusion graph interaction hierarchical aggregation scene perception scene semantics trajectory forecasting
在线阅读 下载PDF
MIA-UNet:Multi-Scale Iterative Aggregation U-Network for Retinal Vessel Segmentation 被引量:2
14
作者 Linfang Yu Zhen Qin +1 位作者 Yi Ding Zhiguang Qin 《Computer Modeling in Engineering & Sciences》 SCIE EI 2021年第11期805-828,共24页
As an important part of the new generation of information technology,the Internet of Things(IoT)has been widely concerned and regarded as an enabling technology of the next generation of health care system.The fundus ... As an important part of the new generation of information technology,the Internet of Things(IoT)has been widely concerned and regarded as an enabling technology of the next generation of health care system.The fundus photography equipment is connected to the cloud platform through the IoT,so as to realize the realtime uploading of fundus images and the rapid issuance of diagnostic suggestions by artificial intelligence.At the same time,important security and privacy issues have emerged.The data uploaded to the cloud platform involves more personal attributes,health status and medical application data of patients.Once leaked,abused or improperly disclosed,personal information security will be violated.Therefore,it is important to address the security and privacy issues of massive medical and healthcare equipment connecting to the infrastructure of IoT healthcare and health systems.To meet this challenge,we propose MIA-UNet,a multi-scale iterative aggregation U-network,which aims to achieve accurate and efficient retinal vessel segmentation for ophthalmic auxiliary diagnosis while ensuring that the network has low computational complexity to adapt to mobile terminals.In this way,users do not need to upload the data to the cloud platform,and can analyze and process the fundus images on their own mobile terminals,thus eliminating the leakage of personal information.Specifically,the interconnection between encoder and decoder,as well as the internal connection between decoder subnetworks in classic U-Net are redefined and redesigned.Furthermore,we propose a hybrid loss function to smooth the gradient and deal with the imbalance between foreground and background.Compared with the UNet,the segmentation performance of the proposed network is significantly improved on the premise that the number of parameters is only increased by 2%.When applied to three publicly available datasets:DRIVE,STARE and CHASE DB1,the proposed network achieves the accuracy/F1-score of 96.33%/84.34%,97.12%/83.17%and 97.06%/84.10%,respectively.The experimental results show that the MIA-UNet is superior to the state-of-the-art methods. 展开更多
关键词 Retinal vessel segmentation security and privacy redesigned skip connection feature maps aggregation hybrid loss function
在线阅读 下载PDF
Supervised Feature Learning for Offline Writer Identification Using VLAD and Double Power Normalization
15
作者 Dawei Liang Meng Wu Yan Hu 《Computers, Materials & Continua》 SCIE EI 2023年第7期279-293,共15页
As an indispensable part of identity authentication,offline writer identification plays a notable role in biology,forensics,and historical document analysis.However,identifying handwriting efficiently,stably,and quick... As an indispensable part of identity authentication,offline writer identification plays a notable role in biology,forensics,and historical document analysis.However,identifying handwriting efficiently,stably,and quickly is still challenging due to the method of extracting and processing handwriting features.In this paper,we propose an efficient system to identify writers through handwritten images,which integrates local and global features from similar handwritten images.The local features are modeled by effective aggregate processing,and global features are extracted through transfer learning.Specifically,the proposed system employs a pre-trained Residual Network to mine the relationship between large image sets and specific handwritten images,while the vector of locally aggregated descriptors with double power normalization is employed in aggregating local and global features.Moreover,handwritten image segmentation,preprocessing,enhancement,optimization of neural network architecture,and normalization for local and global features are exploited,significantly improving system performance.The proposed system is evaluated on Computer Vision Lab(CVL)datasets and the International Conference on Document Analysis and Recognition(ICDAR)2013 datasets.The results show that it represents good generalizability and achieves state-of-the-art performance.Furthermore,the system performs better when training complete handwriting patches with the normalization method.The experimental result indicates that it’s significant to segment handwriting reasonably while dealing with handwriting overlap,which reduces visual burstiness. 展开更多
关键词 Writer identification power normalization vector of locally aggregated descriptors feature extraction
在线阅读 下载PDF
SCE-YOLO:改进YOLOv8的轻量级无人机视觉检测算法 被引量:3
16
作者 张帅 王波涛 +1 位作者 涂嘉怡 陈聪实 《计算机工程与应用》 北大核心 2025年第13期100-112,共13页
针对无人机航拍场景下的目标检测模型计算复杂、检测效果不佳等问题,提出一种改进YOLOv8的轻量级无人机目标检测算法SCE-YOLO。使用STA_C2f替换骨干网络中的C2f模块,提高模型的特征提取能力;将采用渐进重参数化方法改进的AIFI模块作为... 针对无人机航拍场景下的目标检测模型计算复杂、检测效果不佳等问题,提出一种改进YOLOv8的轻量级无人机目标检测算法SCE-YOLO。使用STA_C2f替换骨干网络中的C2f模块,提高模型的特征提取能力;将采用渐进重参数化方法改进的AIFI模块作为空间金字塔池化层,实现高质量的尺度特征交互;提出一种多尺度特征聚合扩散网络UAV_CFDPN,根据航拍小目标的尺度特征优化网络结构,设计特征聚合模块FAM以及新的特征聚合与扩散路径,使得模型获得丰富的多尺度特征和上下文信息,提高目标检测的尺度适应性;设计一种高效共享卷积模块ES-Head,在保持定位和分类能力的同时,使得模型更加轻量高效。在VisDrone2019数据集上进行测试,实验结果表明,相较于YOLOv8s,虽然提出的SCE-YOLO算法mAP50减少0.5个百分点,但参数量和计算量仅为YOLOv8s的10.0%和48.8%,在检测精度和轻量化方面相较于其他先进算法具有明显的优势。 展开更多
关键词 目标检测 YOLOv8 多尺度特征 特征聚合 轻量化
在线阅读 下载PDF
动态特征聚合与多层次协同的无人机红外目标实例分割 被引量:2
17
作者 何自芬 王启刚 +3 位作者 张印辉 黄滢 彭伟 陈光晨 《红外与激光工程》 北大核心 2025年第8期246-258,共13页
针对无人机红外成像中因距离较远导致的图像轮廓模糊及目标尺度变化致使分割精度下降的问题,文中提出动态特征聚合与多层次协同的无人机红外目标实例分割模型(Dynamic feature aggregation and multi-level collaboration,DFMCNet)。首... 针对无人机红外成像中因距离较远导致的图像轮廓模糊及目标尺度变化致使分割精度下降的问题,文中提出动态特征聚合与多层次协同的无人机红外目标实例分割模型(Dynamic feature aggregation and multi-level collaboration,DFMCNet)。首先,设计区域特征自适应卷积模块(Spatial attention dynamic convolution,SADConv),采用动态卷积核和注意力机制,有效缓解特征图降维引发的细节丢失,抑制背景噪声干扰;其次,构建特征感知重组上采样模块(Feature sensing recombination upsampling module,FRUM),利用并行化可学习权重实现特征重组,在恢复特征图分辨率时保留空间特征并增强空间结构信息关注;最后,引入多尺度上下文聚合模块(Multi-scale context aggregation feature extraction module,MSFE),通过跨层级特征融合捕获多尺度上下文信息,提升模型对尺寸差异目标的泛化性。在红外航拍交通数据集Aerial-Mancar上的实验表明,DFMCNet的mAP50精度为78.4%较基准模型提升9.7%,mAP50-95精度为51.1%提升5.6%,与YOLOv12n-seg相比mAP50提高7.2%,验证了其在无人机红外场景下实现红外目标精确分割的有效性。 展开更多
关键词 无人机红外 动态卷积核 特征重组 多尺度聚合
原文传递
基于层次特征增强的细粒度点云分类 被引量:1
18
作者 白静 刘路 +1 位作者 郑虎 蒋金哲 《浙江大学学报(理学版)》 北大核心 2025年第1期70-80,共11页
针对粗粒度点云分类方法在细粒度数据集中局部特征提取不足的问题,提出了一种基于层次特征增强的三维细粒度点云分类网络(HFE-Net)。基于Veronese映射的点特征增强模块(V-PE)对点云数据进行数据增强,辅助网络学习法线和姿态高阶信息;基... 针对粗粒度点云分类方法在细粒度数据集中局部特征提取不足的问题,提出了一种基于层次特征增强的三维细粒度点云分类网络(HFE-Net)。基于Veronese映射的点特征增强模块(V-PE)对点云数据进行数据增强,辅助网络学习法线和姿态高阶信息;基于多尺度上下文感知的簇内特征增强模块(CA-IntraCE),利用不同尺度的K近邻(K-nearest neighbors,KNN)算法以及交叉注意力实现不同尺度特征的增强,以消除最大池化带来的信息丢失;基于分组稀疏采样的簇间特征增强模块(GSS-InterCE),利用最远点采样(FPS)算法获得稀疏点,并采用交叉注意力实验不同簇间的特征增强,从而提高网络的细粒度判别能力。在FG3D数据集Airplane、Car和Chair 3个类别上的实验结果显示,HFE-Net的总体准确率分别达97.40%,80.53%和83.83%,已超过现有最优方法DC-Net、FGPNet的分类框架,说明HFE-Net的分类性能具有一定的优越性。 展开更多
关键词 三维点云 细粒度分类 交叉注意力 特征增强
在线阅读 下载PDF
基于结构变换补全的边缘纹理双特征聚合图像修复方法
19
作者 张荣国 文译浩 +2 位作者 胡静 王丽芳 刘小君 《模式识别与人工智能》 北大核心 2025年第5期397-411,共15页
现有神经网络在修复受损图像缺失区域时,仍存在边缘结构不合理、纹理不完整等缺陷.为此,文中提出基于结构变换补全的边缘纹理双特征聚合图像修复方法.首先,设计基于轴向注意力与上下文Transformer的结构变换补全器,结合结构平滑器进一... 现有神经网络在修复受损图像缺失区域时,仍存在边缘结构不合理、纹理不完整等缺陷.为此,文中提出基于结构变换补全的边缘纹理双特征聚合图像修复方法.首先,设计基于轴向注意力与上下文Transformer的结构变换补全器,结合结构平滑器进一步补全优化边缘结构,增强对边缘局部细节与全局结构的捕捉能力,抑制边缘噪声和伪影,修复受损的边缘结构.然后,构建边缘引导特征对齐器和边缘纹理双特征聚合器,自适应学习缩放和偏移参数,有效解决在不同特征空间层次上边缘结构特征和纹理特征动态聚合时的尺度偏移问题,提升图像修复的整体质量.最后,在3个数据集上的实验表明文中方法的可行性和有效性. 展开更多
关键词 图像修复 边缘引导 结构补全 特征空间 双特征聚合
在线阅读 下载PDF
多特征聚合的边界引导视频图像显著目标检测
20
作者 张荣国 郑晓鸽 +2 位作者 王丽芳 胡静 刘小君 《中国图象图形学报》 北大核心 2025年第4期1141-1154,共14页
目的视频显著目标检测的目的是识别和突出显示视频中的重要对象或区域。现有的方法在挖掘边界线索和时空特征之间的相关性方面存在不足,并且在特征聚合过程中未能充分考虑相关的上下文信息,导致检测结果不够精确。因此提出了多特征聚合... 目的视频显著目标检测的目的是识别和突出显示视频中的重要对象或区域。现有的方法在挖掘边界线索和时空特征之间的相关性方面存在不足,并且在特征聚合过程中未能充分考虑相关的上下文信息,导致检测结果不够精确。因此提出了多特征聚合的边界引导网络,进行显著目标边界信息和显著目标时空信息之间的互补协作。方法首先,提取视频帧显著目标的空间和运动特征,在不同分辨率下将显著目标边界特征与显著目标时空特征耦合,突出运动目标边界的特征,更准确地定位视频显著目标;其次,采用了多层特征注意聚合模块以提高不同特征的表征能力,使得各相异特征得以充分利用;同时在训练阶段采用混合损失来帮助网络学习,以更加准确地分割出运动目标显著的边界区域,获得期望的显著目标。结果实验在4个数据集上与现有的5种方法进行了比较,所提方法在4个数据集上的F-measure值均优于对比方法。在DAVIS(densely annotated video segmentation)数据集上,与性能次优的模型相比,F-measure值提高了0.2%,S-measure值略低于最优值0.7%;在FBMS(Freiburg-Berkeley motion segmentation)数据集上,F-measure值比次优值提高了0.9%;在ViSal数据集上,平均绝对误差(mean absolute error,MAE)值仅低于最优方法STVS(spatial temporal video salient)0.1%,F-measure值比STVS提高了0.2%;在MCL据集上,所提方法实现了最优的MAE值2.2%,S-measure值和F-measure值比次优方法SSAV(saliency-shift aware VSOD)分别提高了1.6%和0.6%。结论提出的方法能够有效提升检测出的视频显著目标的边界质量。 展开更多
关键词 视频图像 显著性目标检测 边界引导 多尺度特征 特征聚合
原文传递
上一页 1 2 27 下一页 到第
使用帮助 返回顶部