期刊文献+
共找到13,793篇文章
< 1 2 250 >
每页显示 20 50 100
A Fine-Grained RecognitionModel based on Discriminative Region Localization and Efficient Second-Order Feature Encoding
1
作者 Xiaorui Zhang Yingying Wang +3 位作者 Wei Sun Shiyu Zhou Haoming Zhang Pengpai Wang 《Computers, Materials & Continua》 2026年第4期946-965,共20页
Discriminative region localization and efficient feature encoding are crucial for fine-grained object recognition.However,existing data augmentation methods struggle to accurately locate discriminative regions in comp... Discriminative region localization and efficient feature encoding are crucial for fine-grained object recognition.However,existing data augmentation methods struggle to accurately locate discriminative regions in complex backgrounds,small target objects,and limited training data,leading to poor recognition.Fine-grained images exhibit“small inter-class differences,”and while second-order feature encoding enhances discrimination,it often requires dual Convolutional Neural Networks(CNN),increasing training time and complexity.This study proposes a model integrating discriminative region localization and efficient second-order feature encoding.By ranking feature map channels via a fully connected layer,it selects high-importance channels to generate an enhanced map,accurately locating discriminative regions.Cropping and erasing augmentations further refine recognition.To improve efficiency,a novel second-order feature encoding module generates an attention map from the fourth convolutional group of Residual Network 50 layers(ResNet-50)and multiplies it with features from the fifth group,producing second-order features while reducing dimensionality and training time.Experiments on Caltech-University of California,San Diego Birds-200-2011(CUB-200-2011),Stanford Car,and Fine-Grained Visual Classification of Aircraft(FGVC Aircraft)datasets show state-of-the-art accuracy of 88.9%,94.7%,and 93.3%,respectively. 展开更多
关键词 Fine-grained recognition feature encoding data augmentation second-order feature discriminative regions
在线阅读 下载PDF
Enhancing the genomic prediction accuracy of swine agricultural economic traits using an expanded one-hot encoding in CNN models 被引量:1
2
作者 Zishuai Wang Wangchang Li Zhonglin Tang 《Journal of Integrative Agriculture》 2025年第9期3574-3582,共9页
Deep learning(DL)methods like multilayer perceptrons(MLPs)and convolutional neural networks(CNNs)have been applied to predict the complex traits in animal and plant breeding.However,improving the genomic prediction ac... Deep learning(DL)methods like multilayer perceptrons(MLPs)and convolutional neural networks(CNNs)have been applied to predict the complex traits in animal and plant breeding.However,improving the genomic prediction accuracy still presents signifcant challenges.In this study,we applied CNNs to predict swine traits using previously published data.Specifcally,we extensively evaluated the CNN model's performance by employing various sets of single nucleotide polymorphisms(SNPs)and concluded that the CNN model achieved optimal performance when utilizing SNP sets comprising 1,000 SNPs.Furthermore,we adopted a novel approach using the one-hot encoding method that transforms the 16 different genotypes into sets of eight binary variables.This innovative encoding method signifcantly enhanced the CNN's prediction accuracy for swine traits,outperforming the traditional one-hot encoding techniques.Our fndings suggest that the expanded one-hot encoding method can improve the accuracy of DL methods in the genomic prediction of swine agricultural economic traits.This discovery has significant implications for swine breeding programs,where genomic prediction is pivotal in improving breeding strategies.Furthermore,future research endeavors can explore additional enhancements to DL methods by incorporating advanced data pre-processing techniques. 展开更多
关键词 SWINE agricultural economic traits genomic prediction deep learning one-hot encoding convolutional neural networks(CNNs)
在线阅读 下载PDF
Joint Feature Encoding and Task Alignment Mechanism for Emotion-Cause Pair Extraction
3
作者 Shi Li Didi Sun 《Computers, Materials & Continua》 SCIE EI 2025年第1期1069-1086,共18页
With the rapid expansion of social media,analyzing emotions and their causes in texts has gained significant importance.Emotion-cause pair extraction enables the identification of causal relationships between emotions... With the rapid expansion of social media,analyzing emotions and their causes in texts has gained significant importance.Emotion-cause pair extraction enables the identification of causal relationships between emotions and their triggers within a text,facilitating a deeper understanding of expressed sentiments and their underlying reasons.This comprehension is crucial for making informed strategic decisions in various business and societal contexts.However,recent research approaches employing multi-task learning frameworks for modeling often face challenges such as the inability to simultaneouslymodel extracted features and their interactions,or inconsistencies in label prediction between emotion-cause pair extraction and independent assistant tasks like emotion and cause extraction.To address these issues,this study proposes an emotion-cause pair extraction methodology that incorporates joint feature encoding and task alignment mechanisms.The model consists of two primary components:First,joint feature encoding simultaneously generates features for emotion-cause pairs and clauses,enhancing feature interactions between emotion clauses,cause clauses,and emotion-cause pairs.Second,the task alignment technique is applied to reduce the labeling distance between emotion-cause pair extraction and the two assistant tasks,capturing deep semantic information interactions among tasks.The proposed method is evaluated on a Chinese benchmark corpus using 10-fold cross-validation,assessing key performance metrics such as precision,recall,and F1 score.Experimental results demonstrate that the model achieves an F1 score of 76.05%,surpassing the state-of-the-art by 1.03%.The proposed model exhibits significant improvements in emotion-cause pair extraction(ECPE)and cause extraction(CE)compared to existing methods,validating its effectiveness.This research introduces a novel approach based on joint feature encoding and task alignment mechanisms,contributing to advancements in emotion-cause pair extraction.However,the study’s limitation lies in the data sources,potentially restricting the generalizability of the findings. 展开更多
关键词 Emotion-cause pair extraction interactive information enhancement joint feature encoding label consistency task alignment mechanisms
在线阅读 下载PDF
Dual encoding feature filtering generalized attention UNET for retinal vessel segmentation
4
作者 ISLAM Md Tauhidul WU Da-Wen +6 位作者 TANG Qing-Qing ZHAO Kai-Yang YIN Teng LI Yan-Fei SHANG Wen-Yi LIU Jing-Yu ZHANG Hai-Xian 《四川大学学报(自然科学版)》 北大核心 2025年第1期79-95,共17页
Retinal blood vessel segmentation is crucial for diagnosing ocular and cardiovascular diseases.Although the introduction of U-Net in 2015 by Olaf Ronneberger significantly advanced this field,yet issues like limited t... Retinal blood vessel segmentation is crucial for diagnosing ocular and cardiovascular diseases.Although the introduction of U-Net in 2015 by Olaf Ronneberger significantly advanced this field,yet issues like limited training data,imbalance data distribution,and inadequate feature extraction persist,hindering both the segmentation performance and optimal model generalization.Addressing these critical issues,the DEFFA-Unet is proposed featuring an additional encoder to process domain-invariant pre-processed inputs,thereby improving both richer feature encoding and enhanced model generalization.A feature filtering fusion module is developed to ensure the precise feature filtering and robust hybrid feature fusion.In response to the task-specific need for higher precision where false positives are very costly,traditional skip connections are replaced with the attention-guided feature reconstructing fusion module.Additionally,innovative data augmentation and balancing methods are proposed to counter data scarcity and distribution imbalance,further boosting the robustness and generalization of the model.With a comprehensive suite of evaluation metrics,extensive validations on four benchmark datasets(DRIVE,CHASEDB1,STARE,and HRF)and an SLO dataset(IOSTAR),demonstrate the proposed method’s superiority over both baseline and state-of-the-art models.Particularly the proposed method significantly outperforms the compared methods in cross-validation model generalization. 展开更多
关键词 Vessel segmentation Data balancing Data augmentation Dual encoder Attention Mechanism Model generalization
在线阅读 下载PDF
Encoding converters for quantum communication networks
5
作者 Hua-Xing Xu Shao-Hua Wang +2 位作者 Ya-Qi Song Ping Zhang Chang-Lei Wang 《Chinese Physics B》 2025年第5期64-69,共6页
Quantum communication networks,such as quantum key distribution(QKD)networks,typically employ the measurement-resend mechanism between two users using quantum communication devices based on different quantum encoding ... Quantum communication networks,such as quantum key distribution(QKD)networks,typically employ the measurement-resend mechanism between two users using quantum communication devices based on different quantum encoding types.To achieve direct communication between the devices with different quantum encoding types,in this paper,we propose encoding conversion schemes between the polarization bases(rectilinear,diagonal and circular bases)and the time-bin phase bases(two phase bases and time-bin basis)and design the quantum encoding converters.The theoretical analysis of the encoding conversion schemes is given in detail,and the basis correspondence of encoding conversion and the property of bit flip are revealed.The conversion relationship between polarization bases and time-bin phase bases can be easily selected by controlling a phase shifter.Since no optical switches are used in our scheme,the converter can be operated with high speed.The converters can also be modularized,which may be utilized to realize miniaturization in the future. 展开更多
关键词 quantum communication networks encoding conversion polarization encoding time-bin phase encoding
原文传递
A Blockchain-Based Covert Communication Model Based on Dynamic Base-K Encoding
6
作者 Wang Zhujun Zhang Lejun +7 位作者 Li Xueqing Tian Zhihong Su Shen Qiu Jing Chen Huiling Qiu Tie Sergey Gataullin Guo Ran 《China Communications》 2025年第6期319-333,共15页
Blockchain,as a distributed ledger,inherently possesses tamper-resistant capabilities,creating a natural channel for covert communication.However,the immutable nature of data storage might introduce challenges to comm... Blockchain,as a distributed ledger,inherently possesses tamper-resistant capabilities,creating a natural channel for covert communication.However,the immutable nature of data storage might introduce challenges to communication security.This study introduces a blockchain-based covert communication model utilizing dynamic Base-K encoding.The proposed encoding scheme utilizes the input address sequence to determine K to encode the secret message and determines the order of transactions based on K,thus ensuring effective concealment of the message.The dynamic encoding parameters enhance flexibility and address issues related to identical transaction amounts for the same secret message.Experimental results demonstrate that the proposed method maintains smooth communication and low susceptibility to tampering,achieving commendable concealment and embedding rates. 展开更多
关键词 base-K encoding blockchain CONCEALMENT covert communication
在线阅读 下载PDF
Image encoding-based bearing fault diagnosis:Review and challenges for high-speed trains
7
作者 Huimin Li Lingfeng Li +1 位作者 Bin Liu Ge Xin 《High-Speed Railway》 2025年第3期251-259,共9页
High-Speed Trains (HSTs) have emerged as a mainstream mode of transportation in China, owing to their exceptional safety and efficiency. Ensuring the reliable operation of HSTs is of paramount economic and societal im... High-Speed Trains (HSTs) have emerged as a mainstream mode of transportation in China, owing to their exceptional safety and efficiency. Ensuring the reliable operation of HSTs is of paramount economic and societal importance. As critical rotating mechanical components of the transmission system, bearings make their fault diagnosis a topic of extensive attention. This paper provides a systematic review of image encoding-based bearing fault diagnosis methods tailored to the condition monitoring of HSTs. First, it categorizes the image encoding techniques applied in the field of bearing fault diagnosis. Then, a review of state-of-the-art studies has been presented, encompassing both monomodal image conversion and multimodal image fusion approaches. Finally, it highlights current challenges and proposes future research directions to advance intelligent fault diagnosis in HSTs, aiming to provide a valuable reference for researchers and engineers in the field of intelligent operation and maintenance. 展开更多
关键词 High-speed trains Image encoding Fault diagnosis Rotating machinery Condition monitoring
在线阅读 下载PDF
Validity of the Gaussian phase distribution approximation for analysis of isotropic diffusion encoding applied to restricted diffusion in a cylinder
8
作者 Daniel Topgaard 《Magnetic Resonance Letters》 2025年第4期20-27,共8页
The Gaussian phase distribution approximation enables analysis of restricted diffusion encoded by general gradient waveforms but fails to account for the diffraction-like features that may occur for simple pore geomet... The Gaussian phase distribution approximation enables analysis of restricted diffusion encoded by general gradient waveforms but fails to account for the diffraction-like features that may occur for simple pore geometries.We investigate the range of validity of the approximation by random walk simulations of restricted diffusion in a cylinder using isotropic diffusion encoding sequences as well as conventional single gradient pulse pairs and oscillating gradient waveforms.The results show that clear deviations from the approximation may be observed at relative signal attenuations below 0.1 for onedimensional sequences with few oscillation periods.Increasing the encoding dimensionality and/or number of oscillations while extending the total duration of the waveform diminishes the non-Gaussian effects while preserving the low apparent diffusivities characteristic of restriction. 展开更多
关键词 NMR DIFFUSION Porous media Pulsed gradient spin echo Tensor-valued encoding
在线阅读 下载PDF
Faulty-feeder Detection Based on Sparse Waveform Encoding and Simple Convolutional Neural Network with Multi-scale Filters and One Layer of Convolution
9
作者 Jiawei Yuan Tong Wu Zaibin Jiao 《CSEE Journal of Power and Energy Systems》 2025年第5期2150-2164,共15页
Faulty-feeder detection in neutral point noneffectively grounded distribution networks consistently attracts research attention since it directly affects quality and safety of energy supply.Most modern research on fau... Faulty-feeder detection in neutral point noneffectively grounded distribution networks consistently attracts research attention since it directly affects quality and safety of energy supply.Most modern research on faulty-feeder detection tends to apply more complex digital signal processing techniques and deeper neural networks in order to better extract and learn as many detailed characteristics as possible.However,these approaches may easily result in overfitting and high computational cost,which cannot meet requirements for detection accuracy and efficiency in practical applications.This paper proposes an innovative waveform encoding method and details a simple convolutional neural network(CNN)with one layer of convolution used for identification,which seeks to improve detection accuracy and efficiency simultaneously.First,sparse characteristics of waveforms are utilized to encode into compact vectors,and a waveform-vector matrix is generated.Second,to deduce waveform-vector matrix,a simple CNN with multi-scale filters and one layer of convolution is established.Finally,a methodology for faulty-feeder detection is proposed,and both detection accuracy and efficiency are considerably enhanced.Comparative studies have confirmed clear superiority of the developed method,which outperforms existing approaches in both detection accuracy and efficiency,thus highlighting its significant potential for application. 展开更多
关键词 Convolutional neural network faulty-feeder detection multi-scale filters sparse waveform encoding
原文传递
Enhanced Multimodal Sentiment Analysis via Integrated Spatial Position Encoding and Fusion Embedding
10
作者 Chenquan Gan Xu Liu +3 位作者 Yu Tang Xianrong Yu Qingyi Zhu Deepak Kumar Jain 《Computers, Materials & Continua》 2025年第12期5399-5421,共23页
Multimodal sentiment analysis aims to understand emotions from text,speech,and video data.However,current methods often overlook the dominant role of text and suffer from feature loss during integration.Given the vary... Multimodal sentiment analysis aims to understand emotions from text,speech,and video data.However,current methods often overlook the dominant role of text and suffer from feature loss during integration.Given the varying importance of each modality across different contexts,a central and pressing challenge in multimodal sentiment analysis lies in maximizing the use of rich intra-modal features while minimizing information loss during the fusion process.In response to these critical limitations,we propose a novel framework that integrates spatial position encoding and fusion embedding modules to address these issues.In our model,text is treated as the core modality,while speech and video features are selectively incorporated through a unique position-aware fusion process.The spatial position encoding strategy preserves the internal structural information of speech and visual modalities,enabling the model to capture localized intra-modal dependencies that are often overlooked.This design enhances the richness and discriminative power of the fused representation,enabling more accurate and context-aware sentiment prediction.Finally,we conduct comprehensive evaluations on two widely recognized standard datasets in the field—CMU-MOSI and CMU-MOSEI to validate the performance of the proposed model.The experimental results demonstrate that our model exhibits good performance and effectiveness for sentiment analysis tasks. 展开更多
关键词 Multimodal sentiment analysis spatial position encoding fusion embedding feature loss reduction
在线阅读 下载PDF
Autonomous inverse encoding guides 4D nanoprinting for highly programmable shape morphing
11
作者 Shuaiqi Ren Zhiang Zhang +6 位作者 Ruokun He Jiahao Fan Guangming Wang Hesheng Wang Bing Han Yong-Lai Zhang Zhuo-Chen Ma 《International Journal of Extreme Manufacturing》 2025年第3期467-482,共16页
Highly programmable shape morphing of 4D-printed micro/nanostructures is urgently desired for applications in robotics and intelligent systems.However,due to the lack of autonomous holistic strategies throughout the t... Highly programmable shape morphing of 4D-printed micro/nanostructures is urgently desired for applications in robotics and intelligent systems.However,due to the lack of autonomous holistic strategies throughout the target shape input,optimal material distribution generation,and fabrication program output,4D nanoprinting that permits arbitrary shape morphing remains a challenging task for manual design.In this study,we report an autonomous inverse encoding strategy to decipher the genetic code for material property distributions that can guide the encoded modeling toward arbitrarily pre-programmed 4D shape morphing.By tuning the laser power of each voxel at the nanoscale,the genetic code can be spatially programmed and controllable shape morphing can be realized through the inverse encoding process.Using this strategy,the 4D-printed structures can be designed and accurately shift to the target morphing of arbitrarily hand-drawn lines under stimulation.Furthermore,as a proof-of-concept,a flexible fiber micromanipulator that can approach the target region through pre-programmed shape morphing is autonomously inversely encoded according to the localized spatial environment.This strategy may contribute to the modeling and arbitrary shape morphing of micro/nanostructures fabricated via 4D nanoprinting,leading to cutting-edge applications in microfluidics,micro-robotics,minimally invasive robotic surgery,and tissue engineering. 展开更多
关键词 femtosecond laser fabrication 4D printing two-photon polymerization autonomous inverse encoding stimuli-responsive materials
在线阅读 下载PDF
Improved Sensitivity Encoding Parallel Magnetic Resonance Imaging Reconstruction Algorithm Based on Efficient Sum of Outer Products Dictionary Learning
12
作者 DUAN Jizhong SU Yan 《Journal of Shanghai Jiaotong university(Science)》 2025年第3期561-571,共11页
Sensitivity encoding(SENSE)is a parallel magnetic resonance imaging(MRI)reconstruction model by utilizing the sensitivity information of receiver coils to achieve image reconstruction.The existing SENSE-based reconstr... Sensitivity encoding(SENSE)is a parallel magnetic resonance imaging(MRI)reconstruction model by utilizing the sensitivity information of receiver coils to achieve image reconstruction.The existing SENSE-based reconstruction algorithms usually used nonadaptive sparsifying transforms,resulting in a limited reconstruction accuracy.Therefore,we proposed a new model for accurate parallel MRI reconstruction by combining the L0 norm regularization term based on the efficient sum of outer products dictionary learning(SOUPDIL)with the SENSE model,called SOUPDIL-SENSE.The SOUPDIL-SENSE model is mainly solved by utilizing the variable splitting and alternating direction method of multipliers techniques.The experimental results on four human datasets show that the proposed algorithm effectively promotes the image sparsity,eliminates the noise and artifacts of the reconstructed images,and improves the reconstruction accuracy. 展开更多
关键词 parallel magnetic resonance imaging(MRI) sensitivity encoding(SENSE) efficient sum of outer products dictionary learning(SOUPDIL) alternating direction method of multipliers
原文传递
基于多尺度编码器融合的三维人体姿态估计算法
13
作者 包晓安 陈恩琳 +3 位作者 张娜 涂小妹 吴彪 张庆琪 《浙江大学学报(工学版)》 北大核心 2026年第3期565-573,584,共10页
针对冗余信息干扰与信息完整性需求之间的矛盾,提出基于多尺度编码器融合的三维人体姿态估计方法.该方法由关键帧时空编码器(KFSTE)和全局保留自注意力编码器(GRSAE)构成.KFSTE通过关键帧选择器对骨架特征序列进行筛选后,由时间编码器... 针对冗余信息干扰与信息完整性需求之间的矛盾,提出基于多尺度编码器融合的三维人体姿态估计方法.该方法由关键帧时空编码器(KFSTE)和全局保留自注意力编码器(GRSAE)构成.KFSTE通过关键帧选择器对骨架特征序列进行筛选后,由时间编码器获取局部时空建模.GRSAE通过保留编码器进行全局单阶段编码来获取全局骨架序列特征,避免因关键帧筛选偏差导致的信息损失.通过对双编码器的特征拼接及回归处理,预测得到三维人体姿态坐标.实验结果表明,在较大规模的Human3.6M数据集上,所提方法的平均关节位置误差(MPJPE)比MixSTE低3%,有11个动作获得最佳. 展开更多
关键词 三维人体姿态估计 时空编码器 关键帧提取 保留自注意力编码 多编码特征融合
在线阅读 下载PDF
基于线性注意和类别关联特征学习的在线动作检测 被引量:1
14
作者 詹永照 孙慧敏 +1 位作者 夏惠芬 任晓鹏 《江苏大学学报(自然科学版)》 北大核心 2026年第1期39-47,63,共10页
为了在在线动作检测中充分合理利用动作的上下文特征、与类别关联的特征和预测的未来特征快速检测相应动作,提出基于线性注意和类别关联特征学习的在线动作检测方法.该方法改进了Transformer构架,采用哈达玛积的轻型线性自注意实现Trans... 为了在在线动作检测中充分合理利用动作的上下文特征、与类别关联的特征和预测的未来特征快速检测相应动作,提出基于线性注意和类别关联特征学习的在线动作检测方法.该方法改进了Transformer构架,采用哈达玛积的轻型线性自注意实现Transformer视频上下文特征学习,以减少计算开销;其次对训练样本动作特征进行聚类,将视频序列上下文特征与动作类别特征进行关联学习,有效获得与类别关联的特征表达;最后融合动作的上下文特征、与类别关联的特征和预测的未来特征检测相应时刻动作,以提升动作鉴别性.在典型数据集上进行性能试验,完成了超参取值分析,对比了不同方法的工作精度和运行效率.给出了消融试验和可视化分析.结果表明:在Thumos14(TSN-Anet)、Thumos14(TSN-Kinetics)和HDD数据集上,所提出方法的mAP比Colar方法分别提高了0.2、0.5、0.2百分点,可见新方法优于目前较先进的Colar方法. 展开更多
关键词 在线动作检测 深度学习 注意力机制 编码 上下文特征 TRANSFORMER 类别关联特征学习
在线阅读 下载PDF
基于双向时序窗口Transformer的网络入侵检测方法
15
作者 王长浩 王明阳 +1 位作者 丁磊 刘凯 《计算机应用研究》 北大核心 2026年第1期271-279,共9页
近年来,网络攻击的高度动态化、隐蔽化给互联网的安全和稳定带来了极大的威胁。针对现有网络入侵检测方法在局部时序建模精度不足及多分类下少数类识别能力不佳等问题,提出了一种基于双向时间滑动窗口Transformer的网络异常流量检测方... 近年来,网络攻击的高度动态化、隐蔽化给互联网的安全和稳定带来了极大的威胁。针对现有网络入侵检测方法在局部时序建模精度不足及多分类下少数类识别能力不佳等问题,提出了一种基于双向时间滑动窗口Transformer的网络异常流量检测方法。该方法将网络流量数据转换为突出时序关系的三维序列数据,引入可学习的嵌入编码及上下文位置编码,以增强序列特征的表现能力,提升了异常流量检测的准确率和稳定性,并在UNSW-NB15、CIC-IDS-2017公开数据集上进行了验证。实验结果表明,所提方法均表现出较好的性能优势,在二分类任务中检测准确率分别为99.79%、99.77%;在多分类任务中,准确率分别达到98.48%、99.76%,性能均显著高于其他先进深度学习模型。综上,该方法有效提升了网络异常流量检测的准确性和对少数类攻击的识别能力,为网络安全防护提供了新的技术手段。 展开更多
关键词 入侵检测 网络流量 双向时间窗口 上下文位置编码
在线阅读 下载PDF
基于分布式知识编码的机器人场景理解与推理
16
作者 王海涛 张少林 +3 位作者 蒋天雨 葛悦光 崔少伟 王硕 《科学技术与工程》 北大核心 2026年第9期3634-3641,共8页
在复杂场景中的理解和推理能力是衡量机器人智能程度的关键指标之一。然而,现有方法多聚焦于静态环境中的问答任务,难以有效应对动态场景中的多步推理需求。提出一种基于分布式知识编码的机器人场景理解与推理方法,通过构建概念-属性双... 在复杂场景中的理解和推理能力是衡量机器人智能程度的关键指标之一。然而,现有方法多聚焦于静态环境中的问答任务,难以有效应对动态场景中的多步推理需求。提出一种基于分布式知识编码的机器人场景理解与推理方法,通过构建概念-属性双层结构的场景知识图谱,增强机器人的场景理解能力。其中,实体概念通过分布式嵌入技术表示为属性集合,实现语义表达的紧凑化与可计算化。此外,设计了一种分类器,用于根据初始场景与目标场景,推理实现场景变换所需的机器人操作序列。所提分布式知识编码方法具有良好的可解释性,能够将多步场景变换过程分解为可执行的单步操作序列,使得推理复杂度随变换步数线性增长。在公开数据集Trance上的实验结果表明,与GPT-4o相比,所提方法在场景变换识别的敏感性与多步推理性能方面均取得更优表现。此外,在真实机器人平台上,验证了所提方法在实际场景中的可行性与鲁棒性。 展开更多
关键词 场景理解 知识图谱 分布式知识编码 动作序列预测
在线阅读 下载PDF
基于多方位感知深度融合检测头的目标检测算法
17
作者 包晓安 彭书友 +3 位作者 张娜 涂小妹 张庆琪 吴彪 《浙江大学学报(工学版)》 北大核心 2026年第1期32-42,共11页
针对传统目标检测头难以有效捕捉全局信息的问题,提出基于多方位感知深度融合检测头的目标检测算法.通过在检测头部分设计高效双轴窗口注意力编码器(EDWE)模块,使网络能够深度融合捕获到的全局信息与局部信息;在特征金字塔结构之后使用... 针对传统目标检测头难以有效捕捉全局信息的问题,提出基于多方位感知深度融合检测头的目标检测算法.通过在检测头部分设计高效双轴窗口注意力编码器(EDWE)模块,使网络能够深度融合捕获到的全局信息与局部信息;在特征金字塔结构之后使用重参化大核卷积(RLK)模块,减小来自主干网络的特征空间差异,增强网络对中小型数据集的适应性;引入编码器选择保留模块(ESM),选择性地累积来自EDWE模块的输出,优化反向传播.实验结果表明,在规模较大的MS-COCO2017数据集上,所提算法应用于常见模型RetinaNet、FCOS、ATSS时使AP分别提升了2.9、2.6、3.4个百分点;在规模较小的PASCAL VOC2007数据集上,所提算法使3种模型的AP分别实现了1.3、1.0和1.1个百分点的提升.通过EDWE、RLK和ESM模块的协同作用,所提算法有效提升了目标检测精度,在不同规模的数据集上均展现了显著的性能优势. 展开更多
关键词 检测头 目标检测 Transformer编码器 深度融合 大核卷积
在线阅读 下载PDF
基于CNN-Transformer架构的电磁传播损耗预测算法
18
作者 万勇 李骏杰 +1 位作者 孙伟峰 戴永寿 《现代电子技术》 北大核心 2026年第6期43-48,共6页
为了解决传统经验传播损耗模型预测精度不足的问题,提出一种基于CNN-Transformer架构的电磁传播损耗预测算法,通过构建回归模型进行精准的传播损耗预测。通过斯皮尔曼系数法提取有效特征,利用CNN提取与传播损耗预测高度相关的浅层特征,... 为了解决传统经验传播损耗模型预测精度不足的问题,提出一种基于CNN-Transformer架构的电磁传播损耗预测算法,通过构建回归模型进行精准的传播损耗预测。通过斯皮尔曼系数法提取有效特征,利用CNN提取与传播损耗预测高度相关的浅层特征,将从卫星图像中获取的传播路径上地物特征序列进行位置编码,增强对传播路径中不同地物特征顺序对传播损耗影响的理解。最后将CNN提取的浅层特征与位置编码后的地物特征输入到Transformer模型,通过多头自注意力机制捕捉特征间的全局关联性,从而有效校正传播损耗的预测结果。实验结果表明,所提出的CNN-Transformer方法显著降低了传播损耗预测的均方根误差(RMSE),达到了3.3745 dB,同时保持了0.8956的较高确定性系数(R^(2))。所提的电磁传播损耗预测算法为无线通信传播特性研究领域提供了参考,具有一定的应用价值。 展开更多
关键词 电磁传播 损耗预测 TRANSFORMER CNN 斯皮尔曼系数法 地物类型 位置编码
在线阅读 下载PDF
基于文本引导的轻量异构编码多模态图像融合
19
作者 王传云 周明奇 +3 位作者 孙冬冬 王田 高骞 李照奎 《工程科学学报》 北大核心 2026年第2期346-359,共14页
针对资源受限的无人机平台对红外与可见光图像的融合效率与感知性能需求,本文提出一种基于文本引导的轻量异构编码多模态图像融合网络.该网络设计了一种面向红外与可见光图像信息表达功能互补的轻量化双分支异构编码,红外图像编码分支... 针对资源受限的无人机平台对红外与可见光图像的融合效率与感知性能需求,本文提出一种基于文本引导的轻量异构编码多模态图像融合网络.该网络设计了一种面向红外与可见光图像信息表达功能互补的轻量化双分支异构编码,红外图像编码分支强调热目标与边缘响应,可见光图像编码分支侧重于纹理与细节信息建模,从而有效避免同构编码器带来的特征冗余与性能瓶颈.同时,引入轻量级跨模态特征融合模块,增强多模信息之间的互补性与融合表达能力.进一步,通过预训练视觉语言模型结合语义文本特征对融合过程进行引导与调控,提升融合图像的语义一致性与环境适应性.在三个公开多模态图像数据集TNO、LLVIP与M3FD上,本文方法与九种代表性图像融合算法进行了系统对比实验与综合评估,结果显示本文网络在互信息、结构相似性等多个主流评价指标上均表现优越,融合图像在细节清晰度、边缘结构一致性与目标可辨性方面优于现有方法.同时,消融实验表明所提出模型的推理时间相较基线方法减少约50%,且在不显著牺牲性能的前提下实现了更高的效率.除定量评估外,本文还开展了基于文本指令的定性实验,结果显示模型可根据不同语义指令灵活调整红外与可见光特征融合策略,适应低光、过曝、低对比、噪声等多种任务场景.在保证语义一致性的同时,有效增强了热源感知、结构清晰度与抗干扰能力,展现出传统无引导方法难以实现的语义可控性与内容适应性. 展开更多
关键词 多模态图像融合 双分支异构编码 文本引导 轻量化网络 注意力机制
在线阅读 下载PDF
两自由度绳索摆角高精度测量系统设计与试验
20
作者 金纯 王振宇 +3 位作者 陈平 周生宇 苏衍江 钱硕 《实验室研究与探索》 北大核心 2026年第3期18-22,28,共6页
为满足模块化建筑吊装过程中负载抑摆与精准定位的要求,需对吊装绳索的摆角进行实时精准测量。然而现有基于加速度计、电位计等传感器的测量方法存在线性度差、易产生累积漂移等问题,而视觉测量方法易受环境干扰,在动态工况下适应性不... 为满足模块化建筑吊装过程中负载抑摆与精准定位的要求,需对吊装绳索的摆角进行实时精准测量。然而现有基于加速度计、电位计等传感器的测量方法存在线性度差、易产生累积漂移等问题,而视觉测量方法易受环境干扰,在动态工况下适应性不足。针对上述局限,提出一种基于双旋转编码器的两自由度绳索摆角测量系统。该系统通过设计半圆检测片片并结合旋转编码器,实现了两自由度摆角的同步高精度采集,并建立了相应的姿态解算模型。试验结果表明:该系统平均测量误差为0.15°、最大误差为0.58°、动态响应周期为20 ms,在精度与实时性方面均显著优于加速度计和视觉测量方法。研究成果可为模块化建筑单元的高精度安装以及其他悬吊负载的姿态控制提供可靠的技术支撑。 展开更多
关键词 摆角测量 旋转编码器 两自由度 模块化建筑吊装
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部