期刊文献+
共找到38篇文章
< 1 2 >
每页显示 20 50 100
Approximate error correction scheme for three-dimensional surface codes based reinforcement learning
1
作者 曲英杰 陈钊 +1 位作者 王伟杰 马鸿洋 《Chinese Physics B》 SCIE EI CAS CSCD 2023年第10期229-240,共12页
Quantum error correction technology is an important method to eliminate errors during the operation of quantum computers.In order to solve the problem of influence of errors on physical qubits,we propose an approximat... Quantum error correction technology is an important method to eliminate errors during the operation of quantum computers.In order to solve the problem of influence of errors on physical qubits,we propose an approximate error correction scheme that performs dimension mapping operations on surface codes.This error correction scheme utilizes the topological properties of error correction codes to map the surface code dimension to three dimensions.Compared to previous error correction schemes,the present three-dimensional surface code exhibits good scalability due to its higher redundancy and more efficient error correction capabilities.By reducing the number of ancilla qubits required for error correction,this approach achieves savings in measurement space and reduces resource consumption costs.In order to improve the decoding efficiency and solve the problem of the correlation between the surface code stabilizer and the 3D space after dimension mapping,we employ a reinforcement learning(RL)decoder based on deep Q-learning,which enables faster identification of the optimal syndrome and achieves better thresholds through conditional optimization.Compared to the minimum weight perfect matching decoding,the threshold of the RL trained model reaches 0.78%,which is 56%higher and enables large-scale fault-tolerant quantum computation. 展开更多
关键词 fault-tolerant quantum computing surface code approximate error correction reinforcement learning
原文传递
Decoding topological XYZ^(2) codes with reinforcement learning based on attention mechanisms
2
作者 陈庆辉 姬宇欣 +2 位作者 王柯涵 马鸿洋 纪乃华 《Chinese Physics B》 SCIE EI CAS CSCD 2024年第6期262-270,共9页
Quantum error correction, a technique that relies on the principle of redundancy to encode logical information into additional qubits to better protect the system from noise, is necessary to design a viable quantum co... Quantum error correction, a technique that relies on the principle of redundancy to encode logical information into additional qubits to better protect the system from noise, is necessary to design a viable quantum computer. For this new topological stabilizer code-XYZ^(2) code defined on the cellular lattice, it is implemented on a hexagonal lattice of qubits and it encodes the logical qubits with the help of stabilizer measurements of weight six and weight two. However topological stabilizer codes in cellular lattice quantum systems suffer from the detrimental effects of noise due to interaction with the environment. Several decoding approaches have been proposed to address this problem. Here, we propose the use of a state-attention based reinforcement learning decoder to decode XYZ^(2) codes, which enables the decoder to more accurately focus on the information related to the current decoding position, and the error correction accuracy of our reinforcement learning decoder model under the optimisation conditions can reach 83.27% under the depolarizing noise model, and we have measured thresholds of 0.18856 and 0.19043 for XYZ^(2) codes at code spacing of 3–7 and 7–11, respectively. our study provides directions and ideas for applications of decoding schemes combining reinforcement learning attention mechanisms to other topological quantum error-correcting codes. 展开更多
关键词 quantum error correction topological quantum stabilizer code reinforcement learning attention mechanism
原文传递
基于外码分块编码的BATS码度优化
3
作者 杨柳 阴慧颖 +2 位作者 马征 刘恒 王士恒 《西南交通大学学报》 北大核心 2026年第1期156-166,共11页
为解决分批稀疏码(BATS码)在现有外码分块编码方案下,外码随机分批导致的数据重复译码及资源浪费问题,系统地研究基于外码分块编码方案的BATS码理论批次数优化与动态适应性问题.首先,在已知丢包率的条件下,构建BATS码批次数消耗分析模型... 为解决分批稀疏码(BATS码)在现有外码分块编码方案下,外码随机分批导致的数据重复译码及资源浪费问题,系统地研究基于外码分块编码方案的BATS码理论批次数优化与动态适应性问题.首先,在已知丢包率的条件下,构建BATS码批次数消耗分析模型,并推导得出最优度值的计算方法,以此应对现有方案在计算理论批次数以及确定最小化批次数消耗的最优度值方面所面临的挑战;其次,针对信道丢包率未知的场景,提出一种基于强化学习的BATS码动态度优化方法,借助智能学习机制,在丢包率未知的情况下实时获取度值;最后,通过仿真实验对所构建的理论模型和提出的动态优化方法进行评估.理论分析结果显示,所构建的基于外码分块的传输模型及其理论批次数计算公式能够精准计算批次数消耗并确定最优度值.仿真结果进一步证明,在丢包率未知的场景下,所提出的强化学习优化方案的平均批次数消耗低于固定度值方案,且在动态信道环境中能够保持良好的性能表现. 展开更多
关键词 分批稀疏码 分块码 传输次数 强化学习
在线阅读 下载PDF
A SPEECH RECOGNITION METHOD USING COMPETITIVE AND SELECTIVE LEARNING NEURAL NETWORKS
4
作者 徐雄 胡光锐 严永红 《Journal of Shanghai Jiaotong university(Science)》 EI 2000年第2期10-13,共4页
On the basis of asymptotic theory of Gersho, the isodistortion principle of vector clustering was discussed and a kind of competitive and selective learning method (CSL) which may avoid local optimization and have exc... On the basis of asymptotic theory of Gersho, the isodistortion principle of vector clustering was discussed and a kind of competitive and selective learning method (CSL) which may avoid local optimization and have excellent result in application to clusters of HMM model was also proposed. In combining the parallel, self organizational hierarchical neural networks (PSHNN) to reclassify the scores of every form output by HMM, the CSL speech recognition rate is obviously elevated. 展开更多
关键词 SPEECH recognition COMPETITIVE learning classification NEURaL networks document code:a
在线阅读 下载PDF
Linear Complementary Dual Codes Constructed from Reinforcement Learning
5
作者 WU Yansheng MA Jin YANG Shangdong 《Journal of Systems Science & Complexity》 2025年第3期1388-1403,共16页
Recently,linear complementary dual(LCD)codes have garnered substantial interest within coding theory research due to their diverse applications and favorable attributes.This paper directs its attention to the construc... Recently,linear complementary dual(LCD)codes have garnered substantial interest within coding theory research due to their diverse applications and favorable attributes.This paper directs its attention to the construction of binary and ternary LCD codes leveraging curiosity-driven reinforcement learning(RL).By establishing reward and devising well-reasoned mappings from actions to states,it aims to facilitate the successful synthesis of binary or ternary LCD codes.Experimental results indicate that LCD codes constructed using RL exhibit slightly superior error-correction performance compared to those conventionally constructed LCD codes and those developed via standard RL methodologies.The paper introduces novel binary and ternary LCD codes with enhanced minimum distance bounds.Finally,it showcases how random network distillation aids agents in exploring beyond local optima,enhancing the overall performance of the models without compromising convergence. 展开更多
关键词 artificial intelligence error correcting code LCD code reinforcement learning
原文传递
一种基于Dyna-Q学习的旋翼无人机视觉伺服智能控制方法 被引量:8
6
作者 史豪斌 徐梦 +1 位作者 刘珈妤 李继超 《控制与决策》 EI CSCD 北大核心 2019年第12期2517-2526,共10页
基于图像的视觉伺服机器人控制方法通过机器人的视觉获取图像信息,然后形成基于图像信息的闭环反馈来控制机器人的合理运动.经典视觉伺服的伺服增益的选取在大多数条件下是人工赋值的,故存在鲁棒性差、收敛速度慢等问题.针对该问题,提... 基于图像的视觉伺服机器人控制方法通过机器人的视觉获取图像信息,然后形成基于图像信息的闭环反馈来控制机器人的合理运动.经典视觉伺服的伺服增益的选取在大多数条件下是人工赋值的,故存在鲁棒性差、收敛速度慢等问题.针对该问题,提出一种基于Dyna-Q的旋翼无人机视觉伺服智能控制方法调节伺服增益以提高其自适应性.首先,使用基于费尔曼链码的图像特征提取算法提取目标特征点;然后,使用基于图像的视觉伺服形成特征误差的闭环控制;其次,针对旋翼无人机强耦合欠驱动的动力学特性提出一种解耦的视觉伺服控制模型;最后,建立使用Dyna-Q学习调节伺服增益的强化学习模型,通过训练可以使得旋翼无人机自主选择伺服增益.Dyna-Q学习在经典的Q学习的基础上通过建立环境模型来存储经验,环境模型产生的虚拟样本可以作为学习样本来进行值函数的迭代.实验结果表明,所提出的方法相比于传统控制方法PID控制以及经典的基于图像视觉伺服方法具有收敛速度快、稳定性高的优势. 展开更多
关键词 视觉伺服 Dyna-Q学习 增益调节 旋翼无人机 费尔曼连码 强化学习
原文传递
基于Tile Coding编码和模型学习的Actor-Critic算法 被引量:3
7
作者 金玉净 朱文文 +1 位作者 伏玉琛 刘全 《计算机科学》 CSCD 北大核心 2014年第6期239-242,249,共5页
Actor-Critic是一类具有较好性能及收敛保证的强化学习方法,然而,Agent在学习和改进策略的过程中并没有对环境的动态性进行学习,导致Actor-Critic方法的性能受到一定限制。此外,Actor-Critic方法中需要近似地表示策略以及值函数,其中状... Actor-Critic是一类具有较好性能及收敛保证的强化学习方法,然而,Agent在学习和改进策略的过程中并没有对环境的动态性进行学习,导致Actor-Critic方法的性能受到一定限制。此外,Actor-Critic方法中需要近似地表示策略以及值函数,其中状态和动作的编码方法以及参数对Actor-Critic方法有重要的影响。Tile Coding编码具有简单易用、计算时间复杂度较低等优点,因此,将Tile Coding编码与基于模型的Actor-Critic方法结合,并将所得算法应用于强化学习仿真实验。实验结果表明,所得算法具有较好的性能。 展开更多
关键词 强化学习 TILE CODING actor-Critic 模型学习 函数逼近
在线阅读 下载PDF
基于多目标强化学习的抗强干扰Polar编码优化方法 被引量:1
8
作者 梁豪 叶淦华 +2 位作者 陆锐敏 王恒 魏鹏 《电子与信息学报》 EI CSCD 北大核心 2023年第11期4092-4100,共9页
为提升跳频(FH)通信系统信息传输的可靠性和抗干扰能力,该文基于新型Polar编码的慢跳频抗干扰通信系统模型,提出一种适应强干扰环境的Polar编码构造优化方法。首先,面向包含常态和干扰态的混合信道设计多目标强化学习算法,然后优化编码... 为提升跳频(FH)通信系统信息传输的可靠性和抗干扰能力,该文基于新型Polar编码的慢跳频抗干扰通信系统模型,提出一种适应强干扰环境的Polar编码构造优化方法。首先,面向包含常态和干扰态的混合信道设计多目标强化学习算法,然后优化编码过程中的信息位比特信道序列,提升码字的纠错性能,并通过初始化预处理和理论计算回报值降低算法执行复杂度。仿真结果表明,在包含强干扰的混合信道条件下,所提编码优化方法的全局误码性能优于传统编码构造方法,相比于第5代移动通信系统(5G)第3代合作伙伴计划(3GPP)标准方案全局编码增益达0.5 dB,有效改善Polar编码跳频通信高可靠抗干扰传输性能。 展开更多
关键词 信道编码 抗干扰 Polar码 强化学习 可靠性能
在线阅读 下载PDF
Cooperative Caching for Scalable Video Coding Using Value-Decomposed Dimensional Networks 被引量:2
9
作者 Youjia Chen Yuekai Cai +2 位作者 Haifeng Zheng Jinsong Hu Jun Li 《China Communications》 SCIE CSCD 2022年第9期146-161,共16页
Scalable video coding(SVC)has been widely used in video-on-demand(VOD)service,to efficiently satisfy users’different video quality requirements and dynamically adjust video stream to timevariant wireless channels.Und... Scalable video coding(SVC)has been widely used in video-on-demand(VOD)service,to efficiently satisfy users’different video quality requirements and dynamically adjust video stream to timevariant wireless channels.Under the 5G network structure,we consider a cooperative caching scheme inside each cluster with SVC to economically utilize the limited caching storage.A novel multi-agent deep reinforcement learning(MADRL)framework is proposed to jointly optimize the video access delay and users’satisfaction,where an aggregation node is introduced helping individual agents to achieve global observations and overall system rewards.Moreover,to cope with the large action space caused by the large number of videos and users,a dimension decomposition method is embedded into the neural network in each agent,which greatly reduce the computational complexity and memory cost of the reinforcement learning.Experimental results show that:1)the proposed value-decomposed dimensional network(VDDN)algorithm achieves an obvious performance gain versus the traditional MADRL;2)the proposed VDDN algorithm can handle an extremely large action space and quickly converge with a low computational complexity. 展开更多
关键词 cooperative caching multi-agent deep reinforcement learning scalable video coding value-decomposition network
在线阅读 下载PDF
Rich-text document styling restoration via reinforcement learning 被引量:1
10
作者 Hongwei LI Yingpeng HU +2 位作者 Yixuan CAO Ganbin ZHOU Ping LUO 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第4期93-103,共11页
Richly formatted documents,such as financial disclosures,scientific articles,government regulations,widely exist on Web.However,since most of these documents are only for public reading,the styling information inside ... Richly formatted documents,such as financial disclosures,scientific articles,government regulations,widely exist on Web.However,since most of these documents are only for public reading,the styling information inside them is usually missing,making them improper or even burdensome to be displayed and edited in different formats and platforms.In this study we formulate the task of document styling restoration as an optimization problem,which aims to identify the styling settings on the document elements,e.g.,lines,table cells,text,so that rendering with the output styling settings results in a document,where each element inside it holds the(closely)exact position with the one in the original document.Considering that each styling setting is a decision,this problem can be transformed as a multi-step decision-making task over all the document elements,and then be solved by reinforcement learning.Specifically,Monte-Carlo Tree Search(MCTS)is leveraged to explore the different styling settings,and the policy function is learnt under the supervision of the delayed rewards.As a case study,we restore the styling information inside tables,where structural and functional data in the documents are usually presented.Experiment shows that,our best reinforcement method successfully restores the stylings in 87.65%of the tables,with 25.75%absolute improvement over the greedymethod.We also discuss the tradeoff between the inference time and restoration success rate,and argue that although the reinforcement methods cannot be used in real-time scenarios,it is suitable for the offline tasks with high-quality requirement.Finally,this model has been applied in a PDF parser to support cross-format display. 展开更多
关键词 styling restoration monte-carlo tree search reinforcement learning richly formatted documents TaBLES
原文传递
RSOFCPN:CONTROL SYSTEM STRUCTURE ANDALGORITHM DESIGN
11
作者 马勇 杨煜普 +1 位作者 张卫东 许晓鸣 《Journal of Shanghai Jiaotong university(Science)》 EI 2000年第2期57-61,共5页
A stable control scheme for a class of unknown nonlinear systems was presented. The control architecture is composed of two parts, the fuzzy sliding mode controller (FSMC) is applied to drive the state to a designed s... A stable control scheme for a class of unknown nonlinear systems was presented. The control architecture is composed of two parts, the fuzzy sliding mode controller (FSMC) is applied to drive the state to a designed switching hyperplane, and a reinforcement self organizing fuzzy CPN (RSOFCPN) as a feedforward compensator is used to reduce the influence of system uncertainties. The simulation results demonstrate the effectiveness of the proposed control scheme. 展开更多
关键词 nonlinear systems fuzzy SLIDING mode control self ORGaNIZED CPN reinforcement learning document code:a
在线阅读 下载PDF
Arc-length technique for nonlinear finite element analysis 被引量:10
12
作者 MEMONBashir-Ahmed 苏小卒 《Journal of Zhejiang University Science》 EI CSCD 2004年第5期618-628,共11页
Nonlinear solution of reinforced concrete structures, particularly complete load-deflection response, requires tracing of the equilibrium path and proper treatment of the limit and bifurcation points. In this regard, ... Nonlinear solution of reinforced concrete structures, particularly complete load-deflection response, requires tracing of the equilibrium path and proper treatment of the limit and bifurcation points. In this regard, ordinary solution techniques lead to instability near the limit points and also have problems in case of snap-through and snap-back. Thus they fail to predict the complete load-displacement response. The arc-length method serves the purpose well in principle, received wide acceptance in finite element analysis, and has been used extensively. However modifications to the basic idea are vital to meet the particular needs of the analysis. This paper reviews some of the recent developments of the method in the last two decades, with particular emphasis on nonlinear finite element analysis of reinforced concrete structures. 展开更多
关键词 arc-length method Nonlinear analysis Finite element method Reinforced concrete Load-deflection path document code: a CLC number: TU31 arc-length technique for nonlinear finite element analysis* MEMON Bashir-ahmed# SU Xiao-zu (苏小卒) (Department of Structural Engineering Tongji University Shanghai 200092 China) E-mail: bashirmemon@sohu.com xiaozub@online.sh.cn Received July 30 2003 revision accepted Sept. 11 2003 abstract: Nonlinear solution of reinforced concrete structures particularly complete load-deflection response requires tracing of the equilibrium path and proper treatment of the limit and bifurcation points. In this regard ordinary solution techniques lead to instability near the limit points and also have problems in case of snap-through and snap-back. Thus they fail to predict the complete load-displacement response. The arc-length method serves the purpose well in principle received wide acceptance in finite element analysis and has been used extensively. However modifications to the basic idea are vital to meet the particular needs of the analysis. This paper reviews some of the recent developments of the method in the last two decades with particular emphasis on nonlinear finite element analysis of reinforced concrete structures. Key words: arc-length method Nonlinear analysis Finite element method Reinforced concrete Load-deflection path
在线阅读 下载PDF
Movement and behavior analysis using neural spike signals in CA1 of rat hippocampus
13
作者 Hyejin An Kyungjin You +1 位作者 Minwhan Jung Hyunchool Shin 《Journal of Measurement Science and Instrumentation》 CAS 2013年第4期392-396,共5页
The hippocampus which lies in the temporal lobe plays an important role in spatial navigation,learning and memory.Several studies have been made on the place cell activity,spatial memory,prediction of future locations... The hippocampus which lies in the temporal lobe plays an important role in spatial navigation,learning and memory.Several studies have been made on the place cell activity,spatial memory,prediction of future locations and various learning paradigms.However,there are no attempts which have focused on finding whether neurons which contribute largely to both spatial memory and learning about the reward exist.This paper proposes that there are neurons that can simultaneously engage in forming place memory and reward learning in a rat hippocampus' s CA1 area.With a trained rat,a reward experiment was conducted in a modified 8-shaped maze with five stages,and utterance information was obtained from a CA1 neuron.The firing rate which is the count of spikes per unit time was calculated.The decoding was conducted with log-maximum likelihood estimation(Log-MLE) using Gaussian distribution model.Our outcomes provide evidence of neurons which play a part in spatial memory and learning regarding reward. 展开更多
关键词 HIPPOCaMPUS Ca1 place cell reward learning spatial memory Gaussian distribution maximum likelihood estimation(MLE)document codeaarticle ID:1674-8042(2013)04-0392-05
在线阅读 下载PDF
基于强化学习的异常反射编码超表面设计
14
作者 潘宇轩 蒋伊琳 +1 位作者 闫成越 李金鑫 《哈尔滨商业大学学报(自然科学版)》 2025年第1期38-42,共5页
研究了电磁超表面对电磁波束的异常反射调控,采用数字化编码的方式控制超表面的附加相位实现人工控制反射波束的方向.设计并仿真了一种32位二进制码控制的编码超表面,超表面的控制编码由强化学习算法生成.强化学习算法在两个方向上搜索... 研究了电磁超表面对电磁波束的异常反射调控,采用数字化编码的方式控制超表面的附加相位实现人工控制反射波束的方向.设计并仿真了一种32位二进制码控制的编码超表面,超表面的控制编码由强化学习算法生成.强化学习算法在两个方向上搜索编码(电磁波束的异常反射方向和反射信号的功率),以确保超表面实现异常反射功能并减少能量衰减.对编码调控后的超表面在入射波角度固定情况下仿真,结果表明,经过强化学习控制编码调控后的超表面在一定范围内实现了对45°入射的电磁波束反射方向的人工控制使其拥有五个不同的反射角度,反射信号功率可以保持在入射功率的40%以上. 展开更多
关键词 编码超表面 编码序列 强化学习 反射相位 异常反射 波束调控
在线阅读 下载PDF
基于自适应网络编码的异构无线链路并发传输控制方法研究 被引量:15
15
作者 赵夙 王伟 +1 位作者 朱晓荣 倪钦崟 《电子与信息学报》 EI CSCD 北大核心 2022年第8期2777-2784,共8页
随着高清视频直播、虚拟现实等高速率业务不断兴起,单一的网络很难满足用户的业务需求。利用多种异构链路实现并发传输,可以有效聚合带宽资源,提高服务质量。但是,在异构无线网络中,由于链路状况复杂多变,多条链路质量不一,现有的多路... 随着高清视频直播、虚拟现实等高速率业务不断兴起,单一的网络很难满足用户的业务需求。利用多种异构链路实现并发传输,可以有效聚合带宽资源,提高服务质量。但是,在异构无线网络中,由于链路状况复杂多变,多条链路质量不一,现有的多路径并发传输算法并不能自适应地根据复杂的网络状况做出最优的决策。该文提出了一种自适应网络编码的多路径并发传输控制算法,引入Asynchronous Advantage Actor-Critic(A3C)强化学习,通过自适应的网络编码,根据当前网络状况智能地选择编码分组大小和冗余大小,从而解决数据包的乱序问题。仿真结果表明,该算法能够提高10%左右的传输速率,提升了用户体验。 展开更多
关键词 无线网络 并发传输 网络编码 强化学习
在线阅读 下载PDF
考虑5G基站储能可调度容量的有源配电网协同优化调度方法 被引量:20
16
作者 陈实 郭正伟 +3 位作者 周步祥 刘艺洪 臧天磊 罗欢 《电网技术》 EI CSCD 北大核心 2023年第12期5225-5237,共13页
随着移动通信向5G快速更新换代,5G基站建设规模快速增长,可将海量5G通信基站中的闲置储能视作灵活性资源参与电力系统调度,以减轻新能源发电的随机性和波动性对系统的不利影响。针对含分布式风力发电有源配电网的基站储能经济优化调度问... 随着移动通信向5G快速更新换代,5G基站建设规模快速增长,可将海量5G通信基站中的闲置储能视作灵活性资源参与电力系统调度,以减轻新能源发电的随机性和波动性对系统的不利影响。针对含分布式风力发电有源配电网的基站储能经济优化调度问题,首先计及配电网潜在电力中断以及停电恢复时间2个因素,建立基站可靠性评估模型,系统地评估各基站储能的实时可调度容量。进一步以最小化系统运行成本为目标,采用基于变分自编码器(variational auto-encoder,VAE)模型的改进双延迟深度确定性策略梯度(twin delayed deep deterministic policy gradient,TD3)算法求解5G基站储能最优充放电策略。该算法将多基站储能状态用隐变量的形式表征以挖掘数据中隐含的关联,从而降低模型的求解复杂度,提升算法性能。通过迭代求解至收敛,实现多基站储能(multi-base station energy storage,MBSES)系统的实时调控并为每个基站制定符合实际工况的个性化充放电策略。最后通过算例验证了所提方法的有效性。 展开更多
关键词 5G基站 备用储能 可再生能源 可调度容量 特征编码 深度强化学习
原文传递
大粒度Pull Request描述自动生成 被引量:2
17
作者 邝砾 施如意 +2 位作者 赵雷浩 张欢 高洪皓 《软件学报》 EI CSCD 北大核心 2021年第6期1597-1611,共15页
在GitHub平台中,许多项目贡献者在提交Pull Request(PR)时往往会忽略提交PR描述,这使得提交的PR容易被评审者忽略或者拒绝.因此,自动生成PR描述以帮助项目贡献者提高PR通过率是很有必要的.然而,现有PR描述生成方法的表现会受到PR粒度影... 在GitHub平台中,许多项目贡献者在提交Pull Request(PR)时往往会忽略提交PR描述,这使得提交的PR容易被评审者忽略或者拒绝.因此,自动生成PR描述以帮助项目贡献者提高PR通过率是很有必要的.然而,现有PR描述生成方法的表现会受到PR粒度影响,无法有效为大粒度的PR生成描述.因此,该工作专注于大粒度PR描述的自动生成.首先对PR中的文本信息进行预处理,将文本中的单词作为辅助节点构建词-句异质图,以建立PR语句间的联系;随后对异质图进行特征提取,并将提取后的特征输入至图神经网络进行图表示学习,通过节点间的消息传递,使句子节点学习到更丰富的内容信息;最后,选择带有关键信息的句子组成PR描述.此外,针对PR数据集缺少人工标注的真实标签而无法进行监督学习的问题,使用强化学习指导PR描述的生成,以最小化获得奖励的负期望为目标训练模型,该过程与标签无关,并且直接提升了生成结果的表现.在真实的数据集上进行了实验,实验结果表明,提出的大粒度PR描述生成方法在F1值和可读性上优于现有方法. 展开更多
关键词 Pull Request描述 异质图神经网络 强化学习 非结构性文档 摘要生成
在线阅读 下载PDF
模糊强化学习型的图像矢量量化算法 被引量:1
18
作者 姜来 许文焕 +1 位作者 纪震 张基宏 《电子学报》 EI CAS CSCD 北大核心 2006年第9期1738-1741,共4页
本文给出了一种新的图像矢量量化码书的优化设计方法.传统矢量量化方法只考虑了码字与训练矢量之间的吸引影响,所以约束了最优解的寻解空间.本文提出了一种新的学习机理———模糊强化学习机制,该机制在传统的吸引因子基础上,引入新的... 本文给出了一种新的图像矢量量化码书的优化设计方法.传统矢量量化方法只考虑了码字与训练矢量之间的吸引影响,所以约束了最优解的寻解空间.本文提出了一种新的学习机理———模糊强化学习机制,该机制在传统的吸引因子基础上,引入新的排斥因子,极大地释放了吸引因子对最优解的寻解空间的约束.新的模糊强化学习机制没有采用引入随机扰动的方法来避免陷入局部最优码书,而是通过吸引因子和排斥因子的合力作用,较准确地确定了每个码字的最佳移动方向,从而使整体码书向全局最优解靠近.实验结果表明,基于模糊强化学习机制的矢量量化算法始终稳定地取得显著优于模糊K-means算法的性能,较好地解决了矢量量化中的码书设计容易陷入局部极小和初始码书影响优化结果的问题. 展开更多
关键词 矢量量化 图像编码 模糊强化学习 吸引因子 排斥因子
在线阅读 下载PDF
基于深度强化学习的干扰探测共享信号设计 被引量:2
19
作者 肖易寒 刘禹汐 +1 位作者 于祥祯 赵忠凯 《天津大学学报(自然科学与工程技术版)》 EI CAS CSCD 北大核心 2023年第12期1326-1336,共11页
针对当前雷达电子战越来越向着智能化的方向发展、传统干扰机无法适应环境变化、极大地降低了作战效果等问题,考虑将探测信号隐藏在干扰信号中,实现干扰探测共享信号,使侦察干扰机设备发射的干扰信号兼具探测的效果;针对当前干扰探测共... 针对当前雷达电子战越来越向着智能化的方向发展、传统干扰机无法适应环境变化、极大地降低了作战效果等问题,考虑将探测信号隐藏在干扰信号中,实现干扰探测共享信号,使侦察干扰机设备发射的干扰信号兼具探测的效果;针对当前干扰探测共享信号中存在的复杂度低、频谱宽度较窄等问题,设计了一种基于多载频多相位编码(multi-carrier phase code,MCPC)的干扰探测共享信号,其具有良好的类噪声宽频谱特性以及较好的距离探测能力和速度探测能力,可以在对目标雷达实现压制干扰的同时对目标信号及周围环境进行隐蔽探测;为了使共享信号能够适应对战场环境的感知与博弈,进一步引入深度强化学习算法对MCPC干扰探测共享信号进行优化;首先在竞争深度Q学习网络(dueling deep Q-learning network,Du DQN)的基础上对Q值进行正则化,解决了Du DQN中易出现的由过估计导致的局部最优问题;其次,在奖励值中引入状态价值函数形成复合奖励值,将其称为复合奖励值竞争深度正则化Q学习网络(composite reward-dueling deep Q-learning network based on regularization,CR-Du DQNReg),使MCPC共享信号对奖励值的敏感度随自身状态调整,自适应优化相位编码初值,达到更好的干扰和隐蔽探测的效果.实验仿真结果表明:经CR-DuDQNReg算法优化后的MCPC共享信号频谱最高幅度提升17.48%,脉压最高幅度提升17.25%,多普勒模糊函数第1旁瓣幅度降低12.69%,且与传统深度强化学习算法相比,CR-Du DQNReg算法的优化效果更好. 展开更多
关键词 干扰探测共享信号 多载频多相位编码 深度强化学习 复合奖励值
在线阅读 下载PDF
基于类型辅助引导的代码注释生成模型 被引量:1
20
作者 刘利 吕韦岑 汪洋 《无线电通信技术》 北大核心 2024年第4期807-814,共8页
代码注释生成方法通常基于结构-序列(Structure-Sequence, Struct2Seq)框架,但忽略了代码注释的类型信息,例如操作符、字符串等。由于类型信息之间的层次具有依赖性,将类型信息引入已有的Struct2Seq框架并不适用。为了解决上述问题,提... 代码注释生成方法通常基于结构-序列(Structure-Sequence, Struct2Seq)框架,但忽略了代码注释的类型信息,例如操作符、字符串等。由于类型信息之间的层次具有依赖性,将类型信息引入已有的Struct2Seq框架并不适用。为了解决上述问题,提出一种基于类型辅助引导的代码注释生成(Code Comment Generation based on Type-assisted Guidance, CCG-TG)模型,将源代码视为带有类型信息的n元树。该模型包含一个关联类型编码器和一个限制类型解码器,可以对源代码进行自适应总结。此外,提出一种多级强化学习(Multi-level Reinforcement Learning, MRL)方法来优化所提模型的训练过程。在多个数据集上进行实验,与多种基准模型对比,证明所提CCG-TG模型在所有评价指标上的性能最优。 展开更多
关键词 代码注释生成 类型信息 结构序列框架 类型辅助引导 强化学习
在线阅读 下载PDF
上一页 1 2 下一页 到第
使用帮助 返回顶部