期刊文献+
共找到6,065篇文章
< 1 2 250 >
每页显示 20 50 100
Residential Energy Scheduling With Solar Energy Based on Dyna Adaptive Dynamic Programming
1
作者 Kang Xiong Qinglai Wei Hongyang Li 《IEEE/CAA Journal of Automatica Sinica》 2025年第2期403-413,共11页
Learning-based methods have become mainstream for solving residential energy scheduling problems. In order to improve the learning efficiency of existing methods and increase the utilization of renewable energy, we pr... Learning-based methods have become mainstream for solving residential energy scheduling problems. In order to improve the learning efficiency of existing methods and increase the utilization of renewable energy, we propose the Dyna actiondependent heuristic dynamic programming(Dyna-ADHDP)method, which incorporates the ideas of learning and planning from the Dyna framework in action-dependent heuristic dynamic programming. This method defines a continuous action space for precise control of an energy storage system and allows online optimization of algorithm performance during the real-time operation of the residential energy model. Meanwhile, the target network is introduced during the training process to make the training smoother and more efficient. We conducted experimental comparisons with the benchmark method using simulated and real data to verify its applicability and performance. The results confirm the method's excellent performance and generalization capabilities, as well as its excellence in increasing renewable energy utilization and extending equipment life. 展开更多
关键词 Adaptive dynamic programming(Adp) dynamic residential scenarios optimal residential energy management smart grid
在线阅读 下载PDF
Value Iteration-Based Distributed Adaptive Dynamic Programming for Multi-Player Differential Game With Incomplete Information
2
作者 Yun Zhang Yuqi Wang Yunze Cai 《IEEE/CAA Journal of Automatica Sinica》 2025年第2期436-447,共12页
In this paper,a distributed adaptive dynamic programming(ADP)framework based on value iteration is proposed for multi-player differential games.In the game setting,players have no access to the information of others&#... In this paper,a distributed adaptive dynamic programming(ADP)framework based on value iteration is proposed for multi-player differential games.In the game setting,players have no access to the information of others'system parameters or control laws.Each player adopts an on-policy value iteration algorithm as the basic learning framework.To deal with the incomplete information structure,players collect a period of system trajectory data to compensate for the lack of information.The policy updating step is implemented by a nonlinear optimization problem aiming to search for the proximal admissible policy.Theoretical analysis shows that by adopting proximal policy searching rules,the approximated policies can converge to a neighborhood of equilibrium policies.The efficacy of our method is illustrated by three examples,which also demonstrate that the proposed method can accelerate the learning process compared with the centralized learning framework. 展开更多
关键词 Distributed adaptive dynamic programming incomplete information multi-player differential game(MPDG) value iteration
在线阅读 下载PDF
基于FDP的模块化电动重载车辆的能量分配与换挡策略
3
作者 张宁 王俊 +4 位作者 李子鸿 王金湘 殷国栋 欧阳天成 陈卓 《控制理论与应用》 北大核心 2025年第8期1561-1569,共9页
针对本文提出的电动重卡双动力单元构型,缺乏一种平衡车辆经济性和舒适性的能量管理策略.当使用动态规划算法时,存在计算效率慢、难以在线应用、插值泄露等问题.本文提出了一种利用缩减状态空间可行域和挡位保持函数来优化动态规划算法... 针对本文提出的电动重卡双动力单元构型,缺乏一种平衡车辆经济性和舒适性的能量管理策略.当使用动态规划算法时,存在计算效率慢、难以在线应用、插值泄露等问题.本文提出了一种利用缩减状态空间可行域和挡位保持函数来优化动态规划算法的方法.首先制定了面向经济性的策略,然后利用提取的规则对动态规划的状态空间可行域进行缩减,摒弃了选取电池充放电状态(SOC)作为状态变量的方法,并通过在代价函数中加入挡位保持函数,得到了一种较为完善的能量管理策略.结果表明,本策略以多消耗极少的能量为代价,使变速器挡位切换和电机启停次数大幅减少,有效地平衡了车辆的经济性与舒适性. 展开更多
关键词 电动重载车辆 模块化 能量管理策略 快速动态规划
在线阅读 下载PDF
Adaptive fault-tolerant control for non-minimum phase hypersonic vehicles based on adaptive dynamic programming 被引量:3
4
作者 Le WANG Ruiyun QI Bin JIANG 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2024年第3期290-311,共22页
In this paper,a novel adaptive Fault-Tolerant Control(FTC)strategy is proposed for non-minimum phase Hypersonic Vehicles(HSVs)that are affected by actuator faults and parameter uncertainties.The strategy is based on t... In this paper,a novel adaptive Fault-Tolerant Control(FTC)strategy is proposed for non-minimum phase Hypersonic Vehicles(HSVs)that are affected by actuator faults and parameter uncertainties.The strategy is based on the output redefinition method and Adaptive Dynamic Programming(ADP).The intelligent FTC scheme consists of two main parts:a basic fault-tolerant and stable controller and an ADP-based supplementary controller.In the basic FTC part,an output redefinition approach is designed to make zero-dynamics stable with respect to the new output.Then,Ideal Internal Dynamic(IID)is obtained using an optimal bounded inversion approach,and a tracking controller is designed for the new output to realize output tracking of the nonminimum phase HSV system.For the ADP-based compensation control part,an ActionDependent Heuristic Dynamic Programming(ADHDP)adopting an actor-critic learning structure is utilized to further optimize the tracking performance of the HSV control system.Finally,simulation results are provided to verify the effectiveness and efficiency of the proposed FTC algorithm. 展开更多
关键词 Hypersonic vehicle Fault-tolerant control Non-minimum phase system Adaptive control Nonlinear control Adaptive dynamic programming
原文传递
Recent Progress in Reinforcement Learning and Adaptive Dynamic Programming for Advanced Control Applications 被引量:11
5
作者 Ding Wang Ning Gao +2 位作者 Derong Liu Jinna Li Frank L.Lewis 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期18-36,共19页
Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and ... Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence. 展开更多
关键词 Adaptive dynamic programming(Adp) advanced control complex environment data-driven control event-triggered design intelligent control neural networks nonlinear systems optimal control reinforcement learning(RL)
在线阅读 下载PDF
基于无人机数据和ADP算法的铁路线路多目标优化方法 被引量:4
6
作者 洪英杰 高岩 +3 位作者 杨书生 刘托 王平 何庆 《铁道运输与经济》 北大核心 2025年第4期186-195,204,共11页
铁路线路方案的规划与评价为多目标决策,影响工程经济、环境等多方面。为探讨铁路线路多目标优化方法,提出了基于工程造价、生态指标和碳排放的多目标线形优化方法。基于无人机采集的高精度地理信息数据,通过监督分类进行建(构)造物边... 铁路线路方案的规划与评价为多目标决策,影响工程经济、环境等多方面。为探讨铁路线路多目标优化方法,提出了基于工程造价、生态指标和碳排放的多目标线形优化方法。基于无人机采集的高精度地理信息数据,通过监督分类进行建(构)造物边界和生态特征的智能识别,建立包含周边复杂环境的耦合约束集。基于自适应动态规划(Approximate dynamic programming,ADP)算法,引入深度神经网络模型实现线形的智能精细化调整,运用帕累托(Pareto)最优原理处理不同目标之间的冲突关系,将帕累托最优解在三维空间中构建出来,给予决策者更多的决策空间。本方法在华东地区某高速铁路连接线项目中得到应用,结果表明:该方法较人工选线方案降低建设经济费用2.28%,生态优化和碳排放优化也分别达到2.67%和1.59%。该智能选线方法可以为设计人员提供不同优化目标的多种线路方案,实现铁路线路经济效益、环境影响的平衡。 展开更多
关键词 铁路选线 铁路线形优化 多目标动态规划 无人机数据 方案比选
在线阅读 下载PDF
Adaptive Optimal Discrete-Time Output-Feedback Using an Internal Model Principle and Adaptive Dynamic Programming 被引量:1
7
作者 Zhongyang Wang Youqing Wang Zdzisław Kowalczuk 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期131-140,共10页
In order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming(ADP) technique based on the internal model principle(IMP). The proposed metho... In order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming(ADP) technique based on the internal model principle(IMP). The proposed method, termed as IMP-ADP, does not require complete state feedback-merely the measurement of input and output data. More specifically, based on the IMP, the output control problem can first be converted into a stabilization problem. We then design an observer to reproduce the full state of the system by measuring the inputs and outputs. Moreover, this technique includes both a policy iteration algorithm and a value iteration algorithm to determine the optimal feedback gain without using a dynamic system model. It is important that with this concept one does not need to solve the regulator equation. Finally, this control method was tested on an inverter system of grid-connected LCLs to demonstrate that the proposed method provides the desired performance in terms of both tracking and disturbance rejection. 展开更多
关键词 Adaptive dynamic programming(Adp) internal model principle(IMP) output feedback problem policy iteration(PI) value iteration(VI)
在线阅读 下载PDF
一种基于DTW-DP-GMM的工业机器人轨迹学习策略 被引量:3
8
作者 肖洒 陈旭阳 +1 位作者 叶锦华 吴海彬 《天津大学学报(自然科学与工程技术版)》 EI CAS 北大核心 2025年第1期68-80,共13页
针对机器人示教编程过程中使用高斯混合模型(GMM)规划运动轨迹时存在的高斯分布个数难以选择、复现轨迹精度较低等问题,提出了一种复合的机器人运动轨迹学习策略.该策略包含动态时间规整(DTW)算法、高斯混合模型与道格拉斯-普克(DP)算法... 针对机器人示教编程过程中使用高斯混合模型(GMM)规划运动轨迹时存在的高斯分布个数难以选择、复现轨迹精度较低等问题,提出了一种复合的机器人运动轨迹学习策略.该策略包含动态时间规整(DTW)算法、高斯混合模型与道格拉斯-普克(DP)算法.首先,针对示教过程中采集的多条轨迹在时间长度上存在差异的问题,采用DTW算法来统一示教轨迹在时域上的变化.其次,使用GMM算法对示教轨迹的特征进行提取,并利用高斯混合回归(GMR)算法将其重构为复现轨迹.在这个过程中采用DP算法来预估GMM算法的关键参数高斯分布的数量,与传统方法相比,能够简单直观地得到相对准确的参数值.利用DP算法对复现轨迹的数据点进行稀疏化并优化,不仅确保了机器人最终运动轨迹的精度,而且大幅减少了最终轨迹数据点的数量.最后,进行了不同形状的模拟焊接轨迹学习规划实验.结果表明:经由DTW对齐后的示教轨迹具有更加明显的运动特征,经过GMM-GMR学习输出的复现轨迹具有良好的表征结果;在使用GMM-GMR算法学习示教轨迹的过程中,采用DP算法可以有效预估高斯分布个数;经过DP算法稀疏化并优化的最终轨迹的平均位置误差均在0.500 mm以内,其最大误差可以控制在0.800 mm以内,可以满足焊接轨迹规划的精度要求,验证了该策略的有效性和优越性. 展开更多
关键词 工业机器人 示教编程 高斯混合模型 道格拉斯-普克算法 动态时间规整 轨迹复现
在线阅读 下载PDF
PDP:Parallel Dynamic Programming 被引量:15
9
作者 Fei-Yue Wang Jie Zhang +2 位作者 Qinglai Wei Xinhu Zheng Li Li 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2017年第1期1-5,共5页
Deep reinforcement learning is a focus research area in artificial intelligence.The principle of optimality in dynamic programming is a key to the success of reinforcement learning methods.The principle of adaptive dy... Deep reinforcement learning is a focus research area in artificial intelligence.The principle of optimality in dynamic programming is a key to the success of reinforcement learning methods.The principle of adaptive dynamic programming ADP is first presented instead of direct dynamic programming DP,and the inherent relationship between ADP and deep reinforcement learning is developed.Next,analytics intelligence,as the necessary requirement,for the real reinforcement learning,is discussed.Finally,the principle of the parallel dynamic programming,which integrates dynamic programming and analytics intelligence,is presented as the future computational intelligence.©2014 Chinese Association of Automation. 展开更多
关键词 Parallel dynamic programming dynamic programming Adaptive dynamic programming Reinforcement learning Deep learning Neural networks Artificial intelligence
在线阅读 下载PDF
Bayesian network structure learning by dynamic programming algorithm based on node block sequence constraints
10
作者 Chuchao He Ruohai Di +1 位作者 Bo Li Evgeny Neretin 《CAAI Transactions on Intelligence Technology》 2024年第6期1605-1622,共18页
The use of dynamic programming(DP)algorithms to learn Bayesian network structures is limited by their high space complexity and difficulty in learning the structure of large-scale networks.Therefore,this study propose... The use of dynamic programming(DP)algorithms to learn Bayesian network structures is limited by their high space complexity and difficulty in learning the structure of large-scale networks.Therefore,this study proposes a DP algorithm based on node block sequence constraints.The proposed algorithm constrains the traversal process of the parent graph by using the M-sequence matrix to considerably reduce the time consumption and space complexity by pruning the traversal process of the order graph using the node block sequence.Experimental results show that compared with existing DP algorithms,the proposed algorithm can obtain learning results more efficiently with less than 1%loss of accuracy,and can be used for learning larger-scale networks. 展开更多
关键词 Bayesian network(BN) dynamic programming(dp) node block sequence strongly connected component(SCC) structure learning
在线阅读 下载PDF
Performance Potential-based Neuro-dynamic Programming for SMDPs 被引量:10
11
作者 TANGHao YUANJi-Bin LUYang CHENGWen-Juan 《自动化学报》 EI CSCD 北大核心 2005年第4期642-645,共4页
An alpha-uniformized Markov chain is defined by the concept of equivalent infinitesimalgenerator for a semi-Markov decision process (SMDP) with both average- and discounted-criteria.According to the relations of their... An alpha-uniformized Markov chain is defined by the concept of equivalent infinitesimalgenerator for a semi-Markov decision process (SMDP) with both average- and discounted-criteria.According to the relations of their performance measures and performance potentials, the optimiza-tion of an SMDP can be realized by simulating the chain. For the critic model of neuro-dynamicprogramming (NDP), a neuro-policy iteration (NPI) algorithm is presented, and the performanceerror bound is shown as there are approximate error and improvement error in each iteration step.The obtained results may be extended to Markov systems, and have much applicability. Finally, anumerical example is provided. 展开更多
关键词 决议过程 SMdp 执行电位 神经动力学 MARKOV链 优化设计
在线阅读 下载PDF
深水半潜式钻井平台DP动力定位状态下柴油机STANDBY启停注意事项分析
12
作者 刘芝亮 张建洲 《河北石油职业技术大学学报》 2025年第2期45-49,89,共6页
采用DP动力定位的深水半潜式钻井平台,其电力系统需满足高冗余性与快速响应要求,以确保平台在复杂海况下的安全定位与连续作业。规范要求柴油发电机处于备用(STANDBY)状态时,能够根据负载变化自动启动并网,但传统启动方式因缺乏液击防... 采用DP动力定位的深水半潜式钻井平台,其电力系统需满足高冗余性与快速响应要求,以确保平台在复杂海况下的安全定位与连续作业。规范要求柴油发电机处于备用(STANDBY)状态时,能够根据负载变化自动启动并网,但传统启动方式因缺乏液击防护机制,存在重大安全隐患。以DP3级动力定位平台为例,系统分析了柴油发电机STANDBY状态下的自动启停设计逻辑,重点探讨了慢转自检系统对液击风险的防控作用。通过对比无慢转控制与新一代慢转系统的性能差异,揭示了慢转功能在启动前检测气缸积液、实现预润滑以及缩短响应时间方面的技术优势。 展开更多
关键词 dp动力定位 深水半潜式钻井平台 柴油发电机STANDBY状态启动设计 慢转系统 应用分析
在线阅读 下载PDF
An ADP-based robust control scheme for nonaffine nonlinear systems with uncertainties and input constraints
13
作者 Shijie Luo Kun Zhang Wenchao Xue 《Chinese Physics B》 2025年第6期251-260,共10页
The paper develops a robust control approach for nonaffine nonlinear continuous systems with input constraints and unknown uncertainties. Firstly, this paper constructs an affine augmented system(AAS) within a pre-com... The paper develops a robust control approach for nonaffine nonlinear continuous systems with input constraints and unknown uncertainties. Firstly, this paper constructs an affine augmented system(AAS) within a pre-compensation technique for converting the original nonaffine dynamics into affine dynamics. Secondly, the paper derives a stability criterion linking the original nonaffine system and the auxiliary system, demonstrating that the obtained optimal policies from the auxiliary system can achieve the robust controller of the nonaffine system. Thirdly, an online adaptive dynamic programming(ADP) algorithm is designed for approximating the optimal solution of the Hamilton–Jacobi–Bellman(HJB) equation.Moreover, the gradient descent approach and projection approach are employed for updating the actor-critic neural network(NN) weights, with the algorithm's convergence being proven. Then, the uniformly ultimately bounded stability of state is guaranteed. Finally, in simulation, some examples are offered for validating the effectiveness of this presented approach. 展开更多
关键词 adaptive dynamic programming robust control nonaffine nonlinear system neural network
原文传递
Reduction of losses in electric power distribution system-dynamic reconfiguration case study
14
作者 Branimir Novoselnik Drago Bago +1 位作者 Jadranko Matuško Mato Baotić 《Control Theory and Technology》 2025年第1期49-63,共15页
This paper deals with reduction of losses in electric power distribution system through a dynamic reconfiguration case study of a grid in the city of Mostar,Bosnia and Herzegovina.The proposed solution is based on a n... This paper deals with reduction of losses in electric power distribution system through a dynamic reconfiguration case study of a grid in the city of Mostar,Bosnia and Herzegovina.The proposed solution is based on a nonlinear model predictive control algorithm which determines the optimal switching operations of the distribution system.The goal of the control algorithm is to find the optimal radial network topology which minimizes cumulative active power losses and maximizes voltages across the network while simultaneously satisfying all system constraints.The optimization results are validated through multiple simulations(using real power demand data collected for a few characteristic days during winter and summer)which demonstrate the efficiency and usefulness of the developed control algorithm in reducing the grid losses by up to 14%. 展开更多
关键词 Nonlinear model predictive control dynamic reconfiguration Power distribution system Mixed-integer programming Real-life case study
原文传递
Dynamic Optimization of Portfolios 2018 to 2024
15
作者 Elmo Tambosi Filho 《Chinese Business Review》 2025年第3期109-117,共9页
Investors are always willing to receive more data.This has become especially true for the application of modern portfolio theory to the institutional asset allocation process,which requires quantitative estimates of r... Investors are always willing to receive more data.This has become especially true for the application of modern portfolio theory to the institutional asset allocation process,which requires quantitative estimates of risk and return.When long-term data series are unavailable for analysis,it has become common practice to use recent data only.The danger is that these data may not be representative of future performance.Although longer data series are of poorer quality,are difficult to obtain,and may reflect various political and economic regimes,they often paint a very different picture of emerging market performance.This paper presents an application of a stochastic non-linear optimization model of portfolios including transaction costs in the Brazilian financial market.In order to have that,portfolio theory and optimal control were used as theoretical basis.The first strategy tries to allocate the whole available wealth,not considering the risk associated to portfolio(deterministic result).In this case the investor obtained profits of 7.23%a month,taking into account the three risk aversion levels during the whole planning period.On the contrary,the results from the stochastic algorithm obtain profits of 1.34%a month and 18.06%a year,if the investor has low risk aversion.The profits would be 0.88%a month and 11.02%a year for a medium risk aversion investor.And with high risk aversion,the investor obtains 0.62%a month and 7.68%a year. 展开更多
关键词 dynamic modeling stochastic optimizing and non-linear programming
在线阅读 下载PDF
基于IDP的重型商用车自适应距离域预见性巡航控制策略 被引量:2
16
作者 李兴坤 王国晖 +3 位作者 卢紫旺 王玉海 王语风 田光宇 《汽车工程》 EI CSCD 北大核心 2024年第8期1346-1356,共11页
为降低重型商用车燃油消耗、减少运输成本,本文协调“人-车-路”交互体系,将车辆与智能网联环境下的多维度信息进行融合,提出了一种基于迭代动态规划(iterative dynamic programming,IDP)的自适应距离域预见性巡航控制策略(adaptive ran... 为降低重型商用车燃油消耗、减少运输成本,本文协调“人-车-路”交互体系,将车辆与智能网联环境下的多维度信息进行融合,提出了一种基于迭代动态规划(iterative dynamic programming,IDP)的自适应距离域预见性巡航控制策略(adaptive range predictive cruise control strategy,ARPCC)。首先结合车辆状态与前方环境多维度信息,基于车辆纵向动力学建立自适应距离域模型对路网重构,简化网格数量并利用IDP求取全局最优速度序列。其次,在全局最优速度序列的基础上,求取自适应距离域内的分段最优速度序列,实现车辆控制状态的快速求解。最后,利用Matlab/Simulink进行验证。结果表明,通过多次迭代缩小网格,该算法有效提高了计算效率和车辆燃油经济性。 展开更多
关键词 重型商用车 自适应距离域 预见性巡航 迭代动态规划
在线阅读 下载PDF
Shrek:a dynamic object-oriented programming language 被引量:1
17
作者 曹璟 徐宝文 周毓明 《Journal of Southeast University(English Edition)》 EI CAS 2009年第1期31-35,共5页
From a perspective of theoretical study, there are some faults in the models of the existing object-oriented programming languages. For example, C# does not support metaclasses, the primitive types of Java and C# are ... From a perspective of theoretical study, there are some faults in the models of the existing object-oriented programming languages. For example, C# does not support metaclasses, the primitive types of Java and C# are not objects, etc. So, this paper designs a programming language, Shrek, which integrates many language features and constructions in a compact and consistent model. The Shrek language is a class-based purely object-oriented language. It has a dynamical strong type system, and adopts a single-inheritance mechanism with Mixin as its complement. It has a consistent class instantiation and inheritance structure, and the ability of intercessive structural computational reflection, which enables it to support safe metaclass programming. It also supports multi-thread programming and automatic garbage collection, and enforces its expressive power by adopting a native method mechanism. The prototype system of the Shrek language is implemented and anticipated design goals are achieved. 展开更多
关键词 dynamic typing metaclass programming computational reflection native method object-oriented programming language
在线阅读 下载PDF
基于有限时间ADP的微波加热高钛渣温度跟踪控制 被引量:1
18
作者 杨彪 杜婉 +3 位作者 李鑫培 高皓 刘承 马红涛 《控制工程》 CSCD 北大核心 2024年第2期193-202,共10页
针对常规控制方法对微波加热过程控制效果不够理想的问题,提出一种基于数据驱动模型的有限时间自适应动态规划微波加热温度跟踪算法。算法包含模型网络、评价网络和执行网络,这3个网络的实现依赖于神经网络。模型网络实现微波加热过程... 针对常规控制方法对微波加热过程控制效果不够理想的问题,提出一种基于数据驱动模型的有限时间自适应动态规划微波加热温度跟踪算法。算法包含模型网络、评价网络和执行网络,这3个网络的实现依赖于神经网络。模型网络实现微波加热过程的数据驱动建模,评价网络和执行网络实现最优性能指标函数和控制功率的逼近。最后将温度跟踪转化为误差的镇定。通过理论推导证明了算法的收敛性及最优性,并进一步开展了微波加热高钛渣温度跟踪实验和仿真研究。结果表明,算法能有效地跟踪高钛渣的加热过程,基于ELMAN神经网络的模型预测误差小于1℃,温度跟踪误差小于0.2℃,在工业微波加热中具有潜在的应用价值。 展开更多
关键词 微波加热 高钛渣 有限时间 自适应动态规划 神经网络
原文传递
UAV flight strategy algorithm based on dynamic programming 被引量:7
19
作者 ZHANG Zixuan WU Qinhao +2 位作者 ZHANG Bo YI Xiaodong TANG Yuhua 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2018年第6期1293-1299,共7页
Unmanned aerial vehicles(UAVs) may play an important role in data collection and offloading in vast areas deploying wireless sensor networks, and the UAV’s action strategy has a vital influence on achieving applicabi... Unmanned aerial vehicles(UAVs) may play an important role in data collection and offloading in vast areas deploying wireless sensor networks, and the UAV’s action strategy has a vital influence on achieving applicability and computational complexity. Dynamic programming(DP) has a good application in the path planning of UAV, but there are problems in the applicability of special terrain environment and the complexity of the algorithm.Based on the analysis of DP, this paper proposes a hierarchical directional DP(DDP) algorithm based on direction determination and hierarchical model. We compare our methods with Q-learning and DP algorithm by experiments, and the results show that our method can improve the terrain applicability, meanwhile greatly reduce the computational complexity. 展开更多
关键词 motion state space map stratification computational complexity dynamic programming(dp) envirommental adaptability
在线阅读 下载PDF
Approximate Dynamic Programming for Stochastic Resource Allocation Problems 被引量:4
20
作者 Ali Forootani Raffaele Iervolino +1 位作者 Massimo Tipaldi Joshua Neilson 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2020年第4期975-990,共16页
A stochastic resource allocation model, based on the principles of Markov decision processes(MDPs), is proposed in this paper. In particular, a general-purpose framework is developed, which takes into account resource... A stochastic resource allocation model, based on the principles of Markov decision processes(MDPs), is proposed in this paper. In particular, a general-purpose framework is developed, which takes into account resource requests for both instant and future needs. The considered framework can handle two types of reservations(i.e., specified and unspecified time interval reservation requests), and implement an overbooking business strategy to further increase business revenues. The resulting dynamic pricing problems can be regarded as sequential decision-making problems under uncertainty, which is solved by means of stochastic dynamic programming(DP) based algorithms. In this regard, Bellman’s backward principle of optimality is exploited in order to provide all the implementation mechanisms for the proposed reservation pricing algorithm. The curse of dimensionality, as the inevitable issue of the DP both for instant resource requests and future resource reservations,occurs. In particular, an approximate dynamic programming(ADP) technique based on linear function approximations is applied to solve such scalability issues. Several examples are provided to show the effectiveness of the proposed approach. 展开更多
关键词 Approximate dynamic programming(Adp) dynamic programming(dp) Markov decision processes(Mdps) resource allocation problem
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部