期刊文献+
共找到57篇文章
< 1 2 3 >
每页显示 20 50 100
Value Iteration-Based Distributed Adaptive Dynamic Programming for Multi-Player Differential Game With Incomplete Information
1
作者 Yun Zhang Yuqi Wang Yunze Cai 《IEEE/CAA Journal of Automatica Sinica》 2025年第2期436-447,共12页
In this paper,a distributed adaptive dynamic programming(ADP)framework based on value iteration is proposed for multi-player differential games.In the game setting,players have no access to the information of others&#... In this paper,a distributed adaptive dynamic programming(ADP)framework based on value iteration is proposed for multi-player differential games.In the game setting,players have no access to the information of others'system parameters or control laws.Each player adopts an on-policy value iteration algorithm as the basic learning framework.To deal with the incomplete information structure,players collect a period of system trajectory data to compensate for the lack of information.The policy updating step is implemented by a nonlinear optimization problem aiming to search for the proximal admissible policy.Theoretical analysis shows that by adopting proximal policy searching rules,the approximated policies can converge to a neighborhood of equilibrium policies.The efficacy of our method is illustrated by three examples,which also demonstrate that the proposed method can accelerate the learning process compared with the centralized learning framework. 展开更多
关键词 Distributed adaptive dynamic programming incomplete information multi-player differential game(MPDG) value iteration
在线阅读 下载PDF
Residential Energy Scheduling With Solar Energy Based on Dyna Adaptive Dynamic Programming
2
作者 Kang Xiong Qinglai Wei Hongyang Li 《IEEE/CAA Journal of Automatica Sinica》 2025年第2期403-413,共11页
Learning-based methods have become mainstream for solving residential energy scheduling problems. In order to improve the learning efficiency of existing methods and increase the utilization of renewable energy, we pr... Learning-based methods have become mainstream for solving residential energy scheduling problems. In order to improve the learning efficiency of existing methods and increase the utilization of renewable energy, we propose the Dyna actiondependent heuristic dynamic programming(Dyna-ADHDP)method, which incorporates the ideas of learning and planning from the Dyna framework in action-dependent heuristic dynamic programming. This method defines a continuous action space for precise control of an energy storage system and allows online optimization of algorithm performance during the real-time operation of the residential energy model. Meanwhile, the target network is introduced during the training process to make the training smoother and more efficient. We conducted experimental comparisons with the benchmark method using simulated and real data to verify its applicability and performance. The results confirm the method's excellent performance and generalization capabilities, as well as its excellence in increasing renewable energy utilization and extending equipment life. 展开更多
关键词 adaptive dynamic programming(ADP) dynamic residential scenarios optimal residential energy management smart grid
在线阅读 下载PDF
Recent Progress in Reinforcement Learning and Adaptive Dynamic Programming for Advanced Control Applications 被引量:11
3
作者 Ding Wang Ning Gao +2 位作者 Derong Liu Jinna Li Frank L.Lewis 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期18-36,共19页
Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and ... Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence. 展开更多
关键词 adaptive dynamic programming(ADP) advanced control complex environment data-driven control event-triggered design intelligent control neural networks nonlinear systems optimal control reinforcement learning(RL)
在线阅读 下载PDF
Adaptive fault-tolerant control for non-minimum phase hypersonic vehicles based on adaptive dynamic programming 被引量:3
4
作者 Le WANG Ruiyun QI Bin JIANG 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2024年第3期290-311,共22页
In this paper,a novel adaptive Fault-Tolerant Control(FTC)strategy is proposed for non-minimum phase Hypersonic Vehicles(HSVs)that are affected by actuator faults and parameter uncertainties.The strategy is based on t... In this paper,a novel adaptive Fault-Tolerant Control(FTC)strategy is proposed for non-minimum phase Hypersonic Vehicles(HSVs)that are affected by actuator faults and parameter uncertainties.The strategy is based on the output redefinition method and Adaptive Dynamic Programming(ADP).The intelligent FTC scheme consists of two main parts:a basic fault-tolerant and stable controller and an ADP-based supplementary controller.In the basic FTC part,an output redefinition approach is designed to make zero-dynamics stable with respect to the new output.Then,Ideal Internal Dynamic(IID)is obtained using an optimal bounded inversion approach,and a tracking controller is designed for the new output to realize output tracking of the nonminimum phase HSV system.For the ADP-based compensation control part,an ActionDependent Heuristic Dynamic Programming(ADHDP)adopting an actor-critic learning structure is utilized to further optimize the tracking performance of the HSV control system.Finally,simulation results are provided to verify the effectiveness and efficiency of the proposed FTC algorithm. 展开更多
关键词 Hypersonic vehicle Fault-tolerant control Non-minimum phase system adaptive control Nonlinear control adaptive dynamic programming
原文传递
Adaptive Optimal Discrete-Time Output-Feedback Using an Internal Model Principle and Adaptive Dynamic Programming 被引量:1
5
作者 Zhongyang Wang Youqing Wang Zdzisław Kowalczuk 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期131-140,共10页
In order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming(ADP) technique based on the internal model principle(IMP). The proposed metho... In order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming(ADP) technique based on the internal model principle(IMP). The proposed method, termed as IMP-ADP, does not require complete state feedback-merely the measurement of input and output data. More specifically, based on the IMP, the output control problem can first be converted into a stabilization problem. We then design an observer to reproduce the full state of the system by measuring the inputs and outputs. Moreover, this technique includes both a policy iteration algorithm and a value iteration algorithm to determine the optimal feedback gain without using a dynamic system model. It is important that with this concept one does not need to solve the regulator equation. Finally, this control method was tested on an inverter system of grid-connected LCLs to demonstrate that the proposed method provides the desired performance in terms of both tracking and disturbance rejection. 展开更多
关键词 adaptive dynamic programming(ADP) internal model principle(IMP) output feedback problem policy iteration(PI) value iteration(VI)
在线阅读 下载PDF
Parallel Control for Optimal Tracking via Adaptive Dynamic Programming 被引量:25
6
作者 Jingwei Lu Qinglai Wei Fei-Yue Wang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2020年第6期1662-1674,共13页
This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems.Unlike existing optimal state feedback control,the control input of the optimal parallel control is int... This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems.Unlike existing optimal state feedback control,the control input of the optimal parallel control is introduced into the feedback system.However,due to the introduction of control input into the feedback system,the optimal state feedback control methods can not be applied directly.To address this problem,an augmented system and an augmented performance index function are proposed firstly.Thus,the general nonlinear system is transformed into an affine nonlinear system.The difference between the optimal parallel control and the optimal state feedback control is analyzed theoretically.It is proven that the optimal parallel control with the augmented performance index function can be seen as the suboptimal state feedback control with the traditional performance index function.Moreover,an adaptive dynamic programming(ADP)technique is utilized to implement the optimal parallel tracking control using a critic neural network(NN)to approximate the value function online.The stability analysis of the closed-loop system is performed using the Lyapunov theory,and the tracking error and NN weights errors are uniformly ultimately bounded(UUB).Also,the optimal parallel controller guarantees the continuity of the control input under the circumstance that there are finite jump discontinuities in the reference signals.Finally,the effectiveness of the developed optimal parallel control method is verified in two cases. 展开更多
关键词 adaptive dynamic programming(ADP) nonlinear optimal control parallel controller parallel control theory parallel system tracking control neural network(NN)
在线阅读 下载PDF
Residential Energy Scheduling for Variable Weather Solar Energy Based on Adaptive Dynamic Programming 被引量:18
7
作者 Derong Liu Yancai Xu +1 位作者 Qinglai Wei Xinliang Liu 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2018年第1期36-46,共11页
The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable ener... The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable energy resources, are combined together as a nonlinear, time-varying, indefinite and complex system, which is difficult to manage or optimize. Many nations have already applied the residential real-time pricing to balance the burden on their grid. In order to enhance electricity efficiency of the residential micro grid, this paper presents an action dependent heuristic dynamic programming(ADHDP) method to solve the residential energy scheduling problem. The highlights of this paper are listed below. First,the weather-type classification is adopted to establish three types of programming models based on the features of the solar energy. In addition, the priorities of different energy resources are set to reduce the loss of electrical energy transmissions.Second, three ADHDP-based neural networks, which can update themselves during applications, are designed to manage the flows of electricity. Third, simulation results show that the proposed scheduling method has effectively reduced the total electricity cost and improved load balancing process. The comparison with the particle swarm optimization algorithm further proves that the present method has a promising effect on energy management to save cost. 展开更多
关键词 Action dependent heuristic dynamic programming adaptive dynamic programming control strategy residential energy management smart grid
在线阅读 下载PDF
Adaptive event-triggered distributed optimal guidance design via adaptive dynamic programming 被引量:7
8
作者 Teng LONG Yan CAO +1 位作者 Jingliang SUN Guangtong XU 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2022年第7期113-127,共15页
In this paper,the multi-missile cooperative guidance system is formulated as a general nonlinear multi-agent system.To save the limited communication resources,an adaptive eventtriggered optimal guidance law is propos... In this paper,the multi-missile cooperative guidance system is formulated as a general nonlinear multi-agent system.To save the limited communication resources,an adaptive eventtriggered optimal guidance law is proposed by designing a synchronization-error-driven triggering condition,which brings together the consensus control with Adaptive Dynamic Programming(ADP)technique.Then,the developed event-triggered distributed control law can be employed by finding an approximate solution of event-triggered coupled Hamilton-Jacobi-Bellman(HJB)equation.To address this issue,the critic network architecture is constructed,in which an adaptive weight updating law is designed for estimating the cooperative optimal cost function online.Therefore,the event-triggered closed-loop system is decomposed into two subsystems:the system with flow dynamics and the system with jump dynamics.By using Lyapunov method,the stability of this closed-loop system is guaranteed and all signals are ensured to be Uniformly Ultimately Bounded(UUB).Furthermore,the Zeno behavior is avoided.Simulation results are finally provided to demonstrate the effectiveness of the proposed method. 展开更多
关键词 adaptive dynamic programming Distributed control Event-triggered Guidance and control Multi-agent system
原文传递
Policy iteration optimal tracking control for chaotic systems by using an adaptive dynamic programming approach 被引量:2
9
作者 魏庆来 刘德荣 徐延才 《Chinese Physics B》 SCIE EI CAS CSCD 2015年第3期87-94,共8页
A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking prob... A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking problem is transformed into an optimal regulation one. The policy iteration algorithm for discrete-time chaotic systems is first described. Then,the convergence and admissibility properties of the developed policy iteration algorithm are presented, which show that the transformed chaotic system can be stabilized under an arbitrary iterative control law and the iterative performance index function simultaneously converges to the optimum. By implementing the policy iteration algorithm via neural networks,the developed optimal tracking control scheme for chaotic systems is verified by a simulation. 展开更多
关键词 adaptive critic designs adaptive dynamic programming approximate dynamic programming neuro-dynamic programming
原文传递
Neural-network-based stochastic linear quadratic optimal tracking control scheme for unknown discrete-time systems using adaptive dynamic programming 被引量:2
10
作者 Xin Chen Fang Wang 《Control Theory and Technology》 EI CSCD 2021年第3期315-327,共13页
In this paper,a stochastic linear quadratic optimal tracking scheme is proposed for unknown linear discrete-time(DT)systems based on adaptive dynamic programming(ADP)algorithm.First,an augmented system composed of the... In this paper,a stochastic linear quadratic optimal tracking scheme is proposed for unknown linear discrete-time(DT)systems based on adaptive dynamic programming(ADP)algorithm.First,an augmented system composed of the original system and the command generator is constructed and then an augmented stochastic algebraic equation is derived based on the augmented system.Next,to obtain the optimal control strategy,the stochastic case is converted into the deterministic one by system transformation,and then an ADP algorithm is proposed with convergence analysis.For the purpose of realizing the ADP algorithm,three back propagation neural networks including model network,critic network and action network are devised to guarantee unknown system model,optimal value function and optimal control strategy,respectively.Finally,the obtained optimal control strategy is applied to the original stochastic system,and two simulations are provided to demonstrate the effectiveness of the proposed algorithm. 展开更多
关键词 Stochastic system Optimal tracking control adaptive dynamic programming Neural networks
原文传递
Event-based performance guaranteed tracking control for constrained nonlinear system via adaptive dynamic programming method
11
作者 Xingyi Zhang Zijie Guo +1 位作者 Hongru Ren Hongyi Li 《Journal of Automation and Intelligence》 2023年第4期239-247,共9页
An optimal tracking control problem for a class of nonlinear systems with guaranteed performance and asymmetric input constraints is discussed in this paper.The control policy is implemented by adaptive dynamic progra... An optimal tracking control problem for a class of nonlinear systems with guaranteed performance and asymmetric input constraints is discussed in this paper.The control policy is implemented by adaptive dynamic programming(ADP)algorithm under two event-based triggering mechanisms.It is often challenging to design an optimal control law due to the system deviation caused by asymmetric input constraints.First,a prescribed performance control technique is employed to guarantee the tracking errors within predetermined boundaries.Subsequently,considering the asymmetric input constraints,a discounted non-quadratic cost function is introduced.Moreover,in order to reduce controller updates,an event-triggered control law is developed for ADP algorithm.After that,to further simplify the complexity of controller design,this work is extended to a self-triggered case for relaxing the need for continuous signal monitoring by hardware devices.By employing the Lyapunov method,the uniform ultimate boundedness of all signals is proved to be guaranteed.Finally,a simulation example on a mass–spring–damper system subject to asymmetric input constraints is provided to validate the effectiveness of the proposed control scheme. 展开更多
关键词 adaptive dynamic programming(ADP) Asymmetric input constraints Prescribed performance control Event-triggered control Optimal tracking control
在线阅读 下载PDF
Multi-Home Energy Coordination Scheduling Based on Adaptive Dynamic Programming 被引量:1
12
作者 Kang Xiong Qinglai Wei 《The International Journal of Intelligent Control and Systems》 2024年第1期30-36,共7页
Implementing cooperative scheduling of multi-home microgrid energy and reducing the dependence on the main grid have become the focus of microgrid energy management research.This paper proposes a new multi-agent adapt... Implementing cooperative scheduling of multi-home microgrid energy and reducing the dependence on the main grid have become the focus of microgrid energy management research.This paper proposes a new multi-agent adaptive dynamic programming(MAADP)method for the cooperative control of distributed home energy.Each home is defined as a learning agent that needs to reasonably schedule the energy storage system to meet the respective load demand while accomplishing cooperative scheduling among the individual homes.In addition,an energy clearing center(ECC)is introduced to complete the energy exchange between each microgrid to protect the benefits of all parties.The proposed method adopts the learning strategy of“centralized learning and decentralized execution”to avoid the leakage of private information.The experimental comparison with the benchmark method verifies that the method can realize the cooperative scheduling of each home and reduce the dependence on the main grid. 展开更多
关键词 Multi-home scenarios adaptive dynamic programming(ADP) optimal home energy management smart grid
在线阅读 下载PDF
Event-Based Cooperative Control for Uncertain Multiagent System Using Parallel Adaptive Dynamic Programming
13
作者 Shanshan Jiao Qinglai Wei +6 位作者 Fei Dai Jianchao Wu Miaosheng Qiu Bin Zhang Yongjin Luo Kunxin Huang Genpo Ma 《The International Journal of Intelligent Control and Systems》 2024年第3期127-133,共7页
This study explores a new robust consensus control strategy for uncertain multiagent systems and provides an event-based solution to adaptive dynamic programming(ADP)based optimal control.Rather than the control funct... This study explores a new robust consensus control strategy for uncertain multiagent systems and provides an event-based solution to adaptive dynamic programming(ADP)based optimal control.Rather than the control function,the feedback system established symmetrical to the physical system allows the optimal consensus control issue to be handled by the optimal control protocol of an augmented affine system.The feedback system focuses on an auxiliary variable formed in light of the optimality principle and the virtual control input built on a critic neural network(NN).Analysis reveals that the auxiliary variable benefits from decreasing the influence of uncertainty on control performance,while the proposed approach is implemented with fewer communication resources since the critic NN is updated as events occur.Finally,evidence from simulation findings validates the theoretical results. 展开更多
关键词 Optimal consensus control event-based control robust control parallel control adaptive dynamic programming
在线阅读 下载PDF
Adaptive Dynamic Programming-Based Attitude Optimal Tracking Control for a Quadrotor with Unmeasured Velocities and Model Uncertainties
14
作者 Junrui Guo Xiaoyang Gao Tieshan Li 《Guidance, Navigation and Control》 2024年第2期25-46,共22页
This paper proposes an optimal output feedback tracking control scheme of the quadrotor unmanned aerial vehicle(UAV)attitude system with unmeasured angular velocities and model uncertainties.First,neural network(NN)is... This paper proposes an optimal output feedback tracking control scheme of the quadrotor unmanned aerial vehicle(UAV)attitude system with unmeasured angular velocities and model uncertainties.First,neural network(NN)is used to approximate the model uncertainties.Then,an NN velocity observer is established to estimate the unmeasured angular velocities.Further,a quadrotor output feedback attitude optimal tracking controller is designed,which consists of an adaptive controller designed by backstepping method and an optimal compensation term designed by adaptive dynamic programming.All signals in the closed-loop system are proved to be bounded.Finally,numerical simulation example shows that the quadrotor attitude tracking scheme is effective and feasible. 展开更多
关键词 QUADROTOR optimal tracking control neural network velocity observer model uncertainties adaptive dynamic programming
在线阅读 下载PDF
Energy Maximization Absorption of Wave Energy Converter Based on Fourier Pseudo-Spectral Method and Adaptive Dynamic Programming
15
作者 Xinyu Bao Zhen Chen Ming Li 《The International Journal of Intelligent Control and Systems》 2024年第3期108-118,共11页
In this paper,we propose a novel noncausal control framework to address the energy maximization problem of wave energy converters(WECs)subject to constraints.The energy maximization problem of WECs is a constrained op... In this paper,we propose a novel noncausal control framework to address the energy maximization problem of wave energy converters(WECs)subject to constraints.The energy maximization problem of WECs is a constrained optimal control problem.The proposed control framework converts this problem into a reference trajectory tracking problem through the Fourier pseudo-spectral method(FPSM)and utilizes the online tracking adaptive dynamic programming(OTADP)algorithm to realize real-time trajectory tracking for practical use in the ocean environment.Using the wave prediction technique,the optimal trajectory is generated online through a receding horizon(RH)implementation.A critic neural network(NN)is applied to approximate the optimal cost value function and calculate the error-tracking control by solving the associated Hamilton-Jacobi-Bellman(HJB)equation.The proposed WEC control framework improves computational efficiency and makes the online control feasible in practice.Simulation results show the effects of the receding horizon implementation of FPSM with different window lengths and window functions,while verifying the performances of tracking control and energy absorption of WECs in two different sea conditions. 展开更多
关键词 Wave energy converter Fourier pseudo-spectral control adaptive dynamic programming energy maximization optimal trajectory tracking control
在线阅读 下载PDF
A Robust Adaptive Dynamic Programming Principle for Sensorimotor Control with Signal-Dependent Noise 被引量:2
16
作者 JIANG Yu JIANG Zhong-Ping 《Journal of Systems Science & Complexity》 SCIE EI CSCD 2015年第2期261-288,共28页
As human beings,people coordinate movements and interact with the environment through sensory information and motor adaptation in the daily lives.Many characteristics of these interactions can be studied using optimiz... As human beings,people coordinate movements and interact with the environment through sensory information and motor adaptation in the daily lives.Many characteristics of these interactions can be studied using optimization-based models,which assume that the precise knowledge of both the sensorimotor system and its interactive environment is available for the central nervous system(CNS).However,both static and dynamic uncertainties occur inevitably in the daily movements.When these uncertainties are taken into consideration,the previously developed models based on optimization theory may fail to explain how the CNS can still coordinate human movements which are also robust with respect to the uncertainties.In order to address this problem,this paper presents a novel computational mechanism for sensorimotor control from a perspective of robust adaptive dynamic programming(RADP).Sharing some essential features of reinforcement learning,which was originally observed from mammals,the RADP model for sensorimotor control suggests that,instead of identifying the system dynamics of both the motor system and the environment,the CNS computes iteratively a robust optimal control policy using the real-time sensory data.An online learning algorithm is provided in this paper,with rigorous convergence and stability analysis.Then,it is applied to simulate several experiments reported from the past literature.By comparing the proposed numerical results with these experimentally observed data,the authors show that the proposed model can reproduce movement trajectories which are consistent with experimental observations.In addition,the RADP theory provides a unified framework that connects optimality and robustness properties in the sensorimotor system. 展开更多
关键词 adaptive dynamic programming human motor adaptation robust optimal control.
原文传递
Optimal regulation of uncertain dynamic systems using adaptive dynamic programming 被引量:2
17
作者 Hao Xu Qiming Zhao S.Jagannathan 《Journal of Control and Decision》 EI 2014年第3期226-256,共31页
In this tutorial paper,the finite-horizon optimal adaptive regulation of linear and nonlinear dynamic systems with unknown system dynamics is presented in a forward-in-time manner using adaptive dynamic programming(AD... In this tutorial paper,the finite-horizon optimal adaptive regulation of linear and nonlinear dynamic systems with unknown system dynamics is presented in a forward-in-time manner using adaptive dynamic programming(ADP).An adaptive estimator(AE)is introduced with the idea of Q-learning to relax the requirement of system dynamics in the case of linear system,while neural network-based identifier is utilised for nonlinear systems.The time-varying nature of the solution to the Bellman/Hamilton–Jacobi–Bellman equation is handled by utilising a time-dependent basis function,while the terminal constraint is incorporated as part of the update law of the AE/Identifier in solving the optimal feedback control.Utilising an initial admissible control,the proposed optimal regulation scheme of the uncertain linear and nonlinear system yields a forward-in-time and online solution without using value and/or policy iterations.An adaptive observer is utilised for linear systems in order to relax the need for state availability so that the optimal adaptive control design depends only on the reconstructed states.Finally,the optimal control is covered for nonlinear-networked control systems where in the feedback loop is closed via a communication network.Effectiveness of the proposed approach is verified by simulation results.The end result is a variant of a roll-out scheme in ADP wherein an initial admissible policy is selected as the base policy and the control policy is enhanced using a one-time policy improvement at each sampling interval. 展开更多
关键词 adaptive dynamic programming finite horizon optimal control
原文传递
State of the Art of Adaptive Dynamic Programming and Reinforcement Learning 被引量:1
18
作者 Derong Liu Mingming Ha Shan Xue 《CAAI Artificial Intelligence Research》 2022年第2期93-110,共18页
This article introduces the state-of-the-art development of adaptive dynamic programming and reinforcement learning(ADPRL).First,algorithms in reinforcement learning(RL)are introduced and their roots in dynamic progra... This article introduces the state-of-the-art development of adaptive dynamic programming and reinforcement learning(ADPRL).First,algorithms in reinforcement learning(RL)are introduced and their roots in dynamic programming are illustrated.Adaptive dynamic programming(ADP)is then introduced following a brief discussion of dynamic programming.Researchers in ADP and RL have enjoyed the fast developments of the past decade from algorithms,to convergence and optimality analyses,and to stability results.Several key steps in the recent theoretical developments of ADPRL are mentioned with some future perspectives.In particular,convergence and optimality results of value iteration and policy iteration are reviewed,followed by an introduction to the most recent results on stability analysis of value iteration algorithms. 展开更多
关键词 adaptive dynamic programming approximate dynamic programming adaptive critic designs neuro-dynamic programming neural dynamic programming reinforcement learning intelligent control learning control optimal control
原文传递
Adaptive dynamic programming for linear impulse systems
19
作者 Xiao-hua WANG Juan-juan YU +2 位作者 Yao HUANG Hua WANG Zhong-hua MIAO 《Journal of Zhejiang University-Science C(Computers and Electronics)》 SCIE EI 2014年第1期43-50,共8页
We investigate the optimization of linear impulse systems with the reinforcement learning based adaptive dynamic programming(ADP)method.For linear impulse systems,the optimal objective function is shown to be a quadri... We investigate the optimization of linear impulse systems with the reinforcement learning based adaptive dynamic programming(ADP)method.For linear impulse systems,the optimal objective function is shown to be a quadric form of the pre-impulse states.The ADP method provides solutions that iteratively converge to the optimal objective function.If an initial guess of the pre-impulse objective function is selected as a quadratic form of the pre-impulse states,the objective function iteratively converges to the optimal one through ADP.Though direct use of the quadratic objective function of the states within the ADP method is theoretically possible,the numerical singularity problem may occur due to the matrix inversion therein when the system dimensionality increases.A neural network based ADP method can circumvent this problem.A neural network with polynomial activation functions is selected to approximate the pre-impulse objective function and trained iteratively using the ADP method to achieve optimal control.After a successful training,optimal impulse control can be derived.Simulations are presented for illustrative purposes. 展开更多
关键词 adaptive dynamic programming(ADP) Impulse system Optimal control Neural network
原文传递
Decentralised adaptive learning-based control of robot manipulators with unknown parameters
20
作者 Emil Mühlbradt Sveen Jing Zhou +1 位作者 Morten Kjeld Ebbesen Mohammad Poursina 《Journal of Automation and Intelligence》 2025年第2期136-144,共9页
This paper studies motor joint control of a 4-degree-of-freedom(DoF)robotic manipulator using learning-based Adaptive Dynamic Programming(ADP)approach.The manipulator’s dynamics are modelled as an open-loop 4-link se... This paper studies motor joint control of a 4-degree-of-freedom(DoF)robotic manipulator using learning-based Adaptive Dynamic Programming(ADP)approach.The manipulator’s dynamics are modelled as an open-loop 4-link serial kinematic chain with 4 Degrees of Freedom(DoF).Decentralised optimal controllers are designed for each link using ADP approach based on a set of cost matrices and data collected from exploration trajectories.The proposed control strategy employs an off-line,off-policy iterative approach to derive four optimal control policies,one for each joint,under exploration strategies.The objective of the controller is to control the position of each joint.Simulation and experimental results show that four independent optimal controllers are found,each under similar exploration strategies,and the proposed ADP approach successfully yields optimal linear control policies despite the presence of these complexities.The experimental results conducted on the Quanser Qarm robotic platform demonstrate the effectiveness of the proposed ADP controllers in handling significant dynamic nonlinearities,such as actuation limitations,output saturation,and filter delays. 展开更多
关键词 adaptive dynamic programming Optimal control Robot manipulator 4-DoF Unknown dynamics
在线阅读 下载PDF
上一页 1 2 3 下一页 到第
使用帮助 返回顶部