Dear Editor,In this letter,a constrained networked predictive control strategy is proposed for the optimal control problem of complex nonlinear highorder fully actuated(HOFA)systems with noises.The method can effectiv...Dear Editor,In this letter,a constrained networked predictive control strategy is proposed for the optimal control problem of complex nonlinear highorder fully actuated(HOFA)systems with noises.The method can effectively deal with nonlinearities,constraints,and noises in the system,optimize the performance metric,and present an upper bound on the stable output of the system.展开更多
In this paper,we present an optimal neuro-control scheme for continuous-time(CT)nonlinear systems with asymmetric input constraints.Initially,we introduce a discounted cost function for the CT nonlinear systems in ord...In this paper,we present an optimal neuro-control scheme for continuous-time(CT)nonlinear systems with asymmetric input constraints.Initially,we introduce a discounted cost function for the CT nonlinear systems in order to handle the asymmetric input constraints.Then,we develop a Hamilton-Jacobi-Bellman equation(HJBE),which arises in the discounted cost optimal control problem.To obtain the optimal neurocontroller,we utilize a critic neural network(CNN)to solve the HJBE under the framework of reinforcement learning.The CNN's weight vector is tuned via the gradient descent approach.Based on the Lyapunov method,we prove that uniform ultimate boundedness of the CNN's weight vector and the closed-loop system is guaranteed.Finally,we verify the effectiveness of the present optimal neuro-control strategy through performing simulations of two examples.展开更多
In this study,an adaptive neuro-observer-based optimal control(ANOPC)policy is introduced for unknown nonaffine nonlinear systems with control input constraints.Hamilton–Jacobi–Bellman(HJB)framework is employed to m...In this study,an adaptive neuro-observer-based optimal control(ANOPC)policy is introduced for unknown nonaffine nonlinear systems with control input constraints.Hamilton–Jacobi–Bellman(HJB)framework is employed to minimize a non-quadratic cost function corresponding to the constrained control input.ANOPC consists of both analytical and algebraic parts.In the analytical part,first,an observer-based neural network(NN)approximates uncertain system dynamics,and then another NN structure solves the HJB equation.In the algebraic part,the optimal control input that does not exceed the saturation bounds is generated.The weights of two NNs associated with observer and controller are simultaneously updated in an online manner.The ultimately uniformly boundedness(UUB)of all signals of the whole closed-loop system is ensured through Lyapunov’s direct method.Finally,two numerical examples are provided to confirm the effectiveness of the proposed control strategy.展开更多
This paper presents the effect of the high voltage direct current (HVDC) transmission system based on voltage source converter (VSC) on the sub synchronous resonance (SSR) and low frequency oscillations (LFO) in power...This paper presents the effect of the high voltage direct current (HVDC) transmission system based on voltage source converter (VSC) on the sub synchronous resonance (SSR) and low frequency oscillations (LFO) in power system. Also, a novel adaptive neural controller based on neural identifier is proposed for the HVDC which is capable of damping out LFO and sub synchronous oscillations (SSO). For comparison purposes, results of system based damping neural controller are compared with a lead-lag controller based on quantum particle swarm optimization (QPSO). It is shown that implementing adaptive damping controller not only improves the stability of power system but also can overcome drawbacks of conventional compensators with fixed parameters. In order to determine the most effective input of HVDC system to apply supplementary controller signal, analysis based on singular value decomposition is performed. To evaluate the performance of the proposed controller, transient simulations of detailed nonlinear system are considered.展开更多
Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and ...Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.展开更多
Returning to moon has become a top topic recently. Many studies have shown that soft landing is a challenging problem in lunar exploration. The lunar soft landing in this paper begins from a 100 km circular lunar park...Returning to moon has become a top topic recently. Many studies have shown that soft landing is a challenging problem in lunar exploration. The lunar soft landing in this paper begins from a 100 km circular lunar parking orbit. Once the landing area has been selected and it is time to deorbit for landing, a ΔV burn of 19.4 m/s is performed to establish a 100×15 km elliptical orbit. At perilune, the landing jets are ignited, and a propulsive landing is performed. A guidance and control scheme for lunar soft landing is proposed in the paper, which combines optimal theory with nonlinear neuro-control. Basically, an optimal nonlinear control law based on artificial neural network is presented, on the basis of the optimum trajectory from perilune to lunar surface in terms of Pontryagin's maximum principle according to the terminal boundary conditions and performance index. Therefore some optimal control laws can be carried out in the soft landing system due to the nonlinear mapping function of the neural network. The feasibility and validity of the control laws are verified in a simulation experiment.展开更多
基金supported in part by the National Natural Science Foundation of China(62173255,62188101)Shenzhen Key Laboratory of Control Theory and Intelligent Systems(ZDSYS20220330161800001)
文摘Dear Editor,In this letter,a constrained networked predictive control strategy is proposed for the optimal control problem of complex nonlinear highorder fully actuated(HOFA)systems with noises.The method can effectively deal with nonlinearities,constraints,and noises in the system,optimize the performance metric,and present an upper bound on the stable output of the system.
基金supported by the Natural Sciences and Engineering Research Council of Canada(N00892)in part by National Natural Science Foundation of China(51405436,51375452,61573174)
基金supported by the National Natural Science Foundation of China(61973228,61973330)
文摘In this paper,we present an optimal neuro-control scheme for continuous-time(CT)nonlinear systems with asymmetric input constraints.Initially,we introduce a discounted cost function for the CT nonlinear systems in order to handle the asymmetric input constraints.Then,we develop a Hamilton-Jacobi-Bellman equation(HJBE),which arises in the discounted cost optimal control problem.To obtain the optimal neurocontroller,we utilize a critic neural network(CNN)to solve the HJBE under the framework of reinforcement learning.The CNN's weight vector is tuned via the gradient descent approach.Based on the Lyapunov method,we prove that uniform ultimate boundedness of the CNN's weight vector and the closed-loop system is guaranteed.Finally,we verify the effectiveness of the present optimal neuro-control strategy through performing simulations of two examples.
基金Supported by National High Technology Research and Development Program of China (863 Program) (2006AA04Z183), National Nat- ural Science Foundation of China (60621001, 60534010, 60572070, 60774048, 60728307), and the Program for Changjiang Scholars and Innovative Research Groups of China (60728307, 4031002)
文摘In this study,an adaptive neuro-observer-based optimal control(ANOPC)policy is introduced for unknown nonaffine nonlinear systems with control input constraints.Hamilton–Jacobi–Bellman(HJB)framework is employed to minimize a non-quadratic cost function corresponding to the constrained control input.ANOPC consists of both analytical and algebraic parts.In the analytical part,first,an observer-based neural network(NN)approximates uncertain system dynamics,and then another NN structure solves the HJB equation.In the algebraic part,the optimal control input that does not exceed the saturation bounds is generated.The weights of two NNs associated with observer and controller are simultaneously updated in an online manner.The ultimately uniformly boundedness(UUB)of all signals of the whole closed-loop system is ensured through Lyapunov’s direct method.Finally,two numerical examples are provided to confirm the effectiveness of the proposed control strategy.
文摘This paper presents the effect of the high voltage direct current (HVDC) transmission system based on voltage source converter (VSC) on the sub synchronous resonance (SSR) and low frequency oscillations (LFO) in power system. Also, a novel adaptive neural controller based on neural identifier is proposed for the HVDC which is capable of damping out LFO and sub synchronous oscillations (SSO). For comparison purposes, results of system based damping neural controller are compared with a lead-lag controller based on quantum particle swarm optimization (QPSO). It is shown that implementing adaptive damping controller not only improves the stability of power system but also can overcome drawbacks of conventional compensators with fixed parameters. In order to determine the most effective input of HVDC system to apply supplementary controller signal, analysis based on singular value decomposition is performed. To evaluate the performance of the proposed controller, transient simulations of detailed nonlinear system are considered.
基金supported in part by the National Natural Science Foundation of China(62222301, 62073085, 62073158, 61890930-5, 62021003)the National Key Research and Development Program of China (2021ZD0112302, 2021ZD0112301, 2018YFC1900800-5)Beijing Natural Science Foundation (JQ19013)。
文摘Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.
文摘Returning to moon has become a top topic recently. Many studies have shown that soft landing is a challenging problem in lunar exploration. The lunar soft landing in this paper begins from a 100 km circular lunar parking orbit. Once the landing area has been selected and it is time to deorbit for landing, a ΔV burn of 19.4 m/s is performed to establish a 100×15 km elliptical orbit. At perilune, the landing jets are ignited, and a propulsive landing is performed. A guidance and control scheme for lunar soft landing is proposed in the paper, which combines optimal theory with nonlinear neuro-control. Basically, an optimal nonlinear control law based on artificial neural network is presented, on the basis of the optimum trajectory from perilune to lunar surface in terms of Pontryagin's maximum principle according to the terminal boundary conditions and performance index. Therefore some optimal control laws can be carried out in the soft landing system due to the nonlinear mapping function of the neural network. The feasibility and validity of the control laws are verified in a simulation experiment.
文摘一般非线性离散系统存在高度非线性、模型不确定性和动态未知性特性,传统方法应对其控制一般用一个简化或线性化的模型来代表真实系统,会导致固有误差。为此,文章针对一般非线性离散系统,提出了一种基于多维泰勒网(multi-dimensional Taylor networks ,MTN)的近似最优迭代动态规划方法,其所有控制过程均在线进行,无须离线训练步骤。该方法采用actor-Critic框架,并引入3个MTN网络:效用MTN用于在不依赖系统内部动态信息的条件下确定性能指标;Critic MTN用于逼近性能函数;执行MTN则在动态规划框架下在线调整控制策略。整套控制系统采用双闭环控制结构,外环以主要反馈信号实现跟踪控制,内环通过辅助反馈信号进一步提升动态性能。本文充分利用MTN的结构特性,显著降低了控制器的计算复杂度,大幅提升了迭代自适应规划算法的动态响应速度;以液压伺服系统为对象,开展了阶跃信号与正弦信号的跟踪仿真实验。仿真结果表明,基于迭代学习的多维泰勒网离散自适应最优控制器具有良好的跟踪性能和动态响应特性,验证了本文所提方法的有效性与实用性。