期刊文献+
共找到4篇文章
< 1 >
每页显示 20 50 100
A Homotopy Method for Continuous-Time Model-Free LQR Control Based on Policy Iteration
1
作者 Wenwu Fan Junlin Xiong 《IEEE/CAA Journal of Automatica Sinica》 2025年第8期1673-1682,共10页
In recent years,reinforcement learning control theory has been well developed.However,model-free value iteration needs many iterations to achieve the desired precision,and modelfree policy iteration requires an initia... In recent years,reinforcement learning control theory has been well developed.However,model-free value iteration needs many iterations to achieve the desired precision,and modelfree policy iteration requires an initial stabilizing control policy.It is significant to propose a fast model-free algorithm to solve the continuous-time linear quadratic control problem without an initial stabilizing control policy.In this paper,we construct a homotopy path on which each point corresponds to an linear quadratic regulator problem.Based on policy iteration,model-based and model-free homotopy algorithms are proposed to solve the optimal control problem of continuous-time linear systems along the homotopy path.Our algorithms are speeded up using first-order differential information and do not require an initial stabilizing control policy.Finally,several practical examples are used to illustrate our results. 展开更多
关键词 Homotopy path initial stabilizing control policy linear quadratic control policy iteration reinforcement learning
在线阅读 下载PDF
A Lyapunov characterization of robust policy optimization
2
作者 Leilei Cui Zhong-Ping Jiang 《Control Theory and Technology》 EI CSCD 2023年第3期374-389,共16页
In this paper,we study the robustness property of policy optimization(particularly Gauss-Newton gradient descent algorithm which is equivalent to the policy iteration in reinforcement learning)subject to noise at each... In this paper,we study the robustness property of policy optimization(particularly Gauss-Newton gradient descent algorithm which is equivalent to the policy iteration in reinforcement learning)subject to noise at each iteration.By invoking the concept of input-to-state stability and utilizing Lyapunov's direct method,it is shown that,if the noise is sufficiently small,the policy iteration algorithm converges to a small neighborhood of the optimal solution even in the presence of noise at each iteration.Explicit expressions of the upperbound on the noise and the size of the neighborhood to which the policies ultimately converge are provided.Based on Willems'fundamental lemma,a learning-based policy iteration algorithm is proposed.The persistent excitation condition can be readily guaranteed by checking the rank of the Hankel matrix related to an exploration signal.The robustness of the learning-based policy iteration to measurement noise and unknown system disturbances is theoretically demonstrated by the input-to-state stability of the policy iteration.Several numerical simulations are conducted to demonstrate the efficacy of the proposed method. 展开更多
关键词 policy optimization policy iteration(PI)-Input-to-state stability(ISS) Lyapunov's direct method
原文传递
Joint Statement on Climate Change by Leaders of China and the European Union The Way Forward After the 10th Anniversary of the Adoption of the Paris Agreement
3
《Beijing Review》 2025年第34期I0007-I0008,共2页
On the occasion of the 50th anniversary of the establishment of diplomatic relations between China and the European Union(EU)and the 10th anniversary of the adoption of the Paris Agreement,the Chinese and EU leaders h... On the occasion of the 50th anniversary of the establishment of diplomatic relations between China and the European Union(EU)and the 10th anniversary of the adoption of the Paris Agreement,the Chinese and EU leaders hereby:Reiterate that in the fluid and turbulent international situation today,it is crucial that all countries,notably the major economies maintain policy continuity and stability and step up efforts to address climate change. 展开更多
关键词 international situation diplomatic relations Paris Agreement policy continuity maintain policy continuity stability climate change major economies
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部