Inverse reinforcement learning optimal control is under the framework of learner-expert.The learner system can imitate the expert system's demonstrated behaviors and does not require the predefined cost function,s...Inverse reinforcement learning optimal control is under the framework of learner-expert.The learner system can imitate the expert system's demonstrated behaviors and does not require the predefined cost function,so it can handle optimal control problems effectively.This paper proposes an inverse reinforcement learning optimal control method for Takagi-Sugeno(T-S)fuzzy systems.Based on learner systems,an expert system is constructed,where the learner system only knows the expert system's optimal control policy.To reconstruct the unknown cost function,we firstly develop a model-based inverse reinforcement learning algorithm for the case that systems dynamics are known.The developed model-based learning algorithm is consists of two learning stages:an inner reinforcement learning loop and an outer inverse optimal control loop.The inner loop desires to obtain optimal control policy via learner's cost function and the outer loop aims to update learner's state-penalty matrices via only using expert's optimal control policy.Then,to eliminate the requirement that the system dynamics must be known,a data-driven integral learning algorithm is presented.It is proved that the presented two algorithms are convergent and the developed inverse reinforcement learning optimal control scheme can ensure the controlled fuzzy learner systems to be asymptotically stable.Finally,we apply the proposed fuzzy optimal control to the truck-trailer system,and the computer simulation results verify the effectiveness of the presented approach.展开更多
This paper presents an online integral reinforcement learning(RL)solution for problems with hierarchy decision makers.Specifically,we reformulate this model as a leaderfollower game,in which control input and determin...This paper presents an online integral reinforcement learning(RL)solution for problems with hierarchy decision makers.Specifically,we reformulate this model as a leaderfollower game,in which control input and deterministic disturbance act as decision makers at different levels of hierarchy:The control input plays the role of the leader,while the disturbance plays the role of the follower.The main contributions of this paper can be summarized as follows.First,we introduce online RL to deal with systems that have partially unknown information,meaning that accurate dynamic information is not required.Second,we solve the leader-follower coupled Hamilton-Jacobi(HJ)and Riccati equations approximately online using the derived algorithm.Third,we provide turning laws for cost functions and controllers,which ensure closed-loop stability simultaneously.展开更多
We introduce a Deep Reinforcement Learning(DRL)model for crystal structure relaxation and compare different types of neural network architectures and reinforcement learning algorithms for this purpose.Numerical experi...We introduce a Deep Reinforcement Learning(DRL)model for crystal structure relaxation and compare different types of neural network architectures and reinforcement learning algorithms for this purpose.Numerical experiments are conducted on Al-Fe structures,with potential energy surfaces generated using EAM potentials.We examine the influence of parameter settings on model performance and benchmark the best-performing models against classical optimization algorithms.Additionally,the model’s capacity to generalize learned interaction patterns from smaller atomic systems to more complex systems is assessed.The results demonstrate the potential of DRL models to enhance the efficiency of structure relaxation compared to classical optimizers.展开更多
基金The National Natural Science Foundation of China(62173172).
文摘Inverse reinforcement learning optimal control is under the framework of learner-expert.The learner system can imitate the expert system's demonstrated behaviors and does not require the predefined cost function,so it can handle optimal control problems effectively.This paper proposes an inverse reinforcement learning optimal control method for Takagi-Sugeno(T-S)fuzzy systems.Based on learner systems,an expert system is constructed,where the learner system only knows the expert system's optimal control policy.To reconstruct the unknown cost function,we firstly develop a model-based inverse reinforcement learning algorithm for the case that systems dynamics are known.The developed model-based learning algorithm is consists of two learning stages:an inner reinforcement learning loop and an outer inverse optimal control loop.The inner loop desires to obtain optimal control policy via learner's cost function and the outer loop aims to update learner's state-penalty matrices via only using expert's optimal control policy.Then,to eliminate the requirement that the system dynamics must be known,a data-driven integral learning algorithm is presented.It is proved that the presented two algorithms are convergent and the developed inverse reinforcement learning optimal control scheme can ensure the controlled fuzzy learner systems to be asymptotically stable.Finally,we apply the proposed fuzzy optimal control to the truck-trailer system,and the computer simulation results verify the effectiveness of the presented approach.
基金supported by the Support Plan on Science and Technology for Youth Innovation of Universities in Shandong Province(No.2021KJ086)the National Natural Science Foundation of China(Nos.62003234 and 61873179)the National Natural Science Foundation of Shandong Province(No.ZR2020QF048).
文摘This paper presents an online integral reinforcement learning(RL)solution for problems with hierarchy decision makers.Specifically,we reformulate this model as a leaderfollower game,in which control input and deterministic disturbance act as decision makers at different levels of hierarchy:The control input plays the role of the leader,while the disturbance plays the role of the follower.The main contributions of this paper can be summarized as follows.First,we introduce online RL to deal with systems that have partially unknown information,meaning that accurate dynamic information is not required.Second,we solve the leader-follower coupled Hamilton-Jacobi(HJ)and Riccati equations approximately online using the derived algorithm.Third,we provide turning laws for cost functions and controllers,which ensure closed-loop stability simultaneously.
基金supported by the Russian Science Foundation(grant#19-72-30043).The calculations were performed on the Zhores cluster at the Skolkovo Institute of Science and Technology.
文摘We introduce a Deep Reinforcement Learning(DRL)model for crystal structure relaxation and compare different types of neural network architectures and reinforcement learning algorithms for this purpose.Numerical experiments are conducted on Al-Fe structures,with potential energy surfaces generated using EAM potentials.We examine the influence of parameter settings on model performance and benchmark the best-performing models against classical optimization algorithms.Additionally,the model’s capacity to generalize learned interaction patterns from smaller atomic systems to more complex systems is assessed.The results demonstrate the potential of DRL models to enhance the efficiency of structure relaxation compared to classical optimizers.