Trajectory tracking for nonlinear robotic systems remains a fundamental yet challenging problem in control engineering,particularly when both precision and efficiency must be ensured.Conventional control methods are o...Trajectory tracking for nonlinear robotic systems remains a fundamental yet challenging problem in control engineering,particularly when both precision and efficiency must be ensured.Conventional control methods are often effective for stabilization but may not directly optimize long-term performance.To address this limitation,this study develops an integrated framework that combines optimal control principles with reinforcement learning for a single-link robotic manipulator.The proposed scheme adopts an actor–critic structure,where the critic network approximates the value function associated with the Hamilton–Jacobi–Bellman equation,and the actor network generates near-optimal control signals in real time.This dual adaptation enables the controller to refine its policy online without explicit system knowledge.Stability of the closed-loop system is analyzed through Lyapunov theory,ensuring boundedness of the tracking error.Numerical simulations on the single-link manipulator demonstrate that themethod achieves accurate trajectory followingwhile maintaining lowcontrol effort.The results further showthat the actor–critic learning mechanism accelerates convergence of the control policy compared with conventional optimization-based strategies.This work highlights the potential of reinforcement learning integrated with optimal control for robotic manipulators and provides a foundation for future extensions to more complex multi-degree-of-freedom systems.The proposed controller is further validated in a physics-based virtual Gazebo environment,demonstrating stable adaptation and real-time feasibility.展开更多
Dear Editor,In this letter,we focus on the algebraic relationship between the coefficient matrices and the solution of the stochastic algebraic Riccati equation.It is revealed that,if the coefficient matrices are in a...Dear Editor,In this letter,we focus on the algebraic relationship between the coefficient matrices and the solution of the stochastic algebraic Riccati equation.It is revealed that,if the coefficient matrices are in an algebra,then the solution(and also the control gain in many cases)is also in the same algebra.The main result is verified by a numerical simulation.展开更多
Dear Editor,This letter proposes a reinforcement learning-based predictive learning algorithm for unknown continuous-time nonlinear systems with observation loss.Firstly,we construct a temporal nonzero-sum game over p...Dear Editor,This letter proposes a reinforcement learning-based predictive learning algorithm for unknown continuous-time nonlinear systems with observation loss.Firstly,we construct a temporal nonzero-sum game over predictive control input sequences,deriving multiple optimal predictive control input sequences from its solution.展开更多
In this paper,an adaptive neural-network(NN)output feedback optimal control problem is studied for a class of strict-feedback nonlinear systems with unknown internal dynamics,input saturation and state constraints.Neu...In this paper,an adaptive neural-network(NN)output feedback optimal control problem is studied for a class of strict-feedback nonlinear systems with unknown internal dynamics,input saturation and state constraints.Neural networks are used to approximate unknown internal dynamics and an adaptive NN state observer is developed to estimate immeasurable states.Under the framework of the backstepping design,by employing the actor-critic architecture and constructing the tan-type Barrier Lyapunov function(BLF),the virtual and actual optimal controllers are developed.In order to accomplish optimal control effectively,a simplified reinforcement learning(RL)algorithm is designed by deriving the updating laws from the negative gradient of a simple positive function,instead of employing existing optimal control methods.In addition,to ensure that all the signals in the closed-loop system are bounded and the output can follow the reference signal within a bounded error,all state variables are confined within their compact sets all times.Finally,a simulation example is given to illustrate the effectiveness of the proposed control strategy.展开更多
Dear Editor,In this letter,a constrained networked predictive control strategy is proposed for the optimal control problem of complex nonlinear highorder fully actuated(HOFA)systems with noises.The method can effectiv...Dear Editor,In this letter,a constrained networked predictive control strategy is proposed for the optimal control problem of complex nonlinear highorder fully actuated(HOFA)systems with noises.The method can effectively deal with nonlinearities,constraints,and noises in the system,optimize the performance metric,and present an upper bound on the stable output of the system.展开更多
Based on analyzing the thermal process of a CDQ (coke dry quenching)-Boiler system, the mathematical model for opti-mized operation and control in the CDQ-Boiler system was developed. It includes a mathematical mode...Based on analyzing the thermal process of a CDQ (coke dry quenching)-Boiler system, the mathematical model for opti-mized operation and control in the CDQ-Boiler system was developed. It includes a mathematical model for heat transferring process in the CDQ unit, a mathematical model for heat transferring process in the boiler and a combustion model for circulating gas in the CDQ-Boiler system. The model was verified by field data, then a series of simulations under several typical operating conditions of CDQ-Boiler were carried on, and in turn, the online relation formulas between the productivity and the optimal circulating gas, and the one between the productivity and the optimal second air, were achieved respectively. These relation equations have been success- fully used in a CDQ-Boiler computer control system in the Baosteel, to realize online optimized guide and control, and meanwhile high efficiency in the CDQ-Boiler system has been achieved.展开更多
Quantum optimal control(QOC)relies on accurately modeling system dynamics and is often challenged by unknown or inaccessible interactions in real systems.Taking an unknown collective spin system as an example,this wor...Quantum optimal control(QOC)relies on accurately modeling system dynamics and is often challenged by unknown or inaccessible interactions in real systems.Taking an unknown collective spin system as an example,this work introduces a machine-learning-based,data-driven scheme to overcome the challenges encountered,with a trained neural network(NN)assuming the role of a surrogate model that captures the system’s dynamics and subsequently enables QOC to be performed on the NN instead of on the real system.The trained NN surrogate proves effective for practical QOC tasks and is further demonstrated to be adaptable to different experimental conditions,remaining robust across varying system sizes and pulse durations.展开更多
The electromagnetic levitation system(EMLS)serves as the most important part of any magnetic levitation system.However,its characteristics are defined by its highly nonlinear dynamics and instability.Furthermore,the u...The electromagnetic levitation system(EMLS)serves as the most important part of any magnetic levitation system.However,its characteristics are defined by its highly nonlinear dynamics and instability.Furthermore,the uncertainties in the dynamics of an electromagnetic levitation system make the controller design more difficult.Therefore,it is necessary to design a robust control law that will ensure the system’s stability in the presence of these uncertainties.In this framework,the dynamics of an electromagnetic levitation system are addressed in terms of matched and unmatched uncertainties.The robust control problem is translated into the optimal control problem,where the uncertainties of the electromagnetic levitation system are directly reflected in the cost function.The optimal control method is used to solve the robust control problem.The solution to the optimal control problem for the electromagnetic levitation system is indeed a solution to the robust control problem of the electromagnetic levitation system under matched and unmatched uncertainties.The simulation and experimental results demonstrate the performance of the designed control scheme.The performance indices such as integral absolute error(IAE),integral square error(ISE),integral time absolute error(ITAE),and integral time square error(ITSE)are compared for both uncertainties to showcase the robustness of the designed control scheme.展开更多
Dear Editor,This letter proposes a convex optimization-based model predictive control(MPC)autonomous guidance method for the Mars ascent vehicle(MAV).We use the modified chebyshev-picard iteration(MCPI)to solve optimi...Dear Editor,This letter proposes a convex optimization-based model predictive control(MPC)autonomous guidance method for the Mars ascent vehicle(MAV).We use the modified chebyshev-picard iteration(MCPI)to solve optimization sub-problems within the MPC framework,eliminating the dynamic constraints in solving the optimal control problem and enhancing the convergence performance of the algorithm.Moreover,this method can repeatedly perform trajectory optimization calculations at a high frequency,achieving timely correction of the optimal control command.Numerical simulations demonstrate that the method can satisfy the requirements of rapid computation and reliability for the MAV system when considering uncertainties and perturbations.展开更多
This article presents an adaptive optimal control method for a semi-active suspension system.The model of the suspension system is built,in which the components of uncertain parameters and exogenous disturbance are de...This article presents an adaptive optimal control method for a semi-active suspension system.The model of the suspension system is built,in which the components of uncertain parameters and exogenous disturbance are described.The adaptive optimal control law consists of the sum of the optimal control component and the adaptive control component.First,the optimal control law is designed for the model of the suspension system after ignoring the components of uncertain parameters and exogenous disturbance caused by the road surface.The optimal control law expresses the desired dynamic characteristics of the suspension system.Next,the adaptive component is designed with the purpose of compensating for the effects caused by uncertain parameters and exogenous disturbance caused by the road surface;the adaptive component has adaptive parameter rules to estimate uncertain parameters and exogenous disturbance.When exogenous disturbances are eliminated,the system responds with an optimal controller designed.By separating theoretically the dynamic of a semi-active suspension system,this solution allows the design of two separate controllers easily and has reduced the computational burden and the use of too many tools,thus allowing for more convenient hardware implementation.The simulation results also show the effectiveness of damping oscillations of the proposed solution in this article.展开更多
Inverse reinforcement learning optimal control is under the framework of learner-expert.The learner system can imitate the expert system's demonstrated behaviors and does not require the predefined cost function,s...Inverse reinforcement learning optimal control is under the framework of learner-expert.The learner system can imitate the expert system's demonstrated behaviors and does not require the predefined cost function,so it can handle optimal control problems effectively.This paper proposes an inverse reinforcement learning optimal control method for Takagi-Sugeno(T-S)fuzzy systems.Based on learner systems,an expert system is constructed,where the learner system only knows the expert system's optimal control policy.To reconstruct the unknown cost function,we firstly develop a model-based inverse reinforcement learning algorithm for the case that systems dynamics are known.The developed model-based learning algorithm is consists of two learning stages:an inner reinforcement learning loop and an outer inverse optimal control loop.The inner loop desires to obtain optimal control policy via learner's cost function and the outer loop aims to update learner's state-penalty matrices via only using expert's optimal control policy.Then,to eliminate the requirement that the system dynamics must be known,a data-driven integral learning algorithm is presented.It is proved that the presented two algorithms are convergent and the developed inverse reinforcement learning optimal control scheme can ensure the controlled fuzzy learner systems to be asymptotically stable.Finally,we apply the proposed fuzzy optimal control to the truck-trailer system,and the computer simulation results verify the effectiveness of the presented approach.展开更多
This paper investigates the problem of fuzzy adaptive finite-time inverse optimal control for active suspension systems(ASSs).The fuzzy logic systems(FLSs)are utilized to learn the unknown non-linear dynamics and an a...This paper investigates the problem of fuzzy adaptive finite-time inverse optimal control for active suspension systems(ASSs).The fuzzy logic systems(FLSs)are utilized to learn the unknown non-linear dynamics and an auxiliary system is established.Based on the finite-time stability theory and inverse optimal theory,a fuzzy adaptive inverse finite-time inverse optimal control method is proposed.It is proven that the formulated control approach guarantees the stability of the controlled systems,while ensuring that errors converge to a small neighborhood of zero within finite time.Moreover,the optimized control performance can be achieved.Eventually,the simulation results demonstrate the effectiveness of the proposed fuzzy adaptive finite-time inverse optimal control scheme.展开更多
This paper presents an optimized shared control algorithm for human–AI interaction, implemented through a digital twin framework where the physical system and human operator act as the real agent while an AI-driven d...This paper presents an optimized shared control algorithm for human–AI interaction, implemented through a digital twin framework where the physical system and human operator act as the real agent while an AI-driven digital system functions as the virtual agent. In this digital twin architecture, the real agent acquires an optimal control strategy through observed actions, while the AI virtual agent mirrors the real agent to establish a digital replica system and corresponding control policy. Both the real and virtual optimal controllers are approximated using reinforcement learning(RL) techniques. Specifically, critic neural networks(NNs) are employed to learn the virtual and real optimal value functions, while actor NNs are trained to derive their respective optimal controllers. A novel shared mechanism is introduced to integrate both virtual and real value functions into a unified learning framework, yielding an optimal shared controller. This controller adaptively adjusts the confidence ratio between virtual and real agents, enhancing the system's efficiency and flexibility in handling complex control tasks. The stability of the closed-loop system is rigorously analyzed using the Lyapunov method. The effectiveness of the proposed AI–human interactive system is validated through two numerical examples: a representative nonlinear system and an unmanned aerial vehicle(UAV) control system.展开更多
This paper introduces an optimized backstepping control method for Flexible Airbreathing Hypersonic Vehicles(FAHVs).The approach incorporates nonlinear disturbance observation and reinforcement learning to address com...This paper introduces an optimized backstepping control method for Flexible Airbreathing Hypersonic Vehicles(FAHVs).The approach incorporates nonlinear disturbance observation and reinforcement learning to address complex control challenges.The Minimal Learning Parameter(MLP)technique is applied to manage unknown nonlinear dynamics,significantly reducing the computational load usually associated with Neural Network(NN)weight updates.To improve the control system robustness,an MLP-based nonlinear disturbance observer is designed,which estimates lumped disturbances,including flexibility effects,model uncertainties,and external disruptions within the FAHVs.In parallel,the control strategy integrates reinforcement learning using an MLP-based actor-critic framework within the backstepping design to achieve both optimality and robustness.The actor performs control actions,while the critic assesses the optimal performance index function.To minimize this index function,an adaptive gradient descent method constructs both the actor and critic.Lyapunov analysis is employed to demonstrate that all signals in the closed-loop system are semiglobally uniformly ultimately bounded.Simulation results confirm that the proposed control strategy delivers high control performance,marked by improved accuracy and reduced energy consumption.展开更多
Optimal impulse control and impulse games provide the cutting-edge frameworks for modeling systems where control actions occur at discrete time points,and optimizing objectives under discontinuous interventions.This r...Optimal impulse control and impulse games provide the cutting-edge frameworks for modeling systems where control actions occur at discrete time points,and optimizing objectives under discontinuous interventions.This review synthesizes the theoretical advancements,computational approaches,emerging challenges,and possible research directions in the field.Firstly,we briefly review the fundamental theory of continuous-time optimal control,including Pontryagin's maximum principle(PMP)and dynamic programming principle(DPP).Secondly,we present the foundational results in optimal impulse control,including necessary conditions and sufficient conditions.Thirdly,we systematize impulse game methodologies,from Nash equilibrium existence theory to the connection between Nash equilibrium and systems stability.Fourthly,we summarize the numerical algorithms including the intelligent computation approaches.Finally,we examine the new trends and challenges in theory and applications as well as computational considerations.展开更多
This paper highlights the utilization of parallel control and adaptive dynamic programming(ADP) for event-triggered robust parallel optimal consensus control(ETRPOC) of uncertain nonlinear continuous-time multiagent s...This paper highlights the utilization of parallel control and adaptive dynamic programming(ADP) for event-triggered robust parallel optimal consensus control(ETRPOC) of uncertain nonlinear continuous-time multiagent systems(MASs).First, the parallel control system, which consists of a virtual control variable and a specific auxiliary variable obtained from the coupled Hamiltonian, allows general systems to be transformed into affine systems. Of interest is the fact that the parallel control technique's introduction provides an unprecedented perspective on eliminating the negative effects of disturbance. Then, an eventtriggered mechanism is adopted to save communication resources while ensuring the system's stability. The coupled HamiltonJacobi(HJ) equation's solution is approximated using a critic neural network(NN), whose weights are updated in response to events. Furthermore, theoretical analysis reveals that the weight estimation error is uniformly ultimately bounded(UUB). Finally,numerical simulations demonstrate the effectiveness of the developed ETRPOC method.展开更多
Dear Editor,The attacker is always going to intrude covertly networked control systems(NCSs)by dynamically changing false data injection attacks(FDIAs)strategy,while the defender try their best to resist attacks by de...Dear Editor,The attacker is always going to intrude covertly networked control systems(NCSs)by dynamically changing false data injection attacks(FDIAs)strategy,while the defender try their best to resist attacks by designing defense strategy on the basis of identifying attack strategy,maintaining stable operation of NCSs.To solve this attack-defense game problem,this letter investigates optimal secure control of NCSs under FDIAs.First,for the alterations of energy caused by false data,a novel attack-defense game model is constructed,which considers the changes of energy caused by the actions of the defender and attacker in the forward and feedback channels.展开更多
Fiber quality measurement in spinning preparation is crucial for optimizing waste and meeting yarn quality specifications.The brand-new Uster AFIS 6–the next-generation laboratory instrument from Uster Technologies–...Fiber quality measurement in spinning preparation is crucial for optimizing waste and meeting yarn quality specifications.The brand-new Uster AFIS 6–the next-generation laboratory instrument from Uster Technologies–uniquely tests man-made fiber properties in addition to cotton.It provides critical data to optimize fiber process control for cotton,man-made fibers,and blended yarns.展开更多
Compared to other energy sources,nuclear reactors offer several advantages as a spacecraft power source,including compact size,high power density,and long operating life.These qualities make nuclear power an ideal ene...Compared to other energy sources,nuclear reactors offer several advantages as a spacecraft power source,including compact size,high power density,and long operating life.These qualities make nuclear power an ideal energy source for future deep space exploration.A whole system model of the space nuclear reactor consisting of the reactor neutron kinetics,reactivity control,reactor heat transfer,heat exchanger,and thermoelectric converter was developed.In addition,an electrical power control system was designed based on the developed dynamic model.The GRS method was used to quantitatively calculate the uncertainty of coupling parameters of the neutronics,thermal-hydraulics,and control system for the space reactor.The Spearman correlation coefficient was applied in the sensitivity analysis of system input parameters to output parameters.The calculation results showed that the uncertainty of the output parameters caused by coupling parameters had the most considerable variation,with a relative standard deviation<2.01%.Effective delayed neutron fraction was most sensitive to electrical power.To obtain optimal control performance,the non-dominated sorting genetic algorithm method was employed to optimize the controller parameters based on the uncertainty quantification calculation.Two typical transient simulations were conducted to test the adaptive ability of the optimized controller in the uncertainty dynamic system,including 100%full power(FP)to 90%FP step load reduction transient and 5%FP/min linear variable load transient.The results showed that,considering the influence of system uncertainty,the optimized controller could improve the response speed and load following accuracy of electrical power control,in which the effectiveness and superiority have been verified.展开更多
In this paper,we consider the maximal positive definite solution of the nonlinear matrix equation.By using the idea of Algorithm 2.1 in ZHANG(2013),a new inversion-free method with a stepsize parameter is proposed to ...In this paper,we consider the maximal positive definite solution of the nonlinear matrix equation.By using the idea of Algorithm 2.1 in ZHANG(2013),a new inversion-free method with a stepsize parameter is proposed to obtain the maximal positive definite solution of nonlinear matrix equation X+A^(*)X|^(-α)A=Q with the case 0<α≤1.Based on this method,a new iterative algorithm is developed,and its convergence proof is given.Finally,two numerical examples are provided to show the effectiveness of the proposed method.展开更多
基金supported in part by the National Science and Technology Council under Grant NSTC 114-2221-E-027-104.
文摘Trajectory tracking for nonlinear robotic systems remains a fundamental yet challenging problem in control engineering,particularly when both precision and efficiency must be ensured.Conventional control methods are often effective for stabilization but may not directly optimize long-term performance.To address this limitation,this study develops an integrated framework that combines optimal control principles with reinforcement learning for a single-link robotic manipulator.The proposed scheme adopts an actor–critic structure,where the critic network approximates the value function associated with the Hamilton–Jacobi–Bellman equation,and the actor network generates near-optimal control signals in real time.This dual adaptation enables the controller to refine its policy online without explicit system knowledge.Stability of the closed-loop system is analyzed through Lyapunov theory,ensuring boundedness of the tracking error.Numerical simulations on the single-link manipulator demonstrate that themethod achieves accurate trajectory followingwhile maintaining lowcontrol effort.The results further showthat the actor–critic learning mechanism accelerates convergence of the control policy compared with conventional optimization-based strategies.This work highlights the potential of reinforcement learning integrated with optimal control for robotic manipulators and provides a foundation for future extensions to more complex multi-degree-of-freedom systems.The proposed controller is further validated in a physics-based virtual Gazebo environment,demonstrating stable adaptation and real-time feasibility.
文摘Dear Editor,In this letter,we focus on the algebraic relationship between the coefficient matrices and the solution of the stochastic algebraic Riccati equation.It is revealed that,if the coefficient matrices are in an algebra,then the solution(and also the control gain in many cases)is also in the same algebra.The main result is verified by a numerical simulation.
基金supported by the National Natural Science Foundation of China(62433014,62373287,62573324,62333005,62273255)in part by the International Exchange Program for Graduate Students of Tongji University(4360143306)+3 种基金in part by the Fundamental Research Funds for Central Universities(22120230311)supported by DeutscheForschungsgemeinschaft(DFG,German Research Foundation)under Germany’s Excellence Strategy(EXC 2075390740016,468094890)support by the Stuttgart Center for Simulation Science(SimTech)the International Max Planck Research School for Intelligent Systems(IMPRS-IS)for supporting Y.Xie。
文摘Dear Editor,This letter proposes a reinforcement learning-based predictive learning algorithm for unknown continuous-time nonlinear systems with observation loss.Firstly,we construct a temporal nonzero-sum game over predictive control input sequences,deriving multiple optimal predictive control input sequences from its solution.
基金This work was supported by National Natural Science Foundation of China(61822307,61773188).
文摘In this paper,an adaptive neural-network(NN)output feedback optimal control problem is studied for a class of strict-feedback nonlinear systems with unknown internal dynamics,input saturation and state constraints.Neural networks are used to approximate unknown internal dynamics and an adaptive NN state observer is developed to estimate immeasurable states.Under the framework of the backstepping design,by employing the actor-critic architecture and constructing the tan-type Barrier Lyapunov function(BLF),the virtual and actual optimal controllers are developed.In order to accomplish optimal control effectively,a simplified reinforcement learning(RL)algorithm is designed by deriving the updating laws from the negative gradient of a simple positive function,instead of employing existing optimal control methods.In addition,to ensure that all the signals in the closed-loop system are bounded and the output can follow the reference signal within a bounded error,all state variables are confined within their compact sets all times.Finally,a simulation example is given to illustrate the effectiveness of the proposed control strategy.
基金supported in part by the National Natural Science Foundation of China(62173255,62188101)Shenzhen Key Laboratory of Control Theory and Intelligent Systems(ZDSYS20220330161800001)
文摘Dear Editor,In this letter,a constrained networked predictive control strategy is proposed for the optimal control problem of complex nonlinear highorder fully actuated(HOFA)systems with noises.The method can effectively deal with nonlinearities,constraints,and noises in the system,optimize the performance metric,and present an upper bound on the stable output of the system.
文摘Based on analyzing the thermal process of a CDQ (coke dry quenching)-Boiler system, the mathematical model for opti-mized operation and control in the CDQ-Boiler system was developed. It includes a mathematical model for heat transferring process in the CDQ unit, a mathematical model for heat transferring process in the boiler and a combustion model for circulating gas in the CDQ-Boiler system. The model was verified by field data, then a series of simulations under several typical operating conditions of CDQ-Boiler were carried on, and in turn, the online relation formulas between the productivity and the optimal circulating gas, and the one between the productivity and the optimal second air, were achieved respectively. These relation equations have been success- fully used in a CDQ-Boiler computer control system in the Baosteel, to realize online optimized guide and control, and meanwhile high efficiency in the CDQ-Boiler system has been achieved.
基金supported by the Innovation Program for Quantum Science and Technology(Grant No.2021ZD0302100)the National Natural Science Foundation of China(Grant Nos.12361131576,92265205,and 92476205).
文摘Quantum optimal control(QOC)relies on accurately modeling system dynamics and is often challenged by unknown or inaccessible interactions in real systems.Taking an unknown collective spin system as an example,this work introduces a machine-learning-based,data-driven scheme to overcome the challenges encountered,with a trained neural network(NN)assuming the role of a surrogate model that captures the system’s dynamics and subsequently enables QOC to be performed on the NN instead of on the real system.The trained NN surrogate proves effective for practical QOC tasks and is further demonstrated to be adaptable to different experimental conditions,remaining robust across varying system sizes and pulse durations.
文摘The electromagnetic levitation system(EMLS)serves as the most important part of any magnetic levitation system.However,its characteristics are defined by its highly nonlinear dynamics and instability.Furthermore,the uncertainties in the dynamics of an electromagnetic levitation system make the controller design more difficult.Therefore,it is necessary to design a robust control law that will ensure the system’s stability in the presence of these uncertainties.In this framework,the dynamics of an electromagnetic levitation system are addressed in terms of matched and unmatched uncertainties.The robust control problem is translated into the optimal control problem,where the uncertainties of the electromagnetic levitation system are directly reflected in the cost function.The optimal control method is used to solve the robust control problem.The solution to the optimal control problem for the electromagnetic levitation system is indeed a solution to the robust control problem of the electromagnetic levitation system under matched and unmatched uncertainties.The simulation and experimental results demonstrate the performance of the designed control scheme.The performance indices such as integral absolute error(IAE),integral square error(ISE),integral time absolute error(ITAE),and integral time square error(ITSE)are compared for both uncertainties to showcase the robustness of the designed control scheme.
基金supported by the National Defense Basic Scientific Research Program(JCKY2021603B030)the National Natural Science Foundation of China(62273118,12150008)the Natural Science Foundation of Heilongjiang Province(LH2022F023).
文摘Dear Editor,This letter proposes a convex optimization-based model predictive control(MPC)autonomous guidance method for the Mars ascent vehicle(MAV).We use the modified chebyshev-picard iteration(MCPI)to solve optimization sub-problems within the MPC framework,eliminating the dynamic constraints in solving the optimal control problem and enhancing the convergence performance of the algorithm.Moreover,this method can repeatedly perform trajectory optimization calculations at a high frequency,achieving timely correction of the optimal control command.Numerical simulations demonstrate that the method can satisfy the requirements of rapid computation and reliability for the MAV system when considering uncertainties and perturbations.
基金supported in part by the Thai Nguyen University of Technology,Vietnam.
文摘This article presents an adaptive optimal control method for a semi-active suspension system.The model of the suspension system is built,in which the components of uncertain parameters and exogenous disturbance are described.The adaptive optimal control law consists of the sum of the optimal control component and the adaptive control component.First,the optimal control law is designed for the model of the suspension system after ignoring the components of uncertain parameters and exogenous disturbance caused by the road surface.The optimal control law expresses the desired dynamic characteristics of the suspension system.Next,the adaptive component is designed with the purpose of compensating for the effects caused by uncertain parameters and exogenous disturbance caused by the road surface;the adaptive component has adaptive parameter rules to estimate uncertain parameters and exogenous disturbance.When exogenous disturbances are eliminated,the system responds with an optimal controller designed.By separating theoretically the dynamic of a semi-active suspension system,this solution allows the design of two separate controllers easily and has reduced the computational burden and the use of too many tools,thus allowing for more convenient hardware implementation.The simulation results also show the effectiveness of damping oscillations of the proposed solution in this article.
基金The National Natural Science Foundation of China(62173172).
文摘Inverse reinforcement learning optimal control is under the framework of learner-expert.The learner system can imitate the expert system's demonstrated behaviors and does not require the predefined cost function,so it can handle optimal control problems effectively.This paper proposes an inverse reinforcement learning optimal control method for Takagi-Sugeno(T-S)fuzzy systems.Based on learner systems,an expert system is constructed,where the learner system only knows the expert system's optimal control policy.To reconstruct the unknown cost function,we firstly develop a model-based inverse reinforcement learning algorithm for the case that systems dynamics are known.The developed model-based learning algorithm is consists of two learning stages:an inner reinforcement learning loop and an outer inverse optimal control loop.The inner loop desires to obtain optimal control policy via learner's cost function and the outer loop aims to update learner's state-penalty matrices via only using expert's optimal control policy.Then,to eliminate the requirement that the system dynamics must be known,a data-driven integral learning algorithm is presented.It is proved that the presented two algorithms are convergent and the developed inverse reinforcement learning optimal control scheme can ensure the controlled fuzzy learner systems to be asymptotically stable.Finally,we apply the proposed fuzzy optimal control to the truck-trailer system,and the computer simulation results verify the effectiveness of the presented approach.
基金supported by the National Natural Science Foundation of China under 62173172。
文摘This paper investigates the problem of fuzzy adaptive finite-time inverse optimal control for active suspension systems(ASSs).The fuzzy logic systems(FLSs)are utilized to learn the unknown non-linear dynamics and an auxiliary system is established.Based on the finite-time stability theory and inverse optimal theory,a fuzzy adaptive inverse finite-time inverse optimal control method is proposed.It is proven that the formulated control approach guarantees the stability of the controlled systems,while ensuring that errors converge to a small neighborhood of zero within finite time.Moreover,the optimized control performance can be achieved.Eventually,the simulation results demonstrate the effectiveness of the proposed fuzzy adaptive finite-time inverse optimal control scheme.
基金supported by China Postdoctoral Science Foundation(Project ID:2024M762602)the National Natural Science Foundation of China under Grant No.62306232Natural Science Basic Research Program of Shaanxi Province under Grant No.2023-JC-QN-0662.
文摘This paper presents an optimized shared control algorithm for human–AI interaction, implemented through a digital twin framework where the physical system and human operator act as the real agent while an AI-driven digital system functions as the virtual agent. In this digital twin architecture, the real agent acquires an optimal control strategy through observed actions, while the AI virtual agent mirrors the real agent to establish a digital replica system and corresponding control policy. Both the real and virtual optimal controllers are approximated using reinforcement learning(RL) techniques. Specifically, critic neural networks(NNs) are employed to learn the virtual and real optimal value functions, while actor NNs are trained to derive their respective optimal controllers. A novel shared mechanism is introduced to integrate both virtual and real value functions into a unified learning framework, yielding an optimal shared controller. This controller adaptively adjusts the confidence ratio between virtual and real agents, enhancing the system's efficiency and flexibility in handling complex control tasks. The stability of the closed-loop system is rigorously analyzed using the Lyapunov method. The effectiveness of the proposed AI–human interactive system is validated through two numerical examples: a representative nonlinear system and an unmanned aerial vehicle(UAV) control system.
基金co-supported by the National Natural Science Foundation of China(Nos.62303380,62176214,62101590,62003268)。
文摘This paper introduces an optimized backstepping control method for Flexible Airbreathing Hypersonic Vehicles(FAHVs).The approach incorporates nonlinear disturbance observation and reinforcement learning to address complex control challenges.The Minimal Learning Parameter(MLP)technique is applied to manage unknown nonlinear dynamics,significantly reducing the computational load usually associated with Neural Network(NN)weight updates.To improve the control system robustness,an MLP-based nonlinear disturbance observer is designed,which estimates lumped disturbances,including flexibility effects,model uncertainties,and external disruptions within the FAHVs.In parallel,the control strategy integrates reinforcement learning using an MLP-based actor-critic framework within the backstepping design to achieve both optimality and robustness.The actor performs control actions,while the critic assesses the optimal performance index function.To minimize this index function,an adaptive gradient descent method constructs both the actor and critic.Lyapunov analysis is employed to demonstrate that all signals in the closed-loop system are semiglobally uniformly ultimately bounded.Simulation results confirm that the proposed control strategy delivers high control performance,marked by improved accuracy and reduced energy consumption.
文摘Optimal impulse control and impulse games provide the cutting-edge frameworks for modeling systems where control actions occur at discrete time points,and optimizing objectives under discontinuous interventions.This review synthesizes the theoretical advancements,computational approaches,emerging challenges,and possible research directions in the field.Firstly,we briefly review the fundamental theory of continuous-time optimal control,including Pontryagin's maximum principle(PMP)and dynamic programming principle(DPP).Secondly,we present the foundational results in optimal impulse control,including necessary conditions and sufficient conditions.Thirdly,we systematize impulse game methodologies,from Nash equilibrium existence theory to the connection between Nash equilibrium and systems stability.Fourthly,we summarize the numerical algorithms including the intelligent computation approaches.Finally,we examine the new trends and challenges in theory and applications as well as computational considerations.
基金supported in part by the National Key Research and Development Program of China(2021YFE0206100)the National Natural Science Foundation of China(62425310,62073321)+2 种基金the National Defense Basic Scientific Research Program(JCKY2019203C029,JCKY2020130C025)the Science and Technology Development FundMacao SAR(FDCT-22-009-MISE,0060/2021/A2,0015/2020/AMJ)
文摘This paper highlights the utilization of parallel control and adaptive dynamic programming(ADP) for event-triggered robust parallel optimal consensus control(ETRPOC) of uncertain nonlinear continuous-time multiagent systems(MASs).First, the parallel control system, which consists of a virtual control variable and a specific auxiliary variable obtained from the coupled Hamiltonian, allows general systems to be transformed into affine systems. Of interest is the fact that the parallel control technique's introduction provides an unprecedented perspective on eliminating the negative effects of disturbance. Then, an eventtriggered mechanism is adopted to save communication resources while ensuring the system's stability. The coupled HamiltonJacobi(HJ) equation's solution is approximated using a critic neural network(NN), whose weights are updated in response to events. Furthermore, theoretical analysis reveals that the weight estimation error is uniformly ultimately bounded(UUB). Finally,numerical simulations demonstrate the effectiveness of the developed ETRPOC method.
基金supported in part by the National Science Foundation of China(62373240,62273224,U24A20259).
文摘Dear Editor,The attacker is always going to intrude covertly networked control systems(NCSs)by dynamically changing false data injection attacks(FDIAs)strategy,while the defender try their best to resist attacks by designing defense strategy on the basis of identifying attack strategy,maintaining stable operation of NCSs.To solve this attack-defense game problem,this letter investigates optimal secure control of NCSs under FDIAs.First,for the alterations of energy caused by false data,a novel attack-defense game model is constructed,which considers the changes of energy caused by the actions of the defender and attacker in the forward and feedback channels.
文摘Fiber quality measurement in spinning preparation is crucial for optimizing waste and meeting yarn quality specifications.The brand-new Uster AFIS 6–the next-generation laboratory instrument from Uster Technologies–uniquely tests man-made fiber properties in addition to cotton.It provides critical data to optimize fiber process control for cotton,man-made fibers,and blended yarns.
基金supported by the National Natural Science Foundation of China(12305185)Natural Science Foundation of Hunan Province,China(No.2023JJ50122)+1 种基金International Cooperative Research Project of the Ministry of Education,China(No.HZKY20220355)Scientific Research Foundation of the Education Department of Hunan Province,China(No.22A0307).
文摘Compared to other energy sources,nuclear reactors offer several advantages as a spacecraft power source,including compact size,high power density,and long operating life.These qualities make nuclear power an ideal energy source for future deep space exploration.A whole system model of the space nuclear reactor consisting of the reactor neutron kinetics,reactivity control,reactor heat transfer,heat exchanger,and thermoelectric converter was developed.In addition,an electrical power control system was designed based on the developed dynamic model.The GRS method was used to quantitatively calculate the uncertainty of coupling parameters of the neutronics,thermal-hydraulics,and control system for the space reactor.The Spearman correlation coefficient was applied in the sensitivity analysis of system input parameters to output parameters.The calculation results showed that the uncertainty of the output parameters caused by coupling parameters had the most considerable variation,with a relative standard deviation<2.01%.Effective delayed neutron fraction was most sensitive to electrical power.To obtain optimal control performance,the non-dominated sorting genetic algorithm method was employed to optimize the controller parameters based on the uncertainty quantification calculation.Two typical transient simulations were conducted to test the adaptive ability of the optimized controller in the uncertainty dynamic system,including 100%full power(FP)to 90%FP step load reduction transient and 5%FP/min linear variable load transient.The results showed that,considering the influence of system uncertainty,the optimized controller could improve the response speed and load following accuracy of electrical power control,in which the effectiveness and superiority have been verified.
基金Supported in part by Natural Science Foundation of Guangxi(2023GXNSFAA026246)in part by the Central Government's Guide to Local Science and Technology Development Fund(GuikeZY23055044)in part by the National Natural Science Foundation of China(62363003)。
文摘In this paper,we consider the maximal positive definite solution of the nonlinear matrix equation.By using the idea of Algorithm 2.1 in ZHANG(2013),a new inversion-free method with a stepsize parameter is proposed to obtain the maximal positive definite solution of nonlinear matrix equation X+A^(*)X|^(-α)A=Q with the case 0<α≤1.Based on this method,a new iterative algorithm is developed,and its convergence proof is given.Finally,two numerical examples are provided to show the effectiveness of the proposed method.