This work presents an adaptive tracking guidance method for robotic fishes. The scheme enables robots to suppress external interference and eliminate motion jitter. An adaptive integral surge line-of-sight guidance ru...This work presents an adaptive tracking guidance method for robotic fishes. The scheme enables robots to suppress external interference and eliminate motion jitter. An adaptive integral surge line-of-sight guidance rule is designed to eliminate dynamics interference and sideslip issues. Limited-time yaw and surge speed observers are reported to fit disturbance variables in the model. The approximation values can compensate for the system's control input and improve the robots' tracking accuracy.Moreover, this work develops a terminal sliding mode controller and third-order differential processor to determine the rotational torque and reduce the robots' run jitter. Then, Lyapunov's theory proves the uniform ultimate boundedness of the proposed method. Simulation and physical experiments confirm that the technology improves the tracking error convergence speed and stability of robotic fishes.展开更多
Behavior-based autonomous systems rely on human intelligence to resolve multi-mission conflicts by designing mission priority rules and nonlinear controllers.In this work,a novel twolayer reinforcement learning behavi...Behavior-based autonomous systems rely on human intelligence to resolve multi-mission conflicts by designing mission priority rules and nonlinear controllers.In this work,a novel twolayer reinforcement learning behavioral control(RLBC)method is proposed to reduce such dependence by trial-and-error learning.Specifically,in the upper layer,a reinforcement learning mission supervisor(RLMS)is designed to learn the optimal mission priority.Compared with existing mission supervisors,the RLMS improves the dynamic performance of mission priority adjustment by maximizing cumulative rewards and reducing hardware storage demand when using neural networks.In the lower layer,a reinforcement learning controller(RLC)is designed to learn the optimal control policy.Compared with existing behavioral controllers,the RLC reduces the control cost of mission priority adjustment by balancing control performance and consumption.All error signals are proved to be semi-globally uniformly ultimately bounded(SGUUB).Simulation results show that the number of mission priority adjustment and the control cost are significantly reduced compared to some existing mission supervisors and behavioral controllers,respectively.展开更多
In this study,a novel reinforcement learning task supervisor(RLTS)with memory in a behavioral control framework is proposed for human–multi-robot coordination systems(HMRCSs).Existing HMRCSs suffer from high decision...In this study,a novel reinforcement learning task supervisor(RLTS)with memory in a behavioral control framework is proposed for human–multi-robot coordination systems(HMRCSs).Existing HMRCSs suffer from high decision-making time cost and large task tracking errors caused by repeated human intervention,which restricts the autonomy of multi-robot systems(MRSs).Moreover,existing task supervisors in the null-space-based behavioral control(NSBC)framework need to formulate many priority-switching rules manually,which makes it difficult to realize an optimal behavioral priority adjustment strategy in the case of multiple robots and multiple tasks.The proposed RLTS with memory provides a detailed integration of the deep Q-network(DQN)and long short-term memory(LSTM)knowledge base within the NSBC framework,to achieve an optimal behavioral priority adjustment strategy in the presence of task conflict and to reduce the frequency of human intervention.Specifically,the proposed RLTS with memory begins by memorizing human intervention history when the robot systems are not confident in emergencies,and then reloads the history information when encountering the same situation that has been tackled by humans previously.Simulation results demonstrate the effectiveness of the proposed RLTS.Finally,an experiment using a group of mobile robots subject to external noise and disturbances validates the effectiveness of the proposed RLTS with memory in uncertain real-world environments.展开更多
Reinforcement learning behavioral control(RLBC)is limited to an individual agent without any swarm mission,because it models the behavior priority learning as a Markov decision process.In this paper,a novel multi-agen...Reinforcement learning behavioral control(RLBC)is limited to an individual agent without any swarm mission,because it models the behavior priority learning as a Markov decision process.In this paper,a novel multi-agent reinforcement learning behavioral control(MARLBC)method is proposed to overcome such limitations by implementing joint learning.Specifically,a multi-agent reinforcement learning mission supervisor(MARLMS)is designed for a group of nonlinear second-order systems to assign the behavior priorities at the decision layer.Through modeling behavior priority switching as a cooperative Markov game,the MARLMS learns an optimal joint behavior priority to reduce dependence on human intelligence and high-performance computing hardware.At the control layer,a group of second-order reinforcement learning controllers are designed to learn the optimal control policies to track position and velocity signals simultaneously.In particular,input saturation constraints are strictly implemented via designing a group of adaptive compensators.Numerical simulation results show that the proposed MARLBC has a lower switching frequency and control cost than finite-time and fixed-time behavioral control and RLBC methods.展开更多
基金supported in part by the National Natural Science Foundation of China(62303117,T2325018,92367109)the Xiangjiang Scholar Program(XJ2023018)+2 种基金the Key Laboratory of System Control and Information Processing(Scip20240108)the Aeronautical Science Foundation of China(20230001144001)Fujian Provincial Natural Science Foundation(2024J01130098)
文摘This work presents an adaptive tracking guidance method for robotic fishes. The scheme enables robots to suppress external interference and eliminate motion jitter. An adaptive integral surge line-of-sight guidance rule is designed to eliminate dynamics interference and sideslip issues. Limited-time yaw and surge speed observers are reported to fit disturbance variables in the model. The approximation values can compensate for the system's control input and improve the robots' tracking accuracy.Moreover, this work develops a terminal sliding mode controller and third-order differential processor to determine the rotational torque and reduce the robots' run jitter. Then, Lyapunov's theory proves the uniform ultimate boundedness of the proposed method. Simulation and physical experiments confirm that the technology improves the tracking error convergence speed and stability of robotic fishes.
基金the National Natural Science Foundation of China(61603094)。
文摘Behavior-based autonomous systems rely on human intelligence to resolve multi-mission conflicts by designing mission priority rules and nonlinear controllers.In this work,a novel twolayer reinforcement learning behavioral control(RLBC)method is proposed to reduce such dependence by trial-and-error learning.Specifically,in the upper layer,a reinforcement learning mission supervisor(RLMS)is designed to learn the optimal mission priority.Compared with existing mission supervisors,the RLMS improves the dynamic performance of mission priority adjustment by maximizing cumulative rewards and reducing hardware storage demand when using neural networks.In the lower layer,a reinforcement learning controller(RLC)is designed to learn the optimal control policy.Compared with existing behavioral controllers,the RLC reduces the control cost of mission priority adjustment by balancing control performance and consumption.All error signals are proved to be semi-globally uniformly ultimately bounded(SGUUB).Simulation results show that the number of mission priority adjustment and the control cost are significantly reduced compared to some existing mission supervisors and behavioral controllers,respectively.
基金supported by the National Natural Science Foundation of China(No.61603094)。
文摘In this study,a novel reinforcement learning task supervisor(RLTS)with memory in a behavioral control framework is proposed for human–multi-robot coordination systems(HMRCSs).Existing HMRCSs suffer from high decision-making time cost and large task tracking errors caused by repeated human intervention,which restricts the autonomy of multi-robot systems(MRSs).Moreover,existing task supervisors in the null-space-based behavioral control(NSBC)framework need to formulate many priority-switching rules manually,which makes it difficult to realize an optimal behavioral priority adjustment strategy in the case of multiple robots and multiple tasks.The proposed RLTS with memory provides a detailed integration of the deep Q-network(DQN)and long short-term memory(LSTM)knowledge base within the NSBC framework,to achieve an optimal behavioral priority adjustment strategy in the presence of task conflict and to reduce the frequency of human intervention.Specifically,the proposed RLTS with memory begins by memorizing human intervention history when the robot systems are not confident in emergencies,and then reloads the history information when encountering the same situation that has been tackled by humans previously.Simulation results demonstrate the effectiveness of the proposed RLTS.Finally,an experiment using a group of mobile robots subject to external noise and disturbances validates the effectiveness of the proposed RLTS with memory in uncertain real-world environments.
基金Project supported by the National Natural Science Foundation of China(No.92367109)。
文摘Reinforcement learning behavioral control(RLBC)is limited to an individual agent without any swarm mission,because it models the behavior priority learning as a Markov decision process.In this paper,a novel multi-agent reinforcement learning behavioral control(MARLBC)method is proposed to overcome such limitations by implementing joint learning.Specifically,a multi-agent reinforcement learning mission supervisor(MARLMS)is designed for a group of nonlinear second-order systems to assign the behavior priorities at the decision layer.Through modeling behavior priority switching as a cooperative Markov game,the MARLMS learns an optimal joint behavior priority to reduce dependence on human intelligence and high-performance computing hardware.At the control layer,a group of second-order reinforcement learning controllers are designed to learn the optimal control policies to track position and velocity signals simultaneously.In particular,input saturation constraints are strictly implemented via designing a group of adaptive compensators.Numerical simulation results show that the proposed MARLBC has a lower switching frequency and control cost than finite-time and fixed-time behavioral control and RLBC methods.