This paper studies motor joint control of a 4-degree-of-freedom(DoF)robotic manipulator using learning-based Adaptive Dynamic Programming(ADP)approach.The manipulator’s dynamics are modelled as an open-loop 4-link se...This paper studies motor joint control of a 4-degree-of-freedom(DoF)robotic manipulator using learning-based Adaptive Dynamic Programming(ADP)approach.The manipulator’s dynamics are modelled as an open-loop 4-link serial kinematic chain with 4 Degrees of Freedom(DoF).Decentralised optimal controllers are designed for each link using ADP approach based on a set of cost matrices and data collected from exploration trajectories.The proposed control strategy employs an off-line,off-policy iterative approach to derive four optimal control policies,one for each joint,under exploration strategies.The objective of the controller is to control the position of each joint.Simulation and experimental results show that four independent optimal controllers are found,each under similar exploration strategies,and the proposed ADP approach successfully yields optimal linear control policies despite the presence of these complexities.The experimental results conducted on the Quanser Qarm robotic platform demonstrate the effectiveness of the proposed ADP controllers in handling significant dynamic nonlinearities,such as actuation limitations,output saturation,and filter delays.展开更多
In this paper,a distributed adaptive dynamic programming(ADP)framework based on value iteration is proposed for multi-player differential games.In the game setting,players have no access to the information of others...In this paper,a distributed adaptive dynamic programming(ADP)framework based on value iteration is proposed for multi-player differential games.In the game setting,players have no access to the information of others'system parameters or control laws.Each player adopts an on-policy value iteration algorithm as the basic learning framework.To deal with the incomplete information structure,players collect a period of system trajectory data to compensate for the lack of information.The policy updating step is implemented by a nonlinear optimization problem aiming to search for the proximal admissible policy.Theoretical analysis shows that by adopting proximal policy searching rules,the approximated policies can converge to a neighborhood of equilibrium policies.The efficacy of our method is illustrated by three examples,which also demonstrate that the proposed method can accelerate the learning process compared with the centralized learning framework.展开更多
Learning-based methods have become mainstream for solving residential energy scheduling problems. In order to improve the learning efficiency of existing methods and increase the utilization of renewable energy, we pr...Learning-based methods have become mainstream for solving residential energy scheduling problems. In order to improve the learning efficiency of existing methods and increase the utilization of renewable energy, we propose the Dyna actiondependent heuristic dynamic programming(Dyna-ADHDP)method, which incorporates the ideas of learning and planning from the Dyna framework in action-dependent heuristic dynamic programming. This method defines a continuous action space for precise control of an energy storage system and allows online optimization of algorithm performance during the real-time operation of the residential energy model. Meanwhile, the target network is introduced during the training process to make the training smoother and more efficient. We conducted experimental comparisons with the benchmark method using simulated and real data to verify its applicability and performance. The results confirm the method's excellent performance and generalization capabilities, as well as its excellence in increasing renewable energy utilization and extending equipment life.展开更多
Reliable and efficient communication is essential for Unmanned Aerial Vehicle(UAV)networks,especially in dynamic and resource-constrained environments such as disaster management,surveillance,and environmental monitor...Reliable and efficient communication is essential for Unmanned Aerial Vehicle(UAV)networks,especially in dynamic and resource-constrained environments such as disaster management,surveillance,and environmental monitoring.Frequent topology changes,high mobility,and limited energy availability pose significant challenges to maintaining stable and high-performance routing.Traditional routing protocols,such as Ad hoc On-Demand Distance Vector(AODV),Load-Balanced Optimized Predictive Ad hoc Routing(LB-OPAR),and Destination-Sequenced Distance Vector(DSDV),often experience performance degradation under such conditions.To address these limitations,this study evaluates the effectiveness of Dynamic Adaptive Routing(DAR),a protocol designed to adapt routing decisions in real time based on network dynamics and resource constraints.The research utilizes the Network Simulator 3(NS-3)platform to conduct controlled simulations,measuring key performance indicators such as latency,Packet Delivery Ratio(PDR),energy consumption,and throughput.Comparative analysis reveals that DAR consistently outperforms conventional protocols,achieving a 20%-30% reduction in latency,a 25% decrease in energy consumption,and marked improvements in throughput and PDR.These results highlight DAR’s ability to maintain high communication reliability while optimizing resource usage in challenging operational scenarios.By providing empirical evidence of DAR’s advantages in highly dynamic UAV network environments,this study contributes to advancing adaptive routing strategies.The findings not only validate DAR’s robustness and scalability but also lay the groundwork for integrating artificial intelligence-driven decision-making and real-world UAV deployment.Future work will explore cross-layer optimization,multi-UAV coordination,and experimental validation in field trials,aiming to further enhance communication resilience and energy efficiency in next-generation aerial networks.展开更多
Dynamic adaptability is a key feature in biological macromolecules,enabling selective binding and catalysis[1].From DNA supercoiling to enzyme conformational changes,biological systems have evolved intricate ways to d...Dynamic adaptability is a key feature in biological macromolecules,enabling selective binding and catalysis[1].From DNA supercoiling to enzyme conformational changes,biological systems have evolved intricate ways to dynamically adjust their structures to accommodate functional needs.Mimicking this adaptability in synthetic systems is an ongoing challenge in supramolecular chemistry.展开更多
We have succeeded in 2-slit interference simulation by assuming that a travelling particle interacts with its environment, getting information on the environmental condition according to the adaptive dynamics by Ohya,...We have succeeded in 2-slit interference simulation by assuming that a travelling particle interacts with its environment, getting information on the environmental condition according to the adaptive dynamics by Ohya, thus proposed the possibility that the entanglement comes from the interaction with the environment (Ando et al., 2023). This concept means that there should be no isolated or inertial system other than our unique universe space. Taking this message into account and assuming that the signal velocity is constant against our unique universe space, we reconsidered the inertial system and relativity theory by Galilei and Einstein and found several misunderstandings and errors. Time delay and Lorentz shrinkage were derived similarly to the prediction by special relativity theory, but Lorentz transformation and 4-dimensional time/space view were not. They must have implicitly and unconsciously assumed that any signals transfer information without giving any influences to any systems different from our adaptive dynamical view. We propose that their relativity theories should be reinterpreted in view of adaptive dynamics.展开更多
We applied adaptive dynamics to double slit interference phenomenon using particle model and obtained partial successful results in our previous report. The patterns qualitatively corresponded well with experiments. S...We applied adaptive dynamics to double slit interference phenomenon using particle model and obtained partial successful results in our previous report. The patterns qualitatively corresponded well with experiments. Several properties such as concave single slit pattern and large influence of slight displacement of the emission position were different from the experimental results. In this study we tried other slit conditions and obtained consistent patterns with experiments. We do not claim that the adaptive dynamics is the principle of quantum mechanics, but the present results support the probability of adaptive dynamics as the candidate of the basis of quantum mechanics. We discuss the advantages of the adaptive dynamical view for foundations of quantum mechanics.展开更多
A dynamics-based adaptive control approach is proposed for a planar dual-arm space robot in the presence of closed-loop constraints and uncertain inertial parameters of the payload. The controller is capable of contro...A dynamics-based adaptive control approach is proposed for a planar dual-arm space robot in the presence of closed-loop constraints and uncertain inertial parameters of the payload. The controller is capable of controlling the po- sition and attitude of both the satellite base and the payload grasped by the manipulator end effectors. The equations of motion in reduced-order form for the constrained system are derived by incorporating the constraint equations in terms of accelerations into Kane's equations of the unconstrained system. Model analysis shows that the resulting equations perfectly meet the requirement of adaptive controller design. Consequently, by using an indirect approach, an adaptive control scheme is proposed to accomplish position/attitude trajectory tracking control with the uncertain parameters be- ing estimated on-line. The actuator redundancy due to the closed-loop constraints is utilized to minimize a weighted norm of the joint torques. Global asymptotic stability is proven by using Lyapunov's method, and simulation results are also presented to demonstrate the effectiveness of the proposed approach.展开更多
The impact dynamics, impact effect, and post-impact unstable motion sup- pression of free-floating space manipulator capturing a satellite on orbit are analyzed. Firstly, the dynamics equation of free-floating space m...The impact dynamics, impact effect, and post-impact unstable motion sup- pression of free-floating space manipulator capturing a satellite on orbit are analyzed. Firstly, the dynamics equation of free-floating space manipulator is derived using the sec- ond Lagrangian equation. Combining the momentum conservation principle, the impact dynamics and effect between the space manipulator end-effector and satellite of the cap- ture process are analyzed with the momentum impulse method. Focusing on the unstable motion of space manipulator due to the above impact effect, a robust adaptive compound control algorithm is designed to suppress the above unstable motion. There is no need to control the free-floating base position to save the jet fuel. Finally, the simulation is proposed to show the impact effect and verify the validity of the control algorithm.展开更多
Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and ...Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.展开更多
In this paper,a novel adaptive Fault-Tolerant Control(FTC)strategy is proposed for non-minimum phase Hypersonic Vehicles(HSVs)that are affected by actuator faults and parameter uncertainties.The strategy is based on t...In this paper,a novel adaptive Fault-Tolerant Control(FTC)strategy is proposed for non-minimum phase Hypersonic Vehicles(HSVs)that are affected by actuator faults and parameter uncertainties.The strategy is based on the output redefinition method and Adaptive Dynamic Programming(ADP).The intelligent FTC scheme consists of two main parts:a basic fault-tolerant and stable controller and an ADP-based supplementary controller.In the basic FTC part,an output redefinition approach is designed to make zero-dynamics stable with respect to the new output.Then,Ideal Internal Dynamic(IID)is obtained using an optimal bounded inversion approach,and a tracking controller is designed for the new output to realize output tracking of the nonminimum phase HSV system.For the ADP-based compensation control part,an ActionDependent Heuristic Dynamic Programming(ADHDP)adopting an actor-critic learning structure is utilized to further optimize the tracking performance of the HSV control system.Finally,simulation results are provided to verify the effectiveness and efficiency of the proposed FTC algorithm.展开更多
This paper presents a novel cooperative value iteration(VI)-based adaptive dynamic programming method for multi-player differential game models with a convergence proof.The players are divided into two groups in the l...This paper presents a novel cooperative value iteration(VI)-based adaptive dynamic programming method for multi-player differential game models with a convergence proof.The players are divided into two groups in the learning process and adapt their policies sequentially.Our method removes the dependence of admissible initial policies,which is one of the main drawbacks of the PI-based frameworks.Furthermore,this algorithm enables the players to adapt their control policies without full knowledge of others’ system parameters or control laws.The efficacy of our method is illustrated by three examples.展开更多
In order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming(ADP) technique based on the internal model principle(IMP). The proposed metho...In order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming(ADP) technique based on the internal model principle(IMP). The proposed method, termed as IMP-ADP, does not require complete state feedback-merely the measurement of input and output data. More specifically, based on the IMP, the output control problem can first be converted into a stabilization problem. We then design an observer to reproduce the full state of the system by measuring the inputs and outputs. Moreover, this technique includes both a policy iteration algorithm and a value iteration algorithm to determine the optimal feedback gain without using a dynamic system model. It is important that with this concept one does not need to solve the regulator equation. Finally, this control method was tested on an inverter system of grid-connected LCLs to demonstrate that the proposed method provides the desired performance in terms of both tracking and disturbance rejection.展开更多
Normally large amounts of particles are required to accurately simulate the metal cutting process,which consumes a lot of computing time and storage.Adaptive techniques can help decrease the number of particles,hence ...Normally large amounts of particles are required to accurately simulate the metal cutting process,which consumes a lot of computing time and storage.Adaptive techniques can help decrease the number of particles,hence reducing the runtime.This paper presents a novel adaptive smoothed particle hydrodynamics(SPH)method for the metal cutting simulation.The spatial resolution changes adaptively according to the distance to the tool tip by the particle splitting and merging.More particles are selected in the region where the workpiece and the tool are in contact.Since the contact region constantly changes during the cutting process,two quadrilateral frames are adopted in the adaptive algorithm to dynamically change the distribution of particles.One frame for the refinement,the other for the coarsening.These frames move at the same speed as the tool.To test the computational efficiency,the metal cutting process is simulated by using SPH with three different adaptive approaches.Numerical results show that the proposed adaptive algorithm with dynamic refinement and coarsening can significantly optimize the runtime.展开更多
The helicopter Trailing-Edge Flaps(TEFs)technology is one of the recent hot topics in morphing wing research.By employing controlled deflection,TEFs can effectively reduce the vibration level of helicopters.Thus,desig...The helicopter Trailing-Edge Flaps(TEFs)technology is one of the recent hot topics in morphing wing research.By employing controlled deflection,TEFs can effectively reduce the vibration level of helicopters.Thus,designing specific vibration reduction control methods for the helicopters equipped with trailing-edge flaps is of significant practical value.This paper studies the optimal control problem for helicopter-vibration systems with TEFs under the framework of adaptive dynamic programming combined with Reinforcement Learning(RL).Time-delay and disturbances,caused by complexity of helicopter dynamics,inevitably deteriorate the control performance of vibration reduction.To solve this problem,a zero-sum game formulation with a linear quadratic form for reducing vibration of helicopter systems is presented with a virtual predictor.In this context,an off-policy reinforcement learning algorithm is developed to determine the optimal control policy.The algorithm utilizes only vertical vibration load data to achieve a policy that reduces vibration,attains Nash equilibrium,and addresses disturbances while compensating for time-delay without knowledge of the dynamics of the helicopter system.The effectiveness of the proposed method is demonstrated in a virtual platform.展开更多
This paper presents an optimized shared control algorithm for human–AI interaction, implemented through a digital twin framework where the physical system and human operator act as the real agent while an AI-driven d...This paper presents an optimized shared control algorithm for human–AI interaction, implemented through a digital twin framework where the physical system and human operator act as the real agent while an AI-driven digital system functions as the virtual agent. In this digital twin architecture, the real agent acquires an optimal control strategy through observed actions, while the AI virtual agent mirrors the real agent to establish a digital replica system and corresponding control policy. Both the real and virtual optimal controllers are approximated using reinforcement learning(RL) techniques. Specifically, critic neural networks(NNs) are employed to learn the virtual and real optimal value functions, while actor NNs are trained to derive their respective optimal controllers. A novel shared mechanism is introduced to integrate both virtual and real value functions into a unified learning framework, yielding an optimal shared controller. This controller adaptively adjusts the confidence ratio between virtual and real agents, enhancing the system's efficiency and flexibility in handling complex control tasks. The stability of the closed-loop system is rigorously analyzed using the Lyapunov method. The effectiveness of the proposed AI–human interactive system is validated through two numerical examples: a representative nonlinear system and an unmanned aerial vehicle(UAV) control system.展开更多
Integral reinforcement learning(IRL)is an effective tool for solving optimal control problems of nonlinear systems,and it has been widely utilized in optimal controller design for solving discrete-time nonlinearity.Ho...Integral reinforcement learning(IRL)is an effective tool for solving optimal control problems of nonlinear systems,and it has been widely utilized in optimal controller design for solving discrete-time nonlinearity.However,solving the Hamilton-Jacobi-Bellman(HJB)equations for nonlinear systems requires precise and complicated dynamics.Moreover,the research and application of IRL in continuous-time(CT)systems must be further improved.To develop the IRL of a CT nonlinear system,a data-based adaptive neural dynamic programming(ANDP)method is proposed to investigate the optimal control problem of uncertain CT multi-input systems such that the knowledge of the dynamics in the HJB equation is unnecessary.First,the multi-input model is approximated using a neural network(NN),which can be utilized to design an integral reinforcement signal.Subsequently,two criterion networks and one action network are constructed based on the integral reinforcement signal.A nonzero-sum Nash equilibrium can be reached by learning the optimal strategies of the multi-input model.In this scheme,the NN weights are constantly updated using an adaptive algorithm.The weight convergence and the system stability are analyzed in detail.The optimal control problem of a multi-input nonlinear CT system is effectively solved using the ANDP scheme,and the results are verified by a simulation study.展开更多
A mixed adaptive dynamic programming(ADP)scheme based on zero-sum game theory is developed to address optimal control problems of autonomous underwater vehicle(AUV)systems subject to disturbances and safe constraints....A mixed adaptive dynamic programming(ADP)scheme based on zero-sum game theory is developed to address optimal control problems of autonomous underwater vehicle(AUV)systems subject to disturbances and safe constraints.By combining prior dynamic knowledge and actual sampled data,the proposed approach effectively mitigates the defect caused by the inaccurate dynamic model and significantly improves the training speed of the ADP algorithm.Initially,the dataset is enriched with sufficient reference data collected based on a nominal model without considering modelling bias.Also,the control object interacts with the real environment and continuously gathers adequate sampled data in the dataset.To comprehensively leverage the advantages of model-based and model-free methods during training,an adaptive tuning factor is introduced based on the dataset that possesses model-referenced information and conforms to the distribution of the real-world environment,which balances the influence of model-based control law and data-driven policy gradient on the direction of policy improvement.As a result,the proposed approach accelerates the learning speed compared to data-driven methods,concurrently also enhancing the tracking performance in comparison to model-based control methods.Moreover,the optimal control problem under disturbances is formulated as a zero-sum game,and the actor-critic-disturbance framework is introduced to approximate the optimal control input,cost function,and disturbance policy,respectively.Furthermore,the convergence property of the proposed algorithm based on the value iteration method is analysed.Finally,an example of AUV path following based on the improved line-of-sight guidance is presented to demonstrate the effectiveness of the proposed method.展开更多
This paper highlights the utilization of parallel control and adaptive dynamic programming(ADP) for event-triggered robust parallel optimal consensus control(ETRPOC) of uncertain nonlinear continuous-time multiagent s...This paper highlights the utilization of parallel control and adaptive dynamic programming(ADP) for event-triggered robust parallel optimal consensus control(ETRPOC) of uncertain nonlinear continuous-time multiagent systems(MASs).First, the parallel control system, which consists of a virtual control variable and a specific auxiliary variable obtained from the coupled Hamiltonian, allows general systems to be transformed into affine systems. Of interest is the fact that the parallel control technique's introduction provides an unprecedented perspective on eliminating the negative effects of disturbance. Then, an eventtriggered mechanism is adopted to save communication resources while ensuring the system's stability. The coupled HamiltonJacobi(HJ) equation's solution is approximated using a critic neural network(NN), whose weights are updated in response to events. Furthermore, theoretical analysis reveals that the weight estimation error is uniformly ultimately bounded(UUB). Finally,numerical simulations demonstrate the effectiveness of the developed ETRPOC method.展开更多
The paper develops a robust control approach for nonaffine nonlinear continuous systems with input constraints and unknown uncertainties. Firstly, this paper constructs an affine augmented system(AAS) within a pre-com...The paper develops a robust control approach for nonaffine nonlinear continuous systems with input constraints and unknown uncertainties. Firstly, this paper constructs an affine augmented system(AAS) within a pre-compensation technique for converting the original nonaffine dynamics into affine dynamics. Secondly, the paper derives a stability criterion linking the original nonaffine system and the auxiliary system, demonstrating that the obtained optimal policies from the auxiliary system can achieve the robust controller of the nonaffine system. Thirdly, an online adaptive dynamic programming(ADP) algorithm is designed for approximating the optimal solution of the Hamilton–Jacobi–Bellman(HJB) equation.Moreover, the gradient descent approach and projection approach are employed for updating the actor-critic neural network(NN) weights, with the algorithm's convergence being proven. Then, the uniformly ultimately bounded stability of state is guaranteed. Finally, in simulation, some examples are offered for validating the effectiveness of this presented approach.展开更多
基金supported by the DEEPCOBOT project under Grant 306640/O70 funded by the Research Council of Norway.
文摘This paper studies motor joint control of a 4-degree-of-freedom(DoF)robotic manipulator using learning-based Adaptive Dynamic Programming(ADP)approach.The manipulator’s dynamics are modelled as an open-loop 4-link serial kinematic chain with 4 Degrees of Freedom(DoF).Decentralised optimal controllers are designed for each link using ADP approach based on a set of cost matrices and data collected from exploration trajectories.The proposed control strategy employs an off-line,off-policy iterative approach to derive four optimal control policies,one for each joint,under exploration strategies.The objective of the controller is to control the position of each joint.Simulation and experimental results show that four independent optimal controllers are found,each under similar exploration strategies,and the proposed ADP approach successfully yields optimal linear control policies despite the presence of these complexities.The experimental results conducted on the Quanser Qarm robotic platform demonstrate the effectiveness of the proposed ADP controllers in handling significant dynamic nonlinearities,such as actuation limitations,output saturation,and filter delays.
基金supported by the Aeronautical Science Foundation of China(20220001057001)an Open Project of the National Key Laboratory of Air-based Information Perception and Fusion(202437)
文摘In this paper,a distributed adaptive dynamic programming(ADP)framework based on value iteration is proposed for multi-player differential games.In the game setting,players have no access to the information of others'system parameters or control laws.Each player adopts an on-policy value iteration algorithm as the basic learning framework.To deal with the incomplete information structure,players collect a period of system trajectory data to compensate for the lack of information.The policy updating step is implemented by a nonlinear optimization problem aiming to search for the proximal admissible policy.Theoretical analysis shows that by adopting proximal policy searching rules,the approximated policies can converge to a neighborhood of equilibrium policies.The efficacy of our method is illustrated by three examples,which also demonstrate that the proposed method can accelerate the learning process compared with the centralized learning framework.
基金supported in part by the National Key Research and Development Program of China(2024YFB4709100,2021YFE0206100)the National Natural Science Foundation of China(62073321)+1 种基金the National Defense Basic Scientific Research Program(JCKY2019203C029)the Science and Technology Development Fund,Macao SAR,China(0015/2020/AMJ)
文摘Learning-based methods have become mainstream for solving residential energy scheduling problems. In order to improve the learning efficiency of existing methods and increase the utilization of renewable energy, we propose the Dyna actiondependent heuristic dynamic programming(Dyna-ADHDP)method, which incorporates the ideas of learning and planning from the Dyna framework in action-dependent heuristic dynamic programming. This method defines a continuous action space for precise control of an energy storage system and allows online optimization of algorithm performance during the real-time operation of the residential energy model. Meanwhile, the target network is introduced during the training process to make the training smoother and more efficient. We conducted experimental comparisons with the benchmark method using simulated and real data to verify its applicability and performance. The results confirm the method's excellent performance and generalization capabilities, as well as its excellence in increasing renewable energy utilization and extending equipment life.
文摘Reliable and efficient communication is essential for Unmanned Aerial Vehicle(UAV)networks,especially in dynamic and resource-constrained environments such as disaster management,surveillance,and environmental monitoring.Frequent topology changes,high mobility,and limited energy availability pose significant challenges to maintaining stable and high-performance routing.Traditional routing protocols,such as Ad hoc On-Demand Distance Vector(AODV),Load-Balanced Optimized Predictive Ad hoc Routing(LB-OPAR),and Destination-Sequenced Distance Vector(DSDV),often experience performance degradation under such conditions.To address these limitations,this study evaluates the effectiveness of Dynamic Adaptive Routing(DAR),a protocol designed to adapt routing decisions in real time based on network dynamics and resource constraints.The research utilizes the Network Simulator 3(NS-3)platform to conduct controlled simulations,measuring key performance indicators such as latency,Packet Delivery Ratio(PDR),energy consumption,and throughput.Comparative analysis reveals that DAR consistently outperforms conventional protocols,achieving a 20%-30% reduction in latency,a 25% decrease in energy consumption,and marked improvements in throughput and PDR.These results highlight DAR’s ability to maintain high communication reliability while optimizing resource usage in challenging operational scenarios.By providing empirical evidence of DAR’s advantages in highly dynamic UAV network environments,this study contributes to advancing adaptive routing strategies.The findings not only validate DAR’s robustness and scalability but also lay the groundwork for integrating artificial intelligence-driven decision-making and real-world UAV deployment.Future work will explore cross-layer optimization,multi-UAV coordination,and experimental validation in field trials,aiming to further enhance communication resilience and energy efficiency in next-generation aerial networks.
基金the Natural Science Foundation of China(No.22301131)the Natural Science Foundation of Jiangsu Province(Nos.BK20220781,BK20240679)the National Key Research and Development Program of China(No.2024YFB3815700)are greatly acknowledged.
文摘Dynamic adaptability is a key feature in biological macromolecules,enabling selective binding and catalysis[1].From DNA supercoiling to enzyme conformational changes,biological systems have evolved intricate ways to dynamically adjust their structures to accommodate functional needs.Mimicking this adaptability in synthetic systems is an ongoing challenge in supramolecular chemistry.
文摘We have succeeded in 2-slit interference simulation by assuming that a travelling particle interacts with its environment, getting information on the environmental condition according to the adaptive dynamics by Ohya, thus proposed the possibility that the entanglement comes from the interaction with the environment (Ando et al., 2023). This concept means that there should be no isolated or inertial system other than our unique universe space. Taking this message into account and assuming that the signal velocity is constant against our unique universe space, we reconsidered the inertial system and relativity theory by Galilei and Einstein and found several misunderstandings and errors. Time delay and Lorentz shrinkage were derived similarly to the prediction by special relativity theory, but Lorentz transformation and 4-dimensional time/space view were not. They must have implicitly and unconsciously assumed that any signals transfer information without giving any influences to any systems different from our adaptive dynamical view. We propose that their relativity theories should be reinterpreted in view of adaptive dynamics.
文摘We applied adaptive dynamics to double slit interference phenomenon using particle model and obtained partial successful results in our previous report. The patterns qualitatively corresponded well with experiments. Several properties such as concave single slit pattern and large influence of slight displacement of the emission position were different from the experimental results. In this study we tried other slit conditions and obtained consistent patterns with experiments. We do not claim that the adaptive dynamics is the principle of quantum mechanics, but the present results support the probability of adaptive dynamics as the candidate of the basis of quantum mechanics. We discuss the advantages of the adaptive dynamical view for foundations of quantum mechanics.
基金supported by the National Natural Science Foundation of China(11272027)
文摘A dynamics-based adaptive control approach is proposed for a planar dual-arm space robot in the presence of closed-loop constraints and uncertain inertial parameters of the payload. The controller is capable of controlling the po- sition and attitude of both the satellite base and the payload grasped by the manipulator end effectors. The equations of motion in reduced-order form for the constrained system are derived by incorporating the constraint equations in terms of accelerations into Kane's equations of the unconstrained system. Model analysis shows that the resulting equations perfectly meet the requirement of adaptive controller design. Consequently, by using an indirect approach, an adaptive control scheme is proposed to accomplish position/attitude trajectory tracking control with the uncertain parameters be- ing estimated on-line. The actuator redundancy due to the closed-loop constraints is utilized to minimize a weighted norm of the joint torques. Global asymptotic stability is proven by using Lyapunov's method, and simulation results are also presented to demonstrate the effectiveness of the proposed approach.
基金supported by the National Natural Science Foundation of China(Nos.11072061 and 11372073)the Natural Science Foundation of Fujian Province(No.2010J01003)
文摘The impact dynamics, impact effect, and post-impact unstable motion sup- pression of free-floating space manipulator capturing a satellite on orbit are analyzed. Firstly, the dynamics equation of free-floating space manipulator is derived using the sec- ond Lagrangian equation. Combining the momentum conservation principle, the impact dynamics and effect between the space manipulator end-effector and satellite of the cap- ture process are analyzed with the momentum impulse method. Focusing on the unstable motion of space manipulator due to the above impact effect, a robust adaptive compound control algorithm is designed to suppress the above unstable motion. There is no need to control the free-floating base position to save the jet fuel. Finally, the simulation is proposed to show the impact effect and verify the validity of the control algorithm.
基金supported in part by the National Natural Science Foundation of China(62222301, 62073085, 62073158, 61890930-5, 62021003)the National Key Research and Development Program of China (2021ZD0112302, 2021ZD0112301, 2018YFC1900800-5)Beijing Natural Science Foundation (JQ19013)。
文摘Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.
基金supported in part by the Science Center Program of National Natural Science Foundation of China(62373189,62188101,62020106003)the Research Fund of State Key Laboratory of Mechanics and Control for Aerospace Structures,China。
文摘In this paper,a novel adaptive Fault-Tolerant Control(FTC)strategy is proposed for non-minimum phase Hypersonic Vehicles(HSVs)that are affected by actuator faults and parameter uncertainties.The strategy is based on the output redefinition method and Adaptive Dynamic Programming(ADP).The intelligent FTC scheme consists of two main parts:a basic fault-tolerant and stable controller and an ADP-based supplementary controller.In the basic FTC part,an output redefinition approach is designed to make zero-dynamics stable with respect to the new output.Then,Ideal Internal Dynamic(IID)is obtained using an optimal bounded inversion approach,and a tracking controller is designed for the new output to realize output tracking of the nonminimum phase HSV system.For the ADP-based compensation control part,an ActionDependent Heuristic Dynamic Programming(ADHDP)adopting an actor-critic learning structure is utilized to further optimize the tracking performance of the HSV control system.Finally,simulation results are provided to verify the effectiveness and efficiency of the proposed FTC algorithm.
基金supported by the Industry-University-Research Cooperation Fund Project of the Eighth Research Institute of China Aerospace Science and Technology Corporation (USCAST2022-11)Aeronautical Science Foundation of China (20220001057001)。
文摘This paper presents a novel cooperative value iteration(VI)-based adaptive dynamic programming method for multi-player differential game models with a convergence proof.The players are divided into two groups in the learning process and adapt their policies sequentially.Our method removes the dependence of admissible initial policies,which is one of the main drawbacks of the PI-based frameworks.Furthermore,this algorithm enables the players to adapt their control policies without full knowledge of others’ system parameters or control laws.The efficacy of our method is illustrated by three examples.
基金supported by the National Science Fund for Distinguished Young Scholars (62225303)the Fundamental Research Funds for the Central Universities (buctrc202201)+1 种基金China Scholarship Council,and High Performance Computing PlatformCollege of Information Science and Technology,Beijing University of Chemical Technology。
文摘In order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming(ADP) technique based on the internal model principle(IMP). The proposed method, termed as IMP-ADP, does not require complete state feedback-merely the measurement of input and output data. More specifically, based on the IMP, the output control problem can first be converted into a stabilization problem. We then design an observer to reproduce the full state of the system by measuring the inputs and outputs. Moreover, this technique includes both a policy iteration algorithm and a value iteration algorithm to determine the optimal feedback gain without using a dynamic system model. It is important that with this concept one does not need to solve the regulator equation. Finally, this control method was tested on an inverter system of grid-connected LCLs to demonstrate that the proposed method provides the desired performance in terms of both tracking and disturbance rejection.
基金the National Natural Science Foundation of China(Grant Nos.12002290 and 11772274).
文摘Normally large amounts of particles are required to accurately simulate the metal cutting process,which consumes a lot of computing time and storage.Adaptive techniques can help decrease the number of particles,hence reducing the runtime.This paper presents a novel adaptive smoothed particle hydrodynamics(SPH)method for the metal cutting simulation.The spatial resolution changes adaptively according to the distance to the tool tip by the particle splitting and merging.More particles are selected in the region where the workpiece and the tool are in contact.Since the contact region constantly changes during the cutting process,two quadrilateral frames are adopted in the adaptive algorithm to dynamically change the distribution of particles.One frame for the refinement,the other for the coarsening.These frames move at the same speed as the tool.To test the computational efficiency,the metal cutting process is simulated by using SPH with three different adaptive approaches.Numerical results show that the proposed adaptive algorithm with dynamic refinement and coarsening can significantly optimize the runtime.
基金co-supported by the National Natural Science Foundation of China(Nos.62022060,62073234,62073158,62373268,62373273)the Basic Research Project of Education Department of Liaoning Province,China(No.LJKZ0401).
文摘The helicopter Trailing-Edge Flaps(TEFs)technology is one of the recent hot topics in morphing wing research.By employing controlled deflection,TEFs can effectively reduce the vibration level of helicopters.Thus,designing specific vibration reduction control methods for the helicopters equipped with trailing-edge flaps is of significant practical value.This paper studies the optimal control problem for helicopter-vibration systems with TEFs under the framework of adaptive dynamic programming combined with Reinforcement Learning(RL).Time-delay and disturbances,caused by complexity of helicopter dynamics,inevitably deteriorate the control performance of vibration reduction.To solve this problem,a zero-sum game formulation with a linear quadratic form for reducing vibration of helicopter systems is presented with a virtual predictor.In this context,an off-policy reinforcement learning algorithm is developed to determine the optimal control policy.The algorithm utilizes only vertical vibration load data to achieve a policy that reduces vibration,attains Nash equilibrium,and addresses disturbances while compensating for time-delay without knowledge of the dynamics of the helicopter system.The effectiveness of the proposed method is demonstrated in a virtual platform.
基金supported by China Postdoctoral Science Foundation(Project ID:2024M762602)the National Natural Science Foundation of China under Grant No.62306232Natural Science Basic Research Program of Shaanxi Province under Grant No.2023-JC-QN-0662.
文摘This paper presents an optimized shared control algorithm for human–AI interaction, implemented through a digital twin framework where the physical system and human operator act as the real agent while an AI-driven digital system functions as the virtual agent. In this digital twin architecture, the real agent acquires an optimal control strategy through observed actions, while the AI virtual agent mirrors the real agent to establish a digital replica system and corresponding control policy. Both the real and virtual optimal controllers are approximated using reinforcement learning(RL) techniques. Specifically, critic neural networks(NNs) are employed to learn the virtual and real optimal value functions, while actor NNs are trained to derive their respective optimal controllers. A novel shared mechanism is introduced to integrate both virtual and real value functions into a unified learning framework, yielding an optimal shared controller. This controller adaptively adjusts the confidence ratio between virtual and real agents, enhancing the system's efficiency and flexibility in handling complex control tasks. The stability of the closed-loop system is rigorously analyzed using the Lyapunov method. The effectiveness of the proposed AI–human interactive system is validated through two numerical examples: a representative nonlinear system and an unmanned aerial vehicle(UAV) control system.
文摘Integral reinforcement learning(IRL)is an effective tool for solving optimal control problems of nonlinear systems,and it has been widely utilized in optimal controller design for solving discrete-time nonlinearity.However,solving the Hamilton-Jacobi-Bellman(HJB)equations for nonlinear systems requires precise and complicated dynamics.Moreover,the research and application of IRL in continuous-time(CT)systems must be further improved.To develop the IRL of a CT nonlinear system,a data-based adaptive neural dynamic programming(ANDP)method is proposed to investigate the optimal control problem of uncertain CT multi-input systems such that the knowledge of the dynamics in the HJB equation is unnecessary.First,the multi-input model is approximated using a neural network(NN),which can be utilized to design an integral reinforcement signal.Subsequently,two criterion networks and one action network are constructed based on the integral reinforcement signal.A nonzero-sum Nash equilibrium can be reached by learning the optimal strategies of the multi-input model.In this scheme,the NN weights are constantly updated using an adaptive algorithm.The weight convergence and the system stability are analyzed in detail.The optimal control problem of a multi-input nonlinear CT system is effectively solved using the ANDP scheme,and the results are verified by a simulation study.
基金National Key Research and Development Program of China,Grant/Award Number:2021YFC2801700Defense Industrial Technology Development Program,Grant/Award Numbers:JCKY2021110B024,JCKY2022110C072+6 种基金Science and Technology Innovation 2030-“New Generation Artificial Intelligence”Major Project,Grant/Award Number:2022ZD0116305Natural Science Foundation of Hefei,China,Grant/Award Number:202321National Natural Science Foundation of China,Grant/Award Numbers:U2013601,U20A20225Yangtze River Delta S&T Innovation Community Joint Research Project,Grant/Award Number:2022CSJGG0900Anhui Province Natural Science Funds for Distinguished Young Scholar,Grant/Award Number:2308085J02State Key Laboratory of Intelligent Green Vehicle and Mobility,Grant/Award Number:KFY2417State Key Laboratory of Advanced Design and Manufacturing Technology for Vehicle,Grant/Award Number:32215010。
文摘A mixed adaptive dynamic programming(ADP)scheme based on zero-sum game theory is developed to address optimal control problems of autonomous underwater vehicle(AUV)systems subject to disturbances and safe constraints.By combining prior dynamic knowledge and actual sampled data,the proposed approach effectively mitigates the defect caused by the inaccurate dynamic model and significantly improves the training speed of the ADP algorithm.Initially,the dataset is enriched with sufficient reference data collected based on a nominal model without considering modelling bias.Also,the control object interacts with the real environment and continuously gathers adequate sampled data in the dataset.To comprehensively leverage the advantages of model-based and model-free methods during training,an adaptive tuning factor is introduced based on the dataset that possesses model-referenced information and conforms to the distribution of the real-world environment,which balances the influence of model-based control law and data-driven policy gradient on the direction of policy improvement.As a result,the proposed approach accelerates the learning speed compared to data-driven methods,concurrently also enhancing the tracking performance in comparison to model-based control methods.Moreover,the optimal control problem under disturbances is formulated as a zero-sum game,and the actor-critic-disturbance framework is introduced to approximate the optimal control input,cost function,and disturbance policy,respectively.Furthermore,the convergence property of the proposed algorithm based on the value iteration method is analysed.Finally,an example of AUV path following based on the improved line-of-sight guidance is presented to demonstrate the effectiveness of the proposed method.
基金supported in part by the National Key Research and Development Program of China(2021YFE0206100)the National Natural Science Foundation of China(62425310,62073321)+2 种基金the National Defense Basic Scientific Research Program(JCKY2019203C029,JCKY2020130C025)the Science and Technology Development FundMacao SAR(FDCT-22-009-MISE,0060/2021/A2,0015/2020/AMJ)
文摘This paper highlights the utilization of parallel control and adaptive dynamic programming(ADP) for event-triggered robust parallel optimal consensus control(ETRPOC) of uncertain nonlinear continuous-time multiagent systems(MASs).First, the parallel control system, which consists of a virtual control variable and a specific auxiliary variable obtained from the coupled Hamiltonian, allows general systems to be transformed into affine systems. Of interest is the fact that the parallel control technique's introduction provides an unprecedented perspective on eliminating the negative effects of disturbance. Then, an eventtriggered mechanism is adopted to save communication resources while ensuring the system's stability. The coupled HamiltonJacobi(HJ) equation's solution is approximated using a critic neural network(NN), whose weights are updated in response to events. Furthermore, theoretical analysis reveals that the weight estimation error is uniformly ultimately bounded(UUB). Finally,numerical simulations demonstrate the effectiveness of the developed ETRPOC method.
基金Project supported by the National Natural Science Foundation of China (Grant No. 62103408)Beijing Nova Program (Grant No. 20240484516)the Fundamental Research Funds for the Central Universities (Grant No. KG16314701)。
文摘The paper develops a robust control approach for nonaffine nonlinear continuous systems with input constraints and unknown uncertainties. Firstly, this paper constructs an affine augmented system(AAS) within a pre-compensation technique for converting the original nonaffine dynamics into affine dynamics. Secondly, the paper derives a stability criterion linking the original nonaffine system and the auxiliary system, demonstrating that the obtained optimal policies from the auxiliary system can achieve the robust controller of the nonaffine system. Thirdly, an online adaptive dynamic programming(ADP) algorithm is designed for approximating the optimal solution of the Hamilton–Jacobi–Bellman(HJB) equation.Moreover, the gradient descent approach and projection approach are employed for updating the actor-critic neural network(NN) weights, with the algorithm's convergence being proven. Then, the uniformly ultimately bounded stability of state is guaranteed. Finally, in simulation, some examples are offered for validating the effectiveness of this presented approach.