Iterative Learning Control(ILC)provides an effective framework for optimizing repetitive tasks,making it particularly suitable for high-precision applications in both precision manufacturing and intelligent transporta...Iterative Learning Control(ILC)provides an effective framework for optimizing repetitive tasks,making it particularly suitable for high-precision applications in both precision manufacturing and intelligent transportation systems(ITS).This paper presents a systematic review of ILC's developmental progress,current methodologies,and practical implementations across these two critical domains.The review first analyzes the key technical challenges encountered when integrating ILC into precision manufacturing workflows.Through case studies,it evaluates demonstrated improvements in positioning accuracy,surface finish quality,and production throughput.Furthermore,the study examines ILC’s applications in ITS,with particular focus on vehicular motion control applications including autonomous vehicle trajectory tracking,platoon coordination,and traffic signal timing optimization,where its data-driven characteristics enhance adaptability to dynamic environments.Finally,the paper proposes targeted future research directions that are essential for fully realizing ILC’s potential in advancing these interconnected yet distinct fields.展开更多
This paper presents a hierarchical formation control strategy to address the challenges of multiple Unmanned Aerial Vehicles(UAVs)formation control within a cooperative consensus framework.The proposed strategy incorp...This paper presents a hierarchical formation control strategy to address the challenges of multiple Unmanned Aerial Vehicles(UAVs)formation control within a cooperative consensus framework.The proposed strategy incorporates a reference command generation layer,which derives UAV attitude commands based on formation requirements,and a tracking control layer to ensure accurate execution.Collaborative variables,including trajectory position and flight speed,are defined using a three-dimensional track particle and autopilot model,enabling the development of a consensus-based formation control law.Desired attitude angles are computed through altitudehold and coordinated-turn strategies.A sliding surface is designed based on reference models derived from flight quality metrics,while an adaptive controller compensates for aerodynamic model uncertainties.To enhance learning capabilities,a prediction error mechanism based on a series-parallel estimation model is introduced,enabling collaborative learning and the sharing of network weight estimation parameters within the multi-agent system.This facilitates the design of a distributed composite learning law.Lyapunov stability analysis confirms the local exponential stability of the tracking error.The simulations of a twelve-UAV formation,along with comparative analysis of two algorithms,demonstrate the system’s capability for formation maintenance and high-precision tracking control.展开更多
To address the challenge of achieving decentralized,scalable,and adaptive control for large-scale multiple unmanned aerial vehicle(multi-UAV)swarms in dynamic urban environments with obstacles and wind perturbations,w...To address the challenge of achieving decentralized,scalable,and adaptive control for large-scale multiple unmanned aerial vehicle(multi-UAV)swarms in dynamic urban environments with obstacles and wind perturbations,we proposed a hybrid framework integrating adaptive reinforcement learning(RL),multi-modal perception fusion,and enhanced pigeon flock optimization(PFO)with curiosity-driven exploration to enable robust autonomous and formation control.The framework leverages meta-learning to optimize RL policies for real-time adaptation,fuses sensor data for precise state estimation,and enhances PFO with learned leader-follower dynamics and exploration rewards to maintain cohesive formations and explore uncertain areas.For swarms of 10–30 UAVs,it achieves 34%faster convergence,61%reduced stability root mean square error(RMSE),88%fewer collisions and 85.6%–92.3%success rates in target detection and encirclement,outperforming standard multi-agent RL,pure PFO,and single-modality RL.Three-dimensional trajectory visualizations confirm cohesive formations,collision-free maneuvers,and efficient exploration in urban search-and-rescue scenarios.Innovations include meta-RL for rapid adaptation,multi-modal fusion for robust perception,and curiosity-driven PFO for scalable,decentralized control,advancing real-world multi-UAV swarm autonomy and coordination.展开更多
Robust cooperative unmanned aerial vehicle(UAV)formation in complex 3D environments is hampered by reward sparsity and inefficient collaboration.To address this,we propose context-aware relational agent learning(CORAL...Robust cooperative unmanned aerial vehicle(UAV)formation in complex 3D environments is hampered by reward sparsity and inefficient collaboration.To address this,we propose context-aware relational agent learning(CORAL),a novel multi-agent deep reinforcement learning framework.CORAL synergistically integrates two modules:(1)a novelty-based intrinsic reward module to drive efficient exploration and(2)an explicit relational learning module that allows agents to predict peer intentions and enhance coordination.Built on a multi-agent Actor-Critic architecture,CORAL enables agents to balance self-interest with group objectives.Comprehensive evaluations in a high-fidelity simulation show that our method significantly outperforms state-of-theart baselines like multi-agent deep deterministic policy gradient(MADDPG)and monotonic value function factorisation for deep multi-agent reinforcement learning(QMIX)in path planning efficiency,collision avoidance,and scalability.展开更多
This work proposes the application of an iterative learning model predictive control(ILMPC)approach based on an adaptive fault observer(FOBILMPC)for fault-tolerant control and trajectory tracking in air-breathing hype...This work proposes the application of an iterative learning model predictive control(ILMPC)approach based on an adaptive fault observer(FOBILMPC)for fault-tolerant control and trajectory tracking in air-breathing hypersonic vehicles.In order to increase the control amount,this online control legislation makes use of model predictive control(MPC)that is based on the concept of iterative learning control(ILC).By using offline data to decrease the linearized model’s faults,the strategy may effectively increase the robustness of the control system and guarantee that disturbances can be suppressed.An adaptive fault observer is created based on the suggested ILMPC approach in order to enhance overall fault tolerance by estimating and compensating for actuator disturbance and fault degree.During the derivation process,a linearized model of longitudinal dynamics is established.The suggested ILMPC approach is likely to be used in the design of hypersonic vehicle control systems since numerical simulations have demonstrated that it can decrease tracking error and speed up convergence when compared to the offline controller.展开更多
This paper investigates a multiplayer Pareto game for affine nonlinear stochastic systems disturbed by both external and the internal multiplicative noises.The Pareto cooperative optimal strategies with the H_(∞) con...This paper investigates a multiplayer Pareto game for affine nonlinear stochastic systems disturbed by both external and the internal multiplicative noises.The Pareto cooperative optimal strategies with the H_(∞) constraint are resolved by integrating H_(2)/H_(∞) theory with Pareto game theory.First,a nonlinear stochastic bounded real lemma(SBRL)is derived,explicitly accounting for non-zero initial conditions.Through the analysis of four cross-coupled Hamilton-Jacobi equations(HJEs),we establish necessary and sufficient conditions for the existence of Pareto optimal strategies with the H_(∞) constraint.Secondly,to address the complexity of solving these nonlinear partial differential HJEs,we propose a neural network(NN)framework with synchronous tuning rules for the actor,critic,and disturbance components,based on a reinforcement learning(RL)approach.The designed tuning rules ensure convergence of the actor-critic-disturbance components to the desired values,enabling the realization of robust Pareto control strategies.The convergence of the proposed algorithm is rigorously analyzed using a constructed Lyapunov function for the NN weight errors.Finally,a numerical simulation example is provided to demonstrate the effectiveness of the proposed methods and main results.展开更多
This study developed a modeling methodology for statistical optimization-based geologic hazard susceptibility assessment,aiming to enhance the comprehensive performance and classification accuracy of the assessment mo...This study developed a modeling methodology for statistical optimization-based geologic hazard susceptibility assessment,aiming to enhance the comprehensive performance and classification accuracy of the assessment models.First,the cumulative probability method revealed that a low probability(15%)of geologic hazards between any two geologic hazard points occurred outside a buffer zone with a radius of 2297 m(i.e.,the distance threshold).The training dataset was established,consisting of negative samples(non-hazard points)randomly generated based on the distance threshold,positive samples(i.e.,historical hazards),and 13 conditioning factors.Then,models were built using five machine learning algorithms,namely random forest(RF),gradient boosting decision tree(GBDT),naive Bayes(NB),logistic regression(LR),and support vector machine(SVM).The comprehensive performance of the models was assessed using the area under the receiver operating characteristic curve(AUC)and overall accuracy(OA)as indicators,revealing that RF exhibited the best performance,with OA and AUC values of 2.7127 and 0.981,respectively.Furthermore,the machine learning models constructed by considering the distance threshold outperformed those built using the unoptimized dataset.The characteristic factors were ranked using the mutual information method,with their scores decreasing in the order of rainfall(0.1616),altitude(0.06),normalized difference vegetation index(NDVI;0.04),and distance from roads(0.03).Finally,the geologic hazard susceptibility classification was assessed using the natural breaks method combined with a clustering algorithm.The results indicate that the clustering algorithm exhibited higher classification accuracy than the natural breaks method.The findings of this study demonstrate that the proposed model optimization scheme can provide a scientific basis for the prevention and control of geologic hazards.展开更多
Ensuring the consistent mechanical performance of three-dimensional(3D)-printed continuous fiber-reinforced composites is a significant challenge in additive manufacturing.The current reliance on manual monitoring exa...Ensuring the consistent mechanical performance of three-dimensional(3D)-printed continuous fiber-reinforced composites is a significant challenge in additive manufacturing.The current reliance on manual monitoring exacerbates this challenge by rendering the process vulnerable to environmental changes and unexpected factors,resulting in defects and inconsistent product quality,particularly in unmanned long-term operations or printing in extreme environments.To address these issues,we developed a process monitoring and closed-loop feedback control strategy for the 3D printing process.Real-time printing image data were captured and analyzed using a well-trained neural network model,and a real-time control module-enabled closed-loop feedback control of the flow rate was developed.The neural network model,which was based on image processing and artificial intelligence,enabled the recognition of flow rate values with an accuracy of 94.70%.The experimental results showed significant improvements in both the surface performance and mechanical properties of printed composites,with three to six times improvement in tensile strength and elastic modulus,demonstrating the effectiveness of the strategy.This study provides a generalized process monitoring and feedback control method for the 3D printing of continuous fiber-reinforced composites,and offers a potential solution for remote online monitoring and closed-loop adjustment in unmanned or extreme space environments.展开更多
The graded density impactor(GDI)dynamic loading technique is crucial for acquiring the dynamic physical property parameters of materials used in weapons.The accuracy and timeliness of GDI structural design are key to ...The graded density impactor(GDI)dynamic loading technique is crucial for acquiring the dynamic physical property parameters of materials used in weapons.The accuracy and timeliness of GDI structural design are key to achieving controllable stress-strain rate loading.In this study,we have,for the first time,combined one-dimensional fluid computational software with machine learning methods.We first elucidated the mechanisms by which GDI structures control stress and strain rates.Subsequently,we constructed a machine learning model to create a structure-property response surface.The results show that altering the loading velocity and interlayer thickness has a pronounced regulatory effect on stress and strain rates.In contrast,the impedance distribution index and target thickness have less significant effects on stress regulation,although there is a matching relationship between target thickness and interlayer thickness.Compared with traditional design methods,the machine learning approach offers a10^(4)—10^(5)times increase in efficiency and the potential to achieve a global optimum,holding promise for guiding the design of GDI.展开更多
Autonomous legged robots,capable of navigating uneven terrain,can perform a diverse array of tasks.However,designing locomotion controllers remains challenging.In particular,designing a controller based on durable and...Autonomous legged robots,capable of navigating uneven terrain,can perform a diverse array of tasks.However,designing locomotion controllers remains challenging.In particular,designing a controller based on durable and reliable proprioceptive sensors,is essential for achieving adaptability.Presently,the controller must either be manually designed for specific robots and tasks,or developed using machine-learning techniques,which require extensive training time and result in complex controllers.Inspired by animal locomotion,we propose a simple yet comprehensive closed-loop modular framework that utilizes minimal proprioceptive feedback(i.e.,the Coxa-Femur(CF)joint angle),enabling a quadruped robot to efficiently navigate unpredictable and uneven terrains,including the step and slope.The framework comprises a basic neural control network capable of rapidly learning optimized motor patterns,and a straightforward module for sensory feedback sharing and integration.In a series of experiments,we show that integrating sensory feedback into the base neural control network aids the robot in continually learning robust motor patterns on flat,step,and slope terrain,compared with the open-loop base framework.Sharing sensory feedback information across the four legs enables a quadruped robot to proactively navigate unpredictable steps with minimal interaction.Furthermore,the controller remains functional even in the absence of sensor signals.This control configuration was successfully transferred to a physical robot without any modifications.展开更多
The increasingly stringent performance requirement in integrated circuit manufacturing, characterized by smaller feature sizes and higher productivity, necessitates the wafer stage executing a extreme motion with the ...The increasingly stringent performance requirement in integrated circuit manufacturing, characterized by smaller feature sizes and higher productivity, necessitates the wafer stage executing a extreme motion with the accuracy in terms of nanometers. This demanding requirement witnesses a widespread application of iterative learning control(ILC), given the repetitive nature of wafer scanning. ILC enables substantial performance improvement by using past measurement data in combination with the system model knowledge. However, challenges arise in cases where the data is contaminated by the stochastic noise, or when the system model exhibits significant uncertainties, constraining the achievable performance. In response to this issue, an extended state observer(ESO) based adaptive ILC approach is proposed in the frequency domain.Despite being model-based, it utilizes only a rough system model and then compensates for the resulting model uncertainties using an ESO, thereby achieving high robustness against uncertainties with minimal modeling effort. Additionally, an adaptive learning law is developed to mitigate the limited performance in the presence of stochastic noise, yielding high convergence accuracy yet without compromising convergence speed. Simulation and experimental comparisons with existing model-based and data-driven inversion-based ILC validate the effectiveness as well as the superiority of the proposed method.展开更多
Reinforcement learning(RL)has been widely studied as an efficient class of machine learning methods for adaptive optimal control under uncertainties.In recent years,the applications of RL in optimised decision-making ...Reinforcement learning(RL)has been widely studied as an efficient class of machine learning methods for adaptive optimal control under uncertainties.In recent years,the applications of RL in optimised decision-making and motion control of intelligent vehicles have received increasing attention.Due to the complex and dynamic operating environments of intelligent vehicles,it is necessary to improve the learning efficiency and generalisation ability of RL-based decision and control algorithms under different conditions.This survey systematically examines the theoretical foundations,algorithmic advancements and practical challenges of applying RL to intelligent vehicle systems operating in complex and dynamic environments.The major algorithm frameworks of RL are first introduced,and the recent advances in RL-based decision-making and control of intelligent vehicles are overviewed.In addition to self-learning decision and control approaches using state measurements,the developments of DRL methods for end-to-end driving control of intelligent vehicles are summarised.The open problems and directions for further research works are also discussed.展开更多
This study presents an emergency control method for sub-synchronous oscillations in wind power gridconnected systems based on transfer learning,addressing the issue of insufficient generalization ability of traditiona...This study presents an emergency control method for sub-synchronous oscillations in wind power gridconnected systems based on transfer learning,addressing the issue of insufficient generalization ability of traditional methods in complex real-world scenarios.By combining deep reinforcement learning with a transfer learning framework,cross-scenario knowledge transfer is achieved,significantly enhancing the adaptability of the control strategy.First,a sub-synchronous oscillation emergency control model for the wind power grid integration system is constructed under fixed scenarios based on deep reinforcement learning.A reward evaluation system based on the active power oscillation pattern of the system is proposed,introducing penalty functions for the number of machine-shedding rounds and the number of machines shed.This avoids the economic losses and grid security risks caused by the excessive one-time shedding of wind turbines.Furthermore,transfer learning is introduced into model training to enhance the model’s generalization capability in dealing with complex scenarios of actual wind power grid integration systems.By introducing the Maximum Mean Discrepancy(MMD)algorithm to calculate the distribution differences between source data and target data,the online decision-making reliability of the emergency control model is improved.Finally,the effectiveness of the proposed emergency control method for multi-scenario sub-synchronous oscillation in wind power grid integration systems based on transfer learning is analyzed using the New England 39-bus system.展开更多
The integration of artificial intelligence into the development and production of mechatronic products offers a substantial opportunity to enhance efficiency, adaptability, and system performance. This paper examines ...The integration of artificial intelligence into the development and production of mechatronic products offers a substantial opportunity to enhance efficiency, adaptability, and system performance. This paper examines the utilization of reinforcement learning as a control strategy, with a particular focus on its deployment in pivotal stages of the product development lifecycle, specifically between system architecture and system integration and verification. A controller based on reinforcement learning was developed and evaluated in comparison to traditional proportional-integral controllers in dynamic and fault-prone environments. The results illustrate the superior adaptability, stability, and optimization potential of the reinforcement learning approach, particularly in addressing dynamic disturbances and ensuring robust performance. The study illustrates how reinforcement learning can facilitate the transition from conceptual design to implementation by automating optimization processes, enabling interface automation, and enhancing system-level testing. Based on the aforementioned findings, this paper presents future directions for research, which include the integration of domain-specific knowledge into the reinforcement learning process and the validation of this process in real-world environments. The results underscore the potential of artificial intelligence-driven methodologies to revolutionize the design and deployment of intelligent mechatronic systems.展开更多
This paper introduces an optimized backstepping control method for Flexible Airbreathing Hypersonic Vehicles(FAHVs).The approach incorporates nonlinear disturbance observation and reinforcement learning to address com...This paper introduces an optimized backstepping control method for Flexible Airbreathing Hypersonic Vehicles(FAHVs).The approach incorporates nonlinear disturbance observation and reinforcement learning to address complex control challenges.The Minimal Learning Parameter(MLP)technique is applied to manage unknown nonlinear dynamics,significantly reducing the computational load usually associated with Neural Network(NN)weight updates.To improve the control system robustness,an MLP-based nonlinear disturbance observer is designed,which estimates lumped disturbances,including flexibility effects,model uncertainties,and external disruptions within the FAHVs.In parallel,the control strategy integrates reinforcement learning using an MLP-based actor-critic framework within the backstepping design to achieve both optimality and robustness.The actor performs control actions,while the critic assesses the optimal performance index function.To minimize this index function,an adaptive gradient descent method constructs both the actor and critic.Lyapunov analysis is employed to demonstrate that all signals in the closed-loop system are semiglobally uniformly ultimately bounded.Simulation results confirm that the proposed control strategy delivers high control performance,marked by improved accuracy and reduced energy consumption.展开更多
Additive manufacturing(AM)promotes the production of metallic parts with significant design flexibility,yet its use in critical applications is hindered by challenges in ensuring consistent quality and performance.Pro...Additive manufacturing(AM)promotes the production of metallic parts with significant design flexibility,yet its use in critical applications is hindered by challenges in ensuring consistent quality and performance.Process variability often leads to defects,insufficient geometric accuracy and inadequate material properties,which are difficult to effectively manage due to limitations of traditional quality control methods in modeling highdimensional nonlinear relationships and enabling adaptive control.Machine learning(ML)offers a transformative approach to model intricate process-structure-property relationships by leveraging the rich data environment of AM.The study presents a comprehensive examination of ML-driven quality assurance implementations in metallic AM.First,it uniquely examines the innovative exploration of ML in predicting and understanding the fundamental multi-physics fields that influence the quality of a fabricated component,including temperature fields,fluid dynamics and stress/strain evolution.Subsequently,the application of ML in optimizing key quality attributes,including defect detection and mitigation(porosity,cracks,etc.),geometric fidelity enhancement(dimensional accuracy,surface roughness,etc.)and material property tailoring(mechanical strength,fatigue life,corrosion resistance,etc.),are discussed in detail.Finally,the development of ML-driven real-time closed-loop control systems for intelligent quality assurance,the strategies for addressing the data scarcity and cross-scenario transferability in metal AM are discussed.This article provides a novel perspective on the profound potential of ML technology for metal AM quality control applications,highlights the challenges faced during research,and outlines future development directions.展开更多
To enhance the frequency stability and lower the regulation mileage payment of a multiarea integrated energy system(IES)that supports the power Internet of Things(IoT),this paper proposes a data-driven cooperative met...To enhance the frequency stability and lower the regulation mileage payment of a multiarea integrated energy system(IES)that supports the power Internet of Things(IoT),this paper proposes a data-driven cooperative method for automatic generation control(AGC).The method consists of adaptive fractional-order proportional-integral(FOPI)controllers and a novel efficient integration exploration multiagent twin delayed deep deterministic policy gradient(EIE-MATD3)algorithm.The FOPI controllers are designed for each area based on the performancebased frequency regulation market mechanism.The EIE-MATD3 algorithm is used to tune the coefficients of the FOPI controllers in real time using centralized training and decentralized execution.The algorithm incorporates imitation learning and efficient integration exploration to obtain a more robust coordinated control strategy.An experiment on the four-area China Southern Grid(CSG)real-time digital system shows that the proposed method can improve the control performance and reduce the regulation mileage payment of each area in the IES.展开更多
Hydraulic legged robots have potential for high-dynamic motion due to their large power-to-weight ratios. However, it is challenging to ensure both stability and continuity in the motion of such robots. In this study,...Hydraulic legged robots have potential for high-dynamic motion due to their large power-to-weight ratios. However, it is challenging to ensure both stability and continuity in the motion of such robots. In this study, we propose a jumping motion control framework based on deep reinforcement learning that enables hydraulic limb leg units to perform stable and continuous jumping motions. First, to accurately represent the performance of a physical prototype, a quasi-realistic model incorporating physical feasibility constraints is constructed. This model is informed by analysis of the relevant fluid dynamics, and incorporates a trajectory generator and a motion tracking controller. To achieve stable and continuous jumping performance, a deep reinforcement learning algorithm is developed, which jointly optimizes the trajectory generator and the motion tracking controller. Through validation on the physical prototype, we demonstrate that the proposed method reduces the maximum deviation and the average deviation by over 47% and 60%, respectively, and improves landing compliance by up to 7.7% compared to a baseline optimization algorithm, the non-dominated sorting genetic algorithm (NSGA-II). The proposed control framework may serve as a reference for high-dynamic motion control of legged robots and multi-objective optimization across several decision variables.展开更多
A mixed adaptive dynamic programming(ADP)scheme based on zero-sum game theory is developed to address optimal control problems of autonomous underwater vehicle(AUV)systems subject to disturbances and safe constraints....A mixed adaptive dynamic programming(ADP)scheme based on zero-sum game theory is developed to address optimal control problems of autonomous underwater vehicle(AUV)systems subject to disturbances and safe constraints.By combining prior dynamic knowledge and actual sampled data,the proposed approach effectively mitigates the defect caused by the inaccurate dynamic model and significantly improves the training speed of the ADP algorithm.Initially,the dataset is enriched with sufficient reference data collected based on a nominal model without considering modelling bias.Also,the control object interacts with the real environment and continuously gathers adequate sampled data in the dataset.To comprehensively leverage the advantages of model-based and model-free methods during training,an adaptive tuning factor is introduced based on the dataset that possesses model-referenced information and conforms to the distribution of the real-world environment,which balances the influence of model-based control law and data-driven policy gradient on the direction of policy improvement.As a result,the proposed approach accelerates the learning speed compared to data-driven methods,concurrently also enhancing the tracking performance in comparison to model-based control methods.Moreover,the optimal control problem under disturbances is formulated as a zero-sum game,and the actor-critic-disturbance framework is introduced to approximate the optimal control input,cost function,and disturbance policy,respectively.Furthermore,the convergence property of the proposed algorithm based on the value iteration method is analysed.Finally,an example of AUV path following based on the improved line-of-sight guidance is presented to demonstrate the effectiveness of the proposed method.展开更多
基金funded by the Wuxi Young Scientific and Technological Talent Support Initiative,project number:TJXD-2024-203the Natural Science Foundation of the Jiangsu Higher Education Institutions of China,grant number:24KJB470027.
文摘Iterative Learning Control(ILC)provides an effective framework for optimizing repetitive tasks,making it particularly suitable for high-precision applications in both precision manufacturing and intelligent transportation systems(ITS).This paper presents a systematic review of ILC's developmental progress,current methodologies,and practical implementations across these two critical domains.The review first analyzes the key technical challenges encountered when integrating ILC into precision manufacturing workflows.Through case studies,it evaluates demonstrated improvements in positioning accuracy,surface finish quality,and production throughput.Furthermore,the study examines ILC’s applications in ITS,with particular focus on vehicular motion control applications including autonomous vehicle trajectory tracking,platoon coordination,and traffic signal timing optimization,where its data-driven characteristics enhance adaptability to dynamic environments.Finally,the paper proposes targeted future research directions that are essential for fully realizing ILC’s potential in advancing these interconnected yet distinct fields.
基金co-supported in part by the National Natural Science Foundation of China(No.62403131)in part by Jiangsu Funding Program for Excellent Postdoctoral Talent,China(No.2024ZB267)in part by the Shenzhen Science and Technology Program,China(No.JCYJ20230807145500002)。
文摘This paper presents a hierarchical formation control strategy to address the challenges of multiple Unmanned Aerial Vehicles(UAVs)formation control within a cooperative consensus framework.The proposed strategy incorporates a reference command generation layer,which derives UAV attitude commands based on formation requirements,and a tracking control layer to ensure accurate execution.Collaborative variables,including trajectory position and flight speed,are defined using a three-dimensional track particle and autopilot model,enabling the development of a consensus-based formation control law.Desired attitude angles are computed through altitudehold and coordinated-turn strategies.A sliding surface is designed based on reference models derived from flight quality metrics,while an adaptive controller compensates for aerodynamic model uncertainties.To enhance learning capabilities,a prediction error mechanism based on a series-parallel estimation model is introduced,enabling collaborative learning and the sharing of network weight estimation parameters within the multi-agent system.This facilitates the design of a distributed composite learning law.Lyapunov stability analysis confirms the local exponential stability of the tracking error.The simulations of a twelve-UAV formation,along with comparative analysis of two algorithms,demonstrate the system’s capability for formation maintenance and high-precision tracking control.
基金supported by the National Natural Science Foundation of China(No.62350048)。
文摘To address the challenge of achieving decentralized,scalable,and adaptive control for large-scale multiple unmanned aerial vehicle(multi-UAV)swarms in dynamic urban environments with obstacles and wind perturbations,we proposed a hybrid framework integrating adaptive reinforcement learning(RL),multi-modal perception fusion,and enhanced pigeon flock optimization(PFO)with curiosity-driven exploration to enable robust autonomous and formation control.The framework leverages meta-learning to optimize RL policies for real-time adaptation,fuses sensor data for precise state estimation,and enhances PFO with learned leader-follower dynamics and exploration rewards to maintain cohesive formations and explore uncertain areas.For swarms of 10–30 UAVs,it achieves 34%faster convergence,61%reduced stability root mean square error(RMSE),88%fewer collisions and 85.6%–92.3%success rates in target detection and encirclement,outperforming standard multi-agent RL,pure PFO,and single-modality RL.Three-dimensional trajectory visualizations confirm cohesive formations,collision-free maneuvers,and efficient exploration in urban search-and-rescue scenarios.Innovations include meta-RL for rapid adaptation,multi-modal fusion for robust perception,and curiosity-driven PFO for scalable,decentralized control,advancing real-world multi-UAV swarm autonomy and coordination.
基金supported by the STI 2030 Major Projects(No.2022ZD0208804)the National Natural Science Foundation of China(No.62473017)。
文摘Robust cooperative unmanned aerial vehicle(UAV)formation in complex 3D environments is hampered by reward sparsity and inefficient collaboration.To address this,we propose context-aware relational agent learning(CORAL),a novel multi-agent deep reinforcement learning framework.CORAL synergistically integrates two modules:(1)a novelty-based intrinsic reward module to drive efficient exploration and(2)an explicit relational learning module that allows agents to predict peer intentions and enhance coordination.Built on a multi-agent Actor-Critic architecture,CORAL enables agents to balance self-interest with group objectives.Comprehensive evaluations in a high-fidelity simulation show that our method significantly outperforms state-of-theart baselines like multi-agent deep deterministic policy gradient(MADDPG)and monotonic value function factorisation for deep multi-agent reinforcement learning(QMIX)in path planning efficiency,collision avoidance,and scalability.
基金supported by the National Natural Science Foundation of China(12072090).
文摘This work proposes the application of an iterative learning model predictive control(ILMPC)approach based on an adaptive fault observer(FOBILMPC)for fault-tolerant control and trajectory tracking in air-breathing hypersonic vehicles.In order to increase the control amount,this online control legislation makes use of model predictive control(MPC)that is based on the concept of iterative learning control(ILC).By using offline data to decrease the linearized model’s faults,the strategy may effectively increase the robustness of the control system and guarantee that disturbances can be suppressed.An adaptive fault observer is created based on the suggested ILMPC approach in order to enhance overall fault tolerance by estimating and compensating for actuator disturbance and fault degree.During the derivation process,a linearized model of longitudinal dynamics is established.The suggested ILMPC approach is likely to be used in the design of hypersonic vehicle control systems since numerical simulations have demonstrated that it can decrease tracking error and speed up convergence when compared to the offline controller.
基金supported by the National Natural Science Foundation of China(12426609,62203220,62373229)the Taishan Scholar Project Foundation of Shandong Province(tsqnz20230619,tsqn202408110)+2 种基金the Fundamental Research Foundation of the Central Universities(23Cx06024A)the Natural Science Foundation of Shandong Province(ZR2024QF096)the Outstanding Youth Innovation Team in Shandong Higher Education Institutions(2023KJ061).
文摘This paper investigates a multiplayer Pareto game for affine nonlinear stochastic systems disturbed by both external and the internal multiplicative noises.The Pareto cooperative optimal strategies with the H_(∞) constraint are resolved by integrating H_(2)/H_(∞) theory with Pareto game theory.First,a nonlinear stochastic bounded real lemma(SBRL)is derived,explicitly accounting for non-zero initial conditions.Through the analysis of four cross-coupled Hamilton-Jacobi equations(HJEs),we establish necessary and sufficient conditions for the existence of Pareto optimal strategies with the H_(∞) constraint.Secondly,to address the complexity of solving these nonlinear partial differential HJEs,we propose a neural network(NN)framework with synchronous tuning rules for the actor,critic,and disturbance components,based on a reinforcement learning(RL)approach.The designed tuning rules ensure convergence of the actor-critic-disturbance components to the desired values,enabling the realization of robust Pareto control strategies.The convergence of the proposed algorithm is rigorously analyzed using a constructed Lyapunov function for the NN weight errors.Finally,a numerical simulation example is provided to demonstrate the effectiveness of the proposed methods and main results.
基金supported by a project entitled Loess Plateau Region-Watershed-Slope Geological Hazard Multi-Scale Collaborative Intelligent Early Warning System of the National Key R&D Program of China(2022YFC3003404)a project of the Shaanxi Youth Science and Technology Star(2021KJXX-87)public welfare geological survey projects of Shaanxi Institute of Geologic Survey(20180301,201918,202103,and 202413)。
文摘This study developed a modeling methodology for statistical optimization-based geologic hazard susceptibility assessment,aiming to enhance the comprehensive performance and classification accuracy of the assessment models.First,the cumulative probability method revealed that a low probability(15%)of geologic hazards between any two geologic hazard points occurred outside a buffer zone with a radius of 2297 m(i.e.,the distance threshold).The training dataset was established,consisting of negative samples(non-hazard points)randomly generated based on the distance threshold,positive samples(i.e.,historical hazards),and 13 conditioning factors.Then,models were built using five machine learning algorithms,namely random forest(RF),gradient boosting decision tree(GBDT),naive Bayes(NB),logistic regression(LR),and support vector machine(SVM).The comprehensive performance of the models was assessed using the area under the receiver operating characteristic curve(AUC)and overall accuracy(OA)as indicators,revealing that RF exhibited the best performance,with OA and AUC values of 2.7127 and 0.981,respectively.Furthermore,the machine learning models constructed by considering the distance threshold outperformed those built using the unoptimized dataset.The characteristic factors were ranked using the mutual information method,with their scores decreasing in the order of rainfall(0.1616),altitude(0.06),normalized difference vegetation index(NDVI;0.04),and distance from roads(0.03).Finally,the geologic hazard susceptibility classification was assessed using the natural breaks method combined with a clustering algorithm.The results indicate that the clustering algorithm exhibited higher classification accuracy than the natural breaks method.The findings of this study demonstrate that the proposed model optimization scheme can provide a scientific basis for the prevention and control of geologic hazards.
基金supported by National Key Research and Development Program of China(Grant No.2023YFB4604100)National Key Research and Development Program of China(Grant No.2022YFB3806104)+4 种基金Key Research and Development Program in Shaanxi Province(Grant No.2021LLRH-08-17)Young Elite Scientists Sponsorship Program by CAST(No.2023QNRC001)K C Wong Education Foundation of ChinaYouth Innovation Team of Shaanxi Universities of ChinaKey Research and Development Program of Shaanxi Province(Grant 2021LLRH-08-3.1).
文摘Ensuring the consistent mechanical performance of three-dimensional(3D)-printed continuous fiber-reinforced composites is a significant challenge in additive manufacturing.The current reliance on manual monitoring exacerbates this challenge by rendering the process vulnerable to environmental changes and unexpected factors,resulting in defects and inconsistent product quality,particularly in unmanned long-term operations or printing in extreme environments.To address these issues,we developed a process monitoring and closed-loop feedback control strategy for the 3D printing process.Real-time printing image data were captured and analyzed using a well-trained neural network model,and a real-time control module-enabled closed-loop feedback control of the flow rate was developed.The neural network model,which was based on image processing and artificial intelligence,enabled the recognition of flow rate values with an accuracy of 94.70%.The experimental results showed significant improvements in both the surface performance and mechanical properties of printed composites,with three to six times improvement in tensile strength and elastic modulus,demonstrating the effectiveness of the strategy.This study provides a generalized process monitoring and feedback control method for the 3D printing of continuous fiber-reinforced composites,and offers a potential solution for remote online monitoring and closed-loop adjustment in unmanned or extreme space environments.
基金supported by the Guangdong Major Project of Basic and Applied Basic Research(Grant No.2021B0301030001)the National Key Research and Development Program of China(Grant No.2021YFB3802300)the Foundation of National Key Laboratory of Shock Wave and Detonation Physics(Grant No.JCKYS2022212004)。
文摘The graded density impactor(GDI)dynamic loading technique is crucial for acquiring the dynamic physical property parameters of materials used in weapons.The accuracy and timeliness of GDI structural design are key to achieving controllable stress-strain rate loading.In this study,we have,for the first time,combined one-dimensional fluid computational software with machine learning methods.We first elucidated the mechanisms by which GDI structures control stress and strain rates.Subsequently,we constructed a machine learning model to create a structure-property response surface.The results show that altering the loading velocity and interlayer thickness has a pronounced regulatory effect on stress and strain rates.In contrast,the impedance distribution index and target thickness have less significant effects on stress regulation,although there is a matching relationship between target thickness and interlayer thickness.Compared with traditional design methods,the machine learning approach offers a10^(4)—10^(5)times increase in efficiency and the potential to achieve a global optimum,holding promise for guiding the design of GDI.
基金supported by the National Natural Science Foundation of China(Grant Nos.62233008 and 51705247)the State Key Laboratory of Mechanics and Control for Aerospace Structures of Nanjing University of Aeronautics and Astronautics.
文摘Autonomous legged robots,capable of navigating uneven terrain,can perform a diverse array of tasks.However,designing locomotion controllers remains challenging.In particular,designing a controller based on durable and reliable proprioceptive sensors,is essential for achieving adaptability.Presently,the controller must either be manually designed for specific robots and tasks,or developed using machine-learning techniques,which require extensive training time and result in complex controllers.Inspired by animal locomotion,we propose a simple yet comprehensive closed-loop modular framework that utilizes minimal proprioceptive feedback(i.e.,the Coxa-Femur(CF)joint angle),enabling a quadruped robot to efficiently navigate unpredictable and uneven terrains,including the step and slope.The framework comprises a basic neural control network capable of rapidly learning optimized motor patterns,and a straightforward module for sensory feedback sharing and integration.In a series of experiments,we show that integrating sensory feedback into the base neural control network aids the robot in continually learning robust motor patterns on flat,step,and slope terrain,compared with the open-loop base framework.Sharing sensory feedback information across the four legs enables a quadruped robot to proactively navigate unpredictable steps with minimal interaction.Furthermore,the controller remains functional even in the absence of sensor signals.This control configuration was successfully transferred to a physical robot without any modifications.
基金supported by National Natural Science Foundation of China(52375530,52075132)Natural Science Foundation of Heilongjiang Province(YQ2022E025)+4 种基金State Key Laboratory of Precision Electronic Manufacturing Technology and Equipment(Guangdong University of Technology)(JMDZ202312)Fundamental Research Funds for the Central Universities(HIT.OCEF.2024034)China Postdoctoral Science Foundation(2019M651278,2020T130155)Heilongjiang Province Postdoctoral Science Foundation(LBH-Z19066)Space Drive and Manipulation Mechanism Laboratory of BICE and National Key Laboratory of Space Intelligent Control,No BICE-SDMM-2024-01
文摘The increasingly stringent performance requirement in integrated circuit manufacturing, characterized by smaller feature sizes and higher productivity, necessitates the wafer stage executing a extreme motion with the accuracy in terms of nanometers. This demanding requirement witnesses a widespread application of iterative learning control(ILC), given the repetitive nature of wafer scanning. ILC enables substantial performance improvement by using past measurement data in combination with the system model knowledge. However, challenges arise in cases where the data is contaminated by the stochastic noise, or when the system model exhibits significant uncertainties, constraining the achievable performance. In response to this issue, an extended state observer(ESO) based adaptive ILC approach is proposed in the frequency domain.Despite being model-based, it utilizes only a rough system model and then compensates for the resulting model uncertainties using an ESO, thereby achieving high robustness against uncertainties with minimal modeling effort. Additionally, an adaptive learning law is developed to mitigate the limited performance in the presence of stochastic noise, yielding high convergence accuracy yet without compromising convergence speed. Simulation and experimental comparisons with existing model-based and data-driven inversion-based ILC validate the effectiveness as well as the superiority of the proposed method.
基金supported by the National Natural Science Foundation of China under Grant T2521006,Grant 62403483,Grant 62533021 and Grant U24A20279.
文摘Reinforcement learning(RL)has been widely studied as an efficient class of machine learning methods for adaptive optimal control under uncertainties.In recent years,the applications of RL in optimised decision-making and motion control of intelligent vehicles have received increasing attention.Due to the complex and dynamic operating environments of intelligent vehicles,it is necessary to improve the learning efficiency and generalisation ability of RL-based decision and control algorithms under different conditions.This survey systematically examines the theoretical foundations,algorithmic advancements and practical challenges of applying RL to intelligent vehicle systems operating in complex and dynamic environments.The major algorithm frameworks of RL are first introduced,and the recent advances in RL-based decision-making and control of intelligent vehicles are overviewed.In addition to self-learning decision and control approaches using state measurements,the developments of DRL methods for end-to-end driving control of intelligent vehicles are summarised.The open problems and directions for further research works are also discussed.
基金funded by Sponsorship of Science and Technology Project of State Grid Xinjiang Electric Power Co.,Ltd.,grant number SGXJ0000TKJS2400168.
文摘This study presents an emergency control method for sub-synchronous oscillations in wind power gridconnected systems based on transfer learning,addressing the issue of insufficient generalization ability of traditional methods in complex real-world scenarios.By combining deep reinforcement learning with a transfer learning framework,cross-scenario knowledge transfer is achieved,significantly enhancing the adaptability of the control strategy.First,a sub-synchronous oscillation emergency control model for the wind power grid integration system is constructed under fixed scenarios based on deep reinforcement learning.A reward evaluation system based on the active power oscillation pattern of the system is proposed,introducing penalty functions for the number of machine-shedding rounds and the number of machines shed.This avoids the economic losses and grid security risks caused by the excessive one-time shedding of wind turbines.Furthermore,transfer learning is introduced into model training to enhance the model’s generalization capability in dealing with complex scenarios of actual wind power grid integration systems.By introducing the Maximum Mean Discrepancy(MMD)algorithm to calculate the distribution differences between source data and target data,the online decision-making reliability of the emergency control model is improved.Finally,the effectiveness of the proposed emergency control method for multi-scenario sub-synchronous oscillation in wind power grid integration systems based on transfer learning is analyzed using the New England 39-bus system.
文摘The integration of artificial intelligence into the development and production of mechatronic products offers a substantial opportunity to enhance efficiency, adaptability, and system performance. This paper examines the utilization of reinforcement learning as a control strategy, with a particular focus on its deployment in pivotal stages of the product development lifecycle, specifically between system architecture and system integration and verification. A controller based on reinforcement learning was developed and evaluated in comparison to traditional proportional-integral controllers in dynamic and fault-prone environments. The results illustrate the superior adaptability, stability, and optimization potential of the reinforcement learning approach, particularly in addressing dynamic disturbances and ensuring robust performance. The study illustrates how reinforcement learning can facilitate the transition from conceptual design to implementation by automating optimization processes, enabling interface automation, and enhancing system-level testing. Based on the aforementioned findings, this paper presents future directions for research, which include the integration of domain-specific knowledge into the reinforcement learning process and the validation of this process in real-world environments. The results underscore the potential of artificial intelligence-driven methodologies to revolutionize the design and deployment of intelligent mechatronic systems.
基金co-supported by the National Natural Science Foundation of China(Nos.62303380,62176214,62101590,62003268)。
文摘This paper introduces an optimized backstepping control method for Flexible Airbreathing Hypersonic Vehicles(FAHVs).The approach incorporates nonlinear disturbance observation and reinforcement learning to address complex control challenges.The Minimal Learning Parameter(MLP)technique is applied to manage unknown nonlinear dynamics,significantly reducing the computational load usually associated with Neural Network(NN)weight updates.To improve the control system robustness,an MLP-based nonlinear disturbance observer is designed,which estimates lumped disturbances,including flexibility effects,model uncertainties,and external disruptions within the FAHVs.In parallel,the control strategy integrates reinforcement learning using an MLP-based actor-critic framework within the backstepping design to achieve both optimality and robustness.The actor performs control actions,while the critic assesses the optimal performance index function.To minimize this index function,an adaptive gradient descent method constructs both the actor and critic.Lyapunov analysis is employed to demonstrate that all signals in the closed-loop system are semiglobally uniformly ultimately bounded.Simulation results confirm that the proposed control strategy delivers high control performance,marked by improved accuracy and reduced energy consumption.
基金supported by the National Key R&D Program of China(No.2024YFB4609700)Major Research Plan of the National Natural Science Foundation of China(No.92266102)+4 种基金National Natural Science Foundation of China(No.52271135,No.52433016)Open project of Key Laboratory of Green Fabrication and Surface Technology of Advanced Metal Materials,China(No.GFST2024KF05)Innovative Research Group Project of Hubei Provincial Natural Science Foundation,China(No.2025AFA014)ECU DVC Strategic Research Support Fund,Australia(No.23965)Natural Science Foundation of Hubei Province,China(No.2025AFD399).
文摘Additive manufacturing(AM)promotes the production of metallic parts with significant design flexibility,yet its use in critical applications is hindered by challenges in ensuring consistent quality and performance.Process variability often leads to defects,insufficient geometric accuracy and inadequate material properties,which are difficult to effectively manage due to limitations of traditional quality control methods in modeling highdimensional nonlinear relationships and enabling adaptive control.Machine learning(ML)offers a transformative approach to model intricate process-structure-property relationships by leveraging the rich data environment of AM.The study presents a comprehensive examination of ML-driven quality assurance implementations in metallic AM.First,it uniquely examines the innovative exploration of ML in predicting and understanding the fundamental multi-physics fields that influence the quality of a fabricated component,including temperature fields,fluid dynamics and stress/strain evolution.Subsequently,the application of ML in optimizing key quality attributes,including defect detection and mitigation(porosity,cracks,etc.),geometric fidelity enhancement(dimensional accuracy,surface roughness,etc.)and material property tailoring(mechanical strength,fatigue life,corrosion resistance,etc.),are discussed in detail.Finally,the development of ML-driven real-time closed-loop control systems for intelligent quality assurance,the strategies for addressing the data scarcity and cross-scenario transferability in metal AM are discussed.This article provides a novel perspective on the profound potential of ML technology for metal AM quality control applications,highlights the challenges faced during research,and outlines future development directions.
基金upported by National Natural Science Foundation of China(52307118).
文摘To enhance the frequency stability and lower the regulation mileage payment of a multiarea integrated energy system(IES)that supports the power Internet of Things(IoT),this paper proposes a data-driven cooperative method for automatic generation control(AGC).The method consists of adaptive fractional-order proportional-integral(FOPI)controllers and a novel efficient integration exploration multiagent twin delayed deep deterministic policy gradient(EIE-MATD3)algorithm.The FOPI controllers are designed for each area based on the performancebased frequency regulation market mechanism.The EIE-MATD3 algorithm is used to tune the coefficients of the FOPI controllers in real time using centralized training and decentralized execution.The algorithm incorporates imitation learning and efficient integration exploration to obtain a more robust coordinated control strategy.An experiment on the four-area China Southern Grid(CSG)real-time digital system shows that the proposed method can improve the control performance and reduce the regulation mileage payment of each area in the IES.
基金supported by the National Natural Science Foundation of China(No.U24B2049)the Scientific Research Fund of Zhejiang Provincial Education Department(No.Y202457047),China.
文摘Hydraulic legged robots have potential for high-dynamic motion due to their large power-to-weight ratios. However, it is challenging to ensure both stability and continuity in the motion of such robots. In this study, we propose a jumping motion control framework based on deep reinforcement learning that enables hydraulic limb leg units to perform stable and continuous jumping motions. First, to accurately represent the performance of a physical prototype, a quasi-realistic model incorporating physical feasibility constraints is constructed. This model is informed by analysis of the relevant fluid dynamics, and incorporates a trajectory generator and a motion tracking controller. To achieve stable and continuous jumping performance, a deep reinforcement learning algorithm is developed, which jointly optimizes the trajectory generator and the motion tracking controller. Through validation on the physical prototype, we demonstrate that the proposed method reduces the maximum deviation and the average deviation by over 47% and 60%, respectively, and improves landing compliance by up to 7.7% compared to a baseline optimization algorithm, the non-dominated sorting genetic algorithm (NSGA-II). The proposed control framework may serve as a reference for high-dynamic motion control of legged robots and multi-objective optimization across several decision variables.
基金National Key Research and Development Program of China,Grant/Award Number:2021YFC2801700Defense Industrial Technology Development Program,Grant/Award Numbers:JCKY2021110B024,JCKY2022110C072+6 种基金Science and Technology Innovation 2030-“New Generation Artificial Intelligence”Major Project,Grant/Award Number:2022ZD0116305Natural Science Foundation of Hefei,China,Grant/Award Number:202321National Natural Science Foundation of China,Grant/Award Numbers:U2013601,U20A20225Yangtze River Delta S&T Innovation Community Joint Research Project,Grant/Award Number:2022CSJGG0900Anhui Province Natural Science Funds for Distinguished Young Scholar,Grant/Award Number:2308085J02State Key Laboratory of Intelligent Green Vehicle and Mobility,Grant/Award Number:KFY2417State Key Laboratory of Advanced Design and Manufacturing Technology for Vehicle,Grant/Award Number:32215010。
文摘A mixed adaptive dynamic programming(ADP)scheme based on zero-sum game theory is developed to address optimal control problems of autonomous underwater vehicle(AUV)systems subject to disturbances and safe constraints.By combining prior dynamic knowledge and actual sampled data,the proposed approach effectively mitigates the defect caused by the inaccurate dynamic model and significantly improves the training speed of the ADP algorithm.Initially,the dataset is enriched with sufficient reference data collected based on a nominal model without considering modelling bias.Also,the control object interacts with the real environment and continuously gathers adequate sampled data in the dataset.To comprehensively leverage the advantages of model-based and model-free methods during training,an adaptive tuning factor is introduced based on the dataset that possesses model-referenced information and conforms to the distribution of the real-world environment,which balances the influence of model-based control law and data-driven policy gradient on the direction of policy improvement.As a result,the proposed approach accelerates the learning speed compared to data-driven methods,concurrently also enhancing the tracking performance in comparison to model-based control methods.Moreover,the optimal control problem under disturbances is formulated as a zero-sum game,and the actor-critic-disturbance framework is introduced to approximate the optimal control input,cost function,and disturbance policy,respectively.Furthermore,the convergence property of the proposed algorithm based on the value iteration method is analysed.Finally,an example of AUV path following based on the improved line-of-sight guidance is presented to demonstrate the effectiveness of the proposed method.