期刊文献+

基于Stackelberg模型和分层强化学习的无人机构形控制

Formation Control of Unmanned Vehicles Based on the Stackelberg Game Model and Hierarchical Reinforcement Learning Method
在线阅读 下载PDF
导出
摘要 针对多无人机系统的构形控制问题,提出了一种基于Stackelberg博弈模型与分层强化学习的控制方法。首先,基于“领导者跟随者”框架进行系统动力学建模,并利用Stackelberg模型构建了一个包含N+1个参与者的博弈系统,将问题转化为最优控制和微分博弈问题。然后利用最优控制与动态规划的方法获取每个参与者的最优策略及其对应的耦合哈密顿雅可比贝尔曼(Hamilton-Jacobi-Bellman,HJB)方程组。在此基础上提出一种两阶段分层值迭代强化学习算法,对最优解进行迭代计算。在值函数的逼近过程中,引入神经网络激活函数来进行拟合,从而提高算法的计算效率与精度。与传统方法相比,所提出的算法能够有效地求解领导者与多跟随者之间的斯塔克伯格-纳什均衡。最后,基于数值仿真实验,验证了该方法在实际任务中的可行性和有效性。 This paper addresses the configuration control problem of multi-UAV systems by proposing a control method based on the Stackelberg game model and hierarchical reinforcement learning.First,the system dynamics are modeled within a leader-follower framework,and the task is analyzed using the Stackelberg game model.A game-theoretic dynamic system with N+1 participants is constructed,transforming the problem into optimal control and differential game theory.Subsequently,optimal control and dynamic programming techniques are employed to derive the optimal strategies for each participant,along with the corresponding coupled Hamilton-Jacobi-Bellman(HJB)equations.Building on this,a two-stage hierarchical value iteration reinforcement learning algorithm is proposed to iteratively compute the optimal solution.During the value function approximation process,a neural network activation function is incorporated for fitting,thereby enhancing the computational efficiency and accuracy of the algorithm.Compared to traditional methods,the proposed algorithm effectively solves the Stackelberg-Nash equilibrium between the leader and multiple followers.Finally,numerical simulation experiments are conducted to validate the feasibility and effectiveness of the proposed approach in real-world tasks.
作者 朱俊彦 杨亚 王伟 朱鸿绪 孙然 ZHU Junyan;YANG Ya;WANG Wei;ZHU Hongxu;SUN Ran(School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093;Shanghai Electromechanical Engineering Institute,Shanghai 201109;School of Aeronautics and Astronautics,Shanghai Jiao Tong University,Shanghai 200240)
出处 《飞控与探测》 2025年第6期95-106,共12页 Flight Control & Detection
基金 国家自然科学基金面上项目(52272408)。
关键词 无人机编队 构形控制 斯塔克伯格博弈 强化学习 自适应动态规划 UAV formation formation control Stackelberg game reinforcement learning adaptive dynamic programming
  • 相关文献

参考文献7

二级参考文献38

共引文献47

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部