The problem of maneuvering for a servicing spacecraft(inspector)to inspect a noncooperative spacecraft(evader)in cislunar space is investigated in this paper.The evader,which may be a malfunctioning or uncontrolled sa...The problem of maneuvering for a servicing spacecraft(inspector)to inspect a noncooperative spacecraft(evader)in cislunar space is investigated in this paper.The evader,which may be a malfunctioning or uncontrolled satellite,introduces uncertainties due to its potential maneuvering capabilities.To address this challenge,the scenario is modeled as a special orbital game,incorporating the unique complexities of the cislunar environment.A variable-duration,turn-based inspection and anti-inspection game model is designed.The model defines both players'rules,constraints,and victory conditions,providing a framework for non-cooperative inspection.Strategies for both players are developed and validated based on their dynamical properties.The inspector's strategy integrates two-body Lambert transfers with shooting methods,while the evader's strategy aims to maximize the inspector's fuel consumption.Simulation results show that the evader's optimal strategy involves deliberate fluctuations in its lunar periapsis altitude,with the inspector's requiredΔV up to eight times greater than the evader's.The impact of game constraints is evaluated,and the effectiveness of deploying the inspector in low lunar orbit is compared with the inspector at the Earth-Moon Lagrange point L1.The strengths and weaknesses of both are shown.These findings provide valuable insights for future orbital servicing and orbital games.展开更多
The problem of collision avoidance for non-cooperative targets has received significant attention from researchers in recent years.Non-cooperative targets exhibit uncertain states and unpredictable behaviors,making co...The problem of collision avoidance for non-cooperative targets has received significant attention from researchers in recent years.Non-cooperative targets exhibit uncertain states and unpredictable behaviors,making collision avoidance significantly more challenging than that for space debris.Much existing research focuses on the continuous thrust model,whereas the impulsive maneuver model is more appropriate for long-duration and long-distance avoidance missions.Additionally,it is important to minimize the impact on the original mission while avoiding noncooperative targets.On the other hand,the existing avoidance algorithms are computationally complex and time-consuming especially with the limited computing capability of the on-board computer,posing challenges for practical engineering applications.To conquer these difficulties,this paper makes the following key contributions:(A)a turn-based(sequential decision-making)limited-area impulsive collision avoidance model considering the time delay of precision orbit determination is established for the first time;(B)a novel Selection Probability Learning Adaptive Search-depth Search Tree(SPL-ASST)algorithm is proposed for non-cooperative target avoidance,which improves the decision-making efficiency by introducing an adaptive-search-depth mechanism and a neural network into the traditional Monte Carlo Tree Search(MCTS).Numerical simulations confirm the effectiveness and efficiency of the proposed method.展开更多
This paper investigates impulsive orbital attack-defense(AD)games under multiple constraints and victory conditions,involving three spacecraft:attacker,target,and defender.In the AD scenario,the attacker aims to breac...This paper investigates impulsive orbital attack-defense(AD)games under multiple constraints and victory conditions,involving three spacecraft:attacker,target,and defender.In the AD scenario,the attacker aims to breach the defender's interception to rendezvous with the target,while the defender seeks to protect the target by blocking or actively pursuing the attacker.Four different maneuvering constraints and five potential game outcomes are incorporated to more accurately model AD game problems and increase complexity,thereby reducing the effectiveness of traditional methods such as differential games and game-tree searches.To address these challenges,this study proposes a multiagent deep reinforcement learning solution with variable reward functions.Two attack strategies,Direct attack(DA)and Bypass attack(BA),are developed for the attacker,each focusing on different mission priorities.Similarly,two defense strategies,Direct interdiction(DI)and Collinear interdiction(CI),are designed for the defender,each optimizing specific defensive actions through tailored reward functions.Each reward function incorporates both process rewards(e.g.,distance and angle)and outcome rewards,derived from physical principles and validated via geometric analysis.Extensive simulations of four strategy confrontations demonstrate average defensive success rates of 75%for DI vs.DA,40%for DI vs.BA,80%for CI vs.DA,and 70%for CI vs.BA.Results indicate that CI outperforms DI for defenders,while BA outperforms DA for attackers.Moreover,defenders achieve their objectives more effectively under identical maneuvering capabilities.Trajectory evolution analyses further illustrate the effectiveness of the proposed variable reward function-driven strategies.These strategies and analyses offer valuable guidance for practical orbital defense scenarios and lay a foundation for future multi-agent game research.展开更多
This paper conducts a comprehensive study on the multi-constrained two-on-one impulsive orbital pursuit–evasion game(OPEG).Firstly,considering constraints such as maneuverability,fuel reserves,and mission duration,a ...This paper conducts a comprehensive study on the multi-constrained two-on-one impulsive orbital pursuit–evasion game(OPEG).Firstly,considering constraints such as maneuverability,fuel reserves,and mission duration,a mathematical game model for the two-on-one impulsive OPEG is established,which transforms the two-on-one impulsive OPEG,where cooperation and competition coexist,into a multi-constrained three-party optimization problem suitable for solving with multi-agent deep reinforcement learning.Then,an intelligent solution method for cooperative game strategies based on the Multi-Agent Deep Deterministic Policy Gradient(MADDPG)algorithm is proposed.In the reward function design section,a reward function based on fixed-time triggering is introduced to address the information loss problem caused by long impulse intervals.To ensure good convergence of the algorithm and guide the spacecraft to learn effective cooperative strategies during training,an immediate reward function is designed,incorporating outcome rewards,guidance rewards,and cooperative rewards.Numerical simulations validate the feasibility and effectiveness of the proposed method.To further analyze the cooperative mechanisms learned by the spacecraft during algorithm training,a comparative experiment with the one-on-one impulsive OPEG is designed.The experimental results demonstrate that the two pursuers in the two-on-one impulsive OPEG not only develop various strategies such as“pre-emptive interception”,“pincer interception”,and“trailing pursuit”during training,but also improve mission success rates and reduce mission durations through coordinated efforts.Additionally,this paper reveals the impact of the relative initial state distribution between the two pursuing spacecraft and the evading spacecraft on the effectiveness of cooperation.展开更多
This paper studies the formidable challenges posed by space debris during spacecraft operations,specifically focusing on the potential threats arising from the disintegration of a proximate satellite.The disintegratio...This paper studies the formidable challenges posed by space debris during spacecraft operations,specifically focusing on the potential threats arising from the disintegration of a proximate satellite.The disintegration of a satellite could spawn debris fragments that may acquire diverse velocities,markedly complicating the avoidance maneuvers essential for the active spacecraft to sustain its original orbit.To address this issue,the principle of maneuvering reachable domain is introduced to model the debris swarm incorporating uncertainties in both velocity and trajectory following disintegration.Based on this model,a 3-impulse maneuver strategy is proposed to facilitate the active spacecraft through the debris swarm,enabling its safe return to the original orbit.To demonstrate the efficacy of this approach,2 distinct types of orbital scenarios are simulated:one with a short rendezvous period and the other characterized by an extended hovering duration.These simulation results indicate that the propagation model of debris swarm with the maneuver strategy enables the active spacecraft to maneuver through the debris swarm along with an optimal evasion trajectory,thereby ensuring its safe return to the original orbit.展开更多
Optimal,many-revolution spacecraft trajectories are challenging to solve.A connection is made for a class of models between optimal direct and indirect solutions.For transfers that minimize thrust-acceleration-squared...Optimal,many-revolution spacecraft trajectories are challenging to solve.A connection is made for a class of models between optimal direct and indirect solutions.For transfers that minimize thrust-acceleration-squared,primer vector theory maps direct,many-impulsive-maneuver trajectories to the indirect,continuous-thrust-acceleration equivalent.The mapping algorithm is independent of how the direct solution is obtained and requires only a solver for a boundary value problem and its partial derivatives.A Lambert solver is used for the two-body problem in this work.The mapping is simple because the impulsive maneuvers and co-states share the same linear space around an optimal trajectory.For numerical results,the direct coast-impulse solutions are demonstrated to converge to the indirect continuous solutions as the number of impulses and segments increases.The two-body design space is explored with a set of three many-revolution,many-segment examples changing semimajor axis,eccentricity,and inclination.The first two examples involve a small change to either semimajor axis or eccentricity,and the third example is a transfer to geosynchronous orbit.Using a single processor,the optimization runtime is seconds to minutes for revolution counts of 10 to 100,and on the order of one hour for examples with up to 500 revolutions.Any of these thrust-acceleration-squared solutions are good candidates to start a homotopy to a higher-fidelity minimization problem with practical constraints.展开更多
基金supported by the National Key R&D Pro-gram of China:Gravitational Wave Detection Project(Nos.2021YFC2026,2021YFC2202601,2021YFC2202603)the National Natural Science Foundation of China(Nos.12172288 and 12472046)。
文摘The problem of maneuvering for a servicing spacecraft(inspector)to inspect a noncooperative spacecraft(evader)in cislunar space is investigated in this paper.The evader,which may be a malfunctioning or uncontrolled satellite,introduces uncertainties due to its potential maneuvering capabilities.To address this challenge,the scenario is modeled as a special orbital game,incorporating the unique complexities of the cislunar environment.A variable-duration,turn-based inspection and anti-inspection game model is designed.The model defines both players'rules,constraints,and victory conditions,providing a framework for non-cooperative inspection.Strategies for both players are developed and validated based on their dynamical properties.The inspector's strategy integrates two-body Lambert transfers with shooting methods,while the evader's strategy aims to maximize the inspector's fuel consumption.Simulation results show that the evader's optimal strategy involves deliberate fluctuations in its lunar periapsis altitude,with the inspector's requiredΔV up to eight times greater than the evader's.The impact of game constraints is evaluated,and the effectiveness of deploying the inspector in low lunar orbit is compared with the inspector at the Earth-Moon Lagrange point L1.The strengths and weaknesses of both are shown.These findings provide valuable insights for future orbital servicing and orbital games.
基金co-supported by the Foundation of Shanghai Astronautics Science and Technology Innovation,China(No.SAST2022-114)the National Natural Science Foundation of China(No.62303378),the National Natural Science Foundation of China(Nos.124B2031,12202281)the Foundation of China National Key Laboratory of Science and Technology on Test Physics&Numerical Mathematics,China(No.08-YY-2023-R11)。
文摘The problem of collision avoidance for non-cooperative targets has received significant attention from researchers in recent years.Non-cooperative targets exhibit uncertain states and unpredictable behaviors,making collision avoidance significantly more challenging than that for space debris.Much existing research focuses on the continuous thrust model,whereas the impulsive maneuver model is more appropriate for long-duration and long-distance avoidance missions.Additionally,it is important to minimize the impact on the original mission while avoiding noncooperative targets.On the other hand,the existing avoidance algorithms are computationally complex and time-consuming especially with the limited computing capability of the on-board computer,posing challenges for practical engineering applications.To conquer these difficulties,this paper makes the following key contributions:(A)a turn-based(sequential decision-making)limited-area impulsive collision avoidance model considering the time delay of precision orbit determination is established for the first time;(B)a novel Selection Probability Learning Adaptive Search-depth Search Tree(SPL-ASST)algorithm is proposed for non-cooperative target avoidance,which improves the decision-making efficiency by introducing an adaptive-search-depth mechanism and a neural network into the traditional Monte Carlo Tree Search(MCTS).Numerical simulations confirm the effectiveness and efficiency of the proposed method.
基金supported by National Key R&D Program of China:Gravitational Wave Detection Project(Grant Nos.2021YFC22026,2021YFC2202601,2021YFC2202603)National Natural Science Foundation of China(Grant Nos.12172288 and 12472046)。
文摘This paper investigates impulsive orbital attack-defense(AD)games under multiple constraints and victory conditions,involving three spacecraft:attacker,target,and defender.In the AD scenario,the attacker aims to breach the defender's interception to rendezvous with the target,while the defender seeks to protect the target by blocking or actively pursuing the attacker.Four different maneuvering constraints and five potential game outcomes are incorporated to more accurately model AD game problems and increase complexity,thereby reducing the effectiveness of traditional methods such as differential games and game-tree searches.To address these challenges,this study proposes a multiagent deep reinforcement learning solution with variable reward functions.Two attack strategies,Direct attack(DA)and Bypass attack(BA),are developed for the attacker,each focusing on different mission priorities.Similarly,two defense strategies,Direct interdiction(DI)and Collinear interdiction(CI),are designed for the defender,each optimizing specific defensive actions through tailored reward functions.Each reward function incorporates both process rewards(e.g.,distance and angle)and outcome rewards,derived from physical principles and validated via geometric analysis.Extensive simulations of four strategy confrontations demonstrate average defensive success rates of 75%for DI vs.DA,40%for DI vs.BA,80%for CI vs.DA,and 70%for CI vs.BA.Results indicate that CI outperforms DI for defenders,while BA outperforms DA for attackers.Moreover,defenders achieve their objectives more effectively under identical maneuvering capabilities.Trajectory evolution analyses further illustrate the effectiveness of the proposed variable reward function-driven strategies.These strategies and analyses offer valuable guidance for practical orbital defense scenarios and lay a foundation for future multi-agent game research.
基金supported by National Key R&D Program of China:Gravitational Wave Detection Project(No.2021YFC22026,No.2021YFC2202601,No.2021YFC2202603)National Natural Science Foundation of China(No.12172288 and No.12472046).
文摘This paper conducts a comprehensive study on the multi-constrained two-on-one impulsive orbital pursuit–evasion game(OPEG).Firstly,considering constraints such as maneuverability,fuel reserves,and mission duration,a mathematical game model for the two-on-one impulsive OPEG is established,which transforms the two-on-one impulsive OPEG,where cooperation and competition coexist,into a multi-constrained three-party optimization problem suitable for solving with multi-agent deep reinforcement learning.Then,an intelligent solution method for cooperative game strategies based on the Multi-Agent Deep Deterministic Policy Gradient(MADDPG)algorithm is proposed.In the reward function design section,a reward function based on fixed-time triggering is introduced to address the information loss problem caused by long impulse intervals.To ensure good convergence of the algorithm and guide the spacecraft to learn effective cooperative strategies during training,an immediate reward function is designed,incorporating outcome rewards,guidance rewards,and cooperative rewards.Numerical simulations validate the feasibility and effectiveness of the proposed method.To further analyze the cooperative mechanisms learned by the spacecraft during algorithm training,a comparative experiment with the one-on-one impulsive OPEG is designed.The experimental results demonstrate that the two pursuers in the two-on-one impulsive OPEG not only develop various strategies such as“pre-emptive interception”,“pincer interception”,and“trailing pursuit”during training,but also improve mission success rates and reduce mission durations through coordinated efforts.Additionally,this paper reveals the impact of the relative initial state distribution between the two pursuing spacecraft and the evading spacecraft on the effectiveness of cooperation.
基金supported by the National Natural Science Foundation of China under grant U21B6001。
文摘This paper studies the formidable challenges posed by space debris during spacecraft operations,specifically focusing on the potential threats arising from the disintegration of a proximate satellite.The disintegration of a satellite could spawn debris fragments that may acquire diverse velocities,markedly complicating the avoidance maneuvers essential for the active spacecraft to sustain its original orbit.To address this issue,the principle of maneuvering reachable domain is introduced to model the debris swarm incorporating uncertainties in both velocity and trajectory following disintegration.Based on this model,a 3-impulse maneuver strategy is proposed to facilitate the active spacecraft through the debris swarm,enabling its safe return to the original orbit.To demonstrate the efficacy of this approach,2 distinct types of orbital scenarios are simulated:one with a short rendezvous period and the other characterized by an extended hovering duration.These simulation results indicate that the propagation model of debris swarm with the maneuver strategy enables the active spacecraft to maneuver through the debris swarm along with an optimal evasion trajectory,thereby ensuring its safe return to the original orbit.
文摘Optimal,many-revolution spacecraft trajectories are challenging to solve.A connection is made for a class of models between optimal direct and indirect solutions.For transfers that minimize thrust-acceleration-squared,primer vector theory maps direct,many-impulsive-maneuver trajectories to the indirect,continuous-thrust-acceleration equivalent.The mapping algorithm is independent of how the direct solution is obtained and requires only a solver for a boundary value problem and its partial derivatives.A Lambert solver is used for the two-body problem in this work.The mapping is simple because the impulsive maneuvers and co-states share the same linear space around an optimal trajectory.For numerical results,the direct coast-impulse solutions are demonstrated to converge to the indirect continuous solutions as the number of impulses and segments increases.The two-body design space is explored with a set of three many-revolution,many-segment examples changing semimajor axis,eccentricity,and inclination.The first two examples involve a small change to either semimajor axis or eccentricity,and the third example is a transfer to geosynchronous orbit.Using a single processor,the optimization runtime is seconds to minutes for revolution counts of 10 to 100,and on the order of one hour for examples with up to 500 revolutions.Any of these thrust-acceleration-squared solutions are good candidates to start a homotopy to a higher-fidelity minimization problem with practical constraints.