In repeated zero-sum games,instead of constantly playing an equilibrium strategy of the stage game,learning to exploit the opponent given historical interactions could typically obtain a higher utility.However,when pl...In repeated zero-sum games,instead of constantly playing an equilibrium strategy of the stage game,learning to exploit the opponent given historical interactions could typically obtain a higher utility.However,when playing against a fully adaptive opponent,one would have dificulty identifying the opponent's adaptive dynamics and further exploiting its potential weakness.In this paper,we study the problem of optimizing against the adaptive opponent who uses no-regret learning.No-regret learning is a classic and widely-used branch of adaptive learning algorithms.We propose a general framework for online modeling no-regret opponents and exploiting their weakness.With this framework,one could approximate the opponent's no-regret learning dynamics and then develop a response plan to obtain a significant profit based on the inferences of the opponent's strategies.We employ two system identification architectures,including the recurrent neural network(RNN)and the nonlinear autoregressive exogenous model,and adopt an efficient greedy response plan within the framework.Theoretically,we prove the approximation capability of our RNN architecture at approximating specific no-regret dynamics.Empirically,we demonstrate that during interactions at a low level of non-stationarity,our architectures could approximate the dynamics with a low error,and the derived policies could exploit the no-regret opponent to obtain a decent utility.展开更多
In this paper,a zero-sum game Nash equilibrium computation problem with a common constraint set is investigated under two time-varying multi-agent subnetworks,where the two subnetworks have opposite payoff function.A ...In this paper,a zero-sum game Nash equilibrium computation problem with a common constraint set is investigated under two time-varying multi-agent subnetworks,where the two subnetworks have opposite payoff function.A novel distributed projection subgradient algorithm with random sleep scheme is developed to reduce the calculation amount of agents in the process of computing Nash equilibrium.In our algorithm,each agent is determined by an independent identically distributed Bernoulli decision to compute the subgradient and perform the projection operation or to keep the previous consensus estimate,it effectively reduces the amount of computation and calculation time.Moreover,the traditional assumption of stepsize adopted in the existing methods is removed,and the stepsizes in our algorithm are randomized diminishing.Besides,we prove that all agents converge to Nash equilibrium with probability 1 by our algorithm.Finally,a simulation example verifies the validity of our algorithm.展开更多
In this paper, we consider multiobjective two-person zero-sum games with vector payoffs and vector fuzzy payoffs. We translate such games into the corresponding multiobjective programming problems and introduce the pe...In this paper, we consider multiobjective two-person zero-sum games with vector payoffs and vector fuzzy payoffs. We translate such games into the corresponding multiobjective programming problems and introduce the pessimistic Pareto optimal solution concept by assuming that a player supposes the opponent adopts the most disadvantage strategy for the self. It is shown that any pessimistic Pareto optimal solution can be obtained on the basis of linear programming techniques even if the membership functions for the objective functions are nonlinear. Moreover, we propose interactive algorithms based on the bisection method to obtain a pessimistic compromise solution from among the set of all pessimistic Pareto optimal solutions. In order to show the efficiency of the proposed method, we illustrate interactive processes of an application to a vegetable shipment problem.展开更多
Nowadays,China is the largest developing country in the world,and the US is the largest developed country in the world.Sino-US economic and trade relations are of great significance to the two nations and may have apr...Nowadays,China is the largest developing country in the world,and the US is the largest developed country in the world.Sino-US economic and trade relations are of great significance to the two nations and may have aprominent impact on the stability and development of the global economy.展开更多
There are a few studies that focus on solution methods for finding a Nash equilibrium of zero-sum games. We discuss the use of Karmarkar’s interior point method to solve the Nash equilibrium problems of a zero-sum ga...There are a few studies that focus on solution methods for finding a Nash equilibrium of zero-sum games. We discuss the use of Karmarkar’s interior point method to solve the Nash equilibrium problems of a zero-sum game, and prove that it is theoretically a polynomial time algorithm. We implement the Karmarkar method, and a preliminary computational result shows that it performs well for zero-sum games. We also mention an affine scaling method that would help us compute Nash equilibria of general zero-sum games effectively.展开更多
To keep the secrecy performance from being badly influenced by untrusted relay(UR), a multi-UR network through amplify-and-forward(AF) cooperative scheme is put forward, which takes relay weight and harmful factor int...To keep the secrecy performance from being badly influenced by untrusted relay(UR), a multi-UR network through amplify-and-forward(AF) cooperative scheme is put forward, which takes relay weight and harmful factor into account. A nonzero-sum game is established to capture the interaction among URs and detection strategies. Secrecy capacity is investigated as game payoff to indicate the untrusted behaviors of the relays. The maximum probabilities of the behaviors of relay and the optimal system detection strategy can be obtained by using the proposed algorithm.展开更多
Non-orthogonal multiple access technology(NOMA),as a potentially promising technology in the 5G/B5G era,suffers fromubiquitous security threats due to the broadcast nature of the wirelessmedium.In this paper,we focus ...Non-orthogonal multiple access technology(NOMA),as a potentially promising technology in the 5G/B5G era,suffers fromubiquitous security threats due to the broadcast nature of the wirelessmedium.In this paper,we focus on artificial-signal-assisted and relay-assisted secure downlink transmission schemes against external eavesdropping in the context of physical layer security,respectively.To characterize the non-cooperative confrontation around the secrecy rate between the legitimate communication party and the eavesdropper,their interactions are modeled as a two-person zero-sum game.The existence of the Nash equilibrium of the proposed game models is proved,and the pure strategyNash equilibriumand mixed-strategyNash equilibriumprofiles in the two schemes are solved and analyzed,respectively.The numerical simulations are conducted to validate the analytical results,and showthat the two schemes improve the secrecy rate and further enhance the physical layer security performance of NOMA systems.展开更多
This paper investigates the multi-player non-zero-sum game problem for unknown linear continuous-time systems with unmeasurable states.By only accessing the data information of input and output,a data-driven learning ...This paper investigates the multi-player non-zero-sum game problem for unknown linear continuous-time systems with unmeasurable states.By only accessing the data information of input and output,a data-driven learning control approach is proposed to estimate N-tuple dynamic output feedback control policies which can form Nash equilibrium solution to the multi-player non-zero-sum game problem.In particular,the explicit form of dynamic output feedback Nash strategy is constructed by embedding the internal dynamics and solving coupled algebraic Riccati equations.The coupled policy-iteration based iterative learning equations are established to estimate the N-tuple feedback control gains without prior knowledge of system matrices.Finally,an example is used to illustrate the effectiveness of the proposed approach.展开更多
Dear Editor,This letter addresses the impulse game problem for a general scope of deterministic,multi-player,nonzero-sum differential games wherein all participants adopt impulse controls.Our objective is to formulate...Dear Editor,This letter addresses the impulse game problem for a general scope of deterministic,multi-player,nonzero-sum differential games wherein all participants adopt impulse controls.Our objective is to formulate this impulse game problem with the modified objective function including interaction costs among the players in a discontinuous fashion,and subsequently,to derive a verification theorem for identifying the feedback Nash equilibrium strategy.展开更多
Building heating,ventilating,and air conditioning(HVAC)systems have one of the largest energy footprint worldwide,which necessitates the design of intelligent control algorithms that improve the energy utilization whi...Building heating,ventilating,and air conditioning(HVAC)systems have one of the largest energy footprint worldwide,which necessitates the design of intelligent control algorithms that improve the energy utilization while still providing thermal comfort.In this work,the authors formulate the HVAC equipment dynamics in the setting of a two-player non-zero-sum cooperative game,which enables two decision variables(mass flow rate and supply air temperature)to perform joint optimization of the control utilization and thermal setpoint tracking by simultaneously exchanging their policies.The HVAC zone serves as a game environment for these two decision variables that act as two players in a game.It is assumed that dynamic models of HVAC equipment are not available.Furthermore,neither the state nor any estimates of HVAC disturbance(heat gains,outside variations,etc.)are accessible,but only the measurement of the zone temperature is available for feedback.Under these constraints,the authors develop a new data-driven Q-learning scheme employing policy iteration and value iteration with a bias compensation mechanism that accounts for unmeasurable disturbances and circumvents the need of full-state measurement.The proposed algorithms are shown to converge to the optimal solution corresponding to the generalized algebraic Riccati equations(GAREs)in dynamic games.展开更多
In a first for the African continent,Senegal will host the Dakar 2026 Youth Olympic Games(YOG)from 31 October to 13 November.The Dakar 2026 YOG carry a strong symbolic ambition,embodied by their motto“Africa welcomes...In a first for the African continent,Senegal will host the Dakar 2026 Youth Olympic Games(YOG)from 31 October to 13 November.The Dakar 2026 YOG carry a strong symbolic ambition,embodied by their motto“Africa welcomes,Dakar celebrates.”Host Senegal sees the event as a catalyst for its influence,the modernisation of its infrastructure,and the mobilisation of its youth.展开更多
The problem of maneuvering for a servicing spacecraft(inspector)to inspect a noncooperative spacecraft(evader)in cislunar space is investigated in this paper.The evader,which may be a malfunctioning or uncontrolled sa...The problem of maneuvering for a servicing spacecraft(inspector)to inspect a noncooperative spacecraft(evader)in cislunar space is investigated in this paper.The evader,which may be a malfunctioning or uncontrolled satellite,introduces uncertainties due to its potential maneuvering capabilities.To address this challenge,the scenario is modeled as a special orbital game,incorporating the unique complexities of the cislunar environment.A variable-duration,turn-based inspection and anti-inspection game model is designed.The model defines both players'rules,constraints,and victory conditions,providing a framework for non-cooperative inspection.Strategies for both players are developed and validated based on their dynamical properties.The inspector's strategy integrates two-body Lambert transfers with shooting methods,while the evader's strategy aims to maximize the inspector's fuel consumption.Simulation results show that the evader's optimal strategy involves deliberate fluctuations in its lunar periapsis altitude,with the inspector's requiredΔV up to eight times greater than the evader's.The impact of game constraints is evaluated,and the effectiveness of deploying the inspector in low lunar orbit is compared with the inspector at the Earth-Moon Lagrange point L1.The strengths and weaknesses of both are shown.These findings provide valuable insights for future orbital servicing and orbital games.展开更多
Vaccination is a key strategy to curb the spread of epidemics.Heterologous vaccination,unlike homologous vaccination which acts on a single target and forms a single immune barrier,covers multiple targets for broader ...Vaccination is a key strategy to curb the spread of epidemics.Heterologous vaccination,unlike homologous vaccination which acts on a single target and forms a single immune barrier,covers multiple targets for broader protection.Yet,heterologous vaccination involves a complex decision process that conventional game-theoretic approaches,such as classical,evolutionary,and minority games cannot adequately capture.The parallel minority game(PMG)can handle bounded-rational,multi-choice decisions,but its application in vaccine research remains rare.In this study,we propose a vaccination-transmission coupled dynamic mechanism based on the parallel minority game and simulate it on a two-dimensional lattice.Using actual observational data and a mean-field mathematical model,we verify the effectiveness of this mechanism in simulating realistic vaccination behavior and transmission dynamics.We further analyze the impact of key parameters,such as vaccine efficacy differences and the proportion of individuals eligible for vaccine switching,on containment effectiveness.Our results demonstrate that heterologous vaccination surpasses homologous vaccination in containment effectiveness,particularly when vaccine efficacy varies significantly.This work provides a novel framework and empirical evidence for understanding individual decision-making and population-wide immunity formation in multi-vaccine settings.展开更多
In strategic decision-making tasks,determining how to assign limited costly resource towards the defender and the attacker is a central problem.However,it is hard for pre-allocated resource assignment to adapt to dyna...In strategic decision-making tasks,determining how to assign limited costly resource towards the defender and the attacker is a central problem.However,it is hard for pre-allocated resource assignment to adapt to dynamic fighting scenarios,and exists situations where the scenario and rule of the Colonel Blotto(CB)game are too restrictive in real world.To address these issues,a support stage is added as supplementary for pre-allocated results,in which a novel two-stage competitive resource assignment problem is formulated based on CB game and stochastic Lanchester equation(SLE).Further,the force attrition in these two stages is formulated as a stochastic progress to consider the complex fighting progress,including the case that the player with fewer resources defeats the player with more resources and wins the battlefield.For solving this two-stage resource assignment problem,nested solving and no-regret learning are proposed to search the optimal resource assignment strategies.Numerical experiments are taken to analyze the effectiveness of the proposed model and study the assignment strategies in various cases.展开更多
In this paper,based on ACP(ACP:artificial societies,computational experiments,and parallel execution)approach,a parallel control method is proposed for zero-sum games of unknown time-varying systems.The process of con...In this paper,based on ACP(ACP:artificial societies,computational experiments,and parallel execution)approach,a parallel control method is proposed for zero-sum games of unknown time-varying systems.The process of constructing a sequence of artificial systems,implementing the computational experiments,and conducting the parallel execution is presented.The artificial systems are constructed to model the real system.Computational experiments adopting adaptive dynamic programming(ADP)are shown to derive control laws for a sequence of artificial systems.The purpose of the parallel execution step is to derive the control laws for the real system.Finally,simulation experiments are provided to show the effectiveness of the proposed method.展开更多
This paper attempts to study two-person nonzero-sum games for denumerable continuous-time Markov chains determined by transition rates,with an expected average criterion.The transition rates are allowed to be unbounde...This paper attempts to study two-person nonzero-sum games for denumerable continuous-time Markov chains determined by transition rates,with an expected average criterion.The transition rates are allowed to be unbounded,and the payoff functions may be unbounded from above and from below.We give suitable conditions under which the existence of a Nash equilibrium is ensured.More precisely,using the socalled "vanishing discount" approach,a Nash equilibrium for the average criterion is obtained as a limit point of a sequence of equilibrium strategies for the discounted criterion as the discount factors tend to zero.Our results are illustrated with a birth-and-death game.展开更多
In this paper,a zero-sum game Nash equilibrium computation problem with event-triggered communication is investigated under an undirected weight-balanced multi-agent network.A novel distributed event-triggered project...In this paper,a zero-sum game Nash equilibrium computation problem with event-triggered communication is investigated under an undirected weight-balanced multi-agent network.A novel distributed event-triggered projection subgradient algorithm is developed to reduce the communication burden within the subnetworks.In the proposed algorithm,when the difference between the current state of the agent and the state of the last trigger time exceeds a given threshold,the agent will be triggered to communicate with its neighbours.Moreover,we prove that all agents converge to Nash equilibrium by the proposed algorithm.Finally,two simulation examples verify that our algorithm not only reduces the communication burden but also ensures that the convergence speed and accuracy are close to that of the time-triggered method under the appropriate threshold.展开更多
This paper presents a novel optimal synchronization control method for multi-agent systems with input saturation.The multi-agent game theory is introduced to transform the optimal synchronization control problem into ...This paper presents a novel optimal synchronization control method for multi-agent systems with input saturation.The multi-agent game theory is introduced to transform the optimal synchronization control problem into a multi-agent nonzero-sum game.Then,the Nash equilibrium can be achieved by solving the coupled Hamilton–Jacobi–Bellman(HJB)equations with nonquadratic input energy terms.A novel off-policy reinforcement learning method is presented to obtain the Nash equilibrium solution without the system models,and the critic neural networks(NNs)and actor NNs are introduced to implement the presented method.Theoretical analysis is provided,which shows that the iterative control laws converge to the Nash equilibrium.Simulation results show the good performance of the presented method.展开更多
基金the Science and Technology Innovation 2030-"New Generation Artificial Intelligence"Major Project(No.2018AAA0100901)。
文摘In repeated zero-sum games,instead of constantly playing an equilibrium strategy of the stage game,learning to exploit the opponent given historical interactions could typically obtain a higher utility.However,when playing against a fully adaptive opponent,one would have dificulty identifying the opponent's adaptive dynamics and further exploiting its potential weakness.In this paper,we study the problem of optimizing against the adaptive opponent who uses no-regret learning.No-regret learning is a classic and widely-used branch of adaptive learning algorithms.We propose a general framework for online modeling no-regret opponents and exploiting their weakness.With this framework,one could approximate the opponent's no-regret learning dynamics and then develop a response plan to obtain a significant profit based on the inferences of the opponent's strategies.We employ two system identification architectures,including the recurrent neural network(RNN)and the nonlinear autoregressive exogenous model,and adopt an efficient greedy response plan within the framework.Theoretically,we prove the approximation capability of our RNN architecture at approximating specific no-regret dynamics.Empirically,we demonstrate that during interactions at a low level of non-stationarity,our architectures could approximate the dynamics with a low error,and the derived policies could exploit the no-regret opponent to obtain a decent utility.
文摘In this paper,a zero-sum game Nash equilibrium computation problem with a common constraint set is investigated under two time-varying multi-agent subnetworks,where the two subnetworks have opposite payoff function.A novel distributed projection subgradient algorithm with random sleep scheme is developed to reduce the calculation amount of agents in the process of computing Nash equilibrium.In our algorithm,each agent is determined by an independent identically distributed Bernoulli decision to compute the subgradient and perform the projection operation or to keep the previous consensus estimate,it effectively reduces the amount of computation and calculation time.Moreover,the traditional assumption of stepsize adopted in the existing methods is removed,and the stepsizes in our algorithm are randomized diminishing.Besides,we prove that all agents converge to Nash equilibrium with probability 1 by our algorithm.Finally,a simulation example verifies the validity of our algorithm.
文摘In this paper, we consider multiobjective two-person zero-sum games with vector payoffs and vector fuzzy payoffs. We translate such games into the corresponding multiobjective programming problems and introduce the pessimistic Pareto optimal solution concept by assuming that a player supposes the opponent adopts the most disadvantage strategy for the self. It is shown that any pessimistic Pareto optimal solution can be obtained on the basis of linear programming techniques even if the membership functions for the objective functions are nonlinear. Moreover, we propose interactive algorithms based on the bisection method to obtain a pessimistic compromise solution from among the set of all pessimistic Pareto optimal solutions. In order to show the efficiency of the proposed method, we illustrate interactive processes of an application to a vegetable shipment problem.
文摘Nowadays,China is the largest developing country in the world,and the US is the largest developed country in the world.Sino-US economic and trade relations are of great significance to the two nations and may have aprominent impact on the stability and development of the global economy.
文摘There are a few studies that focus on solution methods for finding a Nash equilibrium of zero-sum games. We discuss the use of Karmarkar’s interior point method to solve the Nash equilibrium problems of a zero-sum game, and prove that it is theoretically a polynomial time algorithm. We implement the Karmarkar method, and a preliminary computational result shows that it performs well for zero-sum games. We also mention an affine scaling method that would help us compute Nash equilibria of general zero-sum games effectively.
基金Supported by National High Technology Research and Development Program of China (863 Program) (2006AA04Z183), National Natural Science Foundation of China (60621001, 60534010, 60572070, 60774048, 60728307), Program for Changjiang Scholars and Innovative Research Groups of China (60728307, 4031002)
基金Supported by the National Natural Science Foundation of China(No.61101223)
文摘To keep the secrecy performance from being badly influenced by untrusted relay(UR), a multi-UR network through amplify-and-forward(AF) cooperative scheme is put forward, which takes relay weight and harmful factor into account. A nonzero-sum game is established to capture the interaction among URs and detection strategies. Secrecy capacity is investigated as game payoff to indicate the untrusted behaviors of the relays. The maximum probabilities of the behaviors of relay and the optimal system detection strategy can be obtained by using the proposed algorithm.
基金supported by the NationalNatural Science Foundation of China under Grants U1836104,61801073,61931004,62072250National Key Research and Development Program of China under Grant 2021QY0700The Startup Foundation for Introducing Talent of NUIST under Grant 2021r039.
文摘Non-orthogonal multiple access technology(NOMA),as a potentially promising technology in the 5G/B5G era,suffers fromubiquitous security threats due to the broadcast nature of the wirelessmedium.In this paper,we focus on artificial-signal-assisted and relay-assisted secure downlink transmission schemes against external eavesdropping in the context of physical layer security,respectively.To characterize the non-cooperative confrontation around the secrecy rate between the legitimate communication party and the eavesdropper,their interactions are modeled as a two-person zero-sum game.The existence of the Nash equilibrium of the proposed game models is proved,and the pure strategyNash equilibriumand mixed-strategyNash equilibriumprofiles in the two schemes are solved and analyzed,respectively.The numerical simulations are conducted to validate the analytical results,and showthat the two schemes improve the secrecy rate and further enhance the physical layer security performance of NOMA systems.
基金supported by National Key R&D Program of China under Grant No.2021ZD0112600the National Natural Science Foundation of China under Grant No.62373058+3 种基金the Beijing Natural Science Foundation under Grant No.L233003National Science Fund for Distinguished Young Scholars of China under Grant No.62025301the Postdoctoral Fellowship Program of CPSF under Grant No.GZC20233407the Basic Science Center Programs of NSFC under Grant No.62088101。
文摘This paper investigates the multi-player non-zero-sum game problem for unknown linear continuous-time systems with unmeasurable states.By only accessing the data information of input and output,a data-driven learning control approach is proposed to estimate N-tuple dynamic output feedback control policies which can form Nash equilibrium solution to the multi-player non-zero-sum game problem.In particular,the explicit form of dynamic output feedback Nash strategy is constructed by embedding the internal dynamics and solving coupled algebraic Riccati equations.The coupled policy-iteration based iterative learning equations are established to estimate the N-tuple feedback control gains without prior knowledge of system matrices.Finally,an example is used to illustrate the effectiveness of the proposed approach.
基金supported in part by the National Natural Science Foundation of China(62173051)the Fundamental Research Funds for the Central Universities(2024CDJCGJ012,2023CDJXY-010)+1 种基金the Chongqing Technology Innovation and Application Development Special Key Project(CSTB2022TIADCUX0015,CSTB2022TIAD-KPX0162)the China Postdoctoral Science Foundation(2024M763865)
文摘Dear Editor,This letter addresses the impulse game problem for a general scope of deterministic,multi-player,nonzero-sum differential games wherein all participants adopt impulse controls.Our objective is to formulate this impulse game problem with the modified objective function including interaction costs among the players in a discontinuous fashion,and subsequently,to derive a verification theorem for identifying the feedback Nash equilibrium strategy.
文摘Building heating,ventilating,and air conditioning(HVAC)systems have one of the largest energy footprint worldwide,which necessitates the design of intelligent control algorithms that improve the energy utilization while still providing thermal comfort.In this work,the authors formulate the HVAC equipment dynamics in the setting of a two-player non-zero-sum cooperative game,which enables two decision variables(mass flow rate and supply air temperature)to perform joint optimization of the control utilization and thermal setpoint tracking by simultaneously exchanging their policies.The HVAC zone serves as a game environment for these two decision variables that act as two players in a game.It is assumed that dynamic models of HVAC equipment are not available.Furthermore,neither the state nor any estimates of HVAC disturbance(heat gains,outside variations,etc.)are accessible,but only the measurement of the zone temperature is available for feedback.Under these constraints,the authors develop a new data-driven Q-learning scheme employing policy iteration and value iteration with a bias compensation mechanism that accounts for unmeasurable disturbances and circumvents the need of full-state measurement.The proposed algorithms are shown to converge to the optimal solution corresponding to the generalized algebraic Riccati equations(GAREs)in dynamic games.
文摘In a first for the African continent,Senegal will host the Dakar 2026 Youth Olympic Games(YOG)from 31 October to 13 November.The Dakar 2026 YOG carry a strong symbolic ambition,embodied by their motto“Africa welcomes,Dakar celebrates.”Host Senegal sees the event as a catalyst for its influence,the modernisation of its infrastructure,and the mobilisation of its youth.
基金supported by the National Key R&D Pro-gram of China:Gravitational Wave Detection Project(Nos.2021YFC2026,2021YFC2202601,2021YFC2202603)the National Natural Science Foundation of China(Nos.12172288 and 12472046)。
文摘The problem of maneuvering for a servicing spacecraft(inspector)to inspect a noncooperative spacecraft(evader)in cislunar space is investigated in this paper.The evader,which may be a malfunctioning or uncontrolled satellite,introduces uncertainties due to its potential maneuvering capabilities.To address this challenge,the scenario is modeled as a special orbital game,incorporating the unique complexities of the cislunar environment.A variable-duration,turn-based inspection and anti-inspection game model is designed.The model defines both players'rules,constraints,and victory conditions,providing a framework for non-cooperative inspection.Strategies for both players are developed and validated based on their dynamical properties.The inspector's strategy integrates two-body Lambert transfers with shooting methods,while the evader's strategy aims to maximize the inspector's fuel consumption.Simulation results show that the evader's optimal strategy involves deliberate fluctuations in its lunar periapsis altitude,with the inspector's requiredΔV up to eight times greater than the evader's.The impact of game constraints is evaluated,and the effectiveness of deploying the inspector in low lunar orbit is compared with the inspector at the Earth-Moon Lagrange point L1.The strengths and weaknesses of both are shown.These findings provide valuable insights for future orbital servicing and orbital games.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.12571549,12571592,12471463,12022113,12101573)。
文摘Vaccination is a key strategy to curb the spread of epidemics.Heterologous vaccination,unlike homologous vaccination which acts on a single target and forms a single immune barrier,covers multiple targets for broader protection.Yet,heterologous vaccination involves a complex decision process that conventional game-theoretic approaches,such as classical,evolutionary,and minority games cannot adequately capture.The parallel minority game(PMG)can handle bounded-rational,multi-choice decisions,but its application in vaccine research remains rare.In this study,we propose a vaccination-transmission coupled dynamic mechanism based on the parallel minority game and simulate it on a two-dimensional lattice.Using actual observational data and a mean-field mathematical model,we verify the effectiveness of this mechanism in simulating realistic vaccination behavior and transmission dynamics.We further analyze the impact of key parameters,such as vaccine efficacy differences and the proportion of individuals eligible for vaccine switching,on containment effectiveness.Our results demonstrate that heterologous vaccination surpasses homologous vaccination in containment effectiveness,particularly when vaccine efficacy varies significantly.This work provides a novel framework and empirical evidence for understanding individual decision-making and population-wide immunity formation in multi-vaccine settings.
基金supported by the National Natural Science Foundation of China(61702528,61806212,62173336)。
文摘In strategic decision-making tasks,determining how to assign limited costly resource towards the defender and the attacker is a central problem.However,it is hard for pre-allocated resource assignment to adapt to dynamic fighting scenarios,and exists situations where the scenario and rule of the Colonel Blotto(CB)game are too restrictive in real world.To address these issues,a support stage is added as supplementary for pre-allocated results,in which a novel two-stage competitive resource assignment problem is formulated based on CB game and stochastic Lanchester equation(SLE).Further,the force attrition in these two stages is formulated as a stochastic progress to consider the complex fighting progress,including the case that the player with fewer resources defeats the player with more resources and wins the battlefield.For solving this two-stage resource assignment problem,nested solving and no-regret learning are proposed to search the optimal resource assignment strategies.Numerical experiments are taken to analyze the effectiveness of the proposed model and study the assignment strategies in various cases.
基金supported in part by the National Key R&D Program of China(No.2021YFE0206100)the National Natural Science Foundation of China(Nos.62073321 and 62273036)+2 种基金the National Defense Basic Scientific Research Program(No.JCKY2019203C029)the Science and Technology Development Fund,Macao SAR(Nos.FDCT-22-009-MISE and 0060/2021/A20015/2020/AMJ)the State Key Lab of Rail Traffic Control&Safety(No.RCS2021K005).
文摘In this paper,based on ACP(ACP:artificial societies,computational experiments,and parallel execution)approach,a parallel control method is proposed for zero-sum games of unknown time-varying systems.The process of constructing a sequence of artificial systems,implementing the computational experiments,and conducting the parallel execution is presented.The artificial systems are constructed to model the real system.Computational experiments adopting adaptive dynamic programming(ADP)are shown to derive control laws for a sequence of artificial systems.The purpose of the parallel execution step is to derive the control laws for the real system.Finally,simulation experiments are provided to show the effectiveness of the proposed method.
基金supported by National Science Foundation for Distinguished Young Scholars of China (Grant No. 10925107)Guangdong Province Universities and Colleges Pearl River Scholar Funded Scheme (2011)
文摘This paper attempts to study two-person nonzero-sum games for denumerable continuous-time Markov chains determined by transition rates,with an expected average criterion.The transition rates are allowed to be unbounded,and the payoff functions may be unbounded from above and from below.We give suitable conditions under which the existence of a Nash equilibrium is ensured.More precisely,using the socalled "vanishing discount" approach,a Nash equilibrium for the average criterion is obtained as a limit point of a sequence of equilibrium strategies for the discounted criterion as the discount factors tend to zero.Our results are illustrated with a birth-and-death game.
文摘In this paper,a zero-sum game Nash equilibrium computation problem with event-triggered communication is investigated under an undirected weight-balanced multi-agent network.A novel distributed event-triggered projection subgradient algorithm is developed to reduce the communication burden within the subnetworks.In the proposed algorithm,when the difference between the current state of the agent and the state of the last trigger time exceeds a given threshold,the agent will be triggered to communicate with its neighbours.Moreover,we prove that all agents converge to Nash equilibrium by the proposed algorithm.Finally,two simulation examples verify that our algorithm not only reduces the communication burden but also ensures that the convergence speed and accuracy are close to that of the time-triggered method under the appropriate threshold.
基金Project supported by the National Key R&D Program of China(No.2018YFB1702300)the National Natural Science Foundation of China(Nos.61722312 and 61533017)。
文摘This paper presents a novel optimal synchronization control method for multi-agent systems with input saturation.The multi-agent game theory is introduced to transform the optimal synchronization control problem into a multi-agent nonzero-sum game.Then,the Nash equilibrium can be achieved by solving the coupled Hamilton–Jacobi–Bellman(HJB)equations with nonquadratic input energy terms.A novel off-policy reinforcement learning method is presented to obtain the Nash equilibrium solution without the system models,and the critic neural networks(NNs)and actor NNs are introduced to implement the presented method.Theoretical analysis is provided,which shows that the iterative control laws converge to the Nash equilibrium.Simulation results show the good performance of the presented method.