Moving Target Defense(MTD)necessitates scientifically effective decision-making methodologies for defensive technology implementation.While most MTD decision studies focus on accurately identifying optimal strategies,...Moving Target Defense(MTD)necessitates scientifically effective decision-making methodologies for defensive technology implementation.While most MTD decision studies focus on accurately identifying optimal strategies,the issue of optimal defense timing remains underexplored.Current default approaches—periodic or overly frequent MTD triggers—lead to suboptimal trade-offs among system security,performance,and cost.The timing of MTD strategy activation critically impacts both defensive efficacy and operational overhead,yet existing frameworks inadequately address this temporal dimension.To bridge this gap,this paper proposes a Stackelberg-FlipIt game model that formalizes asymmetric cyber conflicts as alternating control over attack surfaces,thereby capturing the dynamic security state evolution of MTD systems.We introduce a belief factor to quantify information asymmetry during adversarial interactions,enhancing the precision of MTD trigger timing.Leveraging this game-theoretic foundation,we employMulti-Agent Reinforcement Learning(MARL)to derive adaptive temporal strategies,optimized via a novel four-dimensional reward function that holistically balances security,performance,cost,and timing.Experimental validation using IP addressmutation against scanning attacks demonstrates stable strategy convergence and accelerated defense response,significantly improving cybersecurity affordability and effectiveness.展开更多
Highly intelligent Unmanned Combat Aerial Vehicle(UCAV)formation is expected to bring out strengths in Beyond-Visual-Range(BVR)air combat.Although Multi-Agent Reinforcement Learning(MARL)shows outstanding performance ...Highly intelligent Unmanned Combat Aerial Vehicle(UCAV)formation is expected to bring out strengths in Beyond-Visual-Range(BVR)air combat.Although Multi-Agent Reinforcement Learning(MARL)shows outstanding performance in cooperative decision-making,it is challenging for existing MARL algorithms to quickly converge to an optimal strategy for UCAV formation in BVR air combat where confrontation is complicated and reward is extremely sparse and delayed.Aiming to solve this problem,this paper proposes an Advantage Highlight Multi-Agent Proximal Policy Optimization(AHMAPPO)algorithm.First,at every step,the AHMAPPO records the degree to which the best formation exceeds the average of formations in parallel environments and carries out additional advantage sampling according to it.Then,the sampling result is introduced into the updating process of the actor network to improve its optimization efficiency.Finally,the simulation results reveal that compared with some state-of-the-art MARL algorithms,the AHMAPPO can obtain a more excellent strategy utilizing fewer sample episodes in the UCAV formation BVR air combat simulation environment built in this paper,which can reflect the critical features of BVR air combat.The AHMAPPO can significantly increase the convergence efficiency of the strategy for UCAV formation in BVR air combat,with a maximum increase of 81.5%relative to other algorithms.展开更多
The strategy evolution process of game players is highly uncertain due to random emergent situations and other external disturbances.This paper investigates the issue of strategy interaction and behavioral decision-ma...The strategy evolution process of game players is highly uncertain due to random emergent situations and other external disturbances.This paper investigates the issue of strategy interaction and behavioral decision-making among game players in simulated confrontation scenarios within a random interference environment.It considers the possible risks that random disturbances may pose to the autonomous decision-making of game players,as well as the impact of participants’manipulative behaviors on the state changes of the players.A nonlinear mathematical model is established to describe the strategy decision-making process of the participants in this scenario.Subsequently,the strategy selection interaction relationship,strategy evolution stability,and dynamic decision-making process of the game players are investigated and verified by simulation experiments.The results show that maneuver-related parameters and random environmental interference factors have different effects on the selection and evolutionary speed of the agent’s strategies.Especially in a highly uncertain environment,even small information asymmetry or miscalculation may have a significant impact on decision-making.This also confirms the feasibility and effectiveness of the method proposed in the paper,which can better explain the behavioral decision-making process of the agent in the interaction process.This study provides feasibility analysis ideas and theoretical references for improving multi-agent interactive decision-making and the interpretability of the game system model.展开更多
Constructing a cross-border power energy system with multiagent power energy as an alliance is important for studying cross-border power-trading markets.This study considers multiple neighboring countries in the form ...Constructing a cross-border power energy system with multiagent power energy as an alliance is important for studying cross-border power-trading markets.This study considers multiple neighboring countries in the form of alliances,introduces neighboring countries’exchange rates into the cross-border multi-agent power-trading market and proposes a method to study each agent’s dynamic decision-making behavior based on evolutionary game theory.To this end,this study uses three national agents as examples,constructs a tripartite evolutionary game model,and analyzes the evolution process of the decision-making behavior of each agent member state under the initial willingness value,cost of payment,and additional revenue of the alliance.This research helps realize cross-border energy operations so that the transaction agent can achieve greater trade profits and provides a theoretical basis for cooperation and stability between multiple agents.展开更多
The critical role of patient-reported outcome measures(PROMs)in enhancing clinical decision-making and promoting patient-centered care has gained a profound significance in scientific research.PROMs encapsulate a pati...The critical role of patient-reported outcome measures(PROMs)in enhancing clinical decision-making and promoting patient-centered care has gained a profound significance in scientific research.PROMs encapsulate a patient's health status directly from their perspective,encompassing various domains such as symptom severity,functional status,and overall quality of life.By integrating PROMs into routine clinical practice and research,healthcare providers can achieve a more nuanced understanding of patient experiences and tailor treatments accordingly.The deployment of PROMs supports dynamic patient-provider interactions,fostering better patient engagement and adherence to tre-atment plans.Moreover,PROMs are pivotal in clinical settings for monitoring disease progression and treatment efficacy,particularly in chronic and mental health conditions.However,challenges in implementing PROMs include data collection and management,integration into existing health systems,and acceptance by patients and providers.Overcoming these barriers necessitates technological advancements,policy development,and continuous education to enhance the acceptability and effectiveness of PROMs.The paper concludes with recommendations for future research and policy-making aimed at optimizing the use and impact of PROMs across healthcare settings.展开更多
To solve problems of poor security guarantee and insufficient training efficiency in the conventional reinforcement learning methods for decision-making,this study proposes a hybrid framework to combine deep reinforce...To solve problems of poor security guarantee and insufficient training efficiency in the conventional reinforcement learning methods for decision-making,this study proposes a hybrid framework to combine deep reinforcement learning with rule-based decision-making methods.A risk assessment model for lane-change maneuvers considering uncertain predictions of surrounding vehicles is established as a safety filter to improve learning efficiency while correcting dangerous actions for safety enhancement.On this basis,a Risk-fused DDQN is constructed utilizing the model-based risk assessment and supervision mechanism.The proposed reinforcement learning algorithm sets up a separate experience buffer for dangerous trials and punishes such actions,which is shown to improve the sampling efficiency and training outcomes.Compared with conventional DDQN methods,the proposed algorithm improves the convergence value of cumulated reward by 7.6%and 2.2%in the two constructed scenarios in the simulation study and reduces the number of training episodes by 52.2%and 66.8%respectively.The success rate of lane change is improved by 57.3%while the time headway is increased at least by 16.5%in real vehicle tests,which confirms the higher training efficiency,scenario adaptability,and security of the proposed Risk-fused DDQN.展开更多
Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-...Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-robot control.Empowering cooperative MARL with multi-task decision-making capabilities is expected to further broaden its application scope.In multi-task scenarios,cooperative MARL algorithms need to address 3 types of multi-task problems:reward-related multi-task,arising from different reward functions;multi-domain multi-task,caused by differences in state and action spaces,state transition functions;and scalability-related multi-task,resulting from the dynamic variation in the number of agents.Most existing studies focus on scalability-related multitask problems.However,with the increasing integration between large language models(LLMs)and multi-agent systems,a growing number of LLM-based multi-agent systems have emerged,enabling more complex multi-task cooperation.This paper provides a comprehensive review of the latest advances in this field.By combining multi-task reinforcement learning with cooperative MARL,we categorize and analyze the 3 major types of multi-task problems under multi-agent settings,offering more fine-grained classifications and summarizing key insights for each.In addition,we summarize commonly used benchmarks and discuss future directions of research in this area,which hold promise for further enhancing the multi-task cooperation capabilities of multi-agent systems and expanding their practical applications in the real world.展开更多
This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method...This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method is employed to achieve secure control by estimating the system's state in real time.Secondly,by combining a memory-based adaptive eventtriggered mechanism with neural networks,the paper aims to approximate the nonlinear terms in the networked system and efficiently conserve system resources.Finally,based on a two-degree-of-freedom model of a vehicle affected by crosswinds,this paper constructs a multi-unmanned ground vehicle(Multi-UGV)system to validate the effectiveness of the proposed method.Simulation results show that the proposed control strategy can effectively handle external disturbances such as crosswinds in practical applications,ensuring the stability and reliable operation of the Multi-UGV system.展开更多
Due to the numerous variables to take into account as well as the inherent ambiguity and uncertainty,evaluating educational institutions can be difficult.The concept of a possibility Pythagorean fuzzy hypersoft set(pP...Due to the numerous variables to take into account as well as the inherent ambiguity and uncertainty,evaluating educational institutions can be difficult.The concept of a possibility Pythagorean fuzzy hypersoft set(pPyFHSS)is more flexible in this regard than other theoretical fuzzy set-like models,even though some attempts have been made in the literature to address such uncertainties.This study investigates the elementary notions of pPyFHSS including its set-theoretic operations union,intersection,complement,OR-and AND-operations.Some results related to these operations are also modified for pPyFHSS.Additionally,the similarity measures between pPyFHSSs are formulated with the assistance of numerical examples and results.Lastly,an intelligent decision-assisted mechanism is developed with the proposal of a robust algorithm based on similarity measures for solving multi-attribute decision-making(MADM)problems.A case study that helps the decision-makers assess the best educational institution is discussed to validate the suggested system.The algorithmic results are compared with the most pertinent model to evaluate the adaptability of pPyFHSS,as it generalizes the classical possibility fuzzy set-like theoretical models.Similarly,while considering significant evaluating factors,the flexibility of pPyFHSS is observed through structural comparison.展开更多
This paper investigates the challenges associated with Unmanned Aerial Vehicle (UAV) collaborative search and target tracking in dynamic and unknown environments characterized by limited field of view. The primary obj...This paper investigates the challenges associated with Unmanned Aerial Vehicle (UAV) collaborative search and target tracking in dynamic and unknown environments characterized by limited field of view. The primary objective is to explore the unknown environments to locate and track targets effectively. To address this problem, we propose a novel Multi-Agent Reinforcement Learning (MARL) method based on Graph Neural Network (GNN). Firstly, a method is introduced for encoding continuous-space multi-UAV problem data into spatial graphs which establish essential relationships among agents, obstacles, and targets. Secondly, a Graph AttenTion network (GAT) model is presented, which focuses exclusively on adjacent nodes, learns attention weights adaptively and allows agents to better process information in dynamic environments. Reward functions are specifically designed to tackle exploration challenges in environments with sparse rewards. By introducing a framework that integrates centralized training and distributed execution, the advancement of models is facilitated. Simulation results show that the proposed method outperforms the existing MARL method in search rate and tracking performance with less collisions. The experiments show that the proposed method can be extended to applications with a larger number of agents, which provides a potential solution to the challenging problem of multi-UAV autonomous tracking in dynamic unknown environments.展开更多
Accurate medical diagnosis,which involves identifying diseases based on patient symptoms,is often hindered by uncertainties in data interpretation and retrieval.Advanced fuzzy set theories have emerged as effective to...Accurate medical diagnosis,which involves identifying diseases based on patient symptoms,is often hindered by uncertainties in data interpretation and retrieval.Advanced fuzzy set theories have emerged as effective tools to address these challenges.In this paper,new mathematical approaches for handling uncertainty in medical diagnosis are introduced using q-rung orthopair fuzzy sets(q-ROFS)and interval-valued q-rung orthopair fuzzy sets(IVq-ROFS).Three aggregation operators are proposed in our methodologies:the q-ROF weighted averaging(q-ROFWA),the q-ROF weighted geometric(q-ROFWG),and the q-ROF weighted neutrality averaging(qROFWNA),which enhance decision-making under uncertainty.These operators are paired with ranking methods such as the similarity measure,score function,and inverse score function to improve the accuracy of disease identification.Additionally,the impact of varying q-rung values is explored through a sensitivity analysis,extending the analysis beyond the typical maximum value of 3.The Basic Uncertain Information(BUI)method is employed to simulate expert opinions,and aggregation operators are used to combine these opinions in a group decisionmaking context.Our results provide a comprehensive comparison of methodologies,highlighting their strengths and limitations in diagnosing diseases based on uncertain patient data.展开更多
BACKGROUND Understanding a patient's clinical status and setting priorities for their care are two aspects of the constantly changing process of clinical decision-making.One analytical technique that can be helpfu...BACKGROUND Understanding a patient's clinical status and setting priorities for their care are two aspects of the constantly changing process of clinical decision-making.One analytical technique that can be helpful in uncertain situations is clinical judgment.Clinicians must deal with contradictory information,lack of time to make decisions,and long-term factors when emergencies occur.AIM To examine the ethical issues healthcare professionals faced during the coronavirus disease 2019(COVID-19)pandemic and the factors affecting clinical decision-making.METHODS This pilot study,which means it was a preliminary investigation to gather information and test the feasibility of a larger investigation was conducted over 6 months and we invited responses from clinicians worldwide who managed patients with COVID-19.The survey focused on topics related to their professional roles and personal relationships.We examined five core areas influencing critical care decision-making:Patients'personal factors,family-related factors,informed consent,communication and media,and hospital administrative policies on clinical decision-making.The collected data were analyzed using the χ^(2) test for categorical variables.RESULTS A total of 102 clinicians from 23 specialties and 17 countries responded to the survey.Age was a significant factor in treatment planning(n=88)and ventilator access(n=78).Sex had no bearing on how decisions were made.Most doctors reported maintaining patient confidentiality regarding privacy and informed consent.Approximately 50%of clinicians reported a moderate influence of clinical work,with many citing it as one of the most important factors affecting their health and relationships.Clinicians from developing countries had a significantly higher score for considering a patient's financial status when creating a treatment plan than their counterparts from developed countries.Regarding personal experiences,some respondents noted that treatment plans and preferences changed from wave to wave,and that there was a rapid turnover of studies and evidence.Hospital and government policies also played a role in critical decision-making.Rather than assessing the appropriateness of treatment,some doctors observed that hospital policies regarding medications were driven by patient demand.CONCLUSION Factors other than medical considerations frequently affect management choices.The disparity in treatment choices,became more apparent during the pandemic.We highlight the difficulties and contradictions between moral standards and the realities physicians encountered during this medical emergency.False information,large patient populations,and limited resources caused problems for clinicians.These factors impacted decision-making,which,in turn,affected patient care and healthcare staff well-being.展开更多
Objectives This study aimed to clarify the relationship between the content of proxy decision-making made by families of patients with malignant brain tumors regarding treatment policies and daily care and the cues le...Objectives This study aimed to clarify the relationship between the content of proxy decision-making made by families of patients with malignant brain tumors regarding treatment policies and daily care and the cues leading to those decisions.Methods Semi-structured personal interviews were used to collect data.Seven family members of patients with malignant brain tumors were selected to participate in the study by purposive sampling method from June to August 2022 in the Patient Family Association of Japan.Responses were content analyzed to explore the relationship between the content of decisions regarding“treatment policies”and“daily care”and the cues influencing those decisions.Semi-structured interviews were analyzed by using thematic analysis.Results The contents of proxy decisions regarding“treatment policies”included implementation,interruption,and termination of initial treatments,free medical treatments,use of respirators,and end-of-life sedation and included six cues:treatment policies suggested by the primary physician,information and knowledge about the disease and treatment obtained by the family from limited resources,perceived life threat from symptom worsening,words and reactions from the patient regarding treatment,patient’s personality and way of life inferred from their treatment preferences,family’s thoughts and values hoping for better treatment for the patient.Decisions for“daily care”included meal content and methods,excretion,mobility,maintaining cleanliness,rehabilitation,continuation or resignation from work,treatment settings(outpatient or inpatient),and ways to spend time outside and included seven cues:words and thoughts from the patient about their way of life,patient’s reactions and life history inferred from their preferred way of living,things the patient can do to maintain daily life and roles,awareness of the increasing inability to do things in daily life,family’s underlying thoughts and values about how to spend the remaining time,approval from family members regarding the care setting,advice from medical professionals on living at home.Conclusions For“treatment policies,”guidelines from medical professionals were a key cue,while for“daily care,”the small signs from the patients in their daily lives served as cues for proxy decision-making.This may be due to the lack of information available to families and the limited time available for discussion with the patient.Families of patients with malignant brain tumors repeatedly use multiple cues to make proxy decision-making under high uncertainty.Therefore,nurses supporting proxy decision-making should assess the family’s situation and provide cues that facilitate informed and confident decisions.展开更多
Group living is widespread across diverse taxa,and the mechanisms underlying collective decision-making in contexts of variable role division are critical for understanding the dynamics of group stability.While studie...Group living is widespread across diverse taxa,and the mechanisms underlying collective decision-making in contexts of variable role division are critical for understanding the dynamics of group stability.While studies on collective behavior in small animals such as fish and insects are well-established,similar research on large wild animals remains challenging due to the limited availability of sufficient and systematic field data.Here,we aimed to explore the collective decision-making pattern and its sexual difference for the dimorphic Tibetan antelopes Pantholops hodgsonii(chiru)in Xizang Autonomous Region,China,by analyzing individual leadership distribution,as well as the joining process,considering factors such as calving stages and joining ranks.The distinct correlations of decision participants’ratio with group size and decision duration underscore the trade-off between accuracy and speed in decision-making.Male antelopes display a more democratic decision-making pattern,while females exhibit more prompt responses after calving at an early stage.This study uncovers a partially shared decision-making strategy among Tibetan antelopes,suggesting flexible self-organization in group decision processes aligned with animal life cycle progression.展开更多
BACKGROUND Mesalamine is the recommended first-line treatment for inducing and maintaining remission in mild-to-moderate ulcerative colitis(UC).However,adherence in real-world settings is frequently suboptimal.Encoura...BACKGROUND Mesalamine is the recommended first-line treatment for inducing and maintaining remission in mild-to-moderate ulcerative colitis(UC).However,adherence in real-world settings is frequently suboptimal.Encouraging collaborative patient-provider relationships may foster better adherence and patient outcomes.AIM To quantify the association between patient participation in treatment decisionmaking and adherence to oral mesalamine in UC.METHODS We conducted a 12-month,prospective,non-interventional cohort study at 113 gastroenterology practices in Germany.Eligible patients were aged≥18 years,had a confirmed UC diagnosis,had no prior mesalamine treatment,and provided informed consent.At the first visit,we collected data on demographics,clinical characteristics,patient preference for mesalamine formulation(tablets or granules),and disease knowledge.Self-reported adherence and disease activity were assessed at all visits.Correlation analyses and logistic regression were used to examine associations between adherence and various factors.RESULTS Of the 605 consecutively screened patients,520 were included in the study.The median age was 41 years(range:18-91),with a male-to-female ratio of 1.1:1.0.Approximately 75%of patients reported good adherence at each study visit.In correlation analyses,patient participation in treatment decision-making was significantly associated with better adherence across all visits(P=0.04).In the regression analysis at 12 months,this association was evident among patients who both preferred and received prolonged-release mesalamine granules(odds ratio=2.73,P=0.001).Patients reporting good adherence also experienced significant improvements in disease activity over 12 months(P<0.001).CONCLUSION Facilitating patient participation in treatment decisions and accommodating medication preferences may improve adherence to mesalamine.This may require additional effort but has the potential to improve long-term management of UC.展开更多
This paper mainly focuses on the velocity-constrained consensus problem of discrete-time heterogeneous multi-agent systems with nonconvex constraints and arbitrarily switching topologies,where each agent has first-ord...This paper mainly focuses on the velocity-constrained consensus problem of discrete-time heterogeneous multi-agent systems with nonconvex constraints and arbitrarily switching topologies,where each agent has first-order or second-order dynamics.To solve this problem,a distributed algorithm is proposed based on a contraction operator.By employing the properties of the stochastic matrix,it is shown that all agents’position states could converge to a common point and second-order agents’velocity states could remain in corresponding nonconvex constraint sets and converge to zero as long as the joint communication topology has one directed spanning tree.Finally,the numerical simulation results are provided to verify the effectiveness of the proposed algorithms.展开更多
Information plays a crucial role in guiding behavioral decisions during public health emergencies. Individuals communicate to acquire relevant knowledge about an epidemic, which influences their decisions to adopt pro...Information plays a crucial role in guiding behavioral decisions during public health emergencies. Individuals communicate to acquire relevant knowledge about an epidemic, which influences their decisions to adopt protective measures.However, whether to disseminate specific information is also a behavioral decision. In light of this understanding, we develop a coupled information–vaccination–epidemic model to depict these co-evolutionary dynamics in a three-layer network. Negative information dissemination and vaccination are treated as separate decision-making processes. We then examine the combined effects of herd and risk motives on information dissemination and vaccination decisions through the lens of game theory. The microscopic Markov chain approach(MMCA) is used to describe the dynamic process and to derive the epidemic threshold. Simulation results indicate that increasing the cost of negative information dissemination and providing timely clarification can effectively control the epidemic. Furthermore, a phenomenon of diminishing marginal utility is observed as the cost of dissemination increases, suggesting that authorities do not need to overinvest in suppressing negative information. Conversely, reducing the cost of vaccination and increasing vaccine efficacy emerge as more effective strategies for outbreak control. In addition, we find that the scale of the epidemic is greater when the herd motive dominates behavioral decision-making. In conclusion, this study provides a new perspective for understanding the complexity of epidemic spreading by starting with the construction of different behavioral decisions.展开更多
Decision-making of connected and automated vehicles(CAV)includes a sequence of driving maneuvers that improve safety and efficiency,characterized by complex scenarios,strong uncertainty,and high real-time requirements...Decision-making of connected and automated vehicles(CAV)includes a sequence of driving maneuvers that improve safety and efficiency,characterized by complex scenarios,strong uncertainty,and high real-time requirements.Deep reinforcement learning(DRL)exhibits excellent capability of real-time decision-making and adaptability to complex scenarios,and generalization abilities.However,it is arduous to guarantee complete driving safety and efficiency under the constraints of training samples and costs.This paper proposes a Mixture of Expert method(MoE)based on Soft Actor-Critic(SAC),where the upper-level discriminator dynamically decides whether to activate the lower-level DRL expert or the heuristic expert based on the features of the input state.To further enhance the performance of the DRL expert,a buffer zone is introduced in the reward function,preemptively applying penalties before insecure situations occur.In order to minimize collision and off-road rates,the Intelligent Driver Model(IDM)and Minimizing Overall Braking Induced by Lane changes(MOBIL)strategy are designed by heuristic experts.Finally,tested in typical simulation scenarios,MOE shows a 13.75%improvement in driving efficiency compared with the traditional DRL method with continuous action space.It ensures high safety with zero collision and zero off-road rates while maintaining high adaptability.展开更多
基金funded by National Natural Science Foundation of China No.62302520.
文摘Moving Target Defense(MTD)necessitates scientifically effective decision-making methodologies for defensive technology implementation.While most MTD decision studies focus on accurately identifying optimal strategies,the issue of optimal defense timing remains underexplored.Current default approaches—periodic or overly frequent MTD triggers—lead to suboptimal trade-offs among system security,performance,and cost.The timing of MTD strategy activation critically impacts both defensive efficacy and operational overhead,yet existing frameworks inadequately address this temporal dimension.To bridge this gap,this paper proposes a Stackelberg-FlipIt game model that formalizes asymmetric cyber conflicts as alternating control over attack surfaces,thereby capturing the dynamic security state evolution of MTD systems.We introduce a belief factor to quantify information asymmetry during adversarial interactions,enhancing the precision of MTD trigger timing.Leveraging this game-theoretic foundation,we employMulti-Agent Reinforcement Learning(MARL)to derive adaptive temporal strategies,optimized via a novel four-dimensional reward function that holistically balances security,performance,cost,and timing.Experimental validation using IP addressmutation against scanning attacks demonstrates stable strategy convergence and accelerated defense response,significantly improving cybersecurity affordability and effectiveness.
基金co-supported by the National Natural Science Foundation of China(No.52272382)the Aeronautical Science Foundation of China(No.20200017051001)the Fundamental Research Funds for the Central Universities,China.
文摘Highly intelligent Unmanned Combat Aerial Vehicle(UCAV)formation is expected to bring out strengths in Beyond-Visual-Range(BVR)air combat.Although Multi-Agent Reinforcement Learning(MARL)shows outstanding performance in cooperative decision-making,it is challenging for existing MARL algorithms to quickly converge to an optimal strategy for UCAV formation in BVR air combat where confrontation is complicated and reward is extremely sparse and delayed.Aiming to solve this problem,this paper proposes an Advantage Highlight Multi-Agent Proximal Policy Optimization(AHMAPPO)algorithm.First,at every step,the AHMAPPO records the degree to which the best formation exceeds the average of formations in parallel environments and carries out additional advantage sampling according to it.Then,the sampling result is introduced into the updating process of the actor network to improve its optimization efficiency.Finally,the simulation results reveal that compared with some state-of-the-art MARL algorithms,the AHMAPPO can obtain a more excellent strategy utilizing fewer sample episodes in the UCAV formation BVR air combat simulation environment built in this paper,which can reflect the critical features of BVR air combat.The AHMAPPO can significantly increase the convergence efficiency of the strategy for UCAV formation in BVR air combat,with a maximum increase of 81.5%relative to other algorithms.
文摘The strategy evolution process of game players is highly uncertain due to random emergent situations and other external disturbances.This paper investigates the issue of strategy interaction and behavioral decision-making among game players in simulated confrontation scenarios within a random interference environment.It considers the possible risks that random disturbances may pose to the autonomous decision-making of game players,as well as the impact of participants’manipulative behaviors on the state changes of the players.A nonlinear mathematical model is established to describe the strategy decision-making process of the participants in this scenario.Subsequently,the strategy selection interaction relationship,strategy evolution stability,and dynamic decision-making process of the game players are investigated and verified by simulation experiments.The results show that maneuver-related parameters and random environmental interference factors have different effects on the selection and evolutionary speed of the agent’s strategies.Especially in a highly uncertain environment,even small information asymmetry or miscalculation may have a significant impact on decision-making.This also confirms the feasibility and effectiveness of the method proposed in the paper,which can better explain the behavioral decision-making process of the agent in the interaction process.This study provides feasibility analysis ideas and theoretical references for improving multi-agent interactive decision-making and the interpretability of the game system model.
基金National Key R&D Program of China(Grant No.2022YFB2703500)National Natural Science Foundation of China(Grant No.52277104)+2 种基金National Key R&D Program of Yunnan Province(202303AC100003)Applied Basic Research Foundation of Yunnan Province (202301AT070455, 202101AT070080)Revitalizing Talent Support Program of Yunnan Province (KKRD202204024).
文摘Constructing a cross-border power energy system with multiagent power energy as an alliance is important for studying cross-border power-trading markets.This study considers multiple neighboring countries in the form of alliances,introduces neighboring countries’exchange rates into the cross-border multi-agent power-trading market and proposes a method to study each agent’s dynamic decision-making behavior based on evolutionary game theory.To this end,this study uses three national agents as examples,constructs a tripartite evolutionary game model,and analyzes the evolution process of the decision-making behavior of each agent member state under the initial willingness value,cost of payment,and additional revenue of the alliance.This research helps realize cross-border energy operations so that the transaction agent can achieve greater trade profits and provides a theoretical basis for cooperation and stability between multiple agents.
文摘The critical role of patient-reported outcome measures(PROMs)in enhancing clinical decision-making and promoting patient-centered care has gained a profound significance in scientific research.PROMs encapsulate a patient's health status directly from their perspective,encompassing various domains such as symptom severity,functional status,and overall quality of life.By integrating PROMs into routine clinical practice and research,healthcare providers can achieve a more nuanced understanding of patient experiences and tailor treatments accordingly.The deployment of PROMs supports dynamic patient-provider interactions,fostering better patient engagement and adherence to tre-atment plans.Moreover,PROMs are pivotal in clinical settings for monitoring disease progression and treatment efficacy,particularly in chronic and mental health conditions.However,challenges in implementing PROMs include data collection and management,integration into existing health systems,and acceptance by patients and providers.Overcoming these barriers necessitates technological advancements,policy development,and continuous education to enhance the acceptability and effectiveness of PROMs.The paper concludes with recommendations for future research and policy-making aimed at optimizing the use and impact of PROMs across healthcare settings.
基金Supported by National Key Research and Development Program of China(Grant No.2022YFE0117100)National Science Foundation of China(Grant No.52102468,52325212)Fundamental Research Funds for the Central Universities。
文摘To solve problems of poor security guarantee and insufficient training efficiency in the conventional reinforcement learning methods for decision-making,this study proposes a hybrid framework to combine deep reinforcement learning with rule-based decision-making methods.A risk assessment model for lane-change maneuvers considering uncertain predictions of surrounding vehicles is established as a safety filter to improve learning efficiency while correcting dangerous actions for safety enhancement.On this basis,a Risk-fused DDQN is constructed utilizing the model-based risk assessment and supervision mechanism.The proposed reinforcement learning algorithm sets up a separate experience buffer for dangerous trials and punishes such actions,which is shown to improve the sampling efficiency and training outcomes.Compared with conventional DDQN methods,the proposed algorithm improves the convergence value of cumulated reward by 7.6%and 2.2%in the two constructed scenarios in the simulation study and reduces the number of training episodes by 52.2%and 66.8%respectively.The success rate of lane change is improved by 57.3%while the time headway is increased at least by 16.5%in real vehicle tests,which confirms the higher training efficiency,scenario adaptability,and security of the proposed Risk-fused DDQN.
基金The National Natural Science Foundation of China(62136008,62293541)The Beijing Natural Science Foundation(4232056)The Beijing Nova Program(20240484514).
文摘Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-robot control.Empowering cooperative MARL with multi-task decision-making capabilities is expected to further broaden its application scope.In multi-task scenarios,cooperative MARL algorithms need to address 3 types of multi-task problems:reward-related multi-task,arising from different reward functions;multi-domain multi-task,caused by differences in state and action spaces,state transition functions;and scalability-related multi-task,resulting from the dynamic variation in the number of agents.Most existing studies focus on scalability-related multitask problems.However,with the increasing integration between large language models(LLMs)and multi-agent systems,a growing number of LLM-based multi-agent systems have emerged,enabling more complex multi-task cooperation.This paper provides a comprehensive review of the latest advances in this field.By combining multi-task reinforcement learning with cooperative MARL,we categorize and analyze the 3 major types of multi-task problems under multi-agent settings,offering more fine-grained classifications and summarizing key insights for each.In addition,we summarize commonly used benchmarks and discuss future directions of research in this area,which hold promise for further enhancing the multi-task cooperation capabilities of multi-agent systems and expanding their practical applications in the real world.
基金The National Natural Science Foundation of China(W2431048)The Science and Technology Research Program of Chongqing Municipal Education Commission,China(KJZDK202300807)The Chongqing Natural Science Foundation,China(CSTB2024NSCQQCXMX0052).
文摘This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method is employed to achieve secure control by estimating the system's state in real time.Secondly,by combining a memory-based adaptive eventtriggered mechanism with neural networks,the paper aims to approximate the nonlinear terms in the networked system and efficiently conserve system resources.Finally,based on a two-degree-of-freedom model of a vehicle affected by crosswinds,this paper constructs a multi-unmanned ground vehicle(Multi-UGV)system to validate the effectiveness of the proposed method.Simulation results show that the proposed control strategy can effectively handle external disturbances such as crosswinds in practical applications,ensuring the stability and reliable operation of the Multi-UGV system.
基金supported by the Deanship of Graduate Studies and Scientific Research at Qassim University(QU-APC-2024-9/1).
文摘Due to the numerous variables to take into account as well as the inherent ambiguity and uncertainty,evaluating educational institutions can be difficult.The concept of a possibility Pythagorean fuzzy hypersoft set(pPyFHSS)is more flexible in this regard than other theoretical fuzzy set-like models,even though some attempts have been made in the literature to address such uncertainties.This study investigates the elementary notions of pPyFHSS including its set-theoretic operations union,intersection,complement,OR-and AND-operations.Some results related to these operations are also modified for pPyFHSS.Additionally,the similarity measures between pPyFHSSs are formulated with the assistance of numerical examples and results.Lastly,an intelligent decision-assisted mechanism is developed with the proposal of a robust algorithm based on similarity measures for solving multi-attribute decision-making(MADM)problems.A case study that helps the decision-makers assess the best educational institution is discussed to validate the suggested system.The algorithmic results are compared with the most pertinent model to evaluate the adaptability of pPyFHSS,as it generalizes the classical possibility fuzzy set-like theoretical models.Similarly,while considering significant evaluating factors,the flexibility of pPyFHSS is observed through structural comparison.
基金supported by the National Natural Science Foundation of China(Nos.12272104,U22B2013).
文摘This paper investigates the challenges associated with Unmanned Aerial Vehicle (UAV) collaborative search and target tracking in dynamic and unknown environments characterized by limited field of view. The primary objective is to explore the unknown environments to locate and track targets effectively. To address this problem, we propose a novel Multi-Agent Reinforcement Learning (MARL) method based on Graph Neural Network (GNN). Firstly, a method is introduced for encoding continuous-space multi-UAV problem data into spatial graphs which establish essential relationships among agents, obstacles, and targets. Secondly, a Graph AttenTion network (GAT) model is presented, which focuses exclusively on adjacent nodes, learns attention weights adaptively and allows agents to better process information in dynamic environments. Reward functions are specifically designed to tackle exploration challenges in environments with sparse rewards. By introducing a framework that integrates centralized training and distributed execution, the advancement of models is facilitated. Simulation results show that the proposed method outperforms the existing MARL method in search rate and tracking performance with less collisions. The experiments show that the proposed method can be extended to applications with a larger number of agents, which provides a potential solution to the challenging problem of multi-UAV autonomous tracking in dynamic unknown environments.
文摘Accurate medical diagnosis,which involves identifying diseases based on patient symptoms,is often hindered by uncertainties in data interpretation and retrieval.Advanced fuzzy set theories have emerged as effective tools to address these challenges.In this paper,new mathematical approaches for handling uncertainty in medical diagnosis are introduced using q-rung orthopair fuzzy sets(q-ROFS)and interval-valued q-rung orthopair fuzzy sets(IVq-ROFS).Three aggregation operators are proposed in our methodologies:the q-ROF weighted averaging(q-ROFWA),the q-ROF weighted geometric(q-ROFWG),and the q-ROF weighted neutrality averaging(qROFWNA),which enhance decision-making under uncertainty.These operators are paired with ranking methods such as the similarity measure,score function,and inverse score function to improve the accuracy of disease identification.Additionally,the impact of varying q-rung values is explored through a sensitivity analysis,extending the analysis beyond the typical maximum value of 3.The Basic Uncertain Information(BUI)method is employed to simulate expert opinions,and aggregation operators are used to combine these opinions in a group decisionmaking context.Our results provide a comprehensive comparison of methodologies,highlighting their strengths and limitations in diagnosing diseases based on uncertain patient data.
文摘BACKGROUND Understanding a patient's clinical status and setting priorities for their care are two aspects of the constantly changing process of clinical decision-making.One analytical technique that can be helpful in uncertain situations is clinical judgment.Clinicians must deal with contradictory information,lack of time to make decisions,and long-term factors when emergencies occur.AIM To examine the ethical issues healthcare professionals faced during the coronavirus disease 2019(COVID-19)pandemic and the factors affecting clinical decision-making.METHODS This pilot study,which means it was a preliminary investigation to gather information and test the feasibility of a larger investigation was conducted over 6 months and we invited responses from clinicians worldwide who managed patients with COVID-19.The survey focused on topics related to their professional roles and personal relationships.We examined five core areas influencing critical care decision-making:Patients'personal factors,family-related factors,informed consent,communication and media,and hospital administrative policies on clinical decision-making.The collected data were analyzed using the χ^(2) test for categorical variables.RESULTS A total of 102 clinicians from 23 specialties and 17 countries responded to the survey.Age was a significant factor in treatment planning(n=88)and ventilator access(n=78).Sex had no bearing on how decisions were made.Most doctors reported maintaining patient confidentiality regarding privacy and informed consent.Approximately 50%of clinicians reported a moderate influence of clinical work,with many citing it as one of the most important factors affecting their health and relationships.Clinicians from developing countries had a significantly higher score for considering a patient's financial status when creating a treatment plan than their counterparts from developed countries.Regarding personal experiences,some respondents noted that treatment plans and preferences changed from wave to wave,and that there was a rapid turnover of studies and evidence.Hospital and government policies also played a role in critical decision-making.Rather than assessing the appropriateness of treatment,some doctors observed that hospital policies regarding medications were driven by patient demand.CONCLUSION Factors other than medical considerations frequently affect management choices.The disparity in treatment choices,became more apparent during the pandemic.We highlight the difficulties and contradictions between moral standards and the realities physicians encountered during this medical emergency.False information,large patient populations,and limited resources caused problems for clinicians.These factors impacted decision-making,which,in turn,affected patient care and healthcare staff well-being.
文摘Objectives This study aimed to clarify the relationship between the content of proxy decision-making made by families of patients with malignant brain tumors regarding treatment policies and daily care and the cues leading to those decisions.Methods Semi-structured personal interviews were used to collect data.Seven family members of patients with malignant brain tumors were selected to participate in the study by purposive sampling method from June to August 2022 in the Patient Family Association of Japan.Responses were content analyzed to explore the relationship between the content of decisions regarding“treatment policies”and“daily care”and the cues influencing those decisions.Semi-structured interviews were analyzed by using thematic analysis.Results The contents of proxy decisions regarding“treatment policies”included implementation,interruption,and termination of initial treatments,free medical treatments,use of respirators,and end-of-life sedation and included six cues:treatment policies suggested by the primary physician,information and knowledge about the disease and treatment obtained by the family from limited resources,perceived life threat from symptom worsening,words and reactions from the patient regarding treatment,patient’s personality and way of life inferred from their treatment preferences,family’s thoughts and values hoping for better treatment for the patient.Decisions for“daily care”included meal content and methods,excretion,mobility,maintaining cleanliness,rehabilitation,continuation or resignation from work,treatment settings(outpatient or inpatient),and ways to spend time outside and included seven cues:words and thoughts from the patient about their way of life,patient’s reactions and life history inferred from their preferred way of living,things the patient can do to maintain daily life and roles,awareness of the increasing inability to do things in daily life,family’s underlying thoughts and values about how to spend the remaining time,approval from family members regarding the care setting,advice from medical professionals on living at home.Conclusions For“treatment policies,”guidelines from medical professionals were a key cue,while for“daily care,”the small signs from the patients in their daily lives served as cues for proxy decision-making.This may be due to the lack of information available to families and the limited time available for discussion with the patient.Families of patients with malignant brain tumors repeatedly use multiple cues to make proxy decision-making under high uncertainty.Therefore,nurses supporting proxy decision-making should assess the family’s situation and provide cues that facilitate informed and confident decisions.
基金supported by the National Natural Science Foundation of China(Grant no.32101237)the China Postdoctoral Science Foundation(Grant no.2021M691522)+1 种基金the National Key Research and Development Program(Grant no.2022YFC3202104)the Tibet Major Science and Technology Project(Grant no.XZ201901-GA-06).
文摘Group living is widespread across diverse taxa,and the mechanisms underlying collective decision-making in contexts of variable role division are critical for understanding the dynamics of group stability.While studies on collective behavior in small animals such as fish and insects are well-established,similar research on large wild animals remains challenging due to the limited availability of sufficient and systematic field data.Here,we aimed to explore the collective decision-making pattern and its sexual difference for the dimorphic Tibetan antelopes Pantholops hodgsonii(chiru)in Xizang Autonomous Region,China,by analyzing individual leadership distribution,as well as the joining process,considering factors such as calving stages and joining ranks.The distinct correlations of decision participants’ratio with group size and decision duration underscore the trade-off between accuracy and speed in decision-making.Male antelopes display a more democratic decision-making pattern,while females exhibit more prompt responses after calving at an early stage.This study uncovers a partially shared decision-making strategy among Tibetan antelopes,suggesting flexible self-organization in group decision processes aligned with animal life cycle progression.
文摘BACKGROUND Mesalamine is the recommended first-line treatment for inducing and maintaining remission in mild-to-moderate ulcerative colitis(UC).However,adherence in real-world settings is frequently suboptimal.Encouraging collaborative patient-provider relationships may foster better adherence and patient outcomes.AIM To quantify the association between patient participation in treatment decisionmaking and adherence to oral mesalamine in UC.METHODS We conducted a 12-month,prospective,non-interventional cohort study at 113 gastroenterology practices in Germany.Eligible patients were aged≥18 years,had a confirmed UC diagnosis,had no prior mesalamine treatment,and provided informed consent.At the first visit,we collected data on demographics,clinical characteristics,patient preference for mesalamine formulation(tablets or granules),and disease knowledge.Self-reported adherence and disease activity were assessed at all visits.Correlation analyses and logistic regression were used to examine associations between adherence and various factors.RESULTS Of the 605 consecutively screened patients,520 were included in the study.The median age was 41 years(range:18-91),with a male-to-female ratio of 1.1:1.0.Approximately 75%of patients reported good adherence at each study visit.In correlation analyses,patient participation in treatment decision-making was significantly associated with better adherence across all visits(P=0.04).In the regression analysis at 12 months,this association was evident among patients who both preferred and received prolonged-release mesalamine granules(odds ratio=2.73,P=0.001).Patients reporting good adherence also experienced significant improvements in disease activity over 12 months(P<0.001).CONCLUSION Facilitating patient participation in treatment decisions and accommodating medication preferences may improve adherence to mesalamine.This may require additional effort but has the potential to improve long-term management of UC.
基金2024 Jiangsu Province Youth Science and Technology Talent Support Project2024 Yancheng Key Research and Development Plan(Social Development)projects,“Research and Application of Multi Agent Offline Distributed Trust Perception Virtual Wireless Sensor Network Algorithm”and“Research and Application of a New Type of Fishery Ship Safety Production Monitoring Equipment”。
文摘This paper mainly focuses on the velocity-constrained consensus problem of discrete-time heterogeneous multi-agent systems with nonconvex constraints and arbitrarily switching topologies,where each agent has first-order or second-order dynamics.To solve this problem,a distributed algorithm is proposed based on a contraction operator.By employing the properties of the stochastic matrix,it is shown that all agents’position states could converge to a common point and second-order agents’velocity states could remain in corresponding nonconvex constraint sets and converge to zero as long as the joint communication topology has one directed spanning tree.Finally,the numerical simulation results are provided to verify the effectiveness of the proposed algorithms.
基金Project supported by the National Natural Science Foundation of China (Grant No. 72174121)the Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning, and the Soft Science Research Project of Shanghai (Grant No. 22692112600)。
文摘Information plays a crucial role in guiding behavioral decisions during public health emergencies. Individuals communicate to acquire relevant knowledge about an epidemic, which influences their decisions to adopt protective measures.However, whether to disseminate specific information is also a behavioral decision. In light of this understanding, we develop a coupled information–vaccination–epidemic model to depict these co-evolutionary dynamics in a three-layer network. Negative information dissemination and vaccination are treated as separate decision-making processes. We then examine the combined effects of herd and risk motives on information dissemination and vaccination decisions through the lens of game theory. The microscopic Markov chain approach(MMCA) is used to describe the dynamic process and to derive the epidemic threshold. Simulation results indicate that increasing the cost of negative information dissemination and providing timely clarification can effectively control the epidemic. Furthermore, a phenomenon of diminishing marginal utility is observed as the cost of dissemination increases, suggesting that authorities do not need to overinvest in suppressing negative information. Conversely, reducing the cost of vaccination and increasing vaccine efficacy emerge as more effective strategies for outbreak control. In addition, we find that the scale of the epidemic is greater when the herd motive dominates behavioral decision-making. In conclusion, this study provides a new perspective for understanding the complexity of epidemic spreading by starting with the construction of different behavioral decisions.
基金Supported by National Key R&D Program of China(Grant No.2022YFB2503203)National Natural Science Foundation of China(Grant No.U1964206).
文摘Decision-making of connected and automated vehicles(CAV)includes a sequence of driving maneuvers that improve safety and efficiency,characterized by complex scenarios,strong uncertainty,and high real-time requirements.Deep reinforcement learning(DRL)exhibits excellent capability of real-time decision-making and adaptability to complex scenarios,and generalization abilities.However,it is arduous to guarantee complete driving safety and efficiency under the constraints of training samples and costs.This paper proposes a Mixture of Expert method(MoE)based on Soft Actor-Critic(SAC),where the upper-level discriminator dynamically decides whether to activate the lower-level DRL expert or the heuristic expert based on the features of the input state.To further enhance the performance of the DRL expert,a buffer zone is introduced in the reward function,preemptively applying penalties before insecure situations occur.In order to minimize collision and off-road rates,the Intelligent Driver Model(IDM)and Minimizing Overall Braking Induced by Lane changes(MOBIL)strategy are designed by heuristic experts.Finally,tested in typical simulation scenarios,MOE shows a 13.75%improvement in driving efficiency compared with the traditional DRL method with continuous action space.It ensures high safety with zero collision and zero off-road rates while maintaining high adaptability.