To address the shortcomings of traditional Genetic Algorithm (GA) in multi-agent path planning, such as prolonged planning time, slow convergence, and solution instability, this paper proposes an Asynchronous Genetic ...To address the shortcomings of traditional Genetic Algorithm (GA) in multi-agent path planning, such as prolonged planning time, slow convergence, and solution instability, this paper proposes an Asynchronous Genetic Algorithm (AGA) to solve multi-agent path planning problems effectively. To enhance the real-time performance and computational efficiency of Multi-Agent Systems (MAS) in path planning, the AGA incorporates an Equal-Size Clustering Algorithm (ESCA) based on the K-means clustering method. The ESCA divides the primary task evenly into a series of subtasks, thereby reducing the gene length in the subsequent GA process. The algorithm then employs GA to solve each subtask sequentially. To evaluate the effectiveness of the proposed method, a simulation program was designed to perform path planning for 100 trajectories, and the results were compared with those of State-Of-The-Art (SOTA) methods. The simulation results demonstrate that, although the solutions provided by AGA are suboptimal, it exhibits significant advantages in terms of execution speed and solution stability compared to other algorithms.展开更多
Shenzhen,a major city in southern China,has experienced rapid advancements in Unmanned Aerial Vehicle(UAV)technology,resulting in extensive logistics networks with thousands of daily flights.However,frequent disruptio...Shenzhen,a major city in southern China,has experienced rapid advancements in Unmanned Aerial Vehicle(UAV)technology,resulting in extensive logistics networks with thousands of daily flights.However,frequent disruptions due to its subtropical monsoon climate,including typhoons and gusty winds,present ongoing challenges.Despite the growing focus on operational costs and third-party risks,research on low-altitude urban wind fields remains scarce.This study addresses this gap by integrating wind field analysis into UAV path planning,introducing key innovations to the classical model.First,UAV wind resistance and turbulence constraints are analyzed,mapping high-wind-speed and turbulence-prone zones in the airspace.Second,wind dynamics are incorporated into path planning by considering airspeed and groundspeed variation,optimizing waypoint selection and flight speed adjustments to improve overall energy efficiency.Additionally,a wind-aware Theta*algorithm is proposed,leveraging wind vectors to expedite search process,while Computational Fluid Dynamics(CFD)techniques are employed to calculate wind fields.A case study of Shenzhen,examining wind patterns over the past decade,demonstrates a 6.23%improvement in groundspeed and a 7.69%reduction in energy consumption compared to wind-agnostic models.This framework advances UAV logistics by enhancing route safety and energy efficiency,contributing to more cost-effective operations.展开更多
Traditional sampling-based path planning algorithms,such as the rapidly-exploring random tree star(RRT^(*)),encounter critical limitations in unstructured orchard environments,including low sampling efficiency in narr...Traditional sampling-based path planning algorithms,such as the rapidly-exploring random tree star(RRT^(*)),encounter critical limitations in unstructured orchard environments,including low sampling efficiency in narrow passages,slow convergence,and high computational costs.To address these challenges,this paper proposes a novel hybrid global path planning algorithm integrating Gaussian sampling and quadtree optimization(RRT^(*)-GSQ).This methodology aims to enhance path planning by synergistically combining a Gaussian mixture sampling strategy to improve node generation in critical regions,an adaptive step-size and direction optimization mechanism for enhanced obstacle avoidance,a Quadtree-AABB collision detection framework to lower computational complexity,and a dynamic iteration control strategy for more efficient convergence.In obstacle-free and obstructed scenarios,compared with the conventional RRT^(*),the proposed algorithm reduced the number of node evaluations by 67.57%and 62.72%,and decreased the search time by 79.72%and 78.52%,respectively.In path tracking tests,the proposed algorithm achieved substantial reductions in RMSE of the final path compared to the conventional RRT^(*).Specifically,the lateral RMSE was reduced by 41.5%in obstacle-free environments and 59.3%in obstructed environments,while the longitudinal RMSE was reduced by 57.2%and 58.5%,respectively.Furthermore,the maximum absolute errors in both lateral and longitudinal directions were constrained within 0.75 m.Field validation experiments in an operational orchard confirmed the algorithm's practical effectiveness,showing reductions in the mean tracking error of 47.6%(obstacle-free)and 58.3%(with obstructed),alongside a 5.1%and 7.2%shortening of the path length compared to the baseline method.The proposed algorithm effectively enhances path planning efficiency and navigation accuracy for robots,presenting a superior solution for high-precision autonomous navigation of agricultural robots in orchard environments and holding significant value for engineering applications.展开更多
Multi-agent reinforcement learning(MARL)has proven its effectiveness in cooperative multi-agent systems(MASs)but still faces issues on the curse of dimensionality and learning efficiency.The main difficulty is caused ...Multi-agent reinforcement learning(MARL)has proven its effectiveness in cooperative multi-agent systems(MASs)but still faces issues on the curse of dimensionality and learning efficiency.The main difficulty is caused by the strong inter-agent coupling nature embedded in an MARL problem,which is yet to be fully exploited in existing algorithms.In this work,we recognize a learning graph characterizing the dependence between individual rewards and individual policies.Then we propose a graph-based reward aggregation(GRA)method,which utilizes the inherent coupling relationship among agents to eliminate redundant information.Specifically,GRA passes information among cooperating agents through graph attention networks to obtain aggregated rewards that contribute to the fitting of the value function,making each agent learn a decentralized executable cooperation policy.In addition,we propose a variant of GRA,named GRA-decen,which achieves decentralized training and decentralized execution(DTDE)when each agent only has access to information of partial agents in the learning process.We conduct experiments in different environments and demonstrate the practicality and scalability of our algorithms.展开更多
In dynamic and uncertain reconnaissance missions,effective task assignment and path planning for multiple unmanned aerial vehicles(UAVs)present significant challenges.A stochastic multi-UAV reconnaissance scheduling p...In dynamic and uncertain reconnaissance missions,effective task assignment and path planning for multiple unmanned aerial vehicles(UAVs)present significant challenges.A stochastic multi-UAV reconnaissance scheduling problem is formulated as a combinatorial optimization task with nonlinear objectives and coupled constraints.To solve the non-deterministic polynomial(NP)-hard problem efficiently,a novel learning-enhanced pigeon-inspired optimization(L-PIO)algorithm is proposed.The algorithm integrates a Q-learning mechanism to dynamically regulate control parameters,enabling adaptive exploration–exploitation trade-offs across different optimization phases.Additionally,geometric abstraction techniques are employed to approximate complex reconnaissance regions using maximum inscribed rectangles and spiral path models,allowing for precise cost modeling of UAV paths.The formal objective function is developed to minimize global flight distance and completion time while maximizing reconnaissance priority and task coverage.A series of simulation experiments are conducted under three scenarios:static task allocation,dynamic task emergence,and UAV failure recovery.Comparative analysis with several updated algorithms demonstrates that L-PIO exhibits superior robustness,adaptability,and computational efficiency.The results verify the algorithm's effectiveness in addressing dynamic reconnaissance task planning in real-time multi-UAV applications.展开更多
This paper presents an adaptive multi-agent coordination(AMAC)strategy suitable for complex scenarios,which only requires information exchange between neighbouring robots.Unlike traditional multi-agent coordination me...This paper presents an adaptive multi-agent coordination(AMAC)strategy suitable for complex scenarios,which only requires information exchange between neighbouring robots.Unlike traditional multi-agent coordination methods that are solved by neural dynamics,the proposed strategy displays greater flexibility,adaptability and scalability.Furthermore,the proposed AMAC strategy is reconstructed as a time-varying complex-valued matrix equation.By introducing a dynamic error function,a fixed-time convergent zeroing neural network(FTCZNN)model is designed for the online solution of the AMAC strategy,with its convergence time upper bound derived theoretically.Finally,the effectiveness and applicability of the coordination control method are demonstrated by numerical simulations and physical experiments.Numerical results indicate that this method can reduce the formation error to the order of 10^(-6)within 1.8 s.展开更多
This paper addresses the synchronization of follower agents’state vectors with that of a leader in high-order nonlinear multi-agent systems.The proposed low-complexity control scheme employs high-gain observers to es...This paper addresses the synchronization of follower agents’state vectors with that of a leader in high-order nonlinear multi-agent systems.The proposed low-complexity control scheme employs high-gain observers to estimate higher-order synchronization errors,enabling the controller to rely solely on relative output measurements.This approach significantly reduces the dependence on full-state information,which is often infeasible or costly in practical engineering applications.An output feedback control strategy is developed to overcome these limitations while ensuring robust and effective synchronization.Simulation results are provided to demonstrate the effectiveness of the proposed approach and validate the theoretical findings.展开更多
Multi-Agent Systems(MAS),which consist of multiple interacting agents,are crucial in Cyber-Physical Systems(CPS),because they improve system adaptability,efficiency,and robustness through parallel processing and colla...Multi-Agent Systems(MAS),which consist of multiple interacting agents,are crucial in Cyber-Physical Systems(CPS),because they improve system adaptability,efficiency,and robustness through parallel processing and collaboration.However,most existing unsupervised meta-learning methods are centralized and not suitable for multi-agent systems where data are distributed stored and inaccessible to all agents.Meta-GMVAE,based on Variational Autoencoder(VAE)and set-level variational inference,represents a sophisticated unsupervised meta-learning model that improves generative performance by efficiently learning data representations across various tasks,increasing adaptability and reducing sample requirements.Inspired by these advancements,we propose a novel Distributed Unsupervised Meta-Learning(DUML)framework based on Meta-GMVAE and a fusion strategy.Furthermore,we present a DUML algorithm based on Gaussian Mixture Model(DUMLGMM),where the parameters of the Gaussian-mixture are solved by an Expectation-Maximization algorithm.Simulations on Omniglot and Mini Image Net datasets show that DUMLGMM can achieve the performance of the corresponding centralized algorithm and outperform non-cooperative algorithm.展开更多
This paper focuses on the leader-following positive consensus problems of heterogeneous switched multi-agent systems.First,a state-feedback controller with dynamic compensation is introduced to achieve positive consen...This paper focuses on the leader-following positive consensus problems of heterogeneous switched multi-agent systems.First,a state-feedback controller with dynamic compensation is introduced to achieve positive consensus under average dwell time switching.Then sufficient conditions are derived to guarantee the positive consensus.The gain matrices of the control protocol are described using a matrix decomposition approach and the corresponding computational complexity is reduced by resorting to linear programming and co-positive Lyapunov functions.Finally,two numerical examples are provided to illustrate the results obtained.展开更多
In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Mu...In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Multi-agent reinforcement learning(MARL)overcomes this limitation by allowing several agents to learn simultaneously within a shared environment,each choosing actions that maximize its own or the group's rewards.By explicitly modeling and exploiting agent-to-agent dynamics,MARL can align those interactions with pedagogical goals such as peer tutoring,collaborative problem-solving,or gamified competition,thus opening richer avenues for adaptive and socially informed learning experiences.This survey investigates the impact of MARL on educational outcomes by examining evidence of its effectiveness in enhancing learner performance,engagement,equity,and reducing teacher workload compared to single agent or traditional approaches.It explores the educational domains and pedagogical problems addressed by MARL,identifies the algorithmic families used,and analyzes their influence on learning.The review also assesses experimental settings and evaluation metrics to determine ecological validity,and outlines current challenges and future research directions in applying MARL to education.展开更多
With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier...With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier heterogeneous architecture composed of mobile devices,unmanned aerial vehicles(UAVs),and macro base stations(BSs).This scenario typically faces fast channel fading,dynamic computational loads,and energy constraints,whereas classical queuing-theoretic or convex-optimization approaches struggle to yield robust solutions in highly dynamic settings.To address this issue,we formulate a multi-agent Markov decision process(MDP)for an air-ground-fused MEC system,unify link selection,bandwidth/power allocation,and task offloading into a continuous action space and propose a joint scheduling strategy that is based on an improved MATD3 algorithm.The improvements include Alternating Layer Normalization(ALN)in the actor to suppress gradient variance,Residual Orthogonalization(RO)in the critic to reduce the correlation between the twin Q-value estimates,and a dynamic-temperature reward to enable adaptive trade-offs during training.On a multi-user,dual-link simulation platform,we conduct ablation and baseline comparisons.The results reveal that the proposed method has better convergence and stability.Compared with MADDPG,TD3,and DSAC,our algorithm achieves more robust performance across key metrics.展开更多
To maximize the profits of power grid operators(GOs),load aggregators(LAs)and electricity customers(ECs),this paper proposes a hierarchical demand response(HDR)framework that considers competing interaction based on m...To maximize the profits of power grid operators(GOs),load aggregators(LAs)and electricity customers(ECs),this paper proposes a hierarchical demand response(HDR)framework that considers competing interaction based on multiagent deep deterministic policy gradient(MaDDPG).The ECs are divided into conventional ECs and the electric vehicles(EVs)which are managed by ECs agent(ECA)and EV agent(EVA)to exploit the flexibility of the HDR framework.Thus,the HDR is a tri-layer model determined by five types of agents engaging in competing interaction to maximize their own profits.To address the limitations of mathematical expression and participation scale in the Stackelberg game within the HDR model,a dynamic interaction mechanism is adopted.Moreover,to tackle the HDR involving various entities,the MaDDPG develops multiple agents to simulation the dynamic competing interactions between each subject as well as solve the problem of continuous action control.Furthermore,MaDDPG adopts soft target update and priority experience replay method to ensure stable and effective training,and makes the exploration strategy comprehensive by using exploration noise.Simulation studies are conducted to verify the performance of the MaDDPG with dynamic interaction mechanism in dealing with multilayer multi-agent continuous action control,compared to the double deep Q network(DDQN),deep Q network(DQN)and dueling DQN.Additionally,comparisons among the proposed HDR with the price based DR(PBDR)and incentive based DR(IBDR)are analyzed to investigate the flexibility of the HDR.展开更多
Q-learning is a classical reinforcement learning method with broad applicability.It can respond effectively to environmental changes and provide flexible strategies,making it suitable for solving robot path-planning p...Q-learning is a classical reinforcement learning method with broad applicability.It can respond effectively to environmental changes and provide flexible strategies,making it suitable for solving robot path-planning problems.However,Q-learning faces challenges in search and update efficiency.To address these issues,we propose an improved Q-learning(IQL)algorithm.We use an enhanced Ant Colony Optimization(ACO)algorithmto optimizeQtable initialization.We also introduce the UCH mechanism to refine the reward function and overcome the exploration dilemma.The IQL algorithm is extensively tested in three grid environments of different scales.The results validate the accuracy of themethod and demonstrate superior path-planning performance compared to traditional approaches.The algorithm reduces the number of trials required for convergence,improves learning efficiency,and enables faster adaptation to environmental changes.It also enhances stability and accuracy by reducing the standard deviation of trials to zero.On grid maps of different sizes,IQL achieves higher expected returns.Compared with the original Q-learning algorithm,IQL improves performance by 12.95%,18.28%,and 7.98% on 10*10,20*20,and 30*30 maps,respectively.The proposed algorithm has promising applications in robotics,path planning,intelligent transportation,aerospace,and game development.展开更多
This paper investigates the consensus tracking control problem for high order nonlinear multi-agent systems subject to non-affine faults,partial measurable states,uncertain control coefficients,and unknown external di...This paper investigates the consensus tracking control problem for high order nonlinear multi-agent systems subject to non-affine faults,partial measurable states,uncertain control coefficients,and unknown external disturbances.Under the directed topology conditions,an observer-based finite-time control strategy based on adaptive backstepping and is proposed,in which a neural network-based state observer is employed to approximate the unmeasurable system state variables.To address the complexity explosion problem associated with the backstepping method,a finite-time command filter is incorporated,with error compensation signals designed to mitigate the filter-induced errors.Additionally,the Butterworth low-pass filter is introduced to avoid the algebraic ring problem in the design of the controller.The finite-time stability of the closed-loop system is rigorously analyzed with the finite-time Lyapunov stability criterion,validating that all closed-loop signals of the system remain bounded within a finite time.Finally,the effectiveness of the proposed control strategy is verified through a simulation example.展开更多
Addressing optimal confrontation methods in multi-agent attack-defense scenarios is a complex challenge.Multi-Agent Reinforcement Learning(MARL)provides an effective framework for tackling sequential decision-making p...Addressing optimal confrontation methods in multi-agent attack-defense scenarios is a complex challenge.Multi-Agent Reinforcement Learning(MARL)provides an effective framework for tackling sequential decision-making problems,significantly enhancing swarm intelligence in maneuvering.However,applying MARL to unmanned swarms presents two primary challenges.First,defensive agents must balance autonomy with collaboration under limited perception while coordinating against adversaries.Second,current algorithms aim to maximize global or individual rewards,making them sensitive to fluctuations in enemy strategies and environmental changes,especially when rewards are sparse.To tackle these issues,we propose an algorithm of MultiAgent Reinforcement Learning with Layered Autonomy and Collaboration(MARL-LAC)for collaborative confrontations.This algorithm integrates dual twin Critics to mitigate the high variance associated with policy gradients.Furthermore,MARL-LAC employs layered autonomy and collaboration to address multi-objective problems,specifically learning a global reward function for the swarm alongside local reward functions for individual defensive agents.Experimental results demonstrate that MARL-LAC enhances decision-making and collaborative behaviors among agents,outperforming the existing algorithms and emphasizing the importance of layered autonomy and collaboration in multi-agent systems.The observed adversarial behaviors demonstrate that agents using MARL-LAC effectively maintain cohesive formations that conceal their intentions by confusing the offensive agent while successfully encircling the target.展开更多
With the expanding applications of unmanned aerial vehicles(UAVs),precise flight evaluation has emerged as a critical enabler for efficient path planning,directly impacting operational performance and safety.Tradition...With the expanding applications of unmanned aerial vehicles(UAVs),precise flight evaluation has emerged as a critical enabler for efficient path planning,directly impacting operational performance and safety.Traditional path planning algorithms typically combine Dubins curves with local optimization to minimize trajectory length under 3D spatial constraints.However,these methods often overlook the correlation between pilot control quality and UAV flight dynamics,limiting their adaptability in complex scenarios.In this paper,we propose an intelligent flight evaluation model specifically designed to enhancemulti-waypoint trajectory optimization algorithms.Our model leverages a decision tree to integrate attitude parameters and trajectory matching metrics,establishing a quantitative link between pilot control quality and UAV flight states.Experimental results demonstrate that the proposed model not only accurately assesses pilot performance across diverse skill levels but also improves the optimality of generated trajectories.When integrated with our path planning algorithm,it efficiently produces optimal trajectories while strictly adhering to UAV flight constraints.This integrated framework highlights significant potential for real-time UAV training,performance assessment,and adaptive mission planning applications.展开更多
Rapidly-exploring Random Tree(RRT)and its variants have become foundational in path-planning research,yet in complex three-dimensional off-road environments their uniform blind sampling and limited safety guarantees l...Rapidly-exploring Random Tree(RRT)and its variants have become foundational in path-planning research,yet in complex three-dimensional off-road environments their uniform blind sampling and limited safety guarantees lead to slow convergence and force an unfavorable trade-off between path quality and traversal safety.To address these challenges,we introduce HS-APF-RRT*,a novel algorithm that fuses layered sampling,an enhanced Artificial Potential Field(APF),and a dynamic neighborhood-expansion mechanism.First,the workspace is hierarchically partitioned into macro,meso,and micro sampling layers,progressively biasing random samples toward safer,lower-energy regions.Second,we augment the traditional APF by incorporating a slope-dependent repulsive term,enabling stronger avoidance of steep obstacles.Third,a dynamic expansion strategy adaptively switches between 8 and 16 connected neighborhoods based on local obstacle density,striking an effective balance between search efficiency and collision-avoidance precision.In simulated off-road scenarios,HS-APF-RRT*is benchmarked against RRT*,GoalBiased RRT*,and APF-RRT*,and demonstrates significantly faster convergence,lower path-energy consumption,and enhanced safety margins.展开更多
Multimodal dialogue systems often fail to maintain coherent reasoning over extended conversations and suffer from hallucination due to limited context modeling capabilities.Current approaches struggle with crossmodal ...Multimodal dialogue systems often fail to maintain coherent reasoning over extended conversations and suffer from hallucination due to limited context modeling capabilities.Current approaches struggle with crossmodal alignment,temporal consistency,and robust handling of noisy or incomplete inputs across multiple modalities.We propose Multi Agent-Chain of Thought(CoT),a novel multi-agent chain-of-thought reasoning framework where specialized agents for text,vision,and speech modalities collaboratively construct shared reasoning traces through inter-agent message passing and consensus voting mechanisms.Our architecture incorporates self-reflection modules,conflict resolution protocols,and dynamic rationale alignment to enhance consistency,factual accuracy,and user engagement.The framework employs a hierarchical attention mechanism with cross-modal fusion and implements adaptive reasoning depth based on dialogue complexity.Comprehensive evaluations on Situated Interactive Multi-Modal Conversations(SIMMC)2.0,VisDial v1.0,and newly introduced challenging scenarios demonstrate statistically significant improvements in grounding accuracy(p<0.01),chain-of-thought interpretability,and robustness to adversarial inputs compared to state-of-the-art monolithic transformer baselines and existing multi-agent approaches.展开更多
Environmental problems are intensifying due to the rapid growth of the population,industry,and urban infrastructure.This expansion has resulted in increased air and water pollution,intensified urban heat island effect...Environmental problems are intensifying due to the rapid growth of the population,industry,and urban infrastructure.This expansion has resulted in increased air and water pollution,intensified urban heat island effects,and greater runoff from parks and other green spaces.Addressing these challenges requires prioritizing green infrastructure and other sustainable urban development strategies.This study introduces a novel Integrated Decision Support System that combines Pythagorean Fuzzy Sets with the Advanced Alternative Ranking Order Method allowing for Two-Step Normalization(AAROM-TN),enhanced by a dual weighting strategy.The weighting approach integrates the Criteria Importance Through Intercriteria Correlation(CRITIC)method with the Criteria Importance through Means and Standard Deviation(CIMAS)technique.The originality of the proposed framework lies in its ability to objectively quantify criteria importance using CRITIC,incorporate decision-makers’preferences through CIMAS,and capture the uncertainty and hesitation inherent in human judgment via Pythagorean Fuzzy Sets.A case study evaluating green infrastructure alternatives in metropolitan regions demonstrates the applicability and effectiveness of the framework.A sensitivity analysis is conducted to examine how variations in criteria weights affect the rankings and to evaluate the robustness of the results.Furthermore,a comparative analysis highlights the practical and financial implications of each alternative by assessing their respective strengths and weaknesses.展开更多
This study examines the methods to plan the development of offshore oilfields over the years,which are used to support the decision-making on the development of offshore oilfields.About 100 papers are analysed and cat...This study examines the methods to plan the development of offshore oilfields over the years,which are used to support the decision-making on the development of offshore oilfields.About 100 papers are analysed and categorised into different groups of main early-stage decisions.The present study stands in contrast to the contributions of the operations research and system engineering review articles,on the one hand,and the petroleum engineering review articles,on the other.This is because it does not focus on one methodological approach,nor does it limit the literature analysis by offshore oilfield characteristics.Consequently,the present analysis may offer valuable insights,for instance,by identifying environmental planning decisions as a recent yet highly significant concern that is currently being imposed on decision-making process.Thus,it is evident that the incorporation of safety criteria within the technical-economic decision-making process for the design of production systems would be a crucial requirement at development phase.展开更多
文摘To address the shortcomings of traditional Genetic Algorithm (GA) in multi-agent path planning, such as prolonged planning time, slow convergence, and solution instability, this paper proposes an Asynchronous Genetic Algorithm (AGA) to solve multi-agent path planning problems effectively. To enhance the real-time performance and computational efficiency of Multi-Agent Systems (MAS) in path planning, the AGA incorporates an Equal-Size Clustering Algorithm (ESCA) based on the K-means clustering method. The ESCA divides the primary task evenly into a series of subtasks, thereby reducing the gene length in the subsequent GA process. The algorithm then employs GA to solve each subtask sequentially. To evaluate the effectiveness of the proposed method, a simulation program was designed to perform path planning for 100 trajectories, and the results were compared with those of State-Of-The-Art (SOTA) methods. The simulation results demonstrate that, although the solutions provided by AGA are suboptimal, it exhibits significant advantages in terms of execution speed and solution stability compared to other algorithms.
基金supported by the National Natural Science Foundation of China(No.U2433214)。
文摘Shenzhen,a major city in southern China,has experienced rapid advancements in Unmanned Aerial Vehicle(UAV)technology,resulting in extensive logistics networks with thousands of daily flights.However,frequent disruptions due to its subtropical monsoon climate,including typhoons and gusty winds,present ongoing challenges.Despite the growing focus on operational costs and third-party risks,research on low-altitude urban wind fields remains scarce.This study addresses this gap by integrating wind field analysis into UAV path planning,introducing key innovations to the classical model.First,UAV wind resistance and turbulence constraints are analyzed,mapping high-wind-speed and turbulence-prone zones in the airspace.Second,wind dynamics are incorporated into path planning by considering airspeed and groundspeed variation,optimizing waypoint selection and flight speed adjustments to improve overall energy efficiency.Additionally,a wind-aware Theta*algorithm is proposed,leveraging wind vectors to expedite search process,while Computational Fluid Dynamics(CFD)techniques are employed to calculate wind fields.A case study of Shenzhen,examining wind patterns over the past decade,demonstrates a 6.23%improvement in groundspeed and a 7.69%reduction in energy consumption compared to wind-agnostic models.This framework advances UAV logistics by enhancing route safety and energy efficiency,contributing to more cost-effective operations.
基金National Natural Science Foundation of China(32301712)Natural Science Foundation of Jiangsu Province(BK20230548,BK20250876)+2 种基金Project of Faculty of Agricultural Equipment of Jiangsu University(NGXB20240203)A Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD-2023-87)Open Funding Project of the Key Laboratory of Modern Agricultural Equipment and Technology(Jiangsu University),Ministry of Education(MAET202101)。
文摘Traditional sampling-based path planning algorithms,such as the rapidly-exploring random tree star(RRT^(*)),encounter critical limitations in unstructured orchard environments,including low sampling efficiency in narrow passages,slow convergence,and high computational costs.To address these challenges,this paper proposes a novel hybrid global path planning algorithm integrating Gaussian sampling and quadtree optimization(RRT^(*)-GSQ).This methodology aims to enhance path planning by synergistically combining a Gaussian mixture sampling strategy to improve node generation in critical regions,an adaptive step-size and direction optimization mechanism for enhanced obstacle avoidance,a Quadtree-AABB collision detection framework to lower computational complexity,and a dynamic iteration control strategy for more efficient convergence.In obstacle-free and obstructed scenarios,compared with the conventional RRT^(*),the proposed algorithm reduced the number of node evaluations by 67.57%and 62.72%,and decreased the search time by 79.72%and 78.52%,respectively.In path tracking tests,the proposed algorithm achieved substantial reductions in RMSE of the final path compared to the conventional RRT^(*).Specifically,the lateral RMSE was reduced by 41.5%in obstacle-free environments and 59.3%in obstructed environments,while the longitudinal RMSE was reduced by 57.2%and 58.5%,respectively.Furthermore,the maximum absolute errors in both lateral and longitudinal directions were constrained within 0.75 m.Field validation experiments in an operational orchard confirmed the algorithm's practical effectiveness,showing reductions in the mean tracking error of 47.6%(obstacle-free)and 58.3%(with obstructed),alongside a 5.1%and 7.2%shortening of the path length compared to the baseline method.The proposed algorithm effectively enhances path planning efficiency and navigation accuracy for robots,presenting a superior solution for high-precision autonomous navigation of agricultural robots in orchard environments and holding significant value for engineering applications.
基金supported in part by the National Natural Science Foundation of China(grants 62203073 and 62573068)the Natural Science Foundation of Chongqing,China(grant CSTB2022NSCQMSX0577)。
文摘Multi-agent reinforcement learning(MARL)has proven its effectiveness in cooperative multi-agent systems(MASs)but still faces issues on the curse of dimensionality and learning efficiency.The main difficulty is caused by the strong inter-agent coupling nature embedded in an MARL problem,which is yet to be fully exploited in existing algorithms.In this work,we recognize a learning graph characterizing the dependence between individual rewards and individual policies.Then we propose a graph-based reward aggregation(GRA)method,which utilizes the inherent coupling relationship among agents to eliminate redundant information.Specifically,GRA passes information among cooperating agents through graph attention networks to obtain aggregated rewards that contribute to the fitting of the value function,making each agent learn a decentralized executable cooperation policy.In addition,we propose a variant of GRA,named GRA-decen,which achieves decentralized training and decentralized execution(DTDE)when each agent only has access to information of partial agents in the learning process.We conduct experiments in different environments and demonstrate the practicality and scalability of our algorithms.
基金supported by the National Natural Science Foundation of China(Nos.T2121003,U24B20156)Open Fund of the National Key Laboratory of Helicopter Aeromechanics(No.2024-ZSJ-LB-02-06)。
文摘In dynamic and uncertain reconnaissance missions,effective task assignment and path planning for multiple unmanned aerial vehicles(UAVs)present significant challenges.A stochastic multi-UAV reconnaissance scheduling problem is formulated as a combinatorial optimization task with nonlinear objectives and coupled constraints.To solve the non-deterministic polynomial(NP)-hard problem efficiently,a novel learning-enhanced pigeon-inspired optimization(L-PIO)algorithm is proposed.The algorithm integrates a Q-learning mechanism to dynamically regulate control parameters,enabling adaptive exploration–exploitation trade-offs across different optimization phases.Additionally,geometric abstraction techniques are employed to approximate complex reconnaissance regions using maximum inscribed rectangles and spiral path models,allowing for precise cost modeling of UAV paths.The formal objective function is developed to minimize global flight distance and completion time while maximizing reconnaissance priority and task coverage.A series of simulation experiments are conducted under three scenarios:static task allocation,dynamic task emergence,and UAV failure recovery.Comparative analysis with several updated algorithms demonstrates that L-PIO exhibits superior robustness,adaptability,and computational efficiency.The results verify the algorithm's effectiveness in addressing dynamic reconnaissance task planning in real-time multi-UAV applications.
基金supported by the National Natural Science Foundation of China under Grants 61962023,61562029 and 62466019.
文摘This paper presents an adaptive multi-agent coordination(AMAC)strategy suitable for complex scenarios,which only requires information exchange between neighbouring robots.Unlike traditional multi-agent coordination methods that are solved by neural dynamics,the proposed strategy displays greater flexibility,adaptability and scalability.Furthermore,the proposed AMAC strategy is reconstructed as a time-varying complex-valued matrix equation.By introducing a dynamic error function,a fixed-time convergent zeroing neural network(FTCZNN)model is designed for the online solution of the AMAC strategy,with its convergence time upper bound derived theoretically.Finally,the effectiveness and applicability of the coordination control method are demonstrated by numerical simulations and physical experiments.Numerical results indicate that this method can reduce the formation error to the order of 10^(-6)within 1.8 s.
文摘This paper addresses the synchronization of follower agents’state vectors with that of a leader in high-order nonlinear multi-agent systems.The proposed low-complexity control scheme employs high-gain observers to estimate higher-order synchronization errors,enabling the controller to rely solely on relative output measurements.This approach significantly reduces the dependence on full-state information,which is often infeasible or costly in practical engineering applications.An output feedback control strategy is developed to overcome these limitations while ensuring robust and effective synchronization.Simulation results are provided to demonstrate the effectiveness of the proposed approach and validate the theoretical findings.
基金supported by the National Natural Science Foundation of China Youth Fund(No.62101579)。
文摘Multi-Agent Systems(MAS),which consist of multiple interacting agents,are crucial in Cyber-Physical Systems(CPS),because they improve system adaptability,efficiency,and robustness through parallel processing and collaboration.However,most existing unsupervised meta-learning methods are centralized and not suitable for multi-agent systems where data are distributed stored and inaccessible to all agents.Meta-GMVAE,based on Variational Autoencoder(VAE)and set-level variational inference,represents a sophisticated unsupervised meta-learning model that improves generative performance by efficiently learning data representations across various tasks,increasing adaptability and reducing sample requirements.Inspired by these advancements,we propose a novel Distributed Unsupervised Meta-Learning(DUML)framework based on Meta-GMVAE and a fusion strategy.Furthermore,we present a DUML algorithm based on Gaussian Mixture Model(DUMLGMM),where the parameters of the Gaussian-mixture are solved by an Expectation-Maximization algorithm.Simulations on Omniglot and Mini Image Net datasets show that DUMLGMM can achieve the performance of the corresponding centralized algorithm and outperform non-cooperative algorithm.
基金supported by the National Natural Science Foundation of China(62463007,62463005)the Natural Science Foundation of Hainan Province(625RC710,625MS047)+1 种基金the System Control and Information Processing Education Ministry Key Laboratory Open Funding,China(Scip20240119)the Science Research Funding of Hainan University,China(KYQD(ZR)22180,KYQD(ZR)23180).
文摘This paper focuses on the leader-following positive consensus problems of heterogeneous switched multi-agent systems.First,a state-feedback controller with dynamic compensation is introduced to achieve positive consensus under average dwell time switching.Then sufficient conditions are derived to guarantee the positive consensus.The gain matrices of the control protocol are described using a matrix decomposition approach and the corresponding computational complexity is reduced by resorting to linear programming and co-positive Lyapunov functions.Finally,two numerical examples are provided to illustrate the results obtained.
文摘In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Multi-agent reinforcement learning(MARL)overcomes this limitation by allowing several agents to learn simultaneously within a shared environment,each choosing actions that maximize its own or the group's rewards.By explicitly modeling and exploiting agent-to-agent dynamics,MARL can align those interactions with pedagogical goals such as peer tutoring,collaborative problem-solving,or gamified competition,thus opening richer avenues for adaptive and socially informed learning experiences.This survey investigates the impact of MARL on educational outcomes by examining evidence of its effectiveness in enhancing learner performance,engagement,equity,and reducing teacher workload compared to single agent or traditional approaches.It explores the educational domains and pedagogical problems addressed by MARL,identifies the algorithmic families used,and analyzes their influence on learning.The review also assesses experimental settings and evaluation metrics to determine ecological validity,and outlines current challenges and future research directions in applying MARL to education.
文摘With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier heterogeneous architecture composed of mobile devices,unmanned aerial vehicles(UAVs),and macro base stations(BSs).This scenario typically faces fast channel fading,dynamic computational loads,and energy constraints,whereas classical queuing-theoretic or convex-optimization approaches struggle to yield robust solutions in highly dynamic settings.To address this issue,we formulate a multi-agent Markov decision process(MDP)for an air-ground-fused MEC system,unify link selection,bandwidth/power allocation,and task offloading into a continuous action space and propose a joint scheduling strategy that is based on an improved MATD3 algorithm.The improvements include Alternating Layer Normalization(ALN)in the actor to suppress gradient variance,Residual Orthogonalization(RO)in the critic to reduce the correlation between the twin Q-value estimates,and a dynamic-temperature reward to enable adaptive trade-offs during training.On a multi-user,dual-link simulation platform,we conduct ablation and baseline comparisons.The results reveal that the proposed method has better convergence and stability.Compared with MADDPG,TD3,and DSAC,our algorithm achieves more robust performance across key metrics.
基金supported by the National Natural Science Foundation of China(No.52477097)the GuangDong Basic and Applied Basic Research Foundation(2023A1515240014)the State Key Laboratory of Advanced Electromagnetic Technology(Grant No.AET 2024KF005).
文摘To maximize the profits of power grid operators(GOs),load aggregators(LAs)and electricity customers(ECs),this paper proposes a hierarchical demand response(HDR)framework that considers competing interaction based on multiagent deep deterministic policy gradient(MaDDPG).The ECs are divided into conventional ECs and the electric vehicles(EVs)which are managed by ECs agent(ECA)and EV agent(EVA)to exploit the flexibility of the HDR framework.Thus,the HDR is a tri-layer model determined by five types of agents engaging in competing interaction to maximize their own profits.To address the limitations of mathematical expression and participation scale in the Stackelberg game within the HDR model,a dynamic interaction mechanism is adopted.Moreover,to tackle the HDR involving various entities,the MaDDPG develops multiple agents to simulation the dynamic competing interactions between each subject as well as solve the problem of continuous action control.Furthermore,MaDDPG adopts soft target update and priority experience replay method to ensure stable and effective training,and makes the exploration strategy comprehensive by using exploration noise.Simulation studies are conducted to verify the performance of the MaDDPG with dynamic interaction mechanism in dealing with multilayer multi-agent continuous action control,compared to the double deep Q network(DDQN),deep Q network(DQN)and dueling DQN.Additionally,comparisons among the proposed HDR with the price based DR(PBDR)and incentive based DR(IBDR)are analyzed to investigate the flexibility of the HDR.
基金Financial supports from the National Natural Science Foundation of China(GrantNo.52374123&51974144)Project of Liaoning Provincial Department of Education(GrantNo.LJKZ0340)Liaoning Revitalization Talents Program(Grant No.XLYC2211085)are greatly acknowledged.
文摘Q-learning is a classical reinforcement learning method with broad applicability.It can respond effectively to environmental changes and provide flexible strategies,making it suitable for solving robot path-planning problems.However,Q-learning faces challenges in search and update efficiency.To address these issues,we propose an improved Q-learning(IQL)algorithm.We use an enhanced Ant Colony Optimization(ACO)algorithmto optimizeQtable initialization.We also introduce the UCH mechanism to refine the reward function and overcome the exploration dilemma.The IQL algorithm is extensively tested in three grid environments of different scales.The results validate the accuracy of themethod and demonstrate superior path-planning performance compared to traditional approaches.The algorithm reduces the number of trials required for convergence,improves learning efficiency,and enables faster adaptation to environmental changes.It also enhances stability and accuracy by reducing the standard deviation of trials to zero.On grid maps of different sizes,IQL achieves higher expected returns.Compared with the original Q-learning algorithm,IQL improves performance by 12.95%,18.28%,and 7.98% on 10*10,20*20,and 30*30 maps,respectively.The proposed algorithm has promising applications in robotics,path planning,intelligent transportation,aerospace,and game development.
基金supported in part by the Beijing Natural Science Foundation under Grant 4252050in part by the National Science Fund for Distinguished Young Scholars under Grant 62425304in part by the Basic Science Center Programs of NSFC under Grant 62088101.
文摘This paper investigates the consensus tracking control problem for high order nonlinear multi-agent systems subject to non-affine faults,partial measurable states,uncertain control coefficients,and unknown external disturbances.Under the directed topology conditions,an observer-based finite-time control strategy based on adaptive backstepping and is proposed,in which a neural network-based state observer is employed to approximate the unmeasurable system state variables.To address the complexity explosion problem associated with the backstepping method,a finite-time command filter is incorporated,with error compensation signals designed to mitigate the filter-induced errors.Additionally,the Butterworth low-pass filter is introduced to avoid the algebraic ring problem in the design of the controller.The finite-time stability of the closed-loop system is rigorously analyzed with the finite-time Lyapunov stability criterion,validating that all closed-loop signals of the system remain bounded within a finite time.Finally,the effectiveness of the proposed control strategy is verified through a simulation example.
基金co-supported by the National Natural Science Foundation of China(Nos.72371052 and 71871042).
文摘Addressing optimal confrontation methods in multi-agent attack-defense scenarios is a complex challenge.Multi-Agent Reinforcement Learning(MARL)provides an effective framework for tackling sequential decision-making problems,significantly enhancing swarm intelligence in maneuvering.However,applying MARL to unmanned swarms presents two primary challenges.First,defensive agents must balance autonomy with collaboration under limited perception while coordinating against adversaries.Second,current algorithms aim to maximize global or individual rewards,making them sensitive to fluctuations in enemy strategies and environmental changes,especially when rewards are sparse.To tackle these issues,we propose an algorithm of MultiAgent Reinforcement Learning with Layered Autonomy and Collaboration(MARL-LAC)for collaborative confrontations.This algorithm integrates dual twin Critics to mitigate the high variance associated with policy gradients.Furthermore,MARL-LAC employs layered autonomy and collaboration to address multi-objective problems,specifically learning a global reward function for the swarm alongside local reward functions for individual defensive agents.Experimental results demonstrate that MARL-LAC enhances decision-making and collaborative behaviors among agents,outperforming the existing algorithms and emphasizing the importance of layered autonomy and collaboration in multi-agent systems.The observed adversarial behaviors demonstrate that agents using MARL-LAC effectively maintain cohesive formations that conceal their intentions by confusing the offensive agent while successfully encircling the target.
基金funded in part by the Fundamental Research Funds for the Central Universities under Grant NS2023052in part by the Natural Science Foundation of Jiangsu Province of China under Grants No.BK20231439 and No.BK20222012.
文摘With the expanding applications of unmanned aerial vehicles(UAVs),precise flight evaluation has emerged as a critical enabler for efficient path planning,directly impacting operational performance and safety.Traditional path planning algorithms typically combine Dubins curves with local optimization to minimize trajectory length under 3D spatial constraints.However,these methods often overlook the correlation between pilot control quality and UAV flight dynamics,limiting their adaptability in complex scenarios.In this paper,we propose an intelligent flight evaluation model specifically designed to enhancemulti-waypoint trajectory optimization algorithms.Our model leverages a decision tree to integrate attitude parameters and trajectory matching metrics,establishing a quantitative link between pilot control quality and UAV flight states.Experimental results demonstrate that the proposed model not only accurately assesses pilot performance across diverse skill levels but also improves the optimality of generated trajectories.When integrated with our path planning algorithm,it efficiently produces optimal trajectories while strictly adhering to UAV flight constraints.This integrated framework highlights significant potential for real-time UAV training,performance assessment,and adaptive mission planning applications.
基金supported in part by 14th Five Year National Key R&D Program Project(Project Number:2023YFB3211001)the National Natural Science Foundation of China(62273339,U24A201397).
文摘Rapidly-exploring Random Tree(RRT)and its variants have become foundational in path-planning research,yet in complex three-dimensional off-road environments their uniform blind sampling and limited safety guarantees lead to slow convergence and force an unfavorable trade-off between path quality and traversal safety.To address these challenges,we introduce HS-APF-RRT*,a novel algorithm that fuses layered sampling,an enhanced Artificial Potential Field(APF),and a dynamic neighborhood-expansion mechanism.First,the workspace is hierarchically partitioned into macro,meso,and micro sampling layers,progressively biasing random samples toward safer,lower-energy regions.Second,we augment the traditional APF by incorporating a slope-dependent repulsive term,enabling stronger avoidance of steep obstacles.Third,a dynamic expansion strategy adaptively switches between 8 and 16 connected neighborhoods based on local obstacle density,striking an effective balance between search efficiency and collision-avoidance precision.In simulated off-road scenarios,HS-APF-RRT*is benchmarked against RRT*,GoalBiased RRT*,and APF-RRT*,and demonstrates significantly faster convergence,lower path-energy consumption,and enhanced safety margins.
文摘Multimodal dialogue systems often fail to maintain coherent reasoning over extended conversations and suffer from hallucination due to limited context modeling capabilities.Current approaches struggle with crossmodal alignment,temporal consistency,and robust handling of noisy or incomplete inputs across multiple modalities.We propose Multi Agent-Chain of Thought(CoT),a novel multi-agent chain-of-thought reasoning framework where specialized agents for text,vision,and speech modalities collaboratively construct shared reasoning traces through inter-agent message passing and consensus voting mechanisms.Our architecture incorporates self-reflection modules,conflict resolution protocols,and dynamic rationale alignment to enhance consistency,factual accuracy,and user engagement.The framework employs a hierarchical attention mechanism with cross-modal fusion and implements adaptive reasoning depth based on dialogue complexity.Comprehensive evaluations on Situated Interactive Multi-Modal Conversations(SIMMC)2.0,VisDial v1.0,and newly introduced challenging scenarios demonstrate statistically significant improvements in grounding accuracy(p<0.01),chain-of-thought interpretability,and robustness to adversarial inputs compared to state-of-the-art monolithic transformer baselines and existing multi-agent approaches.
基金supported by the Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2026R259)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.Ashit Kumar Dutta would like to thank AlMaarefa University for supporting this research under project number MHIRSP2025017.
文摘Environmental problems are intensifying due to the rapid growth of the population,industry,and urban infrastructure.This expansion has resulted in increased air and water pollution,intensified urban heat island effects,and greater runoff from parks and other green spaces.Addressing these challenges requires prioritizing green infrastructure and other sustainable urban development strategies.This study introduces a novel Integrated Decision Support System that combines Pythagorean Fuzzy Sets with the Advanced Alternative Ranking Order Method allowing for Two-Step Normalization(AAROM-TN),enhanced by a dual weighting strategy.The weighting approach integrates the Criteria Importance Through Intercriteria Correlation(CRITIC)method with the Criteria Importance through Means and Standard Deviation(CIMAS)technique.The originality of the proposed framework lies in its ability to objectively quantify criteria importance using CRITIC,incorporate decision-makers’preferences through CIMAS,and capture the uncertainty and hesitation inherent in human judgment via Pythagorean Fuzzy Sets.A case study evaluating green infrastructure alternatives in metropolitan regions demonstrates the applicability and effectiveness of the framework.A sensitivity analysis is conducted to examine how variations in criteria weights affect the rankings and to evaluate the robustness of the results.Furthermore,a comparative analysis highlights the practical and financial implications of each alternative by assessing their respective strengths and weaknesses.
基金the Strategic Research Plan of the Centre for Marine Technology and Ocean Engineering(CENTEC),which is financed by the Portuguese Foundation for Science and Technology(Fundação para a Ciência e a Tecnologia FCT)under contract UIDB/UIDP/00134/2020.
文摘This study examines the methods to plan the development of offshore oilfields over the years,which are used to support the decision-making on the development of offshore oilfields.About 100 papers are analysed and categorised into different groups of main early-stage decisions.The present study stands in contrast to the contributions of the operations research and system engineering review articles,on the one hand,and the petroleum engineering review articles,on the other.This is because it does not focus on one methodological approach,nor does it limit the literature analysis by offshore oilfield characteristics.Consequently,the present analysis may offer valuable insights,for instance,by identifying environmental planning decisions as a recent yet highly significant concern that is currently being imposed on decision-making process.Thus,it is evident that the incorporation of safety criteria within the technical-economic decision-making process for the design of production systems would be a crucial requirement at development phase.