Efficient planning of activities is essential for modern industrial assembly lines to uphold manufacturing standards,prevent project constraint violations,and achieve cost-effective operations.While exact solutions to...Efficient planning of activities is essential for modern industrial assembly lines to uphold manufacturing standards,prevent project constraint violations,and achieve cost-effective operations.While exact solutions to such challenges can be obtained through Integer Programming(IP),the dependence of the search space on input parameters often makes IP computationally infeasible for large-scale scenarios.Heuristic methods,such as Genetic Algorithms,can also be applied,but they frequently produce suboptimal solutions in extensive cases.This paper introduces a novel mathematical model of a generic industrial assembly line formulated as a Markov Decision Process(MDP),without imposing assumptions on the type of assembly line a notable distinction from most existing models.The proposed model is employed to create a virtual environment for training Deep Reinforcement Learning(DRL)agents to optimize task and resource scheduling.To enhance the efficiency of agent training,the paper proposes two innovative tools.The first is an action-masking technique,which ensures the agent selects only feasible actions,thereby reducing training time.The second is a multi-agent approach,where each workstation is managed by an individual agent,as a result,the state and action spaces were reduced.A centralized training framework with decentralized execution is adopted,offering a scalable learning architecture for optimizing industrial assembly lines.This framework allows the agents to learn offline and subsequently provide real-time solutions during operations by leveraging a neural network that maps the current factory state to the optimal action.The effectiveness of the proposed scheme is validated through numerical simulations,demonstrating significantly faster convergence to the optimal solution compared to a comparable model-based approach.展开更多
In this paper,the distributed optimal formation control problem of heterogeneous Euler–Lagrange multi-agent systems with generic formation constraints and inequality constraints is investigated.Based on the primal–d...In this paper,the distributed optimal formation control problem of heterogeneous Euler–Lagrange multi-agent systems with generic formation constraints and inequality constraints is investigated.Based on the primal–dual dynamics and the adaptive control technique,a distributed optimal formation controller consists of a velocity reference signal generator and a velocity tracking controller is proposed.By using the optimality condition,the relationship between the equilibrium point of the closed-loop system and the optimal solution of the optimization problem is established.Then,by utilizing Lyapunov stability analysis,it is rigorously proved that the optimal formation is reached with the proposed controller.Lastly,simulation examples are provided to substantiate the theoretical results.展开更多
This article investigates the time-varying output group formation tracking control(GFTC)problem for heterogeneous multi-agent systems(HMASs)under switching topologies.The objective is to design a distributed control s...This article investigates the time-varying output group formation tracking control(GFTC)problem for heterogeneous multi-agent systems(HMASs)under switching topologies.The objective is to design a distributed control strategy that enables the outputs of the followers to form the desired sub-formations and track the outputs of the leader in each subgroup.Firstly,novel distributed observers are developed to estimate the states of the leaders under switching topologies.Then,GFTC protocols are designed based on the proposed observers.It is shown that with the distributed protocol,the GFTC problem for HMASs under switching topologies is solved if the average dwell time associated with the switching topologies is larger than a fixed threshold.Finally,an example is provided to illustrate the effectiveness of the proposed control strategy.展开更多
Formation control in multi-agent systems has become a critical area of interest due to its wide-ranging applications in robotics,autonomous transportation,and surveillance.While various studies have explored distribut...Formation control in multi-agent systems has become a critical area of interest due to its wide-ranging applications in robotics,autonomous transportation,and surveillance.While various studies have explored distributed cooperative control,this review focuses on the theoretical foundations and recent developments in formation control strategies.The paper categorizes and analyzes key formation types,including formation maintenance,group or cluster formation,bipartite formations,event-triggered formations,finite-time convergence,and constrained formations.A significant portion of the review addresses formation control under constrained dynamics,presenting both modelbased and model-free approaches that consider practical limitations such as actuator bounds,communication delays,and nonholonomic constraints.Additionally,the paper discusses emerging trends,including the integration of eventdriven mechanisms and AI-enhanced coordination strategies.Comparative evaluations highlight the trade-offs among various methodologies regarding scalability,robustness,and real-world feasibility.Practical implementations are reviewed across diverse platforms,and the review identifies the current achievements and unresolved challenges in the field.The paper concludes by outlining promising research directions,such as adaptive control for dynamic environments,energy-efficient coordination,and using learning-based control under uncertainty.This review synthesizes the current state of the art and provides a road map for future investigation,making it a valuable reference for researchers and practitioners aiming to advance formation control in multi-agent systems.展开更多
The Internet of Unmanned Aerial Vehicles(I-UAVs)is expected to execute latency-sensitive tasks,but limited by co-channel interference and malicious jamming.In the face of unknown prior environmental knowledge,defendin...The Internet of Unmanned Aerial Vehicles(I-UAVs)is expected to execute latency-sensitive tasks,but limited by co-channel interference and malicious jamming.In the face of unknown prior environmental knowledge,defending against jamming and interference through spectrum allocation becomes challenging,especially when each UAV pair makes decisions independently.In this paper,we propose a cooperative multi-agent reinforcement learning(MARL)-based anti-jamming framework for I-UAVs,enabling UAV pairs to learn their own policies cooperatively.Specifically,we first model the problem as a modelfree multi-agent Markov decision process(MAMDP)to maximize the long-term expected system throughput.Then,for improving the exploration of the optimal policy,we resort to optimizing a MARL objective function with a mutual-information(MI)regularizer between states and actions,which can dynamically assign the probability for actions frequently used by the optimal policy.Next,through sharing their current channel selections and local learning experience(their soft Q-values),the UAV pairs can learn their own policies cooperatively relying on only preceding observed information and predicting others’actions.Our simulation results show that for both sweep jamming and Markov jamming patterns,the proposed scheme outperforms the benchmarkers in terms of throughput,convergence and stability for different numbers of jammers,channels and UAV pairs.展开更多
This paper investigates the observer-based prescribed-time time-varying output formation-containment(PT-TV-OFC)control problem for heterogeneous multi-agent systems in which the different agents have different state d...This paper investigates the observer-based prescribed-time time-varying output formation-containment(PT-TV-OFC)control problem for heterogeneous multi-agent systems in which the different agents have different state dimensions.The system comprises one tracking leader,multiple formation leaders,and followers,where two types of leaders are used to generate a reference trajectory for movement and achieve specific formation,respectively.Firstly,a prescribed-time dynamics observer is constructed for the formation leaders to estimate the tracking leader's dynamic model and state.On this basis,a prescribed-time control protocol is designed for the formation leaders to achieve time-varying output formation.Then,a prescribed-time convex hull observer is designed for the followers to estimate information regarding the convex hull formed by the formation leaders.Using the estimated convex hull information,a prescribed-time containment control protocol is designed to ensure the followers converge into the convex hull.Furthermore,using Lyapunov stability theory,the stability of systems is proved in detail,which implies that the heterogeneous multi-agent systems can achieve PT-TV-OFC control.Finally,numerical simulations validate the feasibility of the theoretical results.展开更多
Moving Target Defense(MTD)necessitates scientifically effective decision-making methodologies for defensive technology implementation.While most MTD decision studies focus on accurately identifying optimal strategies,...Moving Target Defense(MTD)necessitates scientifically effective decision-making methodologies for defensive technology implementation.While most MTD decision studies focus on accurately identifying optimal strategies,the issue of optimal defense timing remains underexplored.Current default approaches—periodic or overly frequent MTD triggers—lead to suboptimal trade-offs among system security,performance,and cost.The timing of MTD strategy activation critically impacts both defensive efficacy and operational overhead,yet existing frameworks inadequately address this temporal dimension.To bridge this gap,this paper proposes a Stackelberg-FlipIt game model that formalizes asymmetric cyber conflicts as alternating control over attack surfaces,thereby capturing the dynamic security state evolution of MTD systems.We introduce a belief factor to quantify information asymmetry during adversarial interactions,enhancing the precision of MTD trigger timing.Leveraging this game-theoretic foundation,we employMulti-Agent Reinforcement Learning(MARL)to derive adaptive temporal strategies,optimized via a novel four-dimensional reward function that holistically balances security,performance,cost,and timing.Experimental validation using IP addressmutation against scanning attacks demonstrates stable strategy convergence and accelerated defense response,significantly improving cybersecurity affordability and effectiveness.展开更多
This paper delves into the problem of optimal placement conditions for a group of agents collaboratively localizing a target using range-only or bearing-only measurements.The challenge in this study stems from the unc...This paper delves into the problem of optimal placement conditions for a group of agents collaboratively localizing a target using range-only or bearing-only measurements.The challenge in this study stems from the uncertainty associated with the positions of the agents,which may experience drift or disturbances during the target localization process.Initially,we derive the Cramer-Rao lower bound(CRLB)of the target position as the primary analytical metric.Subsequently,we establish the necessary and sufficient conditions for the optimal placement of agents.Based on these conditions,we analyze the maximal allowable agent position error for an expected mean squared error(MSE),providing valuable guidance for the selection of agent positioning sensors.The analytical findings are further validated through simulation experiments.展开更多
This paper addresses the time-varying formation-containment(FC) problem for nonholonomic multi-agent systems with a desired trajectory constraint, where only the leaders can acquire information about the desired traje...This paper addresses the time-varying formation-containment(FC) problem for nonholonomic multi-agent systems with a desired trajectory constraint, where only the leaders can acquire information about the desired trajectory. Input the fixed time-varying formation template to the leader and start executing, this process also needs to track the desired trajectory, and the follower needs to converge to the convex hull that the leader crosses. Firstly, the dynamic models of nonholonomic systems are linearized to second-order dynamics. Then, based on the desired trajectory and formation template, the FC control protocols are proposed. Sufficient conditions to achieve FC are introduced and an algorithm is proposed to resolve the control parameters by solving an algebraic Riccati equation. The system is demonstrated to achieve FC, with the average position and velocity of the leaders converging asymptotically to the desired trajectory. Finally, the theoretical achievements are verified in simulations by a multi-agent system composed of virtual human individuals.展开更多
This paper presents a novel approach to dynamic pricing and distributed energy management in virtual power plant(VPP)networks using multi-agent reinforcement learning(MARL).As the energy landscape evolves towards grea...This paper presents a novel approach to dynamic pricing and distributed energy management in virtual power plant(VPP)networks using multi-agent reinforcement learning(MARL).As the energy landscape evolves towards greater decentralization and renewable integration,traditional optimization methods struggle to address the inherent complexities and uncertainties.Our proposed MARL framework enables adaptive,decentralized decision-making for both the distribution system operator and individual VPPs,optimizing economic efficiency while maintaining grid stability.We formulate the problem as a Markov decision process and develop a custom MARL algorithm that leverages actor-critic architectures and experience replay.Extensive simulations across diverse scenarios demonstrate that our approach consistently outperforms baseline methods,including Stackelberg game models and model predictive control,achieving an 18.73%reduction in costs and a 22.46%increase in VPP profits.The MARL framework shows particular strength in scenarios with high renewable energy penetration,where it improves system performance by 11.95%compared with traditional methods.Furthermore,our approach demonstrates superior adaptability to unexpected events and mis-predictions,highlighting its potential for real-world implementation.展开更多
An Interval Type-2(IT-2)fuzzy controller design approach is proposed in this research to simultaneously achievemultiple control objectives inNonlinearMulti-Agent Systems(NMASs),including formation,containment,and coll...An Interval Type-2(IT-2)fuzzy controller design approach is proposed in this research to simultaneously achievemultiple control objectives inNonlinearMulti-Agent Systems(NMASs),including formation,containment,and collision avoidance.However,inherent nonlinearities and uncertainties present in practical control systems contribute to the challenge of achieving precise control performance.Based on the IT-2 Takagi-Sugeno Fuzzy Model(T-SFM),the fuzzy control approach can offer a more effective solution for NMASs facing uncertainties.Unlike existing control methods for NMASs,the Formation and Containment(F-and-C)control problem with collision avoidance capability under uncertainties based on the IT-2 T-SFM is discussed for the first time.Moreover,an IT-2 fuzzy tracking control approach is proposed to solve the formation task for leaders in NMASs without requiring communication.This control scheme makes the design process of the IT-2 fuzzy Formation Controller(FC)more straightforward and effective.According to the communication interaction protocol,the IT-2 Containment Controller(CC)design approach is proposed for followers to ensure convergence into the region defined by the leaders.Leveraging the IT-2 T-SFM representation,the analysis methods developed for linear Multi-Agent Systems(MASs)are successfully extended to perform containment analysis without requiring the additional assumptions imposed in existing research.Notably,the IT-2 fuzzy tracking controller can also be applied in collision avoidance situations to track the desired trajectories calculated by the avoidance algorithm under the Artificial Potential Field(APF).Benefiting from the combination of vortex and source APFs,the leaders can properly adjust the system dynamics to prevent potential collision risk.Integrating the fuzzy theory and APFs avoidance algorithm,an IT-2 fuzzy controller design approach is proposed to achieve the F-and-C purposewhile ensuring collision avoidance capability.Finally,amulti-ship simulation is conducted to validate the feasibility and effectiveness of the designed IT-2 fuzzy controller.展开更多
Opportunistic mobile crowdsensing(MCS)non-intrusively exploits human mobility trajectories,and the participants’smart devices as sensors have become promising paradigms for various urban data acquisition tasks.Howeve...Opportunistic mobile crowdsensing(MCS)non-intrusively exploits human mobility trajectories,and the participants’smart devices as sensors have become promising paradigms for various urban data acquisition tasks.However,in practice,opportunistic MCS has several challenges from both the perspectives of MCS participants and the data platform.On the one hand,participants face uncertainties in conducting MCS tasks,including their mobility and implicit interactions among participants,and participants’economic returns given by the MCS data platform are determined by not only their own actions but also other participants’strategic actions.On the other hand,the platform can only observe the participants’uploaded sensing data that depends on the unknown effort/action exerted by participants to the platform,while,for optimizing its overall objective,the platform needs to properly reward certain participants for incentivizing them to provide high-quality data.To address the challenge of balancing individual incentives and platform objectives in MCS,this paper proposes MARCS,an online sensing policy based on multi-agent deep reinforcement learning(MADRL)with centralized training and decentralized execution(CTDE).Specifically,the interactions between MCS participants and the data platform are modeled as a partially observable Markov game,where participants,acting as agents,use DRL-based policies to make decisions based on local observations,such as task trajectories and platform payments.To align individual and platform goals effectively,the platform leverages Shapley value to estimate the contribution of each participant’s sensed data,using these estimates as immediate rewards to guide agent training.The experimental results on real mobility trajectory datasets indicate that the revenue of MARCS reaches almost 35%,53%,and 100%higher than DDPG,Actor-Critic,and model predictive control(MPC)respectively on the participant side and similar results on the platform side,which show superior performance compared to baselines.展开更多
A Multi-Agent System ( MAS ) is a promising approach to build complex system. This paper introduces the research of the Inner-Enterprise Credit Rating MAS ( IECRMAS). To raise the rating accuracy, we not only cons...A Multi-Agent System ( MAS ) is a promising approach to build complex system. This paper introduces the research of the Inner-Enterprise Credit Rating MAS ( IECRMAS). To raise the rating accuracy, we not only consider the rating-target's information, but also focus on the evaluators' feature information and propose the rational rating-group formation algorithm based on an anti-bias measurement of the group. We also propose the rational rating individual, which consists of the evaluator and the assistant rating agent. A rational group formation protocol is designed to coordinate autonomous agents to perform the rating job.展开更多
A multi-agent based manufacturing execution system (MES) model is presented. It is open, modula-rized, distributed, configurable, integratable and maintainable. By analyzing the MES domain in manufacturing systems, th...A multi-agent based manufacturing execution system (MES) model is presented. It is open, modula-rized, distributed, configurable, integratable and maintainable. By analyzing the MES domain in manufacturing systems, this paper proposes a multi-agent based MES model and analyzes the partitioned functions of MES in the model using unified modeling language (UML) diagrams, and establishes the ongoing implemented MES architecture. This MES can be facilely integrated with the enterprise resource planning (ERP), the floor control system (FCS), and the other manufacturing applications.展开更多
基金supported in part by the National Sciences and Engineering Research Council of Canada(NSERC)under the grants RGPIN-2022-04937。
文摘Efficient planning of activities is essential for modern industrial assembly lines to uphold manufacturing standards,prevent project constraint violations,and achieve cost-effective operations.While exact solutions to such challenges can be obtained through Integer Programming(IP),the dependence of the search space on input parameters often makes IP computationally infeasible for large-scale scenarios.Heuristic methods,such as Genetic Algorithms,can also be applied,but they frequently produce suboptimal solutions in extensive cases.This paper introduces a novel mathematical model of a generic industrial assembly line formulated as a Markov Decision Process(MDP),without imposing assumptions on the type of assembly line a notable distinction from most existing models.The proposed model is employed to create a virtual environment for training Deep Reinforcement Learning(DRL)agents to optimize task and resource scheduling.To enhance the efficiency of agent training,the paper proposes two innovative tools.The first is an action-masking technique,which ensures the agent selects only feasible actions,thereby reducing training time.The second is a multi-agent approach,where each workstation is managed by an individual agent,as a result,the state and action spaces were reduced.A centralized training framework with decentralized execution is adopted,offering a scalable learning architecture for optimizing industrial assembly lines.This framework allows the agents to learn offline and subsequently provide real-time solutions during operations by leveraging a neural network that maps the current factory state to the optimal action.The effectiveness of the proposed scheme is validated through numerical simulations,demonstrating significantly faster convergence to the optimal solution compared to a comparable model-based approach.
基金supported in part by the National Key Research and Development Program of China under Grant 2022YFB3303900in part by the National Natural Science Foundation of China under Grants 62103277 and 62025305。
文摘In this paper,the distributed optimal formation control problem of heterogeneous Euler–Lagrange multi-agent systems with generic formation constraints and inequality constraints is investigated.Based on the primal–dual dynamics and the adaptive control technique,a distributed optimal formation controller consists of a velocity reference signal generator and a velocity tracking controller is proposed.By using the optimality condition,the relationship between the equilibrium point of the closed-loop system and the optimal solution of the optimization problem is established.Then,by utilizing Lyapunov stability analysis,it is rigorously proved that the optimal formation is reached with the proposed controller.Lastly,simulation examples are provided to substantiate the theoretical results.
文摘This article investigates the time-varying output group formation tracking control(GFTC)problem for heterogeneous multi-agent systems(HMASs)under switching topologies.The objective is to design a distributed control strategy that enables the outputs of the followers to form the desired sub-formations and track the outputs of the leader in each subgroup.Firstly,novel distributed observers are developed to estimate the states of the leaders under switching topologies.Then,GFTC protocols are designed based on the proposed observers.It is shown that with the distributed protocol,the GFTC problem for HMASs under switching topologies is solved if the average dwell time associated with the switching topologies is larger than a fixed threshold.Finally,an example is provided to illustrate the effectiveness of the proposed control strategy.
基金supported in part by the National Natural Science Foundation of China under Grant 6237319in part by the Postgraduate Research and Practice Innovation Program of Jiangsu Province under Grant KYCX230479.
文摘Formation control in multi-agent systems has become a critical area of interest due to its wide-ranging applications in robotics,autonomous transportation,and surveillance.While various studies have explored distributed cooperative control,this review focuses on the theoretical foundations and recent developments in formation control strategies.The paper categorizes and analyzes key formation types,including formation maintenance,group or cluster formation,bipartite formations,event-triggered formations,finite-time convergence,and constrained formations.A significant portion of the review addresses formation control under constrained dynamics,presenting both modelbased and model-free approaches that consider practical limitations such as actuator bounds,communication delays,and nonholonomic constraints.Additionally,the paper discusses emerging trends,including the integration of eventdriven mechanisms and AI-enhanced coordination strategies.Comparative evaluations highlight the trade-offs among various methodologies regarding scalability,robustness,and real-world feasibility.Practical implementations are reviewed across diverse platforms,and the review identifies the current achievements and unresolved challenges in the field.The paper concludes by outlining promising research directions,such as adaptive control for dynamic environments,energy-efficient coordination,and using learning-based control under uncertainty.This review synthesizes the current state of the art and provides a road map for future investigation,making it a valuable reference for researchers and practitioners aiming to advance formation control in multi-agent systems.
基金supported in part by the National Natural Science Foundation of China under Grants 62001225,62071236,62071234 and U22A2002in part by the Major Science and Technology plan of Hainan Province under Grant ZDKJ2021022+1 种基金in part by the Scientific Research Fund Project of Hainan University under Grant KYQD(ZR)-21008in part by the Key Technologies R&D Program of Jiangsu(Prospective and Key Technologies for Industry)under Grants BE2023022 and BE2023022-2.
文摘The Internet of Unmanned Aerial Vehicles(I-UAVs)is expected to execute latency-sensitive tasks,but limited by co-channel interference and malicious jamming.In the face of unknown prior environmental knowledge,defending against jamming and interference through spectrum allocation becomes challenging,especially when each UAV pair makes decisions independently.In this paper,we propose a cooperative multi-agent reinforcement learning(MARL)-based anti-jamming framework for I-UAVs,enabling UAV pairs to learn their own policies cooperatively.Specifically,we first model the problem as a modelfree multi-agent Markov decision process(MAMDP)to maximize the long-term expected system throughput.Then,for improving the exploration of the optimal policy,we resort to optimizing a MARL objective function with a mutual-information(MI)regularizer between states and actions,which can dynamically assign the probability for actions frequently used by the optimal policy.Next,through sharing their current channel selections and local learning experience(their soft Q-values),the UAV pairs can learn their own policies cooperatively relying on only preceding observed information and predicting others’actions.Our simulation results show that for both sweep jamming and Markov jamming patterns,the proposed scheme outperforms the benchmarkers in terms of throughput,convergence and stability for different numbers of jammers,channels and UAV pairs.
基金supported in part by the National Natural Science Foundation of China(Grant Nos.62473135 and 62173121)。
文摘This paper investigates the observer-based prescribed-time time-varying output formation-containment(PT-TV-OFC)control problem for heterogeneous multi-agent systems in which the different agents have different state dimensions.The system comprises one tracking leader,multiple formation leaders,and followers,where two types of leaders are used to generate a reference trajectory for movement and achieve specific formation,respectively.Firstly,a prescribed-time dynamics observer is constructed for the formation leaders to estimate the tracking leader's dynamic model and state.On this basis,a prescribed-time control protocol is designed for the formation leaders to achieve time-varying output formation.Then,a prescribed-time convex hull observer is designed for the followers to estimate information regarding the convex hull formed by the formation leaders.Using the estimated convex hull information,a prescribed-time containment control protocol is designed to ensure the followers converge into the convex hull.Furthermore,using Lyapunov stability theory,the stability of systems is proved in detail,which implies that the heterogeneous multi-agent systems can achieve PT-TV-OFC control.Finally,numerical simulations validate the feasibility of the theoretical results.
基金funded by National Natural Science Foundation of China No.62302520.
文摘Moving Target Defense(MTD)necessitates scientifically effective decision-making methodologies for defensive technology implementation.While most MTD decision studies focus on accurately identifying optimal strategies,the issue of optimal defense timing remains underexplored.Current default approaches—periodic or overly frequent MTD triggers—lead to suboptimal trade-offs among system security,performance,and cost.The timing of MTD strategy activation critically impacts both defensive efficacy and operational overhead,yet existing frameworks inadequately address this temporal dimension.To bridge this gap,this paper proposes a Stackelberg-FlipIt game model that formalizes asymmetric cyber conflicts as alternating control over attack surfaces,thereby capturing the dynamic security state evolution of MTD systems.We introduce a belief factor to quantify information asymmetry during adversarial interactions,enhancing the precision of MTD trigger timing.Leveraging this game-theoretic foundation,we employMulti-Agent Reinforcement Learning(MARL)to derive adaptive temporal strategies,optimized via a novel four-dimensional reward function that holistically balances security,performance,cost,and timing.Experimental validation using IP addressmutation against scanning attacks demonstrates stable strategy convergence and accelerated defense response,significantly improving cybersecurity affordability and effectiveness.
文摘This paper delves into the problem of optimal placement conditions for a group of agents collaboratively localizing a target using range-only or bearing-only measurements.The challenge in this study stems from the uncertainty associated with the positions of the agents,which may experience drift or disturbances during the target localization process.Initially,we derive the Cramer-Rao lower bound(CRLB)of the target position as the primary analytical metric.Subsequently,we establish the necessary and sufficient conditions for the optimal placement of agents.Based on these conditions,we analyze the maximal allowable agent position error for an expected mean squared error(MSE),providing valuable guidance for the selection of agent positioning sensors.The analytical findings are further validated through simulation experiments.
文摘This paper addresses the time-varying formation-containment(FC) problem for nonholonomic multi-agent systems with a desired trajectory constraint, where only the leaders can acquire information about the desired trajectory. Input the fixed time-varying formation template to the leader and start executing, this process also needs to track the desired trajectory, and the follower needs to converge to the convex hull that the leader crosses. Firstly, the dynamic models of nonholonomic systems are linearized to second-order dynamics. Then, based on the desired trajectory and formation template, the FC control protocols are proposed. Sufficient conditions to achieve FC are introduced and an algorithm is proposed to resolve the control parameters by solving an algebraic Riccati equation. The system is demonstrated to achieve FC, with the average position and velocity of the leaders converging asymptotically to the desired trajectory. Finally, the theoretical achievements are verified in simulations by a multi-agent system composed of virtual human individuals.
基金supported by the Science and Technology Project of State Grid Sichuan Electric Power Company Chengdu Power Supply Company under Grant No.521904240005.
文摘This paper presents a novel approach to dynamic pricing and distributed energy management in virtual power plant(VPP)networks using multi-agent reinforcement learning(MARL).As the energy landscape evolves towards greater decentralization and renewable integration,traditional optimization methods struggle to address the inherent complexities and uncertainties.Our proposed MARL framework enables adaptive,decentralized decision-making for both the distribution system operator and individual VPPs,optimizing economic efficiency while maintaining grid stability.We formulate the problem as a Markov decision process and develop a custom MARL algorithm that leverages actor-critic architectures and experience replay.Extensive simulations across diverse scenarios demonstrate that our approach consistently outperforms baseline methods,including Stackelberg game models and model predictive control,achieving an 18.73%reduction in costs and a 22.46%increase in VPP profits.The MARL framework shows particular strength in scenarios with high renewable energy penetration,where it improves system performance by 11.95%compared with traditional methods.Furthermore,our approach demonstrates superior adaptability to unexpected events and mis-predictions,highlighting its potential for real-world implementation.
基金founded by the National Science and Technology Council of the Republic of China under contract NSTC113-2221-E-019-032.
文摘An Interval Type-2(IT-2)fuzzy controller design approach is proposed in this research to simultaneously achievemultiple control objectives inNonlinearMulti-Agent Systems(NMASs),including formation,containment,and collision avoidance.However,inherent nonlinearities and uncertainties present in practical control systems contribute to the challenge of achieving precise control performance.Based on the IT-2 Takagi-Sugeno Fuzzy Model(T-SFM),the fuzzy control approach can offer a more effective solution for NMASs facing uncertainties.Unlike existing control methods for NMASs,the Formation and Containment(F-and-C)control problem with collision avoidance capability under uncertainties based on the IT-2 T-SFM is discussed for the first time.Moreover,an IT-2 fuzzy tracking control approach is proposed to solve the formation task for leaders in NMASs without requiring communication.This control scheme makes the design process of the IT-2 fuzzy Formation Controller(FC)more straightforward and effective.According to the communication interaction protocol,the IT-2 Containment Controller(CC)design approach is proposed for followers to ensure convergence into the region defined by the leaders.Leveraging the IT-2 T-SFM representation,the analysis methods developed for linear Multi-Agent Systems(MASs)are successfully extended to perform containment analysis without requiring the additional assumptions imposed in existing research.Notably,the IT-2 fuzzy tracking controller can also be applied in collision avoidance situations to track the desired trajectories calculated by the avoidance algorithm under the Artificial Potential Field(APF).Benefiting from the combination of vortex and source APFs,the leaders can properly adjust the system dynamics to prevent potential collision risk.Integrating the fuzzy theory and APFs avoidance algorithm,an IT-2 fuzzy controller design approach is proposed to achieve the F-and-C purposewhile ensuring collision avoidance capability.Finally,amulti-ship simulation is conducted to validate the feasibility and effectiveness of the designed IT-2 fuzzy controller.
基金sponsored by Qinglan Project of Jiangsu Province,and Jiangsu Provincial Key Research and Development Program(No.BE2020084-1).
文摘Opportunistic mobile crowdsensing(MCS)non-intrusively exploits human mobility trajectories,and the participants’smart devices as sensors have become promising paradigms for various urban data acquisition tasks.However,in practice,opportunistic MCS has several challenges from both the perspectives of MCS participants and the data platform.On the one hand,participants face uncertainties in conducting MCS tasks,including their mobility and implicit interactions among participants,and participants’economic returns given by the MCS data platform are determined by not only their own actions but also other participants’strategic actions.On the other hand,the platform can only observe the participants’uploaded sensing data that depends on the unknown effort/action exerted by participants to the platform,while,for optimizing its overall objective,the platform needs to properly reward certain participants for incentivizing them to provide high-quality data.To address the challenge of balancing individual incentives and platform objectives in MCS,this paper proposes MARCS,an online sensing policy based on multi-agent deep reinforcement learning(MADRL)with centralized training and decentralized execution(CTDE).Specifically,the interactions between MCS participants and the data platform are modeled as a partially observable Markov game,where participants,acting as agents,use DRL-based policies to make decisions based on local observations,such as task trajectories and platform payments.To align individual and platform goals effectively,the platform leverages Shapley value to estimate the contribution of each participant’s sensed data,using these estimates as immediate rewards to guide agent training.The experimental results on real mobility trajectory datasets indicate that the revenue of MARCS reaches almost 35%,53%,and 100%higher than DDPG,Actor-Critic,and model predictive control(MPC)respectively on the participant side and similar results on the platform side,which show superior performance compared to baselines.
基金This paper is supported by National Science Foundation of China under Grant No60542004
文摘A Multi-Agent System ( MAS ) is a promising approach to build complex system. This paper introduces the research of the Inner-Enterprise Credit Rating MAS ( IECRMAS). To raise the rating accuracy, we not only consider the rating-target's information, but also focus on the evaluators' feature information and propose the rational rating-group formation algorithm based on an anti-bias measurement of the group. We also propose the rational rating individual, which consists of the evaluator and the assistant rating agent. A rational group formation protocol is designed to coordinate autonomous agents to perform the rating job.
文摘A multi-agent based manufacturing execution system (MES) model is presented. It is open, modula-rized, distributed, configurable, integratable and maintainable. By analyzing the MES domain in manufacturing systems, this paper proposes a multi-agent based MES model and analyzes the partitioned functions of MES in the model using unified modeling language (UML) diagrams, and establishes the ongoing implemented MES architecture. This MES can be facilely integrated with the enterprise resource planning (ERP), the floor control system (FCS), and the other manufacturing applications.