With the rapid development of power Internet of Things(IoT)scenarios such as smart factories and smart homes,numerous intelligent terminal devices and real-time interactive applications impose higher demands on comput...With the rapid development of power Internet of Things(IoT)scenarios such as smart factories and smart homes,numerous intelligent terminal devices and real-time interactive applications impose higher demands on computing latency and resource supply efficiency.Multi-access edge computing technology deploys cloud computing capabilities at the network edge;constructs distributed computing nodes and multi-access systems and offers infrastructure support for services with low latency and high reliability.Existing research relies on a strong assumption that the environmental state is fully observable and fails to thoroughly consider the continuous time-varying features of edge server load fluctuations,leading to insufficient adaptability of the model in a heterogeneous dynamic environment.Thus,this paper establishes a framework for end-edge collaborative task offloading based on a partially observable Markov decision-making process(POMDP)and proposes a method for end-edge collaborative task offloading in heterogeneous scenarios.It achieves time-series modeling of the historical load characteristics of edge servers and endows the agent with the ability to be aware of the load in dynamic environmental states.Moreover,by dynamically assessing the exploration value of historical trajectories in the central trajectory pool and adjusting the sample weight distribution,directional exploration and strategy optimization of high-value trajectories are realized.Experimental results indicate that the proposed method exhibits distinct advantages compared with existing methods in terms of average delay and task failure rate and also verifies the method’s robustness in a dynamic environment.展开更多
Edge artificial intelligence will empower the ever simple industrial wireless networks(IWNs)supporting complex and dynamic tasks by collaboratively exploiting the computation and communication resources of both machin...Edge artificial intelligence will empower the ever simple industrial wireless networks(IWNs)supporting complex and dynamic tasks by collaboratively exploiting the computation and communication resources of both machine-type devices(MTDs)and edge servers.In this paper,we propose a multi-agent deep reinforcement learning based resource allocation(MADRL-RA)algorithm for end-edge orchestrated IWNs to support computation-intensive and delay-sensitive applications.First,we present the system model of IWNs,wherein each MTD is regarded as a self-learning agent.Then,we apply the Markov decision process to formulate a minimum system overhead problem with joint optimization of delay and energy consumption.Next,we employ MADRL to defeat the explosive state space and learn an effective resource allocation policy with respect to computing decision,computation capacity,and transmission power.To break the time correlation of training data while accelerating the learning process of MADRL-RA,we design a weighted experience replay to store and sample experiences categorically.Furthermore,we propose a step-by-stepε-greedy method to balance exploitation and exploration.Finally,we verify the effectiveness of MADRL-RA by comparing it with some benchmark algorithms in many experiments,showing that MADRL-RA converges quickly and learns an effective resource allocation policy achieving the minimum system overhead.展开更多
基金funded by the State Grid Corporation Science and Technology Project“Research and Application of Key Technologies for Integrated Sensing and Computing for Intelligent Operation of Power Grid”(Grant No.5700-202318596A-3-2-ZN).
文摘With the rapid development of power Internet of Things(IoT)scenarios such as smart factories and smart homes,numerous intelligent terminal devices and real-time interactive applications impose higher demands on computing latency and resource supply efficiency.Multi-access edge computing technology deploys cloud computing capabilities at the network edge;constructs distributed computing nodes and multi-access systems and offers infrastructure support for services with low latency and high reliability.Existing research relies on a strong assumption that the environmental state is fully observable and fails to thoroughly consider the continuous time-varying features of edge server load fluctuations,leading to insufficient adaptability of the model in a heterogeneous dynamic environment.Thus,this paper establishes a framework for end-edge collaborative task offloading based on a partially observable Markov decision-making process(POMDP)and proposes a method for end-edge collaborative task offloading in heterogeneous scenarios.It achieves time-series modeling of the historical load characteristics of edge servers and endows the agent with the ability to be aware of the load in dynamic environmental states.Moreover,by dynamically assessing the exploration value of historical trajectories in the central trajectory pool and adjusting the sample weight distribution,directional exploration and strategy optimization of high-value trajectories are realized.Experimental results indicate that the proposed method exhibits distinct advantages compared with existing methods in terms of average delay and task failure rate and also verifies the method’s robustness in a dynamic environment.
基金Project supported by the National Key R&rD Program of China(No.2020YFB1710900)the National Natural Science Foundation of China(Nos.62173322,61803368,and U1908212)+1 种基金the China Postdoctoral Science Foundation(No.2019M661156)the Youth Innovation Promotion Association,Chinese Academy of Sciences(No.2019202)。
文摘Edge artificial intelligence will empower the ever simple industrial wireless networks(IWNs)supporting complex and dynamic tasks by collaboratively exploiting the computation and communication resources of both machine-type devices(MTDs)and edge servers.In this paper,we propose a multi-agent deep reinforcement learning based resource allocation(MADRL-RA)algorithm for end-edge orchestrated IWNs to support computation-intensive and delay-sensitive applications.First,we present the system model of IWNs,wherein each MTD is regarded as a self-learning agent.Then,we apply the Markov decision process to formulate a minimum system overhead problem with joint optimization of delay and energy consumption.Next,we employ MADRL to defeat the explosive state space and learn an effective resource allocation policy with respect to computing decision,computation capacity,and transmission power.To break the time correlation of training data while accelerating the learning process of MADRL-RA,we design a weighted experience replay to store and sample experiences categorically.Furthermore,we propose a step-by-stepε-greedy method to balance exploitation and exploration.Finally,we verify the effectiveness of MADRL-RA by comparing it with some benchmark algorithms in many experiments,showing that MADRL-RA converges quickly and learns an effective resource allocation policy achieving the minimum system overhead.