In Software-Defined Networks(SDNs),determining how to efficiently achieve Quality of Service(QoS)-aware routing is challenging but critical for significantly improving the performance of a network,where the metrics of...In Software-Defined Networks(SDNs),determining how to efficiently achieve Quality of Service(QoS)-aware routing is challenging but critical for significantly improving the performance of a network,where the metrics of QoS can be defined as,for example,average latency,packet loss ratio,and throughput.The SDN controller can use network statistics and a Deep Reinforcement Learning(DRL)method to resolve this challenge.In this paper,we formulate dynamic routing in an SDN as a Markov decision process and propose a DRL algorithm called the Asynchronous Advantage Actor-Critic QoS-aware Routing Optimization Mechanism(AQROM)to determine routing strategies that balance the traffic loads in the network.AQROM can improve the QoS of the network and reduce the training time via dynamic routing strategy updates;that is,the reward function can be dynamically and promptly altered based on the optimization objective regardless of the network topology and traffic pattern.AQROM can be considered as one-step optimization and a black-box routing mechanism in high-dimensional input and output sets for both discrete and continuous states,and actions with respect to the operations in the SDN.Extensive simulations were conducted using OMNeT++and the results demonstrated that AQROM 1)achieved much faster and stable convergence than the Deep Deterministic Policy Gradient(DDPG)and Advantage Actor-Critic(A2C),2)incurred a lower packet loss ratio and latency than Open Shortest Path First(OSPF),DDPG,and A2C,and 3)resulted in higher and more stable throughput than OSPF,DDPG,and A2C.展开更多
Decision-making of connected and automated vehicles(CAV)includes a sequence of driving maneuvers that improve safety and efficiency,characterized by complex scenarios,strong uncertainty,and high real-time requirements...Decision-making of connected and automated vehicles(CAV)includes a sequence of driving maneuvers that improve safety and efficiency,characterized by complex scenarios,strong uncertainty,and high real-time requirements.Deep reinforcement learning(DRL)exhibits excellent capability of real-time decision-making and adaptability to complex scenarios,and generalization abilities.However,it is arduous to guarantee complete driving safety and efficiency under the constraints of training samples and costs.This paper proposes a Mixture of Expert method(MoE)based on Soft Actor-Critic(SAC),where the upper-level discriminator dynamically decides whether to activate the lower-level DRL expert or the heuristic expert based on the features of the input state.To further enhance the performance of the DRL expert,a buffer zone is introduced in the reward function,preemptively applying penalties before insecure situations occur.In order to minimize collision and off-road rates,the Intelligent Driver Model(IDM)and Minimizing Overall Braking Induced by Lane changes(MOBIL)strategy are designed by heuristic experts.Finally,tested in typical simulation scenarios,MOE shows a 13.75%improvement in driving efficiency compared with the traditional DRL method with continuous action space.It ensures high safety with zero collision and zero off-road rates while maintaining high adaptability.展开更多
Wireless Sensor Network(WSN)is widely utilized in large-scale distributed unmanned detection scenarios due to its low cost and flexible installation.However,WSN data collection encounters challenges in scenarios lacki...Wireless Sensor Network(WSN)is widely utilized in large-scale distributed unmanned detection scenarios due to its low cost and flexible installation.However,WSN data collection encounters challenges in scenarios lacking communication infrastructure.Unmanned aerial vehicle(UAV)offers a novel solution for WSN data collection,leveraging their high mobility.In this paper,we present an efficient UAV-assisted data collection algorithm aimed at minimizing the overall power consumption of the WSN.Firstly,a two-layer UAV-assisted data collection model is introduced,including the ground and aerial layers.The ground layer senses the environmental data by the cluster members(CMs),and the CMs transmit the data to the cluster heads(CHs),which forward the collected data to the UAVs.The aerial network layer consists of multiple UAVs that collect,store,and forward data from the CHs to the data center for analysis.Secondly,an improved clustering algorithm based on K-Means++is proposed to optimize the number and locations of CHs.Moreover,an Actor-Critic based algorithm is introduced to optimize the UAV deployment and the association with CHs.Finally,simulation results verify the effectiveness of the proposed algorithms.展开更多
Parking in a small parking lot within limited space poses a difficult task. It often leads to deviations between the final parking posture and the target posture. These deviations can lead to partial occupancy of adja...Parking in a small parking lot within limited space poses a difficult task. It often leads to deviations between the final parking posture and the target posture. These deviations can lead to partial occupancy of adjacent parking lots, which poses a safety threat to vehicles parked in these parking lots. However, previous studies have not addressed this issue. In this paper, we aim to evaluate the impact of parking deviation of existing vehicles next to the target parking lot(PDEVNTPL) on the automatic ego vehicle(AEV) parking, in terms of safety, comfort, accuracy, and efficiency of parking. A segmented parking training framework(SPTF) based on soft actor-critic(SAC) is proposed to improve parking performance. In the proposed method, the SAC algorithm incorporates strategy entropy into the objective function, to enable the AEV to learn parking strategies based on a more comprehensive understanding of the environment. Additionally, the SPTF simplifies complex parking tasks to maintain the high performance of deep reinforcement learning(DRL). The experimental results reveal that the PDEVNTPL has a detrimental influence on the AEV parking in terms of safety, accuracy, and comfort, leading to reductions of more than 27%, 54%, and 26%respectively. However, the SAC-based SPTF effectively mitigates this impact, resulting in a considerable increase in the parking success rate from 71% to 93%. Furthermore, the heading angle deviation is significantly reduced from 2.25 degrees to 0.43degrees.展开更多
Actor-Critic是一种强化学习方法,通过与环境在线试错交互收集样本来学习策略,是求解序贯感知决策问题的有效手段.但是,这种在线交互的主动学习范式在一些复杂真实环境中收集样本时会带来成本和安全问题离线强化学习作为一种基于数据驱...Actor-Critic是一种强化学习方法,通过与环境在线试错交互收集样本来学习策略,是求解序贯感知决策问题的有效手段.但是,这种在线交互的主动学习范式在一些复杂真实环境中收集样本时会带来成本和安全问题离线强化学习作为一种基于数据驱动的强化学习范式,强调从静态样本数据集中学习策略,与环境无探索交互,为机器人、自动驾驶、健康护理等真实世界部署应用提供了可行的解决方案,是近年来的研究热点.目前,离线强化学习方法存在学习策略和行为策略之间的分布偏移挑战,针对这个挑战,通常采用策略约束或值函数正则化来限制访问数据集分布之外(Out-Of-Distribution,OOD)的动作,从而导致学习性能过于保守,阻碍了值函数网络的泛化和学习策略的性能提升.为此,本文利用不确定性估计和OOD采样来平衡值函数学习的泛化性和保守性,提出一种基于不确定性估计的离线确定型Actor-Critic方法(Offline Deterministic Actor-Critic based on UncertaintyEstimation,ODACUE).首先,针对确定型策略,给出一种Q值函数的不确定性估计算子定义,理论证明了该算子学到的Q值函数是最优Q值函数的一种悲观估计.然后,将不确定性估计算子应用于确定型Actor-Critic框架中,通过对不确定性估计算子进行凸组合构造Critic学习的目标函数.最后,D4RL基准数据集任务上的实验结果表明:相较于对比算法,ODACUE在11个不同质量等级数据集任务中的总体性能提升最低达9.56%,最高达64.92%.此外,参数分析和消融实验进一步验证了ODACUE的稳定性和泛化能力.展开更多
Medical procedures are inherently invasive and carry the risk of inducing pain to the mind and body.Recently,efforts have been made to alleviate the discomfort associated with invasive medical procedures through the u...Medical procedures are inherently invasive and carry the risk of inducing pain to the mind and body.Recently,efforts have been made to alleviate the discomfort associated with invasive medical procedures through the use of virtual reality(VR)technology.VR has been demonstrated to be an effective treatment for pain associated with medical procedures,as well as for chronic pain conditions for which no effective treatment has been established.The precise mechanism by which the diversion from reality facilitated by VR contributes to the diminution of pain and anxiety has yet to be elucidated.However,the provision of positive images through VR-based visual stimulation may enhance the functionality of brain networks.The salience network is diminished,while the default mode network is enhanced.Additionally,the medial prefrontal cortex may establish a stronger connection with the default mode network,which could result in a reduction of pain and anxiety.Further research into the potential of VR technology to alleviate pain could lead to a reduction in the number of individuals who overdose on painkillers and contribute to positive change in the medical field.展开更多
In this study,a novel residential virtual power plant(RVPP)scheduling method that leverages a gate recurrent unit(GRU)-integrated deep reinforcement learning(DRL)algorithm is proposed.In the proposed scheme,the GRU-in...In this study,a novel residential virtual power plant(RVPP)scheduling method that leverages a gate recurrent unit(GRU)-integrated deep reinforcement learning(DRL)algorithm is proposed.In the proposed scheme,the GRU-integrated DRL algorithm guides the RVPP to participate effectively in both the day-ahead and real-time markets,lowering the electricity purchase costs and consumption risks for end-users.The Lagrangian relaxation technique is introduced to transform the constrained Markov decision process(CMDP)into an unconstrained optimization problem,which guarantees that the constraints are strictly satisfied without determining the penalty coefficients.Furthermore,to enhance the scalability of the constrained soft actor-critic(CSAC)-based RVPP scheduling approach,a fully distributed scheduling architecture was designed to enable plug-and-play in the residential distributed energy resources(RDER).Case studies performed on the constructed RVPP scenario validated the performance of the proposed methodology in enhancing the responsiveness of the RDER to power tariffs,balancing the supply and demand of the power grid,and ensuring customer comfort.展开更多
Complex network models are frequently employed for simulating and studyingdiverse real-world complex systems.Among these models,scale-free networks typically exhibit greater fragility to malicious attacks.Consequently...Complex network models are frequently employed for simulating and studyingdiverse real-world complex systems.Among these models,scale-free networks typically exhibit greater fragility to malicious attacks.Consequently,enhancing the robustness of scale-free networks has become a pressing issue.To address this problem,this paper proposes a Multi-Granularity Integration Algorithm(MGIA),which aims to improve the robustness of scale-free networks while keeping the initial degree of each node unchanged,ensuring network connectivity and avoiding the generation of multiple edges.The algorithm generates a multi-granularity structure from the initial network to be optimized,then uses different optimization strategies to optimize the networks at various granular layers in this structure,and finally realizes the information exchange between different granular layers,thereby further enhancing the optimization effect.We propose new network refresh,crossover,and mutation operators to ensure that the optimized network satisfies the given constraints.Meanwhile,we propose new network similarity and network dissimilarity evaluation metrics to improve the effectiveness of the optimization operators in the algorithm.In the experiments,the MGIA enhances the robustness of the scale-free network by 67.6%.This improvement is approximately 17.2%higher than the optimization effects achieved by eight currently existing complex network robustness optimization algorithms.展开更多
Satellite edge computing has garnered significant attention from researchers;however,processing a large volume of tasks within multi-node satellite networks still poses considerable challenges.The sharp increase in us...Satellite edge computing has garnered significant attention from researchers;however,processing a large volume of tasks within multi-node satellite networks still poses considerable challenges.The sharp increase in user demand for latency-sensitive tasks has inevitably led to offloading bottlenecks and insufficient computational capacity on individual satellite edge servers,making it necessary to implement effective task offloading scheduling to enhance user experience.In this paper,we propose a priority-based task scheduling strategy based on a Software-Defined Network(SDN)framework for satellite-terrestrial integrated networks,which clarifies the execution order of tasks based on their priority.Subsequently,we apply a Dueling-Double Deep Q-Network(DDQN)algorithm enhanced with prioritized experience replay to derive a computation offloading strategy,improving the experience replay mechanism within the Dueling-DDQN framework.Next,we utilize the Deep Deterministic Policy Gradient(DDPG)algorithm to determine the optimal resource allocation strategy to reduce the processing latency of sub-tasks.Simulation results demonstrate that the proposed d3-DDPG algorithm outperforms other approaches,effectively reducing task processing latency and thus improving user experience and system efficiency.展开更多
Two-dimensional endoscopic images are susceptible to interferences such as specular reflections and monotonous texture illumination,hindering accurate three-dimensional lesion reconstruction by surgical robots.This st...Two-dimensional endoscopic images are susceptible to interferences such as specular reflections and monotonous texture illumination,hindering accurate three-dimensional lesion reconstruction by surgical robots.This study proposes a novel end-to-end disparity estimation model to address these challenges.Our approach combines a Pseudo-Siamese neural network architecture with pyramid dilated convolutions,integrating multi-scale image information to enhance robustness against lighting interferences.This study introduces a Pseudo-Siamese structure-based disparity regression model that simplifies left-right image comparison,improving accuracy and efficiency.The model was evaluated using a dataset of stereo endoscopic videos captured by the Da Vinci surgical robot,comprising simulated silicone heart sequences and real heart video data.Experimental results demonstrate significant improvement in the network’s resistance to lighting interference without substantially increasing parameters.Moreover,the model exhibited faster convergence during training,contributing to overall performance enhancement.This study advances endoscopic image processing accuracy and has potential implications for surgical robot applications in complex environments.展开更多
Deep neural networks(DNNs)are effective in solving both forward and inverse problems for nonlinear partial differential equations(PDEs).However,conventional DNNs are not effective in handling problems such as delay di...Deep neural networks(DNNs)are effective in solving both forward and inverse problems for nonlinear partial differential equations(PDEs).However,conventional DNNs are not effective in handling problems such as delay differential equations(DDEs)and delay integrodifferential equations(DIDEs)with constant delays,primarily due to their low regularity at delayinduced breaking points.In this paper,a DNN method that combines multi-task learning(MTL)which is proposed to solve both the forward and inverse problems of DIDEs.The core idea of this approach is to divide the original equation into multiple tasks based on the delay,using auxiliary outputs to represent the integral terms,followed by the use of MTL to seamlessly incorporate the properties at the breaking points into the loss function.Furthermore,given the increased training dificulty associated with multiple tasks and outputs,we employ a sequential training scheme to reduce training complexity and provide reference solutions for subsequent tasks.This approach significantly enhances the approximation accuracy of solving DIDEs with DNNs,as demonstrated by comparisons with traditional DNN methods.We validate the effectiveness of this method through several numerical experiments,test various parameter sharing structures in MTL and compare the testing results of these structures.Finally,this method is implemented to solve the inverse problem of nonlinear DIDE and the results show that the unknown parameters of DIDE can be discovered with sparse or noisy data.展开更多
基金fully supported by GUET Excellent Graduate Thesis Program(Grant No.19YJPYBS03)Innovation Project of Guangxi Graduate Education(Grant No.YCBZ2022109)New Technology Research University Cooperation Project of the 34th Research Institute of China Electronics Technology Group Corporation,2021(Grant No.SF2126007)。
文摘In Software-Defined Networks(SDNs),determining how to efficiently achieve Quality of Service(QoS)-aware routing is challenging but critical for significantly improving the performance of a network,where the metrics of QoS can be defined as,for example,average latency,packet loss ratio,and throughput.The SDN controller can use network statistics and a Deep Reinforcement Learning(DRL)method to resolve this challenge.In this paper,we formulate dynamic routing in an SDN as a Markov decision process and propose a DRL algorithm called the Asynchronous Advantage Actor-Critic QoS-aware Routing Optimization Mechanism(AQROM)to determine routing strategies that balance the traffic loads in the network.AQROM can improve the QoS of the network and reduce the training time via dynamic routing strategy updates;that is,the reward function can be dynamically and promptly altered based on the optimization objective regardless of the network topology and traffic pattern.AQROM can be considered as one-step optimization and a black-box routing mechanism in high-dimensional input and output sets for both discrete and continuous states,and actions with respect to the operations in the SDN.Extensive simulations were conducted using OMNeT++and the results demonstrated that AQROM 1)achieved much faster and stable convergence than the Deep Deterministic Policy Gradient(DDPG)and Advantage Actor-Critic(A2C),2)incurred a lower packet loss ratio and latency than Open Shortest Path First(OSPF),DDPG,and A2C,and 3)resulted in higher and more stable throughput than OSPF,DDPG,and A2C.
基金Supported by National Key R&D Program of China(Grant No.2022YFB2503203)National Natural Science Foundation of China(Grant No.U1964206).
文摘Decision-making of connected and automated vehicles(CAV)includes a sequence of driving maneuvers that improve safety and efficiency,characterized by complex scenarios,strong uncertainty,and high real-time requirements.Deep reinforcement learning(DRL)exhibits excellent capability of real-time decision-making and adaptability to complex scenarios,and generalization abilities.However,it is arduous to guarantee complete driving safety and efficiency under the constraints of training samples and costs.This paper proposes a Mixture of Expert method(MoE)based on Soft Actor-Critic(SAC),where the upper-level discriminator dynamically decides whether to activate the lower-level DRL expert or the heuristic expert based on the features of the input state.To further enhance the performance of the DRL expert,a buffer zone is introduced in the reward function,preemptively applying penalties before insecure situations occur.In order to minimize collision and off-road rates,the Intelligent Driver Model(IDM)and Minimizing Overall Braking Induced by Lane changes(MOBIL)strategy are designed by heuristic experts.Finally,tested in typical simulation scenarios,MOE shows a 13.75%improvement in driving efficiency compared with the traditional DRL method with continuous action space.It ensures high safety with zero collision and zero off-road rates while maintaining high adaptability.
基金supported by the National Natural Science Foundation of China(NSFC)(61831002,62001076)the General Program of Natural Science Foundation of Chongqing(No.CSTB2023NSCQ-MSX0726,No.cstc2020jcyjmsxmX0878).
文摘Wireless Sensor Network(WSN)is widely utilized in large-scale distributed unmanned detection scenarios due to its low cost and flexible installation.However,WSN data collection encounters challenges in scenarios lacking communication infrastructure.Unmanned aerial vehicle(UAV)offers a novel solution for WSN data collection,leveraging their high mobility.In this paper,we present an efficient UAV-assisted data collection algorithm aimed at minimizing the overall power consumption of the WSN.Firstly,a two-layer UAV-assisted data collection model is introduced,including the ground and aerial layers.The ground layer senses the environmental data by the cluster members(CMs),and the CMs transmit the data to the cluster heads(CHs),which forward the collected data to the UAVs.The aerial network layer consists of multiple UAVs that collect,store,and forward data from the CHs to the data center for analysis.Secondly,an improved clustering algorithm based on K-Means++is proposed to optimize the number and locations of CHs.Moreover,an Actor-Critic based algorithm is introduced to optimize the UAV deployment and the association with CHs.Finally,simulation results verify the effectiveness of the proposed algorithms.
基金supported by National Natural Science Foundation of China(52222215, 52272420, 52072051)。
文摘Parking in a small parking lot within limited space poses a difficult task. It often leads to deviations between the final parking posture and the target posture. These deviations can lead to partial occupancy of adjacent parking lots, which poses a safety threat to vehicles parked in these parking lots. However, previous studies have not addressed this issue. In this paper, we aim to evaluate the impact of parking deviation of existing vehicles next to the target parking lot(PDEVNTPL) on the automatic ego vehicle(AEV) parking, in terms of safety, comfort, accuracy, and efficiency of parking. A segmented parking training framework(SPTF) based on soft actor-critic(SAC) is proposed to improve parking performance. In the proposed method, the SAC algorithm incorporates strategy entropy into the objective function, to enable the AEV to learn parking strategies based on a more comprehensive understanding of the environment. Additionally, the SPTF simplifies complex parking tasks to maintain the high performance of deep reinforcement learning(DRL). The experimental results reveal that the PDEVNTPL has a detrimental influence on the AEV parking in terms of safety, accuracy, and comfort, leading to reductions of more than 27%, 54%, and 26%respectively. However, the SAC-based SPTF effectively mitigates this impact, resulting in a considerable increase in the parking success rate from 71% to 93%. Furthermore, the heading angle deviation is significantly reduced from 2.25 degrees to 0.43degrees.
文摘Actor-Critic是一种强化学习方法,通过与环境在线试错交互收集样本来学习策略,是求解序贯感知决策问题的有效手段.但是,这种在线交互的主动学习范式在一些复杂真实环境中收集样本时会带来成本和安全问题离线强化学习作为一种基于数据驱动的强化学习范式,强调从静态样本数据集中学习策略,与环境无探索交互,为机器人、自动驾驶、健康护理等真实世界部署应用提供了可行的解决方案,是近年来的研究热点.目前,离线强化学习方法存在学习策略和行为策略之间的分布偏移挑战,针对这个挑战,通常采用策略约束或值函数正则化来限制访问数据集分布之外(Out-Of-Distribution,OOD)的动作,从而导致学习性能过于保守,阻碍了值函数网络的泛化和学习策略的性能提升.为此,本文利用不确定性估计和OOD采样来平衡值函数学习的泛化性和保守性,提出一种基于不确定性估计的离线确定型Actor-Critic方法(Offline Deterministic Actor-Critic based on UncertaintyEstimation,ODACUE).首先,针对确定型策略,给出一种Q值函数的不确定性估计算子定义,理论证明了该算子学到的Q值函数是最优Q值函数的一种悲观估计.然后,将不确定性估计算子应用于确定型Actor-Critic框架中,通过对不确定性估计算子进行凸组合构造Critic学习的目标函数.最后,D4RL基准数据集任务上的实验结果表明:相较于对比算法,ODACUE在11个不同质量等级数据集任务中的总体性能提升最低达9.56%,最高达64.92%.此外,参数分析和消融实验进一步验证了ODACUE的稳定性和泛化能力.
文摘Medical procedures are inherently invasive and carry the risk of inducing pain to the mind and body.Recently,efforts have been made to alleviate the discomfort associated with invasive medical procedures through the use of virtual reality(VR)technology.VR has been demonstrated to be an effective treatment for pain associated with medical procedures,as well as for chronic pain conditions for which no effective treatment has been established.The precise mechanism by which the diversion from reality facilitated by VR contributes to the diminution of pain and anxiety has yet to be elucidated.However,the provision of positive images through VR-based visual stimulation may enhance the functionality of brain networks.The salience network is diminished,while the default mode network is enhanced.Additionally,the medial prefrontal cortex may establish a stronger connection with the default mode network,which could result in a reduction of pain and anxiety.Further research into the potential of VR technology to alleviate pain could lead to a reduction in the number of individuals who overdose on painkillers and contribute to positive change in the medical field.
基金supported by the Sichuan Science and Technology Program(grant number 2022YFG0123).
文摘In this study,a novel residential virtual power plant(RVPP)scheduling method that leverages a gate recurrent unit(GRU)-integrated deep reinforcement learning(DRL)algorithm is proposed.In the proposed scheme,the GRU-integrated DRL algorithm guides the RVPP to participate effectively in both the day-ahead and real-time markets,lowering the electricity purchase costs and consumption risks for end-users.The Lagrangian relaxation technique is introduced to transform the constrained Markov decision process(CMDP)into an unconstrained optimization problem,which guarantees that the constraints are strictly satisfied without determining the penalty coefficients.Furthermore,to enhance the scalability of the constrained soft actor-critic(CSAC)-based RVPP scheduling approach,a fully distributed scheduling architecture was designed to enable plug-and-play in the residential distributed energy resources(RDER).Case studies performed on the constructed RVPP scenario validated the performance of the proposed methodology in enhancing the responsiveness of the RDER to power tariffs,balancing the supply and demand of the power grid,and ensuring customer comfort.
基金National Natural Science Foundation of China(11971211,12171388).
文摘Complex network models are frequently employed for simulating and studyingdiverse real-world complex systems.Among these models,scale-free networks typically exhibit greater fragility to malicious attacks.Consequently,enhancing the robustness of scale-free networks has become a pressing issue.To address this problem,this paper proposes a Multi-Granularity Integration Algorithm(MGIA),which aims to improve the robustness of scale-free networks while keeping the initial degree of each node unchanged,ensuring network connectivity and avoiding the generation of multiple edges.The algorithm generates a multi-granularity structure from the initial network to be optimized,then uses different optimization strategies to optimize the networks at various granular layers in this structure,and finally realizes the information exchange between different granular layers,thereby further enhancing the optimization effect.We propose new network refresh,crossover,and mutation operators to ensure that the optimized network satisfies the given constraints.Meanwhile,we propose new network similarity and network dissimilarity evaluation metrics to improve the effectiveness of the optimization operators in the algorithm.In the experiments,the MGIA enhances the robustness of the scale-free network by 67.6%.This improvement is approximately 17.2%higher than the optimization effects achieved by eight currently existing complex network robustness optimization algorithms.
文摘Satellite edge computing has garnered significant attention from researchers;however,processing a large volume of tasks within multi-node satellite networks still poses considerable challenges.The sharp increase in user demand for latency-sensitive tasks has inevitably led to offloading bottlenecks and insufficient computational capacity on individual satellite edge servers,making it necessary to implement effective task offloading scheduling to enhance user experience.In this paper,we propose a priority-based task scheduling strategy based on a Software-Defined Network(SDN)framework for satellite-terrestrial integrated networks,which clarifies the execution order of tasks based on their priority.Subsequently,we apply a Dueling-Double Deep Q-Network(DDQN)algorithm enhanced with prioritized experience replay to derive a computation offloading strategy,improving the experience replay mechanism within the Dueling-DDQN framework.Next,we utilize the Deep Deterministic Policy Gradient(DDPG)algorithm to determine the optimal resource allocation strategy to reduce the processing latency of sub-tasks.Simulation results demonstrate that the proposed d3-DDPG algorithm outperforms other approaches,effectively reducing task processing latency and thus improving user experience and system efficiency.
基金Supported by Sichuan Science and Technology Program(2023YFSY0026,2023YFH0004)Supported by the Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korean government(MSIT)(No.RS-2022-00155885,Artificial Intelligence Convergence Innovation Human Resources Development(Hanyang University ERICA)).
文摘Two-dimensional endoscopic images are susceptible to interferences such as specular reflections and monotonous texture illumination,hindering accurate three-dimensional lesion reconstruction by surgical robots.This study proposes a novel end-to-end disparity estimation model to address these challenges.Our approach combines a Pseudo-Siamese neural network architecture with pyramid dilated convolutions,integrating multi-scale image information to enhance robustness against lighting interferences.This study introduces a Pseudo-Siamese structure-based disparity regression model that simplifies left-right image comparison,improving accuracy and efficiency.The model was evaluated using a dataset of stereo endoscopic videos captured by the Da Vinci surgical robot,comprising simulated silicone heart sequences and real heart video data.Experimental results demonstrate significant improvement in the network’s resistance to lighting interference without substantially increasing parameters.Moreover,the model exhibited faster convergence during training,contributing to overall performance enhancement.This study advances endoscopic image processing accuracy and has potential implications for surgical robot applications in complex environments.
文摘Deep neural networks(DNNs)are effective in solving both forward and inverse problems for nonlinear partial differential equations(PDEs).However,conventional DNNs are not effective in handling problems such as delay differential equations(DDEs)and delay integrodifferential equations(DIDEs)with constant delays,primarily due to their low regularity at delayinduced breaking points.In this paper,a DNN method that combines multi-task learning(MTL)which is proposed to solve both the forward and inverse problems of DIDEs.The core idea of this approach is to divide the original equation into multiple tasks based on the delay,using auxiliary outputs to represent the integral terms,followed by the use of MTL to seamlessly incorporate the properties at the breaking points into the loss function.Furthermore,given the increased training dificulty associated with multiple tasks and outputs,we employ a sequential training scheme to reduce training complexity and provide reference solutions for subsequent tasks.This approach significantly enhances the approximation accuracy of solving DIDEs with DNNs,as demonstrated by comparisons with traditional DNN methods.We validate the effectiveness of this method through several numerical experiments,test various parameter sharing structures in MTL and compare the testing results of these structures.Finally,this method is implemented to solve the inverse problem of nonlinear DIDE and the results show that the unknown parameters of DIDE can be discovered with sparse or noisy data.