期刊文献+
共找到542,309篇文章
< 1 2 250 >
每页显示 20 50 100
Actor-Critic框架下基于DDPG算法的绘画机器人控制系统优化设计 被引量:1
1
作者 罗子彪 唐娇 《自动化与仪器仪表》 2025年第2期193-197,202,共6页
人工智能与艺术创作的碰撞成为当前研究新焦点。然而,机器人在进行图画绘制工作中的控制效果却难以满足精度需求。因此,研究在深度确定性策略梯度算法基础上进行了绘画机器人控制系统设计。在Actor网络和Critic网络框架下,对算法的奖励... 人工智能与艺术创作的碰撞成为当前研究新焦点。然而,机器人在进行图画绘制工作中的控制效果却难以满足精度需求。因此,研究在深度确定性策略梯度算法基础上进行了绘画机器人控制系统设计。在Actor网络和Critic网络框架下,对算法的奖励函数以及经验池进行改进与优化,并提出了绘画机器人控制系统。验证显示,研究提出的控制系统比其他算法基础上的控制系统训练收敛速度平均提高了38.04%。机械臂肘关节仿真误差比其他算法平均减少了93.74%。结果表明,对算法的奖励函数与经验池进行改进能够提高算法收敛速度与性能。研究提出的绘画机器人控制系统对机器人绘制图像的过程控制能够满足控制精度需求,在机器人控制中具有积极的应用价值。 展开更多
关键词 Actor网络 Critic网络 DDPG算法 深度强化学习 控制系统
原文传递
面向长序列自主作业的非对称Actor-Critic强化学习方法
2
作者 任君凯 瞿宇珂 +3 位作者 罗嘉威 倪子淇 卢惠民 叶益聪 《国防科技大学学报》 北大核心 2025年第4期111-122,共12页
长序列自主作业能力已成为制约智能机器人走向实际应用的问题之一。针对机器人在复杂场景中面临的多样化长序列操作技能需求,提出了一种高效鲁棒的非对称Actor-Critic强化学习方法,旨在解决长序列任务学习难度大与奖励函数设计复杂的挑... 长序列自主作业能力已成为制约智能机器人走向实际应用的问题之一。针对机器人在复杂场景中面临的多样化长序列操作技能需求,提出了一种高效鲁棒的非对称Actor-Critic强化学习方法,旨在解决长序列任务学习难度大与奖励函数设计复杂的挑战。通过整合多个Critic网络协同训练单一Actor网络,并引入生成对抗模仿学习为Critic网络生成内在奖励,从而降低长序列任务学习难度。在此基础上,设计两阶段学习方法,利用模仿学习为强化学习提供高质量预训练行为策略,在进一步提高学习效率的同时,增强策略的泛化性能。面向化学实验室长序列自主作业的仿真结果表明,该方法显著提高了机器人长序列操作技能的学习效率与行为策略的鲁棒性。 展开更多
关键词 自主作业机器人 强化学习 actor-critic 长序列操作
在线阅读 下载PDF
基于Actor-Critic算法的新能源汽车实时充电优化调度研究
3
作者 赖城贤 杨婷 苏庆列 《黑龙江工业学院学报(综合版)》 2025年第5期128-133,共6页
随着新能源汽车的普及,其充电调度问题日益凸显。研究旨在通过优化充电调度算法,实现新能源汽车充电的实时优化,以提升充电效率和降低成本。研究采用了分两步执行的Actor-Critic充电调度算法,利用多层感知器构建Actor和Critic网络,并通... 随着新能源汽车的普及,其充电调度问题日益凸显。研究旨在通过优化充电调度算法,实现新能源汽车充电的实时优化,以提升充电效率和降低成本。研究采用了分两步执行的Actor-Critic充电调度算法,利用多层感知器构建Actor和Critic网络,并通过并行计算提高算法效率。研究结果显示,该算法在精准率上迅速上升,在约200次迭代后达到0.9,显著优于其他算法。在运行时间方面,该算法始终保持较低水平,显示出高运行效率。在充电负载管理上,该算法在50小时内达到约45kW的负载,充电效率接近90%,且充电成本在所有车辆数量下均为最低。该算法在新能源汽车充电调度中表现出色,不仅提高了充电效率,降低了充电成本,而且具有较快的收敛速度和较低的运行时间,为新能源汽车充电调度提供了一种有效的解决方案。 展开更多
关键词 actor-critic算法 新能源汽车 实时充电 优化调度 状态空间
在线阅读 下载PDF
AQROM:A quality of service aware routing optimization mechanism based on asynchronous advantage actor-critic in software-defined networks 被引量:1
4
作者 Wei Zhou Xing Jiang +4 位作者 Qingsong Luo Bingli Guo Xiang Sun Fengyuan Sun Lingyu Meng 《Digital Communications and Networks》 CSCD 2024年第5期1405-1414,共10页
In Software-Defined Networks(SDNs),determining how to efficiently achieve Quality of Service(QoS)-aware routing is challenging but critical for significantly improving the performance of a network,where the metrics of... In Software-Defined Networks(SDNs),determining how to efficiently achieve Quality of Service(QoS)-aware routing is challenging but critical for significantly improving the performance of a network,where the metrics of QoS can be defined as,for example,average latency,packet loss ratio,and throughput.The SDN controller can use network statistics and a Deep Reinforcement Learning(DRL)method to resolve this challenge.In this paper,we formulate dynamic routing in an SDN as a Markov decision process and propose a DRL algorithm called the Asynchronous Advantage Actor-Critic QoS-aware Routing Optimization Mechanism(AQROM)to determine routing strategies that balance the traffic loads in the network.AQROM can improve the QoS of the network and reduce the training time via dynamic routing strategy updates;that is,the reward function can be dynamically and promptly altered based on the optimization objective regardless of the network topology and traffic pattern.AQROM can be considered as one-step optimization and a black-box routing mechanism in high-dimensional input and output sets for both discrete and continuous states,and actions with respect to the operations in the SDN.Extensive simulations were conducted using OMNeT++and the results demonstrated that AQROM 1)achieved much faster and stable convergence than the Deep Deterministic Policy Gradient(DDPG)and Advantage Actor-Critic(A2C),2)incurred a lower packet loss ratio and latency than Open Shortest Path First(OSPF),DDPG,and A2C,and 3)resulted in higher and more stable throughput than OSPF,DDPG,and A2C. 展开更多
关键词 Software-defined networks Asynchronous advantage actor-critic QoS-aware routing optimization mechanism
在线阅读 下载PDF
Mixture of Experts Framework Based on Soft Actor-Critic Algorithm for Highway Decision-Making of Connected and Automated Vehicles
5
作者 Fuxing Yao Chao Sun +2 位作者 Bing Lu Bo Wang Haiyang Yu 《Chinese Journal of Mechanical Engineering》 2025年第1期382-395,共14页
Decision-making of connected and automated vehicles(CAV)includes a sequence of driving maneuvers that improve safety and efficiency,characterized by complex scenarios,strong uncertainty,and high real-time requirements... Decision-making of connected and automated vehicles(CAV)includes a sequence of driving maneuvers that improve safety and efficiency,characterized by complex scenarios,strong uncertainty,and high real-time requirements.Deep reinforcement learning(DRL)exhibits excellent capability of real-time decision-making and adaptability to complex scenarios,and generalization abilities.However,it is arduous to guarantee complete driving safety and efficiency under the constraints of training samples and costs.This paper proposes a Mixture of Expert method(MoE)based on Soft Actor-Critic(SAC),where the upper-level discriminator dynamically decides whether to activate the lower-level DRL expert or the heuristic expert based on the features of the input state.To further enhance the performance of the DRL expert,a buffer zone is introduced in the reward function,preemptively applying penalties before insecure situations occur.In order to minimize collision and off-road rates,the Intelligent Driver Model(IDM)and Minimizing Overall Braking Induced by Lane changes(MOBIL)strategy are designed by heuristic experts.Finally,tested in typical simulation scenarios,MOE shows a 13.75%improvement in driving efficiency compared with the traditional DRL method with continuous action space.It ensures high safety with zero collision and zero off-road rates while maintaining high adaptability. 展开更多
关键词 DECISION-MAKING Soft actor-critic Connected and automated vehicles
在线阅读 下载PDF
Actor-Critic-Based UAV-Assisted Data Collection in the Wireless Sensor Network 被引量:1
6
作者 Huang Xiaoge Wang Lingzhi +1 位作者 He Yong Chen Qianbin 《China Communications》 SCIE CSCD 2024年第4期163-177,共15页
Wireless Sensor Network(WSN)is widely utilized in large-scale distributed unmanned detection scenarios due to its low cost and flexible installation.However,WSN data collection encounters challenges in scenarios lacki... Wireless Sensor Network(WSN)is widely utilized in large-scale distributed unmanned detection scenarios due to its low cost and flexible installation.However,WSN data collection encounters challenges in scenarios lacking communication infrastructure.Unmanned aerial vehicle(UAV)offers a novel solution for WSN data collection,leveraging their high mobility.In this paper,we present an efficient UAV-assisted data collection algorithm aimed at minimizing the overall power consumption of the WSN.Firstly,a two-layer UAV-assisted data collection model is introduced,including the ground and aerial layers.The ground layer senses the environmental data by the cluster members(CMs),and the CMs transmit the data to the cluster heads(CHs),which forward the collected data to the UAVs.The aerial network layer consists of multiple UAVs that collect,store,and forward data from the CHs to the data center for analysis.Secondly,an improved clustering algorithm based on K-Means++is proposed to optimize the number and locations of CHs.Moreover,an Actor-Critic based algorithm is introduced to optimize the UAV deployment and the association with CHs.Finally,simulation results verify the effectiveness of the proposed algorithms. 展开更多
关键词 actor critic data collection deep reinforcement learning unmanned aerial vehicle wireless sensor network
在线阅读 下载PDF
改进Deep Q Networks的交通信号均衡调度算法
7
作者 贺道坤 《机械设计与制造》 北大核心 2025年第4期135-140,共6页
为进一步缓解城市道路高峰时段十字路口的交通拥堵现象,实现路口各道路车流均衡通过,基于改进Deep Q Networks提出了一种的交通信号均衡调度算法。提取十字路口与交通信号调度最相关的特征,分别建立单向十字路口交通信号模型和线性双向... 为进一步缓解城市道路高峰时段十字路口的交通拥堵现象,实现路口各道路车流均衡通过,基于改进Deep Q Networks提出了一种的交通信号均衡调度算法。提取十字路口与交通信号调度最相关的特征,分别建立单向十字路口交通信号模型和线性双向十字路口交通信号模型,并基于此构建交通信号调度优化模型;针对Deep Q Networks算法在交通信号调度问题应用中所存在的收敛性、过估计等不足,对Deep Q Networks进行竞争网络改进、双网络改进以及梯度更新策略改进,提出相适应的均衡调度算法。通过与经典Deep Q Networks仿真比对,验证论文算法对交通信号调度问题的适用性和优越性。基于城市道路数据,分别针对两种场景进行仿真计算,仿真结果表明该算法能够有效缩减十字路口车辆排队长度,均衡各路口车流通行量,缓解高峰出行方向的道路拥堵现象,有利于十字路口交通信号调度效益的提升。 展开更多
关键词 交通信号调度 十字路口 Deep Q networks 深度强化学习 智能交通
在线阅读 下载PDF
Path Planning and Tracking Control for Parking via Soft Actor-Critic Under Non-Ideal Scenarios 被引量:3
8
作者 Xiaolin Tang Yuyou Yang +3 位作者 Teng Liu Xianke Lin Kai Yang Shen Li 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期181-195,共15页
Parking in a small parking lot within limited space poses a difficult task. It often leads to deviations between the final parking posture and the target posture. These deviations can lead to partial occupancy of adja... Parking in a small parking lot within limited space poses a difficult task. It often leads to deviations between the final parking posture and the target posture. These deviations can lead to partial occupancy of adjacent parking lots, which poses a safety threat to vehicles parked in these parking lots. However, previous studies have not addressed this issue. In this paper, we aim to evaluate the impact of parking deviation of existing vehicles next to the target parking lot(PDEVNTPL) on the automatic ego vehicle(AEV) parking, in terms of safety, comfort, accuracy, and efficiency of parking. A segmented parking training framework(SPTF) based on soft actor-critic(SAC) is proposed to improve parking performance. In the proposed method, the SAC algorithm incorporates strategy entropy into the objective function, to enable the AEV to learn parking strategies based on a more comprehensive understanding of the environment. Additionally, the SPTF simplifies complex parking tasks to maintain the high performance of deep reinforcement learning(DRL). The experimental results reveal that the PDEVNTPL has a detrimental influence on the AEV parking in terms of safety, accuracy, and comfort, leading to reductions of more than 27%, 54%, and 26%respectively. However, the SAC-based SPTF effectively mitigates this impact, resulting in a considerable increase in the parking success rate from 71% to 93%. Furthermore, the heading angle deviation is significantly reduced from 2.25 degrees to 0.43degrees. 展开更多
关键词 Automatic parking control strategy parking deviation(APS) soft actor-critic(SAC)
在线阅读 下载PDF
Actor-critic框架下的二次指派问题求解方法 被引量:1
9
作者 李雪源 韩丛英 《中国科学院大学学报(中英文)》 CAS CSCD 北大核心 2024年第2期275-284,共10页
二次指派问题(QAP)属于NP-hard组合优化问题,在现实生活中有着广泛应用。目前相对成熟的启发式算法通常以问题为导向来设计定制化算法,缺乏迁移泛化能力。为提供一个统一的QAP求解策略,将QAP问题的流量矩阵及距离矩阵抽象成两个无向完... 二次指派问题(QAP)属于NP-hard组合优化问题,在现实生活中有着广泛应用。目前相对成熟的启发式算法通常以问题为导向来设计定制化算法,缺乏迁移泛化能力。为提供一个统一的QAP求解策略,将QAP问题的流量矩阵及距离矩阵抽象成两个无向完全图并构造相应的关联图,从而将设施和地点的指派任务转化为关联图上的节点选择任务,基于actor-critic框架,提出一种全新的求解算法ACQAP。首先,利用多头注意力机制构造策略网络,处理来自图卷积神经网络的节点表征向量;然后,通过actor-critic算法预测每个节点被作为最优节点输出的概率;最后,依据该概率在可行时间内输出满足目标奖励函数的动作决策序列。该算法摆脱人工设计,且适用于不同规模的输入,更加灵活可靠。实验结果表明,在QAPLIB实例上,本算法在精度媲美传统启发式算法的前提下,迁移泛化能力更强;同时相对于NGM等基于学习的算法,求解的指派费用与最优解之间的偏差最小,且在大部分实例中,偏差均小于20%。 展开更多
关键词 二次指派问题 图卷积神经网络 深度强化学习 多头注意力机制 actor-critic算法
在线阅读 下载PDF
基于不确定性估计的离线确定型Actor-Critic 被引量:2
10
作者 冯涣婷 程玉虎 王雪松 《计算机学报》 EI CAS CSCD 北大核心 2024年第4期717-732,共16页
Actor-Critic是一种强化学习方法,通过与环境在线试错交互收集样本来学习策略,是求解序贯感知决策问题的有效手段.但是,这种在线交互的主动学习范式在一些复杂真实环境中收集样本时会带来成本和安全问题离线强化学习作为一种基于数据驱... Actor-Critic是一种强化学习方法,通过与环境在线试错交互收集样本来学习策略,是求解序贯感知决策问题的有效手段.但是,这种在线交互的主动学习范式在一些复杂真实环境中收集样本时会带来成本和安全问题离线强化学习作为一种基于数据驱动的强化学习范式,强调从静态样本数据集中学习策略,与环境无探索交互,为机器人、自动驾驶、健康护理等真实世界部署应用提供了可行的解决方案,是近年来的研究热点.目前,离线强化学习方法存在学习策略和行为策略之间的分布偏移挑战,针对这个挑战,通常采用策略约束或值函数正则化来限制访问数据集分布之外(Out-Of-Distribution,OOD)的动作,从而导致学习性能过于保守,阻碍了值函数网络的泛化和学习策略的性能提升.为此,本文利用不确定性估计和OOD采样来平衡值函数学习的泛化性和保守性,提出一种基于不确定性估计的离线确定型Actor-Critic方法(Offline Deterministic Actor-Critic based on UncertaintyEstimation,ODACUE).首先,针对确定型策略,给出一种Q值函数的不确定性估计算子定义,理论证明了该算子学到的Q值函数是最优Q值函数的一种悲观估计.然后,将不确定性估计算子应用于确定型Actor-Critic框架中,通过对不确定性估计算子进行凸组合构造Critic学习的目标函数.最后,D4RL基准数据集任务上的实验结果表明:相较于对比算法,ODACUE在11个不同质量等级数据集任务中的总体性能提升最低达9.56%,最高达64.92%.此外,参数分析和消融实验进一步验证了ODACUE的稳定性和泛化能力. 展开更多
关键词 离线强化学习 不确定性估计 分布外采样 凸组合 actor-critic
在线阅读 下载PDF
LATITUDES Network:提升证据合成稳健性的效度(偏倚风险)评价工具库
11
作者 廖明雨 熊益权 +7 位作者 赵芃 郭金 陈靖文 刘春容 贾玉龙 任燕 孙鑫 谭婧 《中国循证医学杂志》 北大核心 2025年第5期614-620,共7页
证据合成是对现有研究证据进行系统收集、分析和整合的过程,其结果依赖于纳入原始研究的质量,而效度评价(validity assessment,又称偏倚风险评价)则是评估这些原始研究质量的重要手段。现有效度评价工具种类繁多,但部分工具缺乏严格的... 证据合成是对现有研究证据进行系统收集、分析和整合的过程,其结果依赖于纳入原始研究的质量,而效度评价(validity assessment,又称偏倚风险评价)则是评估这些原始研究质量的重要手段。现有效度评价工具种类繁多,但部分工具缺乏严格的开发过程和评估,证据合成过程中应用不恰当的效度评价工具开展文献质量评价,可能会影响研究结论的准确性,误导临床实践。为解决这一困境,2023年9月英国Bristol大学学者牵头成立了效度评价工具一站式资源站LATITUDES Network。该网站致力于收集、整理和推广研究效度评价工具,以促进原始研究效度评价的准确性,提升证据合成的稳健性和可靠性。本文对LATITUDES Network成立背景、收录的效度评价工具,以及评价工具使用的培训资源等内容进行了详细介绍,以期为国内学者更多地了解LATITUDES Network,更好地运用恰当的效度评价工具开展文献质量评价,以及为开发效度评价工具等提供参考。 展开更多
关键词 效度评价 偏倚风险 证据合成 LATITUDES network
原文传递
Application of virtual reality technology improves the functionality of brain networks in individuals experiencing pain 被引量:3
12
作者 Takahiko Nagamine 《World Journal of Clinical Cases》 SCIE 2025年第3期66-68,共3页
Medical procedures are inherently invasive and carry the risk of inducing pain to the mind and body.Recently,efforts have been made to alleviate the discomfort associated with invasive medical procedures through the u... Medical procedures are inherently invasive and carry the risk of inducing pain to the mind and body.Recently,efforts have been made to alleviate the discomfort associated with invasive medical procedures through the use of virtual reality(VR)technology.VR has been demonstrated to be an effective treatment for pain associated with medical procedures,as well as for chronic pain conditions for which no effective treatment has been established.The precise mechanism by which the diversion from reality facilitated by VR contributes to the diminution of pain and anxiety has yet to be elucidated.However,the provision of positive images through VR-based visual stimulation may enhance the functionality of brain networks.The salience network is diminished,while the default mode network is enhanced.Additionally,the medial prefrontal cortex may establish a stronger connection with the default mode network,which could result in a reduction of pain and anxiety.Further research into the potential of VR technology to alleviate pain could lead to a reduction in the number of individuals who overdose on painkillers and contribute to positive change in the medical field. 展开更多
关键词 Virtual reality PAIN ANXIETY Salience network Default mode network
在线阅读 下载PDF
融合Dead-ends和离线监督Actor-Critic的动态治疗策略生成模型
13
作者 杨莎莎 于亚新 +3 位作者 王跃茹 许晶铭 魏阳杰 李新华 《计算机科学》 CSCD 北大核心 2024年第7期80-88,共9页
强化学习对数学模型依赖性低,利用经验便于架构和优化模型,非常适合用于动态治疗策略学习。但现有研究仍存在以下问题:1)学习策略最优性的同时未考虑风险,导致学到的策略存在一定的风险;2)忽略了分布偏移问题,导致学到的策略与医生策略... 强化学习对数学模型依赖性低,利用经验便于架构和优化模型,非常适合用于动态治疗策略学习。但现有研究仍存在以下问题:1)学习策略最优性的同时未考虑风险,导致学到的策略存在一定的风险;2)忽略了分布偏移问题,导致学到的策略与医生策略完全不同;3)忽略患者的历史观测数据和治疗史,从而不能很好地得到患者状态,进而导致不能学到最优策略。基于此,提出了融合Dead-ends和离线监督Actor-Critic的动态治疗策略生成模型DOSAC-DTR。首先,考虑学到的策略所推荐的治疗行动的风险性,在Actor-Critic框架中融入Dead-ends概念;其次,为缓解分布偏移问题,在Actor-Critic框架中融入医生监督,在最大化预期回报的同时,最小化所学策略与医生策略之间的差距;最后,为了得到包含患者关键历史信息的状态表示,使用基于LSTM的编码器解码器模型对患者的历史观测数据和治疗史进行建模。实验结果表明,DOSAC-DTR相比基线方法有更好的性能,可以得到更低的估计死亡率以及更高的Jaccard系数。 展开更多
关键词 动态治疗策略 Dead-ends actor-critic 状态表征
在线阅读 下载PDF
基于Actor-Critic自适应PID的钢筋套丝头跟踪检测控制系统研究 被引量:1
14
作者 秦天为 冯云剑 《工业控制计算机》 2024年第2期75-77,共3页
为适应流水线节奏,不影响生产进程,从而更好地实现钢筋套丝头质量检测和尺寸测量的自动化与智能化,设计了基于同步带直线导轨的钢筋套丝头检测跟踪系统,并提出了一种基于Actor-Critic的自适应PID控制方法,用强化学习的方法根据环境反馈... 为适应流水线节奏,不影响生产进程,从而更好地实现钢筋套丝头质量检测和尺寸测量的自动化与智能化,设计了基于同步带直线导轨的钢筋套丝头检测跟踪系统,并提出了一种基于Actor-Critic的自适应PID控制方法,用强化学习的方法根据环境反馈自动调节PID控制器的比例、积分、微分参数。对该方法和其他PID控制方法的响应性能指标进行实验和分析,实验结果表明该方法能够实现高精度、快速响应的跟踪拍摄,保证高精度的套丝头质量检测。 展开更多
关键词 钢筋套丝头检测 跟踪拍摄 自适应PID控制 actor-critic
在线阅读 下载PDF
无人机辅助物联网中基于Safe Actor-Critic的信息年龄最小化研究
15
作者 魏宪鹏 付芳 张志才 《测试技术学报》 2024年第1期71-78,共8页
无人机作为一种新的通信设备,有望在物联网数据采集、监控等业务中发挥关键作用。为保证所采集数据的时效性,利用信息年龄来衡量无人机从物联网设备接收到的数据新鲜度。通过联合优化无人机轨迹和无人机与物联网设备的关联策略以最小化... 无人机作为一种新的通信设备,有望在物联网数据采集、监控等业务中发挥关键作用。为保证所采集数据的时效性,利用信息年龄来衡量无人机从物联网设备接收到的数据新鲜度。通过联合优化无人机轨迹和无人机与物联网设备的关联策略以最小化信息年龄加权和,并保证无人机累积飞行能量消耗满足预算要求。由于上述问题同时受短期和长期约束条件的限制,将问题建模为受约束的马尔可夫决策过程(CMDP),并利用Safe Actor-Critic来求解。仿真结果表明,所提算法在最小化信息年龄的同时,能有效保证能量预算。 展开更多
关键词 无人机 信息年龄 物联网 Safe actor-critic
在线阅读 下载PDF
GRU-integrated constrained soft actor-critic learning enabled fully distributed scheduling strategy for residential virtual power plant
16
作者 Xiaoyun Deng Yongdong Chen +2 位作者 Dongchuan Fan Youbo Liu Chao Ma 《Global Energy Interconnection》 EI CSCD 2024年第2期117-129,共13页
In this study,a novel residential virtual power plant(RVPP)scheduling method that leverages a gate recurrent unit(GRU)-integrated deep reinforcement learning(DRL)algorithm is proposed.In the proposed scheme,the GRU-in... In this study,a novel residential virtual power plant(RVPP)scheduling method that leverages a gate recurrent unit(GRU)-integrated deep reinforcement learning(DRL)algorithm is proposed.In the proposed scheme,the GRU-integrated DRL algorithm guides the RVPP to participate effectively in both the day-ahead and real-time markets,lowering the electricity purchase costs and consumption risks for end-users.The Lagrangian relaxation technique is introduced to transform the constrained Markov decision process(CMDP)into an unconstrained optimization problem,which guarantees that the constraints are strictly satisfied without determining the penalty coefficients.Furthermore,to enhance the scalability of the constrained soft actor-critic(CSAC)-based RVPP scheduling approach,a fully distributed scheduling architecture was designed to enable plug-and-play in the residential distributed energy resources(RDER).Case studies performed on the constructed RVPP scenario validated the performance of the proposed methodology in enhancing the responsiveness of the RDER to power tariffs,balancing the supply and demand of the power grid,and ensuring customer comfort. 展开更多
关键词 Residential virtual power plant Residential distributed energy resource Constrained soft actor-critic Fully distributed scheduling strategy
在线阅读 下载PDF
Robustness Optimization Algorithm with Multi-Granularity Integration for Scale-Free Networks Against Malicious Attacks 被引量:1
17
作者 ZHANG Yiheng LI Jinhai 《昆明理工大学学报(自然科学版)》 北大核心 2025年第1期54-71,共18页
Complex network models are frequently employed for simulating and studyingdiverse real-world complex systems.Among these models,scale-free networks typically exhibit greater fragility to malicious attacks.Consequently... Complex network models are frequently employed for simulating and studyingdiverse real-world complex systems.Among these models,scale-free networks typically exhibit greater fragility to malicious attacks.Consequently,enhancing the robustness of scale-free networks has become a pressing issue.To address this problem,this paper proposes a Multi-Granularity Integration Algorithm(MGIA),which aims to improve the robustness of scale-free networks while keeping the initial degree of each node unchanged,ensuring network connectivity and avoiding the generation of multiple edges.The algorithm generates a multi-granularity structure from the initial network to be optimized,then uses different optimization strategies to optimize the networks at various granular layers in this structure,and finally realizes the information exchange between different granular layers,thereby further enhancing the optimization effect.We propose new network refresh,crossover,and mutation operators to ensure that the optimized network satisfies the given constraints.Meanwhile,we propose new network similarity and network dissimilarity evaluation metrics to improve the effectiveness of the optimization operators in the algorithm.In the experiments,the MGIA enhances the robustness of the scale-free network by 67.6%.This improvement is approximately 17.2%higher than the optimization effects achieved by eight currently existing complex network robustness optimization algorithms. 展开更多
关键词 complex network model MULTI-GRANULARITY scale-free networks ROBUSTNESS algorithm integration
原文传递
Offload Strategy for Edge Computing in Satellite Networks Based on Software Defined Network 被引量:1
18
作者 Zhiguo Liu Yuqing Gui +1 位作者 Lin Wang Yingru Jiang 《Computers, Materials & Continua》 SCIE EI 2025年第1期863-879,共17页
Satellite edge computing has garnered significant attention from researchers;however,processing a large volume of tasks within multi-node satellite networks still poses considerable challenges.The sharp increase in us... Satellite edge computing has garnered significant attention from researchers;however,processing a large volume of tasks within multi-node satellite networks still poses considerable challenges.The sharp increase in user demand for latency-sensitive tasks has inevitably led to offloading bottlenecks and insufficient computational capacity on individual satellite edge servers,making it necessary to implement effective task offloading scheduling to enhance user experience.In this paper,we propose a priority-based task scheduling strategy based on a Software-Defined Network(SDN)framework for satellite-terrestrial integrated networks,which clarifies the execution order of tasks based on their priority.Subsequently,we apply a Dueling-Double Deep Q-Network(DDQN)algorithm enhanced with prioritized experience replay to derive a computation offloading strategy,improving the experience replay mechanism within the Dueling-DDQN framework.Next,we utilize the Deep Deterministic Policy Gradient(DDPG)algorithm to determine the optimal resource allocation strategy to reduce the processing latency of sub-tasks.Simulation results demonstrate that the proposed d3-DDPG algorithm outperforms other approaches,effectively reducing task processing latency and thus improving user experience and system efficiency. 展开更多
关键词 Satellite network edge computing task scheduling computing offloading
在线阅读 下载PDF
A Novel Self-Supervised Learning Network for Binocular Disparity Estimation 被引量:1
19
作者 Jiawei Tian Yu Zhou +5 位作者 Xiaobing Chen Salman A.AlQahtani Hongrong Chen Bo Yang Siyu Lu Wenfeng Zheng 《Computer Modeling in Engineering & Sciences》 SCIE EI 2025年第1期209-229,共21页
Two-dimensional endoscopic images are susceptible to interferences such as specular reflections and monotonous texture illumination,hindering accurate three-dimensional lesion reconstruction by surgical robots.This st... Two-dimensional endoscopic images are susceptible to interferences such as specular reflections and monotonous texture illumination,hindering accurate three-dimensional lesion reconstruction by surgical robots.This study proposes a novel end-to-end disparity estimation model to address these challenges.Our approach combines a Pseudo-Siamese neural network architecture with pyramid dilated convolutions,integrating multi-scale image information to enhance robustness against lighting interferences.This study introduces a Pseudo-Siamese structure-based disparity regression model that simplifies left-right image comparison,improving accuracy and efficiency.The model was evaluated using a dataset of stereo endoscopic videos captured by the Da Vinci surgical robot,comprising simulated silicone heart sequences and real heart video data.Experimental results demonstrate significant improvement in the network’s resistance to lighting interference without substantially increasing parameters.Moreover,the model exhibited faster convergence during training,contributing to overall performance enhancement.This study advances endoscopic image processing accuracy and has potential implications for surgical robot applications in complex environments. 展开更多
关键词 Parallax estimation parallax regression model self-supervised learning Pseudo-Siamese neural network pyramid dilated convolution binocular disparity estimation
在线阅读 下载PDF
DEEP NEURAL NETWORKS COMBINING MULTI-TASK LEARNING FOR SOLVING DELAY INTEGRO-DIFFERENTIAL EQUATIONS 被引量:1
20
作者 WANG Chen-yao SHI Feng 《数学杂志》 2025年第1期13-38,共26页
Deep neural networks(DNNs)are effective in solving both forward and inverse problems for nonlinear partial differential equations(PDEs).However,conventional DNNs are not effective in handling problems such as delay di... Deep neural networks(DNNs)are effective in solving both forward and inverse problems for nonlinear partial differential equations(PDEs).However,conventional DNNs are not effective in handling problems such as delay differential equations(DDEs)and delay integrodifferential equations(DIDEs)with constant delays,primarily due to their low regularity at delayinduced breaking points.In this paper,a DNN method that combines multi-task learning(MTL)which is proposed to solve both the forward and inverse problems of DIDEs.The core idea of this approach is to divide the original equation into multiple tasks based on the delay,using auxiliary outputs to represent the integral terms,followed by the use of MTL to seamlessly incorporate the properties at the breaking points into the loss function.Furthermore,given the increased training dificulty associated with multiple tasks and outputs,we employ a sequential training scheme to reduce training complexity and provide reference solutions for subsequent tasks.This approach significantly enhances the approximation accuracy of solving DIDEs with DNNs,as demonstrated by comparisons with traditional DNN methods.We validate the effectiveness of this method through several numerical experiments,test various parameter sharing structures in MTL and compare the testing results of these structures.Finally,this method is implemented to solve the inverse problem of nonlinear DIDE and the results show that the unknown parameters of DIDE can be discovered with sparse or noisy data. 展开更多
关键词 Delay integro-differential equation Multi-task learning parameter sharing structure deep neural network sequential training scheme
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部