期刊文献+
共找到3,064篇文章
< 1 2 154 >
每页显示 20 50 100
Toward Collaborative and Adaptive Learning:A Survey of Multi-agent Reinforcement Learning in Education
1
作者 Sirine Bouguettaya Ouarda Zedadra +1 位作者 Francesco Pupo Giancarlo Fortino 《Artificial Intelligence Science and Engineering》 2026年第1期1-19,共19页
In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Mu... In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Multi-agent reinforcement learning(MARL)overcomes this limitation by allowing several agents to learn simultaneously within a shared environment,each choosing actions that maximize its own or the group's rewards.By explicitly modeling and exploiting agent-to-agent dynamics,MARL can align those interactions with pedagogical goals such as peer tutoring,collaborative problem-solving,or gamified competition,thus opening richer avenues for adaptive and socially informed learning experiences.This survey investigates the impact of MARL on educational outcomes by examining evidence of its effectiveness in enhancing learner performance,engagement,equity,and reducing teacher workload compared to single agent or traditional approaches.It explores the educational domains and pedagogical problems addressed by MARL,identifies the algorithmic families used,and analyzes their influence on learning.The review also assesses experimental settings and evaluation metrics to determine ecological validity,and outlines current challenges and future research directions in applying MARL to education. 展开更多
关键词 reinforcement learning multi-agent reinforcement learning Agentic AI EDUCATION generative AI
在线阅读 下载PDF
An Improved Reinforcement Learning-Based 6G UAV Communication for Smart Cities
2
作者 Vi Hoai Nam Chu Thi Minh Hue Dang Van Anh 《Computers, Materials & Continua》 2026年第1期2030-2044,共15页
Unmanned Aerial Vehicles(UAVs)have become integral components in smart city infrastructures,supporting applications such as emergency response,surveillance,and data collection.However,the high mobility and dynamic top... Unmanned Aerial Vehicles(UAVs)have become integral components in smart city infrastructures,supporting applications such as emergency response,surveillance,and data collection.However,the high mobility and dynamic topology of Flying Ad Hoc Networks(FANETs)present significant challenges for maintaining reliable,low-latency communication.Conventional geographic routing protocols often struggle in situations where link quality varies and mobility patterns are unpredictable.To overcome these limitations,this paper proposes an improved routing protocol based on reinforcement learning.This new approach integrates Q-learning with mechanisms that are both link-aware and mobility-aware.The proposed method optimizes the selection of relay nodes by using an adaptive reward function that takes into account energy consumption,delay,and link quality.Additionally,a Kalman filter is integrated to predict UAV mobility,improving the stability of communication links under dynamic network conditions.Simulation experiments were conducted using realistic scenarios,varying the number of UAVs to assess scalability.An analysis was conducted on key performance metrics,including the packet delivery ratio,end-to-end delay,and total energy consumption.The results demonstrate that the proposed approach significantly improves the packet delivery ratio by 12%–15%and reduces delay by up to 25.5%when compared to conventional GEO and QGEO protocols.However,this improvement comes at the cost of higher energy consumption due to additional computations and control overhead.Despite this trade-off,the proposed solution ensures reliable and efficient communication,making it well-suited for large-scale UAV networks operating in complex urban environments. 展开更多
关键词 UAV FANET smart cities reinforcement learning Q-LEARNING
在线阅读 下载PDF
Beyond Wi-Fi 7:Enhanced Decentralized Wireless Local Area Networks with Federated Reinforcement Learning
3
作者 Rashid Ali Alaa Omran Almagrabi 《Computers, Materials & Continua》 2026年第3期391-409,共19页
Wi-Fi technology has evolved significantly since its introduction in 1997,advancing to Wi-Fi 6 as the latest standard,with Wi-Fi 7 currently under development.Despite these advancements,integrating machine learning in... Wi-Fi technology has evolved significantly since its introduction in 1997,advancing to Wi-Fi 6 as the latest standard,with Wi-Fi 7 currently under development.Despite these advancements,integrating machine learning into Wi-Fi networks remains challenging,especially in decentralized environments with multiple access points(mAPs).This paper is a short review that summarizes the potential applications of federated reinforcement learning(FRL)across eight key areas of Wi-Fi functionality,including channel access,link adaptation,beamforming,multi-user transmissions,channel bonding,multi-link operation,spatial reuse,and multi-basic servic set(multi-BSS)coordination.FRL is highlighted as a promising framework for enabling decentralized training and decision-making while preserving data privacy.To illustrate its role in practice,we present a case study on link activation in a multi-link operation(MLO)environment with multiple APs.Through theoretical discussion and simulation results,the study demonstrates how FRL can improve performance and reliability,paving the way for more adaptive and collaborative Wi-Fi networks in the era of Wi-Fi 7 and beyond. 展开更多
关键词 Artificial intelligence reinforcement learning channels selection wireless local area networks 802.11ax 802.11be WI-FI
在线阅读 下载PDF
Multi-agent reinforcement learning with layered autonomy and collaboration for enhanced collaborative confrontation
4
作者 Xiaoyu XING Haoxiang XIA 《Chinese Journal of Aeronautics》 2026年第2期370-388,共19页
Addressing optimal confrontation methods in multi-agent attack-defense scenarios is a complex challenge.Multi-Agent Reinforcement Learning(MARL)provides an effective framework for tackling sequential decision-making p... Addressing optimal confrontation methods in multi-agent attack-defense scenarios is a complex challenge.Multi-Agent Reinforcement Learning(MARL)provides an effective framework for tackling sequential decision-making problems,significantly enhancing swarm intelligence in maneuvering.However,applying MARL to unmanned swarms presents two primary challenges.First,defensive agents must balance autonomy with collaboration under limited perception while coordinating against adversaries.Second,current algorithms aim to maximize global or individual rewards,making them sensitive to fluctuations in enemy strategies and environmental changes,especially when rewards are sparse.To tackle these issues,we propose an algorithm of MultiAgent Reinforcement Learning with Layered Autonomy and Collaboration(MARL-LAC)for collaborative confrontations.This algorithm integrates dual twin Critics to mitigate the high variance associated with policy gradients.Furthermore,MARL-LAC employs layered autonomy and collaboration to address multi-objective problems,specifically learning a global reward function for the swarm alongside local reward functions for individual defensive agents.Experimental results demonstrate that MARL-LAC enhances decision-making and collaborative behaviors among agents,outperforming the existing algorithms and emphasizing the importance of layered autonomy and collaboration in multi-agent systems.The observed adversarial behaviors demonstrate that agents using MARL-LAC effectively maintain cohesive formations that conceal their intentions by confusing the offensive agent while successfully encircling the target. 展开更多
关键词 Attack-defense confrontation Collaborative confrontation Autonomous agents Multi-agent systems reinforcement learning Maneuvering decisionmaking
原文传递
A Multi-Objective Deep Reinforcement Learning Algorithm for Computation Offloading in Internet of Vehicles
5
作者 Junjun Ren Guoqiang Chen +1 位作者 Zheng-Yi Chai Dong Yuan 《Computers, Materials & Continua》 2026年第1期2111-2136,共26页
Vehicle Edge Computing(VEC)and Cloud Computing(CC)significantly enhance the processing efficiency of delay-sensitive and computation-intensive applications by offloading compute-intensive tasks from resource-constrain... Vehicle Edge Computing(VEC)and Cloud Computing(CC)significantly enhance the processing efficiency of delay-sensitive and computation-intensive applications by offloading compute-intensive tasks from resource-constrained onboard devices to nearby Roadside Unit(RSU),thereby achieving lower delay and energy consumption.However,due to the limited storage capacity and energy budget of RSUs,it is challenging to meet the demands of the highly dynamic Internet of Vehicles(IoV)environment.Therefore,determining reasonable service caching and computation offloading strategies is crucial.To address this,this paper proposes a joint service caching scheme for cloud-edge collaborative IoV computation offloading.By modeling the dynamic optimization problem using Markov Decision Processes(MDP),the scheme jointly optimizes task delay,energy consumption,load balancing,and privacy entropy to achieve better quality of service.Additionally,a dynamic adaptive multi-objective deep reinforcement learning algorithm is proposed.Each Double Deep Q-Network(DDQN)agent obtains rewards for different objectives based on distinct reward functions and dynamically updates the objective weights by learning the value changes between objectives using Radial Basis Function Networks(RBFN),thereby efficiently approximating the Pareto-optimal decisions for multiple objectives.Extensive experiments demonstrate that the proposed algorithm can better coordinate the three-tier computing resources of cloud,edge,and vehicles.Compared to existing algorithms,the proposed method reduces task delay and energy consumption by 10.64%and 5.1%,respectively. 展开更多
关键词 Deep reinforcement learning internet of vehicles multi-objective optimization cloud-edge computing computation offloading service caching
在线阅读 下载PDF
Implementation of Human-AI Interaction in Reinforcement Learning: Literature Review and Case Studies
6
作者 Shaoping Xiao Zhaoan Wang +3 位作者 Junchao Li Caden Noeller Jiefeng Jiang Jun Wang 《Computers, Materials & Continua》 2026年第2期1-62,共62页
Theintegration of human factors into artificial intelligence(AI)systems has emerged as a critical research frontier,particularly in reinforcement learning(RL),where human-AI interaction(HAII)presents both opportunitie... Theintegration of human factors into artificial intelligence(AI)systems has emerged as a critical research frontier,particularly in reinforcement learning(RL),where human-AI interaction(HAII)presents both opportunities and challenges.As RL continues to demonstrate remarkable success in model-free and partially observable environments,its real-world deployment increasingly requires effective collaboration with human operators and stakeholders.This article systematically examines HAII techniques in RL through both theoretical analysis and practical case studies.We establish a conceptual framework built upon three fundamental pillars of effective human-AI collaboration:computational trust modeling,system usability,and decision understandability.Our comprehensive review organizes HAII methods into five key categories:(1)learning from human feedback,including various shaping approaches;(2)learning from human demonstration through inverse RL and imitation learning;(3)shared autonomy architectures for dynamic control allocation;(4)human-in-the-loop querying strategies for active learning;and(5)explainable RL techniques for interpretable policy generation.Recent state-of-the-art works are critically reviewed,with particular emphasis on advances incorporating large language models in human-AI interaction research.To illustrate some concepts,we present three detailed case studies:an empirical trust model for farmers adopting AI-driven agricultural management systems,the implementation of ethical constraints in roboticmotion planning through human-guided RL,and an experimental investigation of human trust dynamics using a multi-armed bandit paradigm.These applications demonstrate how HAII principles can enhance RL systems’practical utility while bridging the gap between theoretical RL and real-world human-centered applications,ultimately contributing to more deployable and socially beneficial intelligent systems. 展开更多
关键词 Human-AI interaction reinforcement learning partially observable environments trust model ethical constraints
在线阅读 下载PDF
Ride-hailing Electric Vehicle Dispatching for Resilience Reserve Enhancement:An Interactive Deep Reinforcement Learning Approach
7
作者 Ran Tao Dongmei Zhao +2 位作者 Haoxiang Wang Yinghui Wang Xuan Xia 《CSEE Journal of Power and Energy Systems》 2026年第1期448-465,共18页
Ride-hailing electric vehicles are mobile resources with dispatch potential to improve resilience.However,they have not been well investigated because their charging and order-serving are affected or managed by the po... Ride-hailing electric vehicles are mobile resources with dispatch potential to improve resilience.However,they have not been well investigated because their charging and order-serving are affected or managed by the power grid dispatching center and the ride-hailing platform.Effective pre-strategies can improve the prevention ability for high-impact and low-probability(HILP)events and provide the foundation for measures in the response and restoration stages.First,this paper proposes a resilience reserve to expand the existing research on power system resilience.Secondly,this paper puts forward an interactive method of deep reinforcement learning,which considers the interests of both the power grid dispatching center and the ride-hailing platform.It improves the resilience reserve by achieving the order dispatch,orderly charging management of ride-hailing electric vehicles,and the pricing strategy of charging stations.Finally,this paper uses a practical example covering about 107.32 km2 in the center of Chengdu to verify that the proposed method improves the resilience reserve of the power system without obviously damaging the interests of the ride-hailing platform. 展开更多
关键词 Charging scheduling electric vehicle power system resilience reinforcement learning ride-hailing
原文传递
A Regional Distribution Network Coordinated Optimization Strategy for Electric Vehicle Clusters Based on Parametric Deep Reinforcement Learning
8
作者 Lei Su Wanli Feng +4 位作者 Cao Kan Mingjiang Wei Jihai Wang Pan Yu Lingxiao Yang 《Energy Engineering》 2026年第3期195-214,共20页
To address the high costs and operational instability of distribution networks caused by the large-scale integration of distributed energy resources(DERs)(such as photovoltaic(PV)systems,wind turbines(WT),and energy s... To address the high costs and operational instability of distribution networks caused by the large-scale integration of distributed energy resources(DERs)(such as photovoltaic(PV)systems,wind turbines(WT),and energy storage(ES)devices),and the increased grid load fluctuations and safety risks due to uncoordinated electric vehicles(EVs)charging,this paper proposes a novel dual-scale hierarchical collaborative optimization strategy.This strategy decouples system-level economic dispatch from distributed EV agent control,effectively solving the resource coordination conflicts arising from the high computational complexity,poor scalability of existing centralized optimization,or the reliance on local information decision-making in fully decentralized frameworks.At the lower level,an EV charging and discharging model with a hybrid discrete-continuous action space is established,and optimized using an improved Parameterized Deep Q-Network(PDQN)algorithm,which directly handles mode selection and power regulation while embedding physical constraints to ensure safety.At the upper level,microgrid(MG)operators adopt a dynamic pricing strategy optimized through Deep Reinforcement Learning(DRL)to maximize economic benefits and achieve peak-valley shaving.Simulation results show that the proposed strategy outperforms traditional methods,reducing the total operating cost of the MG by 21.6%,decreasing the peak-to-valley load difference by 33.7%,reducing the number of voltage limit violations by 88.9%,and lowering the average electricity cost for EV users by 15.2%.This method brings a win-win result for operators and users,providing a reliable and efficient scheduling solution for distribution networks with high renewable energy penetration rates. 展开更多
关键词 Power system regional distributed energy electric vehicle deep reinforcement learning collaborative optimization
在线阅读 下载PDF
A Deep Reinforcement Learning-Based Partitioning Method for Power System Parallel Restoration
9
作者 Changcheng Li Weimeng Chang +1 位作者 Dahai Zhang Jinghan He 《Energy Engineering》 2026年第1期243-264,共22页
Effective partitioning is crucial for enabling parallel restoration of power systems after blackouts.This paper proposes a novel partitioning method based on deep reinforcement learning.First,the partitioning decision... Effective partitioning is crucial for enabling parallel restoration of power systems after blackouts.This paper proposes a novel partitioning method based on deep reinforcement learning.First,the partitioning decision process is formulated as a Markov decision process(MDP)model to maximize the modularity.Corresponding key partitioning constraints on parallel restoration are considered.Second,based on the partitioning objective and constraints,the reward function of the partitioning MDP model is set by adopting a relative deviation normalization scheme to reduce mutual interference between the reward and penalty in the reward function.The soft bonus scaling mechanism is introduced to mitigate overestimation caused by abrupt jumps in the reward.Then,the deep Q network method is applied to solve the partitioning MDP model and generate partitioning schemes.Two experience replay buffers are employed to speed up the training process of the method.Finally,case studies on the IEEE 39-bus test system demonstrate that the proposed method can generate a high-modularity partitioning result that meets all key partitioning constraints,thereby improving the parallelism and reliability of the restoration process.Moreover,simulation results demonstrate that an appropriate discount factor is crucial for ensuring both the convergence speed and the stability of the partitioning training. 展开更多
关键词 Partitioning method parallel restoration deep reinforcement learning experience replay buffer partitioning modularity
在线阅读 下载PDF
Reinforcement learning for muon scattering tomography enhancement
10
作者 Yi-Ni Wu Yuan-Yuan Liu +7 位作者 Li Wang Jian-Jie Zhang Ning Su Wen-Wan Ding Xin Zhao Zhi Zhou Peng Zheng Jian-Ping Cheng 《Nuclear Science and Techniques》 2026年第5期182-198,共17页
Muon scattering tomography(MST) is a powerful noninvasive imaging technique with significant applications in nuclear material detection and security screening.Traditional MST usually relies on the point of closest app... Muon scattering tomography(MST) is a powerful noninvasive imaging technique with significant applications in nuclear material detection and security screening.Traditional MST usually relies on the point of closest approach(PoCA) algorithm to reconstruct images from muon scattering data;however,PoCA often suffers from suboptimal image clarity and resolution.To overcome these challenges,we propose a novel approach that leverages reinforcement learning(RL) to enhance MST reconstruction,termed the μRL-enhanced method.By framing the MST optimization task as an RL problem,we developed an intelligent agent capable of dynamically adjusting the key PoCA parameters.The agent is trained using a multi-objective reward function that guides the optimization toward higher-quality reconstructions.Our experimental results show that theμRL-enhanced method significantly outperforms the traditional PoCA baseline acros s multiple benchmark metrics.Specifically,the proposed approach on average attains a 307% improvement in the intersection over union(IoU),a 79% increase in the structural similarity index measure(SSIM),and a 8.4% enhancement in the peak signal-to-noise ratio(PSNR) across four experiments.Furthermore,when benchmarked against the maximum likelihood scattering and displacement(MLSD)algorithm,the μRL-enhanced method offers modest gains in PS NR and IoU,together with a one-third increase in SSIM.These improvements demonstrate the enhanced reconstruction accuracy and structural fidelity of the μRL-enhanced method,highlighting its potential to advance MST technologies and their applications. 展开更多
关键词 Muon scattering tomography reinforcement learning Q-LEARNING PoCA
在线阅读 下载PDF
DRAGON-MINE:Deep Reinforcement Adaptive Gradient Optimization Network for Mining Rare Events in Healthcare
11
作者 Mohammed Abdullah Alsuwaiket 《Computer Modeling in Engineering & Sciences》 2026年第3期967-996,共30页
The healthcare field is fraught with challenges associated with severe class imbalance,wherein such critical conditions like sepsis,cardiac arrest,and drug adverse reactions are rare but have dire clinical consequence... The healthcare field is fraught with challenges associated with severe class imbalance,wherein such critical conditions like sepsis,cardiac arrest,and drug adverse reactions are rare but have dire clinical consequences.This paper presents a new framework,Deep Reinforcement Adaptive Gradient Optimization Network to Mining Rare Events(DRAGON-MINE),to demonstrate how deep reinforcement learning can be used synergistically with adaptive gradient optimization and address the inherent weaknesses of current methods in the prediction of rare health events.The suggested architecture uses a dual-pathway consisting of a reinforcement learning agent to dynamically reweigh samples and an adaptive gradient optimizer to follow novel learning rates.With extensive experiments on the MIMIC-IV and eICU-CRD datasets,DRAGON-MINE consistently outperforms recent state-of-the-art methods for sepsis,cardiac arrest,and adverse drug reaction prediction,achieving AUROC values of 92.3%and 91.6%for sepsis prediction on MIMIC-IV and eICU-CRD,respectively,while consistently outperforming Transformer-,CNN-RNN-,and Fed-Ensemble-based methods across all evaluated tasks and datasets,with particularly strong gains observed in precision-recall performance under severe class imbalance.With its high sensitivity(88.4%)and specificity(90.2%),DRAGON-MINE enables reliable early warning of rare clinical events in critical care settings while minimizing false alarms,supporting safer clinical decision support systems,and demonstrating strong potential for scalable deployment across multi-institutional intensive care environments through federated learning. 展开更多
关键词 Deep reinforcement learning rare event prediction class imbalance healthcare AI adaptive gradient optimization sepsis detection federated learning
在线阅读 下载PDF
A State-of-the-Art Survey of Adversarial Reinforcement Learning for IoT Intrusion Detection
12
作者 Qasem Abu Al-Haija Shahad Al Tamimi 《Computers, Materials & Continua》 2026年第4期26-94,共69页
Adversarial Reinforcement Learning(ARL)models for intelligent devices and Network Intrusion Detection Systems(NIDS)improve systemresilience against sophisticated cyber-attacks.As a core component of ARL,Adversarial Tr... Adversarial Reinforcement Learning(ARL)models for intelligent devices and Network Intrusion Detection Systems(NIDS)improve systemresilience against sophisticated cyber-attacks.As a core component of ARL,Adversarial Training(AT)enables NIDS agents to discover and prevent newattack paths by exposing them to competing examples,thereby increasing detection accuracy,reducing False Positives(FPs),and enhancing network security.To develop robust decision-making capabilities for real-world network disruptions and hostile activity,NIDS agents are trained in adversarial scenarios to monitor the current state and notify management of any abnormal or malicious activity.The accuracy and timeliness of the IDS were crucial to the network’s availability and reliability at this time.This paper analyzes ARL applications in NIDS,revealing State-of-The-Art(SoTA)methodology,issues,and future research prospects.This includes Reinforcement Machine Learning(RML)-based NIDS,which enables an agent to interact with the environment to achieve a goal,andDeep Reinforcement Learning(DRL)-based NIDS,which can solve complex decision-making problems.Additionally,this survey study addresses cybersecurity adversarial circumstances and their importance for ARL and NIDS.Architectural design,RL algorithms,feature representation,and training methodologies are examined in the ARL-NIDS study.This comprehensive study evaluates ARL for intelligent NIDS research,benefiting cybersecurity researchers,practitioners,and policymakers.The report promotes cybersecurity defense research and innovation. 展开更多
关键词 reinforcement learning network intrusion detection adversarial training deep learning cybersecurity defense intrusion detection system and machine learning
在线阅读 下载PDF
Safe Deep Reinforcement Learning for Real-time AC Optimal Power Flow:A Near-optimal Solution
13
作者 Bin Feng Jiayue Zhao +4 位作者 Gang Huang Yijie Hu Huating Xu Changxin Guo Zhe Chen 《CSEE Journal of Power and Energy Systems》 2026年第1期99-111,共13页
The real-time AC optimal power flow(OPF)problem is a key issue in making fast and accurate decisions to ensure the safety and economy of power systems.With the rapid development of renewable energies,the fluctuation h... The real-time AC optimal power flow(OPF)problem is a key issue in making fast and accurate decisions to ensure the safety and economy of power systems.With the rapid development of renewable energies,the fluctuation has grown more vibrant,thus a novel approach called safe deep reinforcement learning is proposed in this paper.Herein,the real-time ACOPF problem is modeled as a constrained Markov decision process,and primal-dual optimization(PDO)based proximal policy optimization(PPO)is used to learn the optimal generator outputs in the primal domain and security constraints in the dual domain,which avoids manually selecting a trade-off between penalties for constraint violations and rewards for the economy.Before training,behavior cloning clones the expert experience into the initial weights of neural networks.Moreover,multiprocessing training is utilized to accelerate the training speed.Case studies are conducted on the IEEE 118-bus system and the modified IEEE 118-bus system.Compared with other methods,the experimental results show that the proposed method can achieve security and near-optimal economic goals by fast calculating the real-time ACOPF problem. 展开更多
关键词 Behavior cloning deep reinforcement learning multiprocessing training optimal power flow primal-dual optimization proximal policy optimization
原文传递
A Novel Evolutionary Optimized Transformer-Deep Reinforcement Learning Framework for False Data Injection Detection in Industry 4.0 Smart Water Infrastructures
14
作者 Ahmad Salehiyan Nuria Serrano +2 位作者 Francisco Hernando-Gallego Diego Martín José Vicenteálvarez-Bravo 《Computers, Materials & Continua》 2026年第5期1588-1624,共37页
The increasing integration of cyber-physical components in Industry 4.0 water infrastructures has heightened the risk of false data injection(FDI)attacks,posing critical threats to operational integrity,resource manag... The increasing integration of cyber-physical components in Industry 4.0 water infrastructures has heightened the risk of false data injection(FDI)attacks,posing critical threats to operational integrity,resource management,and public safety.Traditional detection mechanisms often struggle to generalize across heterogeneous environments or adapt to sophisticated,stealthy threats.To address these challenges,we propose a novel evolutionary optimized transformer-based deep reinforcement learning framework(Evo-Transformer-DRL)designed for robust and adaptive FDI detection in smart water infrastructures.The proposed architecture integrates three powerful paradigms:a transformer encoder for modeling complex temporal dependencies in multivariate time series,a DRL agent for learning optimal decision policies in dynamic environments,and an evolutionary optimizer to fine-tune model hyper-parameters.This synergy enhances detection performance while maintaining adaptability across varying data distributions.Specifically,hyper-parameters of both the transformer and DRL modules are optimized using an improved grey wolf optimizer(IGWO),ensuring a balanced trade-off between detection accuracy and computational efficiency.The model is trained and evaluated on three realistic Industry 4.0 water datasets:secure water treatment(SWaT),water distribution(WADI),and battle of the attack detection algorithms(BATADAL),which capture diverse attack scenarios in smart treatment and distribution systems.Comparative analysis against state-of-the-art baselines including Transformer,DRL,bidirectional encoder representations from transformers(BERT),convolutional neural network(CNN),long short-term memory(LSTM),and support vector machines(SVM)demonstrates that our proposed Evo-Transformer-DRL framework consistently outperforms others in key metrics such as accuracy,recall,area under the curve(AUC),and execution time.Notably,it achieves a maximum detection accuracy of 99.19%,highlighting its strong generalization capability across different testbeds.These results confirm the suitability of our hybrid framework for real-world Industry 4.0 deployment,where rapid adaptation,scalability,and reliability are paramount for securing critical infrastructure systems. 展开更多
关键词 Industry 4.0 smart water systems false data injection detection cyber-physical security TRANSFORMER deep reinforcement learning grey wolf optimizer
在线阅读 下载PDF
Computer Modeling of Pipeline Repair Reinforcement with Composite Bandages
15
作者 Maria Tanase Gennadiy Lvov 《Computer Modeling in Engineering & Sciences》 2026年第2期296-315,共20页
The increasing occurrence of corrosion-related damage in steel pipelines has led to the growing use of composite-based repair techniques as an efficient alternative to traditional replacement methods.Computer modeling... The increasing occurrence of corrosion-related damage in steel pipelines has led to the growing use of composite-based repair techniques as an efficient alternative to traditional replacement methods.Computer modeling and structural analysis were performed for the repair reinforcement of a steel pipeline with a composite bandage.A preliminary analysis of possible contact interaction schemes was implemented based on the theory of cylindrical shells,taking into account transverse shear deformations.The finite element method was used for a detailed study of the stress state of the composite bandage and the reinforced section of the pipeline.The limit state of the reinforced section was assessed based on the von Mises criterion for steel and the Tsai-Wu criterion for composites.The effectiveness of the repair was demonstrated on a pipeline whose wall thickness had decreased by 20%as a result of corrosion damage.At a nominal pressure of P=6 MPa,the maximum normal stress in the weakened area reached 381 MPa.The installation of a composite bandage reduced this stress to 312 MPa,making the repaired section virtually as strong as the undamaged pipeline.Due to the linearity of the problem,the results obtained can be easily used to find critical internal pressure values. 展开更多
关键词 Numerical analysis pipeline repair reinforcement composite bandages
在线阅读 下载PDF
Energy Optimization for Autonomous Mobile Robot Path Planning Based on Deep Reinforcement Learning
16
作者 Longfei Gao Weidong Wang Dieyun Ke 《Computers, Materials & Continua》 2026年第1期984-998,共15页
At present,energy consumption is one of the main bottlenecks in autonomous mobile robot development.To address the challenge of high energy consumption in path planning for autonomous mobile robots navigating unknown ... At present,energy consumption is one of the main bottlenecks in autonomous mobile robot development.To address the challenge of high energy consumption in path planning for autonomous mobile robots navigating unknown and complex environments,this paper proposes an Attention-Enhanced Dueling Deep Q-Network(ADDueling DQN),which integrates a multi-head attention mechanism and a prioritized experience replay strategy into a Dueling-DQN reinforcement learning framework.A multi-objective reward function,centered on energy efficiency,is designed to comprehensively consider path length,terrain slope,motion smoothness,and obstacle avoidance,enabling optimal low-energy trajectory generation in 3D space from the source.The incorporation of a multihead attention mechanism allows the model to dynamically focus on energy-critical state features—such as slope gradients and obstacle density—thereby significantly improving its ability to recognize and avoid energy-intensive paths.Additionally,the prioritized experience replay mechanism accelerates learning from key decision-making experiences,suppressing inefficient exploration and guiding the policy toward low-energy solutions more rapidly.The effectiveness of the proposed path planning algorithm is validated through simulation experiments conducted in multiple off-road scenarios.Results demonstrate that AD-Dueling DQN consistently achieves the lowest average energy consumption across all tested environments.Moreover,the proposed method exhibits faster convergence and greater training stability compared to baseline algorithms,highlighting its global optimization capability under energy-aware objectives in complex terrains.This study offers an efficient and scalable intelligent control strategy for the development of energy-conscious autonomous navigation systems. 展开更多
关键词 Autonomous mobile robot deep reinforcement learning energy optimization multi-attention mechanism prioritized experience replay dueling deep Q-Network
在线阅读 下载PDF
Enhanced multi-agent deep reinforcement learning for efficient task offloading and resource allocation in vehicular networks
17
作者 Long Xu Jiale Tan Hongcheng Zhuang 《Digital Communications and Networks》 2026年第1期66-75,共10页
In response to the rising demand for low-latency,computation-intensive applications in vehicular networks,this paper proposes an adaptive task offloading approach for Vehicle-to-Everything(V2X)environments.Leveraging ... In response to the rising demand for low-latency,computation-intensive applications in vehicular networks,this paper proposes an adaptive task offloading approach for Vehicle-to-Everything(V2X)environments.Leveraging an enhanced Multi-Agent Deep Deterministic Policy Gradient(MADDPG)algorithm with an attention mechanism,the proposed approach optimizes computation offloading and resource allocation,aiming to minimize energy consumption and service delay.In this paper,vehicles dynamically offload computing-intensive tasks to both nearby vehicles through V2V links and roadside units through V2I links.The adaptive attention mechanism enables the system to prioritize relevant state information,leading to faster convergence.Simulations conducted in a realistic urban V2X scenario demonstrate that the proposed Attention-enhanced MADDPG(AT-MADDPG)algorithm significantly improves performance,achieving notable reductions in both energy consumption and latency compared to baseline algorithms,especially in high-demand,dynamic scenarios. 展开更多
关键词 Computation offloading Vehicular networks Deep reinforcement learning Adaptive offloading Spectrum and power allocation
在线阅读 下载PDF
Robust Voltage Control for Active Distribution Networks via Safe Deep Reinforcement Learning Against State Perturbations
18
作者 Meng Tian Xiaoxu Li +3 位作者 Ziyang Zhu Zhengcheng Dong Li Gong Jingang Lai 《Protection and Control of Modern Power Systems》 2026年第1期192-207,共16页
With the prevalence of renewable distributed energy resources(DERs)such as photovoltaics(PVs),modern active distribution networks(ADNs)suffer from voltage deviation and power quality issues.However,traditional voltage... With the prevalence of renewable distributed energy resources(DERs)such as photovoltaics(PVs),modern active distribution networks(ADNs)suffer from voltage deviation and power quality issues.However,traditional voltage control methods often face a trade-off between efficiency and effectiveness,and rarely ensure robust voltage safety under typical state perturbations in practical distribution grids.In this paper,a robust model-free voltage regulation approach is proposed which simultaneously takes security and robustness into account.In this context,the voltage control problem is formulated as a constrained Markov decision process(CMDP).A safety-augmented multiagent deep deterministic policy gradient(MADDPG)algorithm is the trained to enable real-time collaborative optimization of ADNs,aiming to maintain nodal voltages within safe operational limits while minimizing total line losses.Moreover,a robust regulation loss is introduced to ensure reliable performance under various state perturbations in practical voltage controls.The proposed regulation algorithm effectively balance efficiency,safety,and robustness,and also demonstrates potential for generalizing these characteristics to other applications.Numerical studies vali-date the robustness of the proposed method under varying state perturbations on the IEEE test cases and the optimal integrated control performance when compared to other benchmarks. 展开更多
关键词 Active distribution network robust voltage control state perturbation model-free safe deep reinforcement learning
在线阅读 下载PDF
Research on UAV-MEC Cooperative Scheduling Algorithms Based on Multi-Agent Deep Reinforcement Learning
19
作者 Yonghua Huo Ying Liu +1 位作者 Anni Jiang Yang Yang 《Computers, Materials & Continua》 2026年第3期1823-1850,共28页
With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier... With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier heterogeneous architecture composed of mobile devices,unmanned aerial vehicles(UAVs),and macro base stations(BSs).This scenario typically faces fast channel fading,dynamic computational loads,and energy constraints,whereas classical queuing-theoretic or convex-optimization approaches struggle to yield robust solutions in highly dynamic settings.To address this issue,we formulate a multi-agent Markov decision process(MDP)for an air-ground-fused MEC system,unify link selection,bandwidth/power allocation,and task offloading into a continuous action space and propose a joint scheduling strategy that is based on an improved MATD3 algorithm.The improvements include Alternating Layer Normalization(ALN)in the actor to suppress gradient variance,Residual Orthogonalization(RO)in the critic to reduce the correlation between the twin Q-value estimates,and a dynamic-temperature reward to enable adaptive trade-offs during training.On a multi-user,dual-link simulation platform,we conduct ablation and baseline comparisons.The results reveal that the proposed method has better convergence and stability.Compared with MADDPG,TD3,and DSAC,our algorithm achieves more robust performance across key metrics. 展开更多
关键词 UAV-MEC networks multi-agent deep reinforcement learning MATD3 task offloading
在线阅读 下载PDF
Robust Reinforcement Learning:Methods,Benchmarks and Challenges
20
作者 Jinlei Gu Mengchu Zhou +1 位作者 Xiwang Guo Yebin Wang 《Artificial Intelligence Science and Engineering》 2026年第1期20-35,共16页
Reinforcement learning(RL),as an important branch of machine learning,has recently achieved extensive attention and success in many applications.Its main idea is to enable agents to continuously learn to make optimal ... Reinforcement learning(RL),as an important branch of machine learning,has recently achieved extensive attention and success in many applications.Its main idea is to enable agents to continuously learn to make optimal decisions by trying to maximize a reward function for their actions and interactions with the environment.However,making highquality decisions in complex and uncertain real-world scenarios is a challenging task.The interference and attacks in such scenarios tend to destroy the existing strategies.Maintaining RL's optimal performance in various cases and adapting to changing environments remains an important challenge.This article presents a comprehensive review of recent advancements in robust reinforcement learning(RRL),and analyzes them from the perspectives of challenges,methodologies,and applications.It systematically evaluates current progress in RRL and summarizes the commonly used benchmark platforms.Finally,several open challenges are discussed to stimulate further research and guide future developments in this area. 展开更多
关键词 robust reinforcement learning robust enhancement environment randomization adversarial training
在线阅读 下载PDF
上一页 1 2 154 下一页 到第
使用帮助 返回顶部