Test case prioritization and ranking play a crucial role in software testing by improving fault detection efficiency and ensuring software reliability.While prioritization selects the most relevant test cases for opti...Test case prioritization and ranking play a crucial role in software testing by improving fault detection efficiency and ensuring software reliability.While prioritization selects the most relevant test cases for optimal coverage,ranking further refines their execution order to detect critical faults earlier.This study investigates machine learning techniques to enhance both prioritization and ranking,contributing to more effective and efficient testing processes.We first employ advanced feature engineering alongside ensemble models,including Gradient Boosted,Support Vector Machines,Random Forests,and Naive Bayes classifiers to optimize test case prioritization,achieving an accuracy score of 0.98847 and significantly improving the Average Percentage of Fault Detection(APFD).Subsequently,we introduce a deep Q-learning framework combined with a Genetic Algorithm(GA)to refine test case ranking within priority levels.This approach achieves a rank accuracy of 0.9172,demonstrating robust performance despite the increasing computational demands of specialized variation operators.Our findings highlight the effectiveness of stacked ensemble learning and reinforcement learning in optimizing test case prioritization and ranking.This integrated approach improves testing efficiency,reduces late-stage defects,and improves overall software stability.The study provides valuable information for AI-driven testing frameworks,paving the way for more intelligent and adaptive software quality assurance methodologies.展开更多
The integration of High-Altitude Platform Stations(HAPS)with Reconfigurable Intelligent Surfaces(RIS)represents a critical advancement for next-generation wireless networks,offering unprecedented opportunities for ubi...The integration of High-Altitude Platform Stations(HAPS)with Reconfigurable Intelligent Surfaces(RIS)represents a critical advancement for next-generation wireless networks,offering unprecedented opportunities for ubiquitous connectivity.However,existing research reveals significant gaps in dynamic resource allocation,joint optimization,and equitable service provisioning under varying channel conditions,limiting practical deployment of these technologies.This paper addresses these challenges by proposing a novel Fairness-Aware Deep Q-Learning(FAIRDQL)framework for joint resource management and phase configuration in HAPS-RIS systems.Our methodology employs a comprehensive three-tier algorithmic architecture integrating adaptive power control,priority-based user scheduling,and dynamic learning mechanisms.The FAIR-DQL approach utilizes advanced reinforcement learning with experience replay and fairness-aware reward functions to balance competing objectives while adapting to dynamic environments.Key findings demonstrate substantial improvements:9.15 dB SINR gain,12.5 bps/Hz capacity,78%power efficiency,and 0.82 fairness index.The framework achieves rapid 40-episode convergence with consistent delay performance.These contributions establish new benchmarks for fairness-aware resource allocation in aerial communications,enabling practical HAPS-RIS deployments in rural connectivity,emergency communications,and urban networks.展开更多
Path planning and obstacle avoidance are two challenging problems in the study of intelligent robots. In this paper, we develop a new method to alleviate these problems based on deep Q-learning with experience replay ...Path planning and obstacle avoidance are two challenging problems in the study of intelligent robots. In this paper, we develop a new method to alleviate these problems based on deep Q-learning with experience replay and heuristic knowledge. In this method, a neural network has been used to resolve the "curse of dimensionality" issue of the Q-table in reinforcement learning. When a robot is walking in an unknown environment, it collects experience data which is used for training a neural network;such a process is called experience replay.Heuristic knowledge helps the robot avoid blind exploration and provides more effective data for training the neural network. The simulation results show that in comparison with the existing methods, our method can converge to an optimal action strategy with less time and can explore a path in an unknown environment with fewer steps and larger average reward.展开更多
Deep Reinforcement Learning(DRL)is a class of Machine Learning(ML)that combines Deep Learning with Reinforcement Learning and provides a framework by which a system can learn from its previous actions in an environmen...Deep Reinforcement Learning(DRL)is a class of Machine Learning(ML)that combines Deep Learning with Reinforcement Learning and provides a framework by which a system can learn from its previous actions in an environment to select its efforts in the future efficiently.DRL has been used in many application fields,including games,robots,networks,etc.for creating autonomous systems that improve themselves with experience.It is well acknowledged that DRL is well suited to solve optimization problems in distributed systems in general and network routing especially.Therefore,a novel query routing approach called Deep Reinforcement Learning based Route Selection(DRLRS)is proposed for unstructured P2P networks based on a Deep Q-Learning algorithm.The main objective of this approach is to achieve better retrieval effectiveness with reduced searching cost by less number of connected peers,exchangedmessages,and reduced time.The simulation results shows a significantly improve searching a resource with compression to k-Random Walker and Directed BFS.Here,retrieval effectiveness,search cost in terms of connected peers,and average overhead are 1.28,106,149,respectively.展开更多
To support dramatically increased traffic loads,communication networks become ultra-dense.Traditional cell association(CA)schemes are timeconsuming,forcing researchers to seek fast schemes.This paper proposes a deep Q...To support dramatically increased traffic loads,communication networks become ultra-dense.Traditional cell association(CA)schemes are timeconsuming,forcing researchers to seek fast schemes.This paper proposes a deep Q-learning based scheme,whose main idea is to train a deep neural network(DNN)to calculate the Q values of all the state-action pairs and the cell holding the maximum Q value is associated.In the training stage,the intelligent agent continuously generates samples through the trial-anderror method to train the DNN until convergence.In the application stage,state vectors of all the users are inputted to the trained DNN to quickly obtain a satisfied CA result of a scenario with the same BS locations and user distribution.Simulations demonstrate that the proposed scheme provides satisfied CA results in a computational time several orders of magnitudes shorter than traditional schemes.Meanwhile,performance metrics,such as capacity and fairness,can be guaranteed.展开更多
Beamforming is significant for millimeter wave multi-user massive multi-input multi-output systems.In the meanwhile,the overhead cost of channel state information and beam training is considerable,especially in dynami...Beamforming is significant for millimeter wave multi-user massive multi-input multi-output systems.In the meanwhile,the overhead cost of channel state information and beam training is considerable,especially in dynamic environments.To reduce the overhead cost,we propose a multi-user beam tracking algorithm using a distributed deep Q-learning method.With online learning of users’moving trajectories,the proposed algorithm learns to scan a beam subspace to maximize the average effective sum rate.Considering practical implementation,we model the continuous beam tracking problem as a non-Markov decision process and thus develop a simplified training scheme of deep Q-learning to reduce the training complexity.Furthermore,we propose a scalable state-action-reward design for scenarios with different users and antenna numbers.Simulation results verify the effectiveness of the designed method.展开更多
The controller is a main component in the Software-Defined Networking(SDN)framework,which plays a significant role in enabling programmability and orchestration for 5G and next-generation networks.In SDN,frequent comm...The controller is a main component in the Software-Defined Networking(SDN)framework,which plays a significant role in enabling programmability and orchestration for 5G and next-generation networks.In SDN,frequent communication occurs between network switches and the controller,which manages and directs traffic flows.If the controller is not strategically placed within the network,this communication can experience increased delays,negatively affecting network performance.Specifically,an improperly placed controller can lead to higher end-to-end(E2E)delay,as switches must traverse more hops or encounter greater propagation delays when communicating with the controller.This paper introduces a novel approach using Deep Q-Learning(DQL)to dynamically place controllers in Software-Defined Internet of Things(SD-IoT)environments,with the goal of minimizing E2E delay between switches and controllers.E2E delay,a crucial metric for network performance,is influenced by two key factors:hop count,which measures the number of network nodes data must traverse,and propagation delay,which accounts for the physical distance between nodes.Our approach models the controller placement problem as a Markov Decision Process(MDP).In this model,the network configuration at any given time is represented as a“state,”while“actions”correspond to potential decisions regarding the placement of controllers or the reassignment of switches to controllers.Using a Deep Q-Network(DQN)to approximate the Q-function,the system learns the optimal controller placement by maximizing the cumulative reward,which is defined as the negative of the E2E delay.Essentially,the lower the delay,the higher the reward the system receives,enabling it to continuously improve its controller placement strategy.The experimental results show that our DQL-based method significantly reduces E2E delay when compared to traditional benchmark placement strategies.By dynamically learning from the network’s real-time conditions,the proposed method ensures that controller placement remains efficient and responsive,reducing communication delays and enhancing overall network performance.展开更多
To reduce the transmission latency and mitigate the backhaul burden of the centralized cloud-based network services,the mobile edge computing(MEC)has been drawing increased attention from both industry and academia re...To reduce the transmission latency and mitigate the backhaul burden of the centralized cloud-based network services,the mobile edge computing(MEC)has been drawing increased attention from both industry and academia recently.This paper focuses on mobile users’computation offloading problem in wireless cellular networks with mobile edge computing for the purpose of optimizing the computation offloading decision making policy.Since wireless network states and computing requests have stochastic properties and the environment’s dynamics are unknown,we use the modelfree reinforcement learning(RL)framework to formulate and tackle the computation offloading problem.Each mobile user learns through interactions with the environment and the estimate of its performance in the form of value function,then it chooses the overhead-aware optimal computation offloading action(local computing or edge computing)based on its state.The state spaces are high-dimensional in our work and value function is unrealistic to estimate.Consequently,we use deep reinforcement learning algorithm,which combines RL method Q-learning with the deep neural network(DNN)to approximate the value functions for complicated control applications,and the optimal policy will be obtained when the value function reaches convergence.Simulation results showed that the effectiveness of the proposed method in comparison with baseline methods in terms of total overheads of all mobile users.展开更多
To address low learning efficiency and inadequate path safety in spraying robot navigation within complex obstacle-rich environments—with dense,dynamic,unpredictable obstacles challenging conventional methods—this p...To address low learning efficiency and inadequate path safety in spraying robot navigation within complex obstacle-rich environments—with dense,dynamic,unpredictable obstacles challenging conventional methods—this paper proposes a hybrid algorithm integrating Q-learning and improved A*-Artificial Potential Field(A-APF).Centered on theQ-learning framework,the algorithmleverages safety-oriented guidance generated byA-APF and employs a dynamic coordination mechanism that adaptively balances exploration and exploitation.The proposed system comprises four core modules:(1)an environment modeling module that constructs grid-based obstacle maps;(2)an A-APF module that combines heuristic search from A*algorithm with repulsive force strategies from APF to generate guidance;(3)a Q-learning module that learns optimal state-action values(Q-values)through spraying robot-environment interaction and a reward function emphasizing path optimality and safety;and(4)a dynamic optimization module that ensures adaptive cooperation between Q-learning and A-APF through exploration rate control and environment-aware constraints.Simulation results demonstrate that the proposed method significantly enhances path safety in complex underground mining environments.Quantitative results indicate that,compared to the traditional Q-learning algorithm,the proposed method shortens training time by 42.95% and achieves a reduction in training failures from 78 to just 3.Compared to the static fusion algorithm,it further reduces both training time(by 10.78%)and training failures(by 50%),thereby improving overall training efficiency.展开更多
This paper focuses on the problem of active object detection(AOD).AOD is important for service robots to complete tasks in the family environment,and leads robots to approach the target ob ject by taking appropriate m...This paper focuses on the problem of active object detection(AOD).AOD is important for service robots to complete tasks in the family environment,and leads robots to approach the target ob ject by taking appropriate moving actions.Most of the current AOD methods are based on reinforcement learning with low training efficiency and testing accuracy.Therefore,an AOD model based on a deep Q-learning network(DQN)with a novel training algorithm is proposed in this paper.The DQN model is designed to fit the Q-values of various actions,and includes state space,feature extraction,and a multilayer perceptron.In contrast to existing research,a novel training algorithm based on memory is designed for the proposed DQN model to improve training efficiency and testing accuracy.In addition,a method of generating the end state is presented to judge when to stop the AOD task during the training process.Sufficient comparison experiments and ablation studies are performed based on an AOD dataset,proving that the presented method has better performance than the comparable methods and that the proposed training algorithm is more effective than the raw training algorithm.展开更多
文摘Test case prioritization and ranking play a crucial role in software testing by improving fault detection efficiency and ensuring software reliability.While prioritization selects the most relevant test cases for optimal coverage,ranking further refines their execution order to detect critical faults earlier.This study investigates machine learning techniques to enhance both prioritization and ranking,contributing to more effective and efficient testing processes.We first employ advanced feature engineering alongside ensemble models,including Gradient Boosted,Support Vector Machines,Random Forests,and Naive Bayes classifiers to optimize test case prioritization,achieving an accuracy score of 0.98847 and significantly improving the Average Percentage of Fault Detection(APFD).Subsequently,we introduce a deep Q-learning framework combined with a Genetic Algorithm(GA)to refine test case ranking within priority levels.This approach achieves a rank accuracy of 0.9172,demonstrating robust performance despite the increasing computational demands of specialized variation operators.Our findings highlight the effectiveness of stacked ensemble learning and reinforcement learning in optimizing test case prioritization and ranking.This integrated approach improves testing efficiency,reduces late-stage defects,and improves overall software stability.The study provides valuable information for AI-driven testing frameworks,paving the way for more intelligent and adaptive software quality assurance methodologies.
基金supported by the Princess Nourah bint Abdulrahman University Researchers Supporting Project,number PNURSP2025R757Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘The integration of High-Altitude Platform Stations(HAPS)with Reconfigurable Intelligent Surfaces(RIS)represents a critical advancement for next-generation wireless networks,offering unprecedented opportunities for ubiquitous connectivity.However,existing research reveals significant gaps in dynamic resource allocation,joint optimization,and equitable service provisioning under varying channel conditions,limiting practical deployment of these technologies.This paper addresses these challenges by proposing a novel Fairness-Aware Deep Q-Learning(FAIRDQL)framework for joint resource management and phase configuration in HAPS-RIS systems.Our methodology employs a comprehensive three-tier algorithmic architecture integrating adaptive power control,priority-based user scheduling,and dynamic learning mechanisms.The FAIR-DQL approach utilizes advanced reinforcement learning with experience replay and fairness-aware reward functions to balance competing objectives while adapting to dynamic environments.Key findings demonstrate substantial improvements:9.15 dB SINR gain,12.5 bps/Hz capacity,78%power efficiency,and 0.82 fairness index.The framework achieves rapid 40-episode convergence with consistent delay performance.These contributions establish new benchmarks for fairness-aware resource allocation in aerial communications,enabling practical HAPS-RIS deployments in rural connectivity,emergency communications,and urban networks.
基金supported by the National Natural Science Foundation of China(61751210,61572441)。
文摘Path planning and obstacle avoidance are two challenging problems in the study of intelligent robots. In this paper, we develop a new method to alleviate these problems based on deep Q-learning with experience replay and heuristic knowledge. In this method, a neural network has been used to resolve the "curse of dimensionality" issue of the Q-table in reinforcement learning. When a robot is walking in an unknown environment, it collects experience data which is used for training a neural network;such a process is called experience replay.Heuristic knowledge helps the robot avoid blind exploration and provides more effective data for training the neural network. The simulation results show that in comparison with the existing methods, our method can converge to an optimal action strategy with less time and can explore a path in an unknown environment with fewer steps and larger average reward.
基金Authors would like to thank the Deanship of Scientific Research at Shaqra University for supporting this work under Project No.g01/n04.
文摘Deep Reinforcement Learning(DRL)is a class of Machine Learning(ML)that combines Deep Learning with Reinforcement Learning and provides a framework by which a system can learn from its previous actions in an environment to select its efforts in the future efficiently.DRL has been used in many application fields,including games,robots,networks,etc.for creating autonomous systems that improve themselves with experience.It is well acknowledged that DRL is well suited to solve optimization problems in distributed systems in general and network routing especially.Therefore,a novel query routing approach called Deep Reinforcement Learning based Route Selection(DRLRS)is proposed for unstructured P2P networks based on a Deep Q-Learning algorithm.The main objective of this approach is to achieve better retrieval effectiveness with reduced searching cost by less number of connected peers,exchangedmessages,and reduced time.The simulation results shows a significantly improve searching a resource with compression to k-Random Walker and Directed BFS.Here,retrieval effectiveness,search cost in terms of connected peers,and average overhead are 1.28,106,149,respectively.
基金This work was supported by the Fundamental Research Funds for the Central Universities of China under grant no.PA2019GDQT0012by National Natural Science Foundation of China(Grant No.61971176)by the Applied Basic Research Program ofWuhan City,China,under grand 2017010201010117.
文摘To support dramatically increased traffic loads,communication networks become ultra-dense.Traditional cell association(CA)schemes are timeconsuming,forcing researchers to seek fast schemes.This paper proposes a deep Q-learning based scheme,whose main idea is to train a deep neural network(DNN)to calculate the Q values of all the state-action pairs and the cell holding the maximum Q value is associated.In the training stage,the intelligent agent continuously generates samples through the trial-anderror method to train the DNN until convergence.In the application stage,state vectors of all the users are inputted to the trained DNN to quickly obtain a satisfied CA result of a scenario with the same BS locations and user distribution.Simulations demonstrate that the proposed scheme provides satisfied CA results in a computational time several orders of magnitudes shorter than traditional schemes.Meanwhile,performance metrics,such as capacity and fairness,can be guaranteed.
文摘Beamforming is significant for millimeter wave multi-user massive multi-input multi-output systems.In the meanwhile,the overhead cost of channel state information and beam training is considerable,especially in dynamic environments.To reduce the overhead cost,we propose a multi-user beam tracking algorithm using a distributed deep Q-learning method.With online learning of users’moving trajectories,the proposed algorithm learns to scan a beam subspace to maximize the average effective sum rate.Considering practical implementation,we model the continuous beam tracking problem as a non-Markov decision process and thus develop a simplified training scheme of deep Q-learning to reduce the training complexity.Furthermore,we propose a scalable state-action-reward design for scenarios with different users and antenna numbers.Simulation results verify the effectiveness of the designed method.
基金supported by the Researcher Supporting Project number(RSPD2024R582),King Saud University,Riyadh,Saudi Arabia.
文摘The controller is a main component in the Software-Defined Networking(SDN)framework,which plays a significant role in enabling programmability and orchestration for 5G and next-generation networks.In SDN,frequent communication occurs between network switches and the controller,which manages and directs traffic flows.If the controller is not strategically placed within the network,this communication can experience increased delays,negatively affecting network performance.Specifically,an improperly placed controller can lead to higher end-to-end(E2E)delay,as switches must traverse more hops or encounter greater propagation delays when communicating with the controller.This paper introduces a novel approach using Deep Q-Learning(DQL)to dynamically place controllers in Software-Defined Internet of Things(SD-IoT)environments,with the goal of minimizing E2E delay between switches and controllers.E2E delay,a crucial metric for network performance,is influenced by two key factors:hop count,which measures the number of network nodes data must traverse,and propagation delay,which accounts for the physical distance between nodes.Our approach models the controller placement problem as a Markov Decision Process(MDP).In this model,the network configuration at any given time is represented as a“state,”while“actions”correspond to potential decisions regarding the placement of controllers or the reassignment of switches to controllers.Using a Deep Q-Network(DQN)to approximate the Q-function,the system learns the optimal controller placement by maximizing the cumulative reward,which is defined as the negative of the E2E delay.Essentially,the lower the delay,the higher the reward the system receives,enabling it to continuously improve its controller placement strategy.The experimental results show that our DQL-based method significantly reduces E2E delay when compared to traditional benchmark placement strategies.By dynamically learning from the network’s real-time conditions,the proposed method ensures that controller placement remains efficient and responsive,reducing communication delays and enhancing overall network performance.
基金This work was supported by the National Natural Science Foundation of China(61571059 and 61871058).
文摘To reduce the transmission latency and mitigate the backhaul burden of the centralized cloud-based network services,the mobile edge computing(MEC)has been drawing increased attention from both industry and academia recently.This paper focuses on mobile users’computation offloading problem in wireless cellular networks with mobile edge computing for the purpose of optimizing the computation offloading decision making policy.Since wireless network states and computing requests have stochastic properties and the environment’s dynamics are unknown,we use the modelfree reinforcement learning(RL)framework to formulate and tackle the computation offloading problem.Each mobile user learns through interactions with the environment and the estimate of its performance in the form of value function,then it chooses the overhead-aware optimal computation offloading action(local computing or edge computing)based on its state.The state spaces are high-dimensional in our work and value function is unrealistic to estimate.Consequently,we use deep reinforcement learning algorithm,which combines RL method Q-learning with the deep neural network(DNN)to approximate the value functions for complicated control applications,and the optimal policy will be obtained when the value function reaches convergence.Simulation results showed that the effectiveness of the proposed method in comparison with baseline methods in terms of total overheads of all mobile users.
基金supported by the National Natural Science Foundation of China(Grant No.52374156).
文摘To address low learning efficiency and inadequate path safety in spraying robot navigation within complex obstacle-rich environments—with dense,dynamic,unpredictable obstacles challenging conventional methods—this paper proposes a hybrid algorithm integrating Q-learning and improved A*-Artificial Potential Field(A-APF).Centered on theQ-learning framework,the algorithmleverages safety-oriented guidance generated byA-APF and employs a dynamic coordination mechanism that adaptively balances exploration and exploitation.The proposed system comprises four core modules:(1)an environment modeling module that constructs grid-based obstacle maps;(2)an A-APF module that combines heuristic search from A*algorithm with repulsive force strategies from APF to generate guidance;(3)a Q-learning module that learns optimal state-action values(Q-values)through spraying robot-environment interaction and a reward function emphasizing path optimality and safety;and(4)a dynamic optimization module that ensures adaptive cooperation between Q-learning and A-APF through exploration rate control and environment-aware constraints.Simulation results demonstrate that the proposed method significantly enhances path safety in complex underground mining environments.Quantitative results indicate that,compared to the traditional Q-learning algorithm,the proposed method shortens training time by 42.95% and achieves a reduction in training failures from 78 to just 3.Compared to the static fusion algorithm,it further reduces both training time(by 10.78%)and training failures(by 50%),thereby improving overall training efficiency.
基金supported by the National Natural Science Foundation of China(Nos.U1813215 and 62273203)。
文摘This paper focuses on the problem of active object detection(AOD).AOD is important for service robots to complete tasks in the family environment,and leads robots to approach the target ob ject by taking appropriate moving actions.Most of the current AOD methods are based on reinforcement learning with low training efficiency and testing accuracy.Therefore,an AOD model based on a deep Q-learning network(DQN)with a novel training algorithm is proposed in this paper.The DQN model is designed to fit the Q-values of various actions,and includes state space,feature extraction,and a multilayer perceptron.In contrast to existing research,a novel training algorithm based on memory is designed for the proposed DQN model to improve training efficiency and testing accuracy.In addition,a method of generating the end state is presented to judge when to stop the AOD task during the training process.Sufficient comparison experiments and ablation studies are performed based on an AOD dataset,proving that the presented method has better performance than the comparable methods and that the proposed training algorithm is more effective than the raw training algorithm.