Intelligent penetration testing is of great significance for the improvement of the security of information systems,and the critical issue is the planning of penetration test paths.In view of the difficulty for attack...Intelligent penetration testing is of great significance for the improvement of the security of information systems,and the critical issue is the planning of penetration test paths.In view of the difficulty for attackers to obtain complete network information in realistic network scenarios,Reinforcement Learning(RL)is a promising solution to discover the optimal penetration path under incomplete information about the target network.Existing RL-based methods are challenged by the sizeable discrete action space,which leads to difficulties in the convergence.Moreover,most methods still rely on experts’knowledge.To address these issues,this paper proposes a penetration path planning method based on reinforcement learning with episodic memory.First,the penetration testing problem is formally described in terms of reinforcement learning.To speed up the training process without specific prior knowledge,the proposed algorithm introduces episodic memory to store experienced advantageous strategies for the first time.Furthermore,the method offers an exploration strategy based on episodic memory to guide the agents in learning.The design makes full use of historical experience to achieve the purpose of reducing blind exploration and improving planning efficiency.Ultimately,comparison experiments are carried out with the existing RL-based methods.The results reveal that the proposed method has better convergence performance.The running time is reduced by more than 20%.展开更多
智能体路径规划算法旨在规划某个智能体的行为轨迹,使其在不碰到障碍物的情况下安全且高效地从起始点到达目标点.目前智能体路径规划算法已经被广泛应用到各种重要的物理信息系统中,因此在实际投入使用前对算法进行测试,以评估其性能是...智能体路径规划算法旨在规划某个智能体的行为轨迹,使其在不碰到障碍物的情况下安全且高效地从起始点到达目标点.目前智能体路径规划算法已经被广泛应用到各种重要的物理信息系统中,因此在实际投入使用前对算法进行测试,以评估其性能是否满足需求就非常重要.然而,作为路径规划算法的输入,任务空间中威胁障碍物的分布形式复杂且多样.此外,路径规划算法在为每个测试用例规划路径时,通常需要较高的运行代价.为了提升路径规划算法的测试效率,将动态随机测试思想引入到路径规划算法中,提出了面向智能体路径规划算法的动态随机测试方法(dynamic random testing approach for intelligent agent path planning algorithms,DRT-PP).具体来说,DRT-PP对路径规划任务空间进行离散划分,并在每个子区域内引入威胁生成概率,进而构建测试剖面,该测试剖面可以作为测试策略在测试用例生成过程中使用.此外,DRT-PP在测试过程中通过动态调整测试剖面,使其逐渐优化,从而提升测试效率.实验结果显示,与随机测试及自适应随机测试相比,DRT-PP方法能够在保证测试用例多样性的同时,生成更多能够暴露被测算法性能缺陷的测试用例.展开更多
文摘Intelligent penetration testing is of great significance for the improvement of the security of information systems,and the critical issue is the planning of penetration test paths.In view of the difficulty for attackers to obtain complete network information in realistic network scenarios,Reinforcement Learning(RL)is a promising solution to discover the optimal penetration path under incomplete information about the target network.Existing RL-based methods are challenged by the sizeable discrete action space,which leads to difficulties in the convergence.Moreover,most methods still rely on experts’knowledge.To address these issues,this paper proposes a penetration path planning method based on reinforcement learning with episodic memory.First,the penetration testing problem is formally described in terms of reinforcement learning.To speed up the training process without specific prior knowledge,the proposed algorithm introduces episodic memory to store experienced advantageous strategies for the first time.Furthermore,the method offers an exploration strategy based on episodic memory to guide the agents in learning.The design makes full use of historical experience to achieve the purpose of reducing blind exploration and improving planning efficiency.Ultimately,comparison experiments are carried out with the existing RL-based methods.The results reveal that the proposed method has better convergence performance.The running time is reduced by more than 20%.
文摘智能体路径规划算法旨在规划某个智能体的行为轨迹,使其在不碰到障碍物的情况下安全且高效地从起始点到达目标点.目前智能体路径规划算法已经被广泛应用到各种重要的物理信息系统中,因此在实际投入使用前对算法进行测试,以评估其性能是否满足需求就非常重要.然而,作为路径规划算法的输入,任务空间中威胁障碍物的分布形式复杂且多样.此外,路径规划算法在为每个测试用例规划路径时,通常需要较高的运行代价.为了提升路径规划算法的测试效率,将动态随机测试思想引入到路径规划算法中,提出了面向智能体路径规划算法的动态随机测试方法(dynamic random testing approach for intelligent agent path planning algorithms,DRT-PP).具体来说,DRT-PP对路径规划任务空间进行离散划分,并在每个子区域内引入威胁生成概率,进而构建测试剖面,该测试剖面可以作为测试策略在测试用例生成过程中使用.此外,DRT-PP在测试过程中通过动态调整测试剖面,使其逐渐优化,从而提升测试效率.实验结果显示,与随机测试及自适应随机测试相比,DRT-PP方法能够在保证测试用例多样性的同时,生成更多能够暴露被测算法性能缺陷的测试用例.