The paper presents a fuzzy Q-learning(FQL)and optical flow-based autonomous navigation approach.The FQL method takes decisions in an unknown environment and without mapping,using motion information and through a reinf...The paper presents a fuzzy Q-learning(FQL)and optical flow-based autonomous navigation approach.The FQL method takes decisions in an unknown environment and without mapping,using motion information and through a reinforcement signal into an evolutionary algorithm.The reinforcement signal is calculated by estimating the optical flow densities in areas of the camera to determine whether they are“dense”or“thin”which has a relationship with the proximity of objects.The results obtained show that the present approach improves the rate of learning compared with a method with a simple reward system and without the evolutionary component.The proposed system was implemented in a virtual robotics system using the CoppeliaSim software and in communication with Python.展开更多
For the congestion problems in high-speed networks, a genetic based fuzzy Q-learning flow controller is proposed. Because of the uncertainties and highly time-varying, it is not easy to accurately obtain the complete ...For the congestion problems in high-speed networks, a genetic based fuzzy Q-learning flow controller is proposed. Because of the uncertainties and highly time-varying, it is not easy to accurately obtain the complete information for high-speed networks. In this case, the Q-learning, which is independent of mathematic model, and prior-knowledge, has good performance. The fuzzy inference is introduced in order to facilitate generalization in large state space, and the genetic operators are used to obtain the consequent parts of fuzzy rules. Simulation results show that the proposed controller can learn to take the best action to regulate source flow with the features of high throughput and low packet loss ratio, and can avoid the occurrence of congestion effectively.展开更多
针对无监督环境下传统网络异常诊断算法存在异常点定位和异常数据分类准确率低等不足,通过设计一种基于改进Q-learning算法的无线网络异常诊断方法:首先基于ADU(Asynchronous Data Unit异步数据单元)单元采集无线网络的数据流,并提取数...针对无监督环境下传统网络异常诊断算法存在异常点定位和异常数据分类准确率低等不足,通过设计一种基于改进Q-learning算法的无线网络异常诊断方法:首先基于ADU(Asynchronous Data Unit异步数据单元)单元采集无线网络的数据流,并提取数据包特征;然后构建Q-learning算法模型探索状态值和奖励值的平衡点,利用SA(Simulated Annealing模拟退火)算法从全局视角对下一时刻状态进行精确识别;最后确定训练样本的联合分布概率,提升输出值的逼近性能以达到平衡探索与代价之间的均衡。测试结果显示:改进Q-learning算法的网络异常定位准确率均值达99.4%,在不同类型网络异常的分类精度和分类效率等方面,也优于三种传统网络异常诊断方法。展开更多
The dwell scheduling problem for a multifunctional radar system is led to the formation of corresponding optimiza-tion problem.In order to solve the resulting optimization prob-lem,the dwell scheduling process in a sc...The dwell scheduling problem for a multifunctional radar system is led to the formation of corresponding optimiza-tion problem.In order to solve the resulting optimization prob-lem,the dwell scheduling process in a scheduling interval(SI)is formulated as a Markov decision process(MDP),where the state,action,and reward are specified for this dwell scheduling problem.Specially,the action is defined as scheduling the task on the left side,right side or in the middle of the radar idle time-line,which reduces the action space effectively and accelerates the convergence of the training.Through the above process,a model-free reinforcement learning framework is established.Then,an adaptive dwell scheduling method based on Q-learn-ing is proposed,where the converged Q value table after train-ing is utilized to instruct the scheduling process.Simulation results demonstrate that compared with existing dwell schedul-ing algorithms,the proposed one can achieve better scheduling performance considering the urgency criterion,the importance criterion and the desired execution time criterion comprehen-sively.The average running time shows the proposed algorithm has real-time performance.展开更多
With the development of economic globalization,distributedmanufacturing is becomingmore andmore prevalent.Recently,integrated scheduling of distributed production and assembly has captured much concern.This research s...With the development of economic globalization,distributedmanufacturing is becomingmore andmore prevalent.Recently,integrated scheduling of distributed production and assembly has captured much concern.This research studies a distributed flexible job shop scheduling problem with assembly operations.Firstly,a mixed integer programming model is formulated to minimize the maximum completion time.Secondly,a Q-learning-assisted coevolutionary algorithmis presented to solve themodel:(1)Multiple populations are developed to seek required decisions simultaneously;(2)An encoding and decoding method based on problem features is applied to represent individuals;(3)A hybrid approach of heuristic rules and random methods is employed to acquire a high-quality population;(4)Three evolutionary strategies having crossover and mutation methods are adopted to enhance exploration capabilities;(5)Three neighborhood structures based on problem features are constructed,and a Q-learning-based iterative local search method is devised to improve exploitation abilities.The Q-learning approach is applied to intelligently select better neighborhood structures.Finally,a group of instances is constructed to perform comparison experiments.The effectiveness of the Q-learning approach is verified by comparing the developed algorithm with its variant without the Q-learning method.Three renowned meta-heuristic algorithms are used in comparison with the developed algorithm.The comparison results demonstrate that the designed method exhibits better performance in coping with the formulated problem.展开更多
文摘The paper presents a fuzzy Q-learning(FQL)and optical flow-based autonomous navigation approach.The FQL method takes decisions in an unknown environment and without mapping,using motion information and through a reinforcement signal into an evolutionary algorithm.The reinforcement signal is calculated by estimating the optical flow densities in areas of the camera to determine whether they are“dense”or“thin”which has a relationship with the proximity of objects.The results obtained show that the present approach improves the rate of learning compared with a method with a simple reward system and without the evolutionary component.The proposed system was implemented in a virtual robotics system using the CoppeliaSim software and in communication with Python.
文摘For the congestion problems in high-speed networks, a genetic based fuzzy Q-learning flow controller is proposed. Because of the uncertainties and highly time-varying, it is not easy to accurately obtain the complete information for high-speed networks. In this case, the Q-learning, which is independent of mathematic model, and prior-knowledge, has good performance. The fuzzy inference is introduced in order to facilitate generalization in large state space, and the genetic operators are used to obtain the consequent parts of fuzzy rules. Simulation results show that the proposed controller can learn to take the best action to regulate source flow with the features of high throughput and low packet loss ratio, and can avoid the occurrence of congestion effectively.
文摘针对无监督环境下传统网络异常诊断算法存在异常点定位和异常数据分类准确率低等不足,通过设计一种基于改进Q-learning算法的无线网络异常诊断方法:首先基于ADU(Asynchronous Data Unit异步数据单元)单元采集无线网络的数据流,并提取数据包特征;然后构建Q-learning算法模型探索状态值和奖励值的平衡点,利用SA(Simulated Annealing模拟退火)算法从全局视角对下一时刻状态进行精确识别;最后确定训练样本的联合分布概率,提升输出值的逼近性能以达到平衡探索与代价之间的均衡。测试结果显示:改进Q-learning算法的网络异常定位准确率均值达99.4%,在不同类型网络异常的分类精度和分类效率等方面,也优于三种传统网络异常诊断方法。
基金supported by the National Natural Science Foundation of China(6177109562031007).
文摘The dwell scheduling problem for a multifunctional radar system is led to the formation of corresponding optimiza-tion problem.In order to solve the resulting optimization prob-lem,the dwell scheduling process in a scheduling interval(SI)is formulated as a Markov decision process(MDP),where the state,action,and reward are specified for this dwell scheduling problem.Specially,the action is defined as scheduling the task on the left side,right side or in the middle of the radar idle time-line,which reduces the action space effectively and accelerates the convergence of the training.Through the above process,a model-free reinforcement learning framework is established.Then,an adaptive dwell scheduling method based on Q-learn-ing is proposed,where the converged Q value table after train-ing is utilized to instruct the scheduling process.Simulation results demonstrate that compared with existing dwell schedul-ing algorithms,the proposed one can achieve better scheduling performance considering the urgency criterion,the importance criterion and the desired execution time criterion comprehen-sively.The average running time shows the proposed algorithm has real-time performance.
文摘With the development of economic globalization,distributedmanufacturing is becomingmore andmore prevalent.Recently,integrated scheduling of distributed production and assembly has captured much concern.This research studies a distributed flexible job shop scheduling problem with assembly operations.Firstly,a mixed integer programming model is formulated to minimize the maximum completion time.Secondly,a Q-learning-assisted coevolutionary algorithmis presented to solve themodel:(1)Multiple populations are developed to seek required decisions simultaneously;(2)An encoding and decoding method based on problem features is applied to represent individuals;(3)A hybrid approach of heuristic rules and random methods is employed to acquire a high-quality population;(4)Three evolutionary strategies having crossover and mutation methods are adopted to enhance exploration capabilities;(5)Three neighborhood structures based on problem features are constructed,and a Q-learning-based iterative local search method is devised to improve exploitation abilities.The Q-learning approach is applied to intelligently select better neighborhood structures.Finally,a group of instances is constructed to perform comparison experiments.The effectiveness of the Q-learning approach is verified by comparing the developed algorithm with its variant without the Q-learning method.Three renowned meta-heuristic algorithms are used in comparison with the developed algorithm.The comparison results demonstrate that the designed method exhibits better performance in coping with the formulated problem.