期刊文献+
共找到20篇文章
< 1 >
每页显示 20 50 100
Dynamic Integration of Q-Learning and A-APF for Efficient Path Planning in Complex Underground Mining Environments
1
作者 Chang Su Liangliang Zhao Dongbing Xiang 《Computers, Materials & Continua》 2026年第2期1017-1040,共24页
To address low learning efficiency and inadequate path safety in spraying robot navigation within complex obstacle-rich environments—with dense,dynamic,unpredictable obstacles challenging conventional methods—this p... To address low learning efficiency and inadequate path safety in spraying robot navigation within complex obstacle-rich environments—with dense,dynamic,unpredictable obstacles challenging conventional methods—this paper proposes a hybrid algorithm integrating Q-learning and improved A*-Artificial Potential Field(A-APF).Centered on theQ-learning framework,the algorithmleverages safety-oriented guidance generated byA-APF and employs a dynamic coordination mechanism that adaptively balances exploration and exploitation.The proposed system comprises four core modules:(1)an environment modeling module that constructs grid-based obstacle maps;(2)an A-APF module that combines heuristic search from A*algorithm with repulsive force strategies from APF to generate guidance;(3)a Q-learning module that learns optimal state-action values(Q-values)through spraying robot-environment interaction and a reward function emphasizing path optimality and safety;and(4)a dynamic optimization module that ensures adaptive cooperation between Q-learning and A-APF through exploration rate control and environment-aware constraints.Simulation results demonstrate that the proposed method significantly enhances path safety in complex underground mining environments.Quantitative results indicate that,compared to the traditional Q-learning algorithm,the proposed method shortens training time by 42.95% and achieves a reduction in training failures from 78 to just 3.Compared to the static fusion algorithm,it further reduces both training time(by 10.78%)and training failures(by 50%),thereby improving overall training efficiency. 展开更多
关键词 q-learning A*algorithm artificial potential field path planning hybrid algorithm
在线阅读 下载PDF
Energy learning hyper-heuristic algorithm for cooperative task assignment of heterogeneous UAVs under complex constraints
2
作者 Mengshun Yuan Mou Chen +1 位作者 Tongle Zhou Zengliang Han 《Defence Technology(防务技术)》 2025年第12期1-14,共14页
Cooperative task assignment is one of the key research focuses in the field of unmanned aerial vehicles(UAVs). In this paper, an energy learning hyper-heuristic(EL-HH) algorithm is proposed to address the cooperative ... Cooperative task assignment is one of the key research focuses in the field of unmanned aerial vehicles(UAVs). In this paper, an energy learning hyper-heuristic(EL-HH) algorithm is proposed to address the cooperative task assignment problem of heterogeneous UAVs under complex constraints. First, a mathematical model is designed to define the scenario, complex constraints, and objective function of the problem. Then, the scheme encoding, the EL-HH strategy, multiple optimization operators, and the task sequence and time adjustment strategies are designed in the EL-HH algorithm. The scheme encoding is designed with three layers: task sequence, UAV sequence, and waiting time. The EL-HH strategy applies an energy learning method to adaptively adjust the energies of operators, thereby facilitating the selection and application of operators. Multiple optimization operators can update schemes in different ways, enabling the algorithm to fully explore the solution space. Afterward, the task order and time adjustment strategies are designed to adjust task order and insert waiting time. Through the iterative optimization process, a satisfactory assignment scheme is ultimately produced. Finally, simulation and experiment verify the effectiveness of the proposed algorithm. 展开更多
关键词 Unmanned aerial vehicle Cooperative task assignment Energy learning hyper-heuristic algorithm
在线阅读 下载PDF
A Q-Learning-Assisted Co-Evolutionary Algorithm for Distributed Assembly Flexible Job Shop Scheduling Problems
3
作者 Song Gao Shixin Liu 《Computers, Materials & Continua》 2025年第6期5623-5641,共19页
With the development of economic globalization,distributedmanufacturing is becomingmore andmore prevalent.Recently,integrated scheduling of distributed production and assembly has captured much concern.This research s... With the development of economic globalization,distributedmanufacturing is becomingmore andmore prevalent.Recently,integrated scheduling of distributed production and assembly has captured much concern.This research studies a distributed flexible job shop scheduling problem with assembly operations.Firstly,a mixed integer programming model is formulated to minimize the maximum completion time.Secondly,a Q-learning-assisted coevolutionary algorithmis presented to solve themodel:(1)Multiple populations are developed to seek required decisions simultaneously;(2)An encoding and decoding method based on problem features is applied to represent individuals;(3)A hybrid approach of heuristic rules and random methods is employed to acquire a high-quality population;(4)Three evolutionary strategies having crossover and mutation methods are adopted to enhance exploration capabilities;(5)Three neighborhood structures based on problem features are constructed,and a Q-learning-based iterative local search method is devised to improve exploitation abilities.The Q-learning approach is applied to intelligently select better neighborhood structures.Finally,a group of instances is constructed to perform comparison experiments.The effectiveness of the Q-learning approach is verified by comparing the developed algorithm with its variant without the Q-learning method.Three renowned meta-heuristic algorithms are used in comparison with the developed algorithm.The comparison results demonstrate that the designed method exhibits better performance in coping with the formulated problem. 展开更多
关键词 Distributed manufacturing flexible job shop scheduling problem assembly operation co-evolutionary algorithm q-learning method
在线阅读 下载PDF
Design for a Novel Framework of Hyper-Heuristic Algorithm 被引量:1
4
作者 郭为安 汪镭 +2 位作者 陈明 刘晋飞 吴启迪 《Journal of Donghua University(English Edition)》 EI CAS 2014年第2期109-112,共4页
A novel framework of hyper-heuristic algorithm was proposed to improve the adaption of evolutionary algorithms( EAs)in optimization. The algorithm could be changed during the evolutionary progress according to their p... A novel framework of hyper-heuristic algorithm was proposed to improve the adaption of evolutionary algorithms( EAs)in optimization. The algorithm could be changed during the evolutionary progress according to their performances. In addition,a large number of elite individuals were employed in the algorithm and the elite individuals helped algorithm achieve a better performance,while such number of elite individuals stagnated the global convergence in conventional single algorithm. The time complexity was analyzed to demonstrate the novel framework did not increase the time complexity. The simulation results indicate that the proposed framework outperforms any single algorithm that composes the framework. 展开更多
关键词 hyper-heuristic algorithm ADAPTION ELITE individuals EVOLUTIONARY algorithm time COMPLEXITY
在线阅读 下载PDF
Hyper-Heuristic Task Scheduling Algorithm Based on Reinforcement Learning in Cloud Computing 被引量:1
5
作者 Lei Yin Chang Sun +3 位作者 Ming Gao Yadong Fang Ming Li Fengyu Zhou 《Intelligent Automation & Soft Computing》 SCIE 2023年第8期1587-1608,共22页
The solution strategy of the heuristic algorithm is pre-set and has good performance in the conventional cloud resource scheduling process.However,for complex and dynamic cloud service scheduling tasks,due to the diff... The solution strategy of the heuristic algorithm is pre-set and has good performance in the conventional cloud resource scheduling process.However,for complex and dynamic cloud service scheduling tasks,due to the difference in service attributes,the solution efficiency of a single strategy is low for such problems.In this paper,we presents a hyper-heuristic algorithm based on reinforcement learning(HHRL)to optimize the completion time of the task sequence.Firstly,In the reward table setting stage of HHRL,we introduce population diversity and integrate maximum time to comprehensively deter-mine the task scheduling and the selection of low-level heuristic strategies.Secondly,a task computational complexity estimation method integrated with linear regression is proposed to influence task scheduling priorities.Besides,we propose a high-quality candidate solution migration method to ensure the continuity and diversity of the solving process.Compared with HHSA,ACO,GA,F-PSO,etc,HHRL can quickly obtain task complexity,select appropriate heuristic strategies for task scheduling,search for the the best makspan and have stronger disturbance detection ability for population diversity. 展开更多
关键词 Task scheduling cloud computing hyper-heuristic algorithm makespan optimization
在线阅读 下载PDF
QMCR:A Q-Learning-Based Multi-Hop Cooperative Routing Protocol for Underwater Acoustic Sensor Networks 被引量:3
6
作者 Yougan Chen Kaitong Zheng +2 位作者 Xing Fang Lei Wan Xiaomei Xu 《China Communications》 SCIE CSCD 2021年第8期224-236,共13页
Routing plays a critical role in data transmission for underwater acoustic sensor networks(UWSNs)in the internet of underwater things(IoUT).Traditional routing methods suffer from high end-toend delay,limited bandwidt... Routing plays a critical role in data transmission for underwater acoustic sensor networks(UWSNs)in the internet of underwater things(IoUT).Traditional routing methods suffer from high end-toend delay,limited bandwidth,and high energy consumption.With the development of artificial intelligence and machine learning algorithms,many researchers apply these new methods to improve the quality of routing.In this paper,we propose a Qlearning-based multi-hop cooperative routing protocol(QMCR)for UWSNs.Our protocol can automatically choose nodes with the maximum Q-value as forwarders based on distance information.Moreover,we combine cooperative communications with Q-learning algorithm to reduce network energy consumption and improve communication efficiency.Experimental results show that the running time of the QMCR is less than one-tenth of that of the artificial fish-swarm algorithm(AFSA),while the routing energy consumption is kept at the same level.Due to the extremely fast speed of the algorithm,the QMCR is a promising method of routing design for UWSNs,especially for the case that it suffers from the extreme dynamic underwater acoustic channels in the real ocean environment. 展开更多
关键词 q-learning algorithm ROUTING internet of underwater things underwater acoustic communication multi-hop cooperative communication
在线阅读 下载PDF
A Vision-based Robotic Navigation Method Using an Evolutionary and Fuzzy Q-Learning Approach
7
作者 Roberto Cuesta-Solano Ernesto Moya-Albor +1 位作者 Jorge Brieva Hiram Ponce 《Journal of Artificial Intelligence and Technology》 2024年第4期363-369,共7页
The paper presents a fuzzy Q-learning(FQL)and optical flow-based autonomous navigation approach.The FQL method takes decisions in an unknown environment and without mapping,using motion information and through a reinf... The paper presents a fuzzy Q-learning(FQL)and optical flow-based autonomous navigation approach.The FQL method takes decisions in an unknown environment and without mapping,using motion information and through a reinforcement signal into an evolutionary algorithm.The reinforcement signal is calculated by estimating the optical flow densities in areas of the camera to determine whether they are“dense”or“thin”which has a relationship with the proximity of objects.The results obtained show that the present approach improves the rate of learning compared with a method with a simple reward system and without the evolutionary component.The proposed system was implemented in a virtual robotics system using the CoppeliaSim software and in communication with Python. 展开更多
关键词 CoppeliaSim evolutionary algorithm fuzzy q-learning optical flow reinforced learning vision-based control navigation
在线阅读 下载PDF
Unveiling Effective Heuristic Strategies: A Review of Cross-Domain Heuristic Search Challenge Algorithms
8
作者 Mohamad Khairulamirin Md Razali MasriAyob +5 位作者 Abdul Hadi Abd Rahman Razman Jarmin Chian Yong Liu Muhammad Maaya Azarinah Izaham Graham Kendall 《Computer Modeling in Engineering & Sciences》 2025年第2期1233-1288,共56页
The Cross-domain Heuristic Search Challenge(CHeSC)is a competition focused on creating efficient search algorithms adaptable to diverse problem domains.Selection hyper-heuristics are a class of algorithms that dynamic... The Cross-domain Heuristic Search Challenge(CHeSC)is a competition focused on creating efficient search algorithms adaptable to diverse problem domains.Selection hyper-heuristics are a class of algorithms that dynamically choose heuristics during the search process.Numerous selection hyper-heuristics have different imple-mentation strategies.However,comparisons between them are lacking in the literature,and previous works have not highlighted the beneficial and detrimental implementation methods of different components.The question is how to effectively employ them to produce an efficient search heuristic.Furthermore,the algorithms that competed in the inaugural CHeSC have not been collectively reviewed.This work conducts a review analysis of the top twenty competitors from this competition to identify effective and ineffective strategies influencing algorithmic performance.A summary of the main characteristics and classification of the algorithms is presented.The analysis underlines efficient and inefficient methods in eight key components,including search points,search phases,heuristic selection,move acceptance,feedback,Tabu mechanism,restart mechanism,and low-level heuristic parameter control.This review analyzes the components referencing the competition’s final leaderboard and discusses future research directions for these components.The effective approaches,identified as having the highest quality index,are mixed search point,iterated search phases,relay hybridization selection,threshold acceptance,mixed learning,Tabu heuristics,stochastic restart,and dynamic parameters.Findings are also compared with recent trends in hyper-heuristics.This work enhances the understanding of selection hyper-heuristics,offering valuable insights for researchers and practitioners aiming to develop effective search algorithms for diverse problem domains. 展开更多
关键词 hyper-heuristicS search algorithms optimization heuristic selection move acceptance learning DIVERSIFICATION parameter control
在线阅读 下载PDF
Design and Test Verification of Energy Consumption Perception AI Algorithm for Terminal Access to Smart Grid
9
作者 Sheng Bi Jiayan Wang +2 位作者 Dong Su Hui Lu Yu Zhang 《Energy Engineering》 2025年第10期4135-4151,共17页
By comparing price plans offered by several retail energy firms,end users with smart meters and controllers may optimize their energy use cost portfolios,due to the growth of deregulated retail power markets.To help s... By comparing price plans offered by several retail energy firms,end users with smart meters and controllers may optimize their energy use cost portfolios,due to the growth of deregulated retail power markets.To help smart grid end-users decrease power payment and usage unhappiness,this article suggests a decision system based on reinforcement learning to aid with electricity price plan selection.An enhanced state-based Markov decision process(MDP)without transition probabilities simulates the decision issue.A Kernel approximate-integrated batch Q-learning approach is used to tackle the given issue.Several adjustments to the sampling and data representation are made to increase the computational and prediction performance.Using a continuous high-dimensional state space,the suggested approach can uncover the underlying characteristics of time-varying pricing schemes.Without knowing anything regarding the market environment in advance,the best decision-making policy may be learned via case studies that use data from actual historical price plans.Experiments show that the suggested decision approach may reduce cost and energy usage dissatisfaction by using user data to build an accurate prediction strategy.In this research,we look at how smart city energy planners rely on precise load forecasts.It presents a hybrid method that extracts associated characteristics to improve accuracy in residential power consumption forecasts using machine learning(ML).It is possible to measure the precision of forecasts with the use of loss functions with the RMSE.This research presents a methodology for estimating smart home energy usage in response to the growing interest in explainable artificial intelligence(XAI).Using Shapley Additive explanations(SHAP)approaches,this strategy makes it easy for consumers to comprehend their energy use trends.To predict future energy use,the study employs gradient boosting in conjunction with long short-term memory neural networks. 展开更多
关键词 Energy consumption perception terminal access smart grid AI Model SHAP q-learning algorithm
在线阅读 下载PDF
Ensemble Artificial Bee Colony Algorithm and Q-Learning for Multi- Objective Distributed Heterogeneous Flowshop Scheduling Problems with Sequence-Dependent Setup Time
10
作者 Fubin Liu Kaizhou Gao +1 位作者 Adam Slowik Ponnuthurai Nagaratnam Suganthan 《Complex System Modeling and Simulation》 2025年第3期221-235,共15页
As the global economy develops and people's awareness of environmental protection increases,the efficient scheduling of production lines in workshops has received more and more attention.However,there is very litt... As the global economy develops and people's awareness of environmental protection increases,the efficient scheduling of production lines in workshops has received more and more attention.However,there is very little research focusing on distributed scheduling for heterogeneous factories.This study addresses a multi-objective distributed heterogeneous permutation flow shop scheduling problem with sequence-dependent setup times(DHPFSP-SDST).The objective is to optimize the trade-off between the maximum completion time(Makespan)and total energy consumption.First,to describe the concerned problems,we establish a mathematical model.Second,we use the artificial bee colony(ABC)algorithm to optimize the two objectives,incorporating five local search strategies tailored to the problem characteristics to enhance the algorithm's performance.Third,to improve the convergence speed of the algorithm,a Q-learning based strategy is designed to select the appropriated local search operator during iterations.Finally,based on experiments conducted on 72 instances,statistical analysis and discussions show that the Q-learning based ABC algorithm can effectively solve the problems better than its peers. 展开更多
关键词 artificial bee colony algorithm q-learning flowshop scheduling sequence-dependent setup time
原文传递
A memetic algorithm based on hyper-heuristics for examination timetabling problems
11
作者 Yu Lei Maoguo Gong +1 位作者 Licheng Jiao Yi Zuo 《International Journal of Intelligent Computing and Cybernetics》 EI 2015年第2期139-151,共13页
Purpose–The examination timetabling problem is an NP-hard problem.A large number of approaches for this problem are developed to find more appropriate search strategies.Hyper-heuristic is a kind of representative met... Purpose–The examination timetabling problem is an NP-hard problem.A large number of approaches for this problem are developed to find more appropriate search strategies.Hyper-heuristic is a kind of representative methods.In hyper-heuristic,the high-level search is executed to construct heuristic lists by traditional methods(such as Tabu search,variable neighborhoods and so on).The purpose of this paper is to apply the evolutionary strategy instead of traditional methods for high-level search to improve the capability of global search.Design/methodology/approach–This paper combines hyper-heuristic with evolutionary strategy to solve examination timetabling problems.First,four graph coloring heuristics are employed to construct heuristic lists.Within the evolutionary algorithm framework,the iterative initialization is utilized to improve the number of feasible solutions in the population;meanwhile,the crossover and mutation operators are applied to find potential heuristic lists in the heuristic space(high-level search).At last,two local search methods are combined to optimize the feasible solutions in the solution space(low-level search).Findings–Experimental results demonstrate that the proposed approach obtains competitive results and outperforms the compared approaches on some benchmark instances.Originality/value–The contribution of this paper is the development of a framework which combines evolutionary algorithm and hyper-heuristic for examination timetabling problems. 展开更多
关键词 Evolutionary computation Examination timetabling problem hyper-heuristic Memetic algorithm
在线阅读 下载PDF
Dynamic Scheduling and Path Planning of Automated Guided Vehicles in Automatic Container Terminal 被引量:17
12
作者 Lijun Yue Houming Fan 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第11期2005-2019,共15页
The uninterrupted operation of the quay crane(QC)ensures that the large container ship can depart port within laytime,which effectively reduces the handling cost for the container terminal and ship owners.The QC waiti... The uninterrupted operation of the quay crane(QC)ensures that the large container ship can depart port within laytime,which effectively reduces the handling cost for the container terminal and ship owners.The QC waiting caused by automated guided vehicles(AGVs)delay in the uncertain environment can be alleviated by dynamic scheduling optimization.A dynamic scheduling process is introduced in this paper to solve the AGV scheduling and path planning problems,in which the scheduling scheme determines the starting and ending nodes of paths,and the choice of paths between nodes affects the scheduling of subsequent AGVs.This work proposes a two-stage mixed integer optimization model to minimize the transportation cost of AGVs under the constraint of laytime.A dynamic optimization algorithm,including the improved rule-based heuristic algorithm and the integration of the Dijkstra algorithm and the Q-Learning algorithm,is designed to solve the optimal AGV scheduling and path schemes.A new conflict avoidance strategy based on graph theory is also proposed to reduce the probability of path conflicts between AGVs.Numerical experiments are conducted to demonstrate the effectiveness of the proposed model and algorithm over existing methods. 展开更多
关键词 Automated container terminal dynamic scheduling path planning q-learning algorithm rule-based heuristic algorithm
在线阅读 下载PDF
A Novel Cooperative Multi-Stage Hyper-Heuristic for Combination Optimization Problems 被引量:12
13
作者 Fuqing Zhao Shilu Di +2 位作者 Jie Cao Jianxin Tang Jonrinaldi 《Complex System Modeling and Simulation》 2021年第2期91-108,共18页
A hyper-heuristic algorithm is a general solution framework that adaptively selects the optimizer to address complex problems.A classical hyper-heuristic framework consists of two levels,including the high-level heuri... A hyper-heuristic algorithm is a general solution framework that adaptively selects the optimizer to address complex problems.A classical hyper-heuristic framework consists of two levels,including the high-level heuristic and a set of low-level heuristics.The low-level heuristics to be used in the optimization process are chosen by the high-level tactics in the hyper-heuristic.In this study,a Cooperative Multi-Stage Hyper-Heuristic(CMS-HH)algorithm is proposed to address certain combinatorial optimization problems.In the CMS-HH,a genetic algorithm is introduced to perturb the initial solution to increase the diversity of the solution.In the search phase,an online learning mechanism based on the multi-armed bandits and relay hybridization technology are proposed to improve the quality of the solution.In addition,a multi-point search is introduced to cooperatively search with a single-point search when the state of the solution does not change in continuous time.The performance of the CMS-HH algorithm is assessed in six specific combinatorial optimization problems,including Boolean satisfiability problems,one-dimensional packing problems,permutation flow-shop scheduling problems,personnel scheduling problems,traveling salesman problems,and vehicle routing problems.The experimental results demonstrate the efficiency and significance of the proposed CMS-HH algorithm. 展开更多
关键词 hyper-heuristic algorithm Multi-Armed Bandits(MAB) relay hybridization technology combinatorial optimization
原文传递
Q-Learning-Based Teaching-Learning Optimization for Distributed Two-Stage Hybrid Flow Shop Scheduling with Fuzzy Processing Time 被引量:10
14
作者 Bingjie Xi Deming Lei 《Complex System Modeling and Simulation》 2022年第2期113-129,共17页
Two-stage hybrid flow shop scheduling has been extensively considered in single-factory settings.However,the distributed two-stage hybrid flow shop scheduling problem(DTHFSP)with fuzzy processing time is seldom invest... Two-stage hybrid flow shop scheduling has been extensively considered in single-factory settings.However,the distributed two-stage hybrid flow shop scheduling problem(DTHFSP)with fuzzy processing time is seldom investigated in multiple factories.Furthermore,the integration of reinforcement learning and metaheuristic is seldom applied to solve DTHFSP.In the current study,DTHFSP with fuzzy processing time was investigated,and a novel Q-learning-based teaching-learning based optimization(QTLBO)was constructed to minimize makespan.Several teachers were recruited for this study.The teacher phase,learner phase,teacher’s self-learning phase,and learner’s self-learning phase were designed.The Q-learning algorithm was implemented by 9 states,4 actions defined as combinations of the above phases,a reward,and an adaptive action selection,which were applied to dynamically adjust the algorithm structure.A number of experiments were conducted.The computational results demonstrate that the new strategies of QTLBO are effective;furthermore,it presents promising results on the considered DTHFSP. 展开更多
关键词 teaching-learning based optimization q-learning algorithm two-stage hybrid flow shop scheduling fuzzy processing time
原文传递
Dynamic plugging regulating strategy of pipeline robot based on reinforcement learning 被引量:1
15
作者 Xing-Yuan Miao Hong Zhao 《Petroleum Science》 SCIE EI CAS CSCD 2024年第1期597-608,共12页
Pipeline isolation plugging robot (PIPR) is an important tool in pipeline maintenance operation. During the plugging process, the violent vibration will occur by the flow field, which can cause serious damage to the p... Pipeline isolation plugging robot (PIPR) is an important tool in pipeline maintenance operation. During the plugging process, the violent vibration will occur by the flow field, which can cause serious damage to the pipeline and PIPR. In this paper, we propose a dynamic regulating strategy to reduce the plugging-induced vibration by regulating the spoiler angle and plugging velocity. Firstly, the dynamic plugging simulation and experiment are performed to study the flow field changes during dynamic plugging. And the pressure difference is proposed to evaluate the degree of flow field vibration. Secondly, the mathematical models of pressure difference with plugging states and spoiler angles are established based on the extreme learning machine (ELM) optimized by improved sparrow search algorithm (ISSA). Finally, a modified Q-learning algorithm based on simulated annealing is applied to determine the optimal strategy for the spoiler angle and plugging velocity in real time. The results show that the proposed method can reduce the plugging-induced vibration by 19.9% and 32.7% on average, compared with single-regulating methods. This study can effectively ensure the stability of the plugging process. 展开更多
关键词 Pipeline isolation plugging robot Plugging-induced vibration Dynamic regulating strategy Extreme learning machine Improved sparrow search algorithm Modified q-learning algorithm
原文传递
Optimized Trajectory Design in UAV Based Cellular Networks for 3D Users: A Double Q-Learning Approach
16
作者 Xuanlin Liu Mingzhe Chen Changchuan Yin 《Journal of Communications and Information Networks》 CSCD 2019年第1期24-32,共9页
In this paper,the problem of trajectory de-sign of unmanned aerial vehicles(UAVs)for maximizing the number of satisfied users is studied in a UAV based cellular network where the UAV works as a flying base station tha... In this paper,the problem of trajectory de-sign of unmanned aerial vehicles(UAVs)for maximizing the number of satisfied users is studied in a UAV based cellular network where the UAV works as a flying base station that serves users,and the user indicates its satis-faction in terms of completion of its data request within an allowable maximum waiting time.The trajectory design is formulated as an optimization problem whose goal is to maximize the number of satisfied users.To solve this problem,a machine learning framework based on double Q-learning algorithm is proposed.The algorithm enables the UAV tofind the optimal trajectory that maximizes the number of satisfied users.Compared to the traditional learning algorithms,such as Q-learning that selects and evaluates the action using the same Q-table,the proposed algorithm can decouple the selection from the evaluation,therefore avoid overestimation which leads to sub-optimal policies.Simulation results show that the proposed algorithm can achieve up to 19.4% and 14.1% gains in terms of the number of satisfied users compared to random algorithm and Q-learning algorithm. 展开更多
关键词 UAV communication trajectory design double q-learning algorithm user satisfaction cellular network
原文传递
Reinforcement Learning-Based Control for Resilient Community Microgrid Applications
17
作者 Md Mahmudul Hasan Ishtiaque Zaman +1 位作者 Miao He Michael Giesselmann 《Journal of Power and Energy Engineering》 2022年第9期1-13,共13页
A novel microgrid control strategy is presented in this paper. A resilient community microgrid model, which is equipped with solar PV generation and electric vehicles (EVs) and an improved inverter control system, is ... A novel microgrid control strategy is presented in this paper. A resilient community microgrid model, which is equipped with solar PV generation and electric vehicles (EVs) and an improved inverter control system, is considered. To fully exploit the capability of the community microgrid to operate in either grid-connected mode or islanded mode, as well as to achieve improved stability of the microgrid system, universal droop control, virtual inertia control, and a reinforcement learning-based control mechanism are combined in a cohesive manner, in which adaptive control parameters are determined online to tune the influence of the controllers. The microgrid model and control mechanisms are implemented in MATLAB/Simulink and set up in real-time simulation to test the feasibility and effectiveness of the proposed model. Experiment results reveal the effectiveness of regulating the controller’s frequency and voltage for various operating conditions and scenarios of a microgrid. 展开更多
关键词 MICROGRID Reinforcement Learning q-learning algorithm Vehi-cle-to-Grid (V2G)
在线阅读 下载PDF
Biased Bi-Population Evolutionary Algorithm for Energy-Efficient Fuzzy Flexible Job Shop Scheduling with Deteriorating Jobs 被引量:1
18
作者 Libao Deng Yingjian Zhu +1 位作者 Yuanzhu Di Lili Zhang 《Complex System Modeling and Simulation》 EI 2024年第1期15-32,共18页
There are many studies about flexible job shop scheduling problem with fuzzy processing time and deteriorating scheduling,but most scholars neglect the connection between them,which means the purpose of both models is... There are many studies about flexible job shop scheduling problem with fuzzy processing time and deteriorating scheduling,but most scholars neglect the connection between them,which means the purpose of both models is to simulate a more realistic factory environment.From this perspective,the solutions can be more precise and practical if both issues are considered simultaneously.Therefore,the deterioration effect is treated as a part of the fuzzy job shop scheduling problem in this paper,which means the linear increase of a certain processing time is transformed into an internal linear shift of a triangle fuzzy processing time.Apart from that,many other contributions can be stated as follows.A new algorithm called reinforcement learning based biased bi-population evolutionary algorithm(RB2EA)is proposed,which utilizes Q-learning algorithm to adjust the size of the two populations and the interaction frequency according to the quality of population.A local enhancement method which combimes multiple local search stratgies is presented.An interaction mechanism is designed to promote the convergence of the bi-population.Extensive experiments are designed to evaluate the efficacy of RB2EA,and the conclusion can be drew that RB2EA is able to solve energy-efficient fuzzy flexible job shop scheduling problem with deteriorating jobs(EFFJSPD)efficiently. 展开更多
关键词 bi-population evolutionary algorithm q-learning algorithm FUZZY deteriorating effect ENERGY flexible job shop scheduling
原文传递
Dynamic value iteration networks for the planning of rapidly changing UAV swarms 被引量:3
19
作者 Wei LI Bowei YANG +1 位作者 Guanghua SONG Xiaohong JIANG 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2021年第5期687-696,共10页
In an unmanned aerial vehicle ad-hoc network(UANET),sparse and rapidly mobile unmanned aerial vehicles(UAVs)/nodes can dynamically change the UANET topology.This may lead to UANET service performance issues.In this st... In an unmanned aerial vehicle ad-hoc network(UANET),sparse and rapidly mobile unmanned aerial vehicles(UAVs)/nodes can dynamically change the UANET topology.This may lead to UANET service performance issues.In this study,for planning rapidly changing UAV swarms,we propose a dynamic value iteration network(DVIN)model trained using the episodic Q-learning method with the connection information of UANETs to generate a state value spread function,which enables UAVs/nodes to adapt to novel physical locations.We then evaluate the performance of the DVIN model and compare it with the non-dominated sorting genetic algorithm II and the exhaustive method.Simulation results demonstrate that the proposed model significantly reduces the decisionmaking time for UAV/node path planning with a high average success rate. 展开更多
关键词 Dynamic value iteration networks Episodic q-learning Unmanned aerial vehicle(UAV)ad-hoc network Non-dominated sorting genetic algorithm II(NSGA-II) Path planning
原文传递
Decision-making Method for Pumped Storage Power Stations in the Electricity Energy and Frequency Regulation Markets
20
作者 Man Chen Hongtao Zhu +6 位作者 Yumin Peng Xuan Wang Xuefeng Zhang Yijun Xiong Lianfu Chen Yikai Li Bushi Zhao 《Chinese Journal of Electrical Engineering》 CSCD 2024年第4期60-72,共13页
With the establishment of “carbon peaking and carbon neutrality” goals in China, along with the development of new power systems and ongoing electricity market reforms, pumped-storage power stations (PSPSs) will inc... With the establishment of “carbon peaking and carbon neutrality” goals in China, along with the development of new power systems and ongoing electricity market reforms, pumped-storage power stations (PSPSs) will increasingly play a significant role in power systems. Therefore, this study focuses on trading and bidding strategies for PSPSs in the electricity market. Firstly, a comprehensive framework for PSPSs participating in the electricity energy and frequency regulation (FR) ancillary service market is proposed. Subsequently, a two-layer trading model is developed to achieve joint clearing in the energy and frequency regulation markets. The upper-layer model aims to maximize the revenue of the power station by optimizing the bidding strategies using a Q-learning algorithm. The lower-layer model minimized the total electricity purchasing cost of the system. Finally, the proposed bi-level trading model is validated by studying an actual case in which data are obtained from a provincial power system in China. The results indicate that through this decision-making method, PSPSs can achieve higher economic revenue in the market, which will provide a reference for the planning and operation of PSPSs. 展开更多
关键词 Pumped storage power station(PSPSs) electricity energy market frequency regulation market bidding strategy q-learning algorithm
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部