期刊文献+

基于增强学习的平行机调度研究 被引量:3

Parallel machines scheduling with reinforcement learning
在线阅读 下载PDF
导出
摘要 尝试运用增强学习方法来研究平行机调度问题,通过定义系统状态、行为和报酬函数,把调度问题转化为平均报酬型半马尔可夫决策过程,并使用结合函数泛化器的R-Learning算法来解决。提出排名算法,并利用它和两种常用的调度规则(最短期望加工时间规则和先进先出规则)来定义增强学习的行为。实验结果表明,R-Learning算法通过仿真实验学习较优的调度策略,在不同的决策状态下选择最优或次优的行为,对每个测试问题的效果都优于以上任何一条调度规则。 A Reinforcement Learning (RL) method, R-Learning, was used to study parallel machines scheduling problems which was aimed to minimize mean flow time of jobs. The scheduling problem was converted into Semi-Markov Decision Process(SMDP) by defining system state, actions and reward function. It was solved by R- Learning functions. A heuristic, Ranking Algorithm (RA) was proposed and defined as RL as well as two commonly used dispatching rules: Shortest Expected Processing Time (SEPT) and First In First Out (FIFO). Experiment results demonstrated that R-Learning could learn a near-optimal scheduling policy through simulation, i.e. to select optimal or sub-optimal actions at different states. The conclusion was that R-Learning was superior to the above heuristic rules in all test problems.
出处 《计算机集成制造系统》 EI CSCD 北大核心 2007年第1期110-116,共7页 Computer Integrated Manufacturing Systems
基金 国家自然科学基金资助项目(50375082)。~~
关键词 调度 平行机 增强学习 马尔可夫决策过程 scheduling parallel machines reinforcement learning Markov decision process
  • 相关文献

参考文献17

  • 1PINEDO M.Scheduling:theory,algorithms,and systems[M].2nd ed.Upper Saddle River,N.J.,USA:Prentice Hall,2002.
  • 2CHEN T C E,SIN C C S.A state-of-the-art review of parallel-machine scheduling research[J].European Journal of Operational Research,1990,47(3):271-292.
  • 3FRANGIONI A,NECCIARI E,SCUTELLA M G.A multi-exchange neighborhood for minimum makespan parallel machine scheduling problems[J].Journal of Combinatorial Optimization,2004,8(2):195-220.
  • 4AZIZOGLU M,KIRCA O.Scheduling jobs on unrelated parallel machines to minimize regular total cost functions[J].IIE Transactions,1999,31(2):153-159.
  • 5MOSHEIOV G.Parallel machine scheduling with a learning effect[J].Journal of the Operational Research Society,2001,52(10):1165-1169.
  • 6MOSHEIOV G,SIDNEY J B.Scheduling with general jobdependent learning curves[J].European Journal of Operational Research,2003,147(3):665-670.
  • 7YU L,SHIH H M,PFUND M,et al.Scheduling of unrelated parallel machines:an application to PWB manufacturing[J].IIE Transactions,2002,34(11):921-931.
  • 8PUTERMAN M L.Markov decision processes[M].New York,N.Y.,USA:Wiley Interscience,1994.
  • 9SUTTON R S,BARTO A G.Reinforcement learning:an introduction[M].Cambridge,Mass.,USA:MIT Press,1998.
  • 10SCHWARTZ A.A reinforcement learning method for maximizing undiscounted rewards[C]//Proceedings of the 10th International Conference on Machine Learning.San Francisco,Cal.USA:Morgan Kaufmann Publishers,1993:298-305.

二级参考文献18

  • 1郑应平 赵丽娜.离散事件与混杂系统的调度控制[J].控制理论与应用,1999,16:82-86.
  • 2Kumar P R. Reentrant queueing networks [ J ]. Special Issue Queuing Systems. 1993,13:87 - 110.
  • 3Kumar P R. Scheduling manufacturing systems of re-entrant lines[ A]. Yao D D. Stochastic Modeling and Analysis of Manufacturing Systems [ M ]. New York : Springer-Verlag, 1994. 325 - 360.
  • 4Meyn S P. Stability and optimization of multiclass queueing networks and their fluid models [ A ]. Proceedings of the Summer Seminar on "The mathematics of Stochastic Manufacturing Systems" [C]. American Mathematical Society, 1997.
  • 5Harrison J M, Kumar P R. Scheduling network of queues: heavy traffic analysis of a two-station closed network[ J]. Operations Research, "1990,38(6) :1052 - 1064.
  • 6Tsitsiklis J N, Roy B V. Average cost temporal difference leaming[ J ]. Automatica, 1999,35 ( 11 ) : 1799 - 1808.
  • 7Bertsekas D P, Tsitsiklis J N. Neuro-Dynamic Programming [M].Athena Scientific, 1996.
  • 8Jin H, Ou J, Kumar P R. The throughput of irreducible closed markovian queueing networks : functional bounds, asymptotic loss,efficiency, and Harrison-wei conjectures [ J]. Mathematics of Operations Research, 1997, 22(4) :886-920.
  • 9郑应平,控制理论与应用,1999年,16卷,增刊,82页
  • 10Jin H,Math Oper Res,1997年,22卷,4期,886页

共引文献5

同被引文献22

引证文献3

二级引证文献33

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部