期刊文献+

基于多Agent强化学习的流水线维护策略 被引量:5

Maintenance policies for flow line based on multi-agent reinforcement learning
在线阅读 下载PDF
导出
摘要 在生产过程中,设备状态的衰变会影响产品质量,尽管设备仍能运行,但其成品率水平逐渐下降.针对由两台具有衰变质量状态的设备和一个库存缓冲组成的2M1B流水线系统,研究衰变设备的预防维护策略.每台设备可视为一个Agent,其预防维护问题被描述成半马氏决策过程模型,并与另一台设备的维护模型相关.以考虑系统全局即时成本为前提,提出了一种分布式的多Agent强化学习方法,获得两台设备在缓冲库存水平下的维护策略.学习所得的维护策略是典型的控制限型形式,即对于给定库存水平,当设备衰变至等于或劣于其相应的控制极限状态时,便触发维护行动. In manufacturing systems, the deterioration of machine states will influence the quality of the produced parts. The machine might be operating with a lower yield level. This paper investigates the maintenance policy for the machines with deteriorating quality states in a flow line system consisting of two series machines with a finite buffer in between. Each of the machines is considered as an agent. The maintenance problem for the machine is modeled as a semi-Markov decision process, which is related with the maintenance decision process of the other machine. On the premise of considering the global system immediate costs, a distributed multi-agent reinforcement learning algorithm is proposed to obtain the maintenance policy for the machines associated with a given buffer level. The learned policy appears to be a typical control limit form, which means the maintenance action should be triggered whenever the state of the machine deteriorates to a certain control limit.
出处 《系统工程学报》 CSCD 北大核心 2013年第5期702-708,共7页 Journal of Systems Engineering
基金 国家自然科学基金资助项目(60904075) 国家杰出青年科学基金资助项目(71125001)
关键词 预防维护 2M1B流水线 多Agent强化学习算法 半马氏决策过程 衰变的质量状态 preventive maintenance 2M1B flow line multi-agent reinforcement learning semi-Markov deci- sion process deteriorating quality states
  • 相关文献

参考文献16

  • 1Van Der Duyn Schouten F A, Vanneste S G. Maintenance optimization of a production system with buffer capacity[J]. European Journal of Operational Research, 1995, 82(2): 323-338.
  • 2Kyriakidis E G, Dimitrakos T D. Optimal preventive maintenance of a production system with an intermediate buffer[J]. European Journal of Operational Research, 2006, 168(1): 86-99.
  • 3Pavitsos A, Kyriakidis E G. Markov decision models for the optimal maintenance of a production unit with an upstream buffer[J]. Computers and Operations Research, 2009, 36(6): 1993-2006.
  • 4Karamatsoukis C C, Kyriakidis E G. Optimal maintenance of two stochastically deteriorating machines with an intermediate buffer[J]. European Journal of Operational Research, 2010, 207(1): 297-308.
  • 5Kim J. Integrated Quality and Quantity Modeling of a Pline[D]. Cambridge: University of Massachusetts Institute of Technology, 2005.
  • 6Inman R R, Blumenfeld D E, Huang N, et al. Designing production systems for quality: Research opportunities from an automobile industry perspective[J]. International Journal of Production Research, 2003, 41 (9): 1953-1971.
  • 7Meerkov S M, Zhang L. Bernoulli production lines with quality-quantity coupling machines: Monotonicity properties and bottle- necks[J]. Annual Operations Research, 2011, 182(1): 119-131.
  • 8Pham H, Wang H. Imperfect maintenance[J]. European Journal of Operational Research, 1996, 94(3): 425--438.
  • 9Das T K, Gosavi A, Mahadevan S, et al. Solving semi-Markov decision problems using average reward reinforcement learning[J]. Management Science, 1999, 45(4): 560-574.
  • 10Gosavi A. Simulation-based Optimization: Parametric Optimization Techniques and Reinforcement Learning[M]. Norwell: Kluwer , Academic Publishers, 2003.

同被引文献48

  • 1浦徐进,吴亚,路璐,蒋力.企业生产行为和官员监管行为的演化博弈模型及仿真分析[J].中国管理科学,2013,21(S1):390-396. 被引量:52
  • 2赵喜林,许兴华.现代制造系统维修策略研究[J].新技术新工艺,2006(1):49-51. 被引量:1
  • 3吕文元,朱清香,方淑芬.利用故障数据和估计的检查数据建立维修优化模型[J].数学的实践与认识,2007,37(10):83-89. 被引量:5
  • 4Santner T J, Williams B J,Notz W I. The Design and Analysis of Computer Experiments. New York: Springer Verlag, 2003.
  • 5Kleijnen JP C. Design and Analysis of Simulation Experiments. New York: Springer US, 2008.
  • 6Kleijnen JPC, Sanchez S M,Lucas T W, et al. State-of-the-art review: A user’s guide to the brave new world of designing simulationexperiments. INFORMS Journal on Computing, 2005,17(3); 263-289.
  • 7Chen V C P, Tsui K L, Barton R R, et al. A review on design, modeling and applications of computer experiments, liE Transactions, 2006, 38(4): 273-291.
  • 8Montgomery D C. Design and Analysis of Experiments. 6th ed. New York: John Wiley & Sons, 2007.
  • 9Shi W, Kleijnen JPC, Liu Z X. Factor screening for simulation with multiple responses: Sequential bifurcation. European Journalof Operational Research, 2014,237(1): 136-147.
  • 10Kleijnen JPC, Bettonvil B W, Persson F. Screening for the Important Factors in Large Discrete-event Simulation Models: SequentialBifurcation and Its Applications. New York: Springer, 2006,287-307.

引证文献5

二级引证文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部