摘要
针对在大型机电设备拆卸与维修问题中当前主流元启发式算法存在的解决拆卸序列规划(DSP)问题效率低、稳定性差的问题,引入了强化学习思想,并将其与分层策略结合,提出了适用于多零件DSP的一种强化学习算法(QL)。首先,构建了DSP数据模型与空间约束模型;基于分层策略,把设备零部件分解成包含少量零件的多个子集;然后,基于零部件两两直接装配约束,构建了每个子集的初始R表,通过拆解工作量指标构建了序列评价奖惩函数,对初始R表进行了更新并生成了最终R表,利用QL算法,根据最终R表对每个子集进行了循环迭代学习训练直至结果收敛,生成了用于最优路径决策的Q表;最后,选取了水电站球阀、轴套提取装置和主接力器作为虚拟拆解测试对象,对该方法的有效性进行了验证。研究结果表明:QL算法相较于遗传算法(GA)和引力搜索算法(GSA),在收敛速度、优化效率和稳定性方面具有优势,运行时间优化程度相较GA与GSA分别达到了97.3%、98.4%;87.1%、94.9%和93.4%、95.0%,得到了符合预期的高质量拆卸序列,验证了该方法的有效性。与传统算法对比,QL算法具有一定的优越性。
Aiming at the problems of low efficiency and poor stability of the current mainstream meta-heuristic algorithms in solving the disassembly sequence planning problem in the disassembly and maintenance of large-scale electromechanical equipment,the idea of reinforcement learning was introduced and combined with the hierarchical strategy.A quality learning(QL)algorithm suitable for multi-part disassembly sequence planning was proposed.Firstly,a data model and a spatial constraint model for disassembly sequence planning were constructed.Based on the hierarchical strategy,the equipment components were decomposed into multiple subsets with a small number of parts.Then,an initial R table for each subset was constructed based on the direct assembly constraints between every two parts.A sequence evaluation reward and punishment function were constructed using the disassembly workload index to update the initial R table and generate the final R table.After that,the QL algorithm was used to perform cyclic iterative learning and training on each subset according to the final R table until the results converged,generating a Q table for optimal path decision-making.Finally,the spherical valve of the hydropower station,the bushing extraction device,and the main servomotor were selected as virtual disassembly test objects to verify the proposed method.The research results show that the QL algorithm has advantages over the genetic algorithm(GA)and the gravitational search algorithm(GSA)in terms of convergence speed,optimization efficiency,and stability.The optimization degrees of the running time reach 97.3%,98.4%;87.1%,94.9%and 93.4%,95.0%respectively.High-quality disassembly sequences that meet the expectations are obtained,it verifies the effectiveness of the proposed method.Compared with the traditional algorithm,QL algorithm has certain advantages.
作者
杨贵程
刘海涛
王克远
吴月超
苏佶智
王卓瑜
YANG Guicheng;LIU Haitao;WANG Keyuan;WU Yuechao;SU Jizhi;WANG Zhuoyu(Power China Huadong Engineering Corporation Limited,Hangzhou 310014,China;State Grid Xinyuan Group Co.,Ltd.,Beijing 100052,China;State Grid Electric Power Engineering Research Institute,Beijing 100073,China;Economic and Technological Research Institute of State Grid Hebei Electric Power Co.,Ltd.,Shijiazhuang 050000,China)
出处
《机电工程》
北大核心
2025年第10期2001-2009,共9页
Journal of Mechanical & Electrical Engineering
基金
国家电网有限公司总部科技项目(5200-202356477A-3-2-ZN)。
关键词
设备维修
拆卸序列规划
强化学习算法
遗传算法
引力搜索算法
分层策略
equipment maintenance
disassembly sequence planning(DSP)
quality learning(QL)algorithm
genetic algorithm(GA)
gravitational search algorithm(GSA)
hierarchical strategy