摘要
提出一种基于行为等价原理分段处理交互式动态影响图(I-DID)的近似算法:先将底层I-DID模型分解成包含若干时间片的子片段,求解首片段,获得各模型的策略树,并依行为等价原理合并策略树,形成策略图,其结果作为下一片段的初始模型,再进行求解.重复这个过程,直到最后片段结束,获得完全策略图,用来指导agent是否进行模型更新.最后,针对多agent老虎问题进行试验和算法比较,试验结果从模型解的质量和模型空间大小2个方面验证了所提算法的有效性.
An approximate solution was presented based on the principle of behaviorally equivalent for interactive dynamic influence diagrams(I-DID).The amount of calculation was reduced by decomposing the I-DID model into more than one fragment and compressing the space of other agents′ candidate models.First,the model of I-DID or DID at bottom level was split into sub-segments that include a number of time slices,then the solution of the first segment for the initial models was obtained,and the policy graph could be gotten by merging policy trees based on the principle of behaviorally equivalent.Continue to solve the next I-DID or DID,the output of the previous fragment was regarded as the input for the subsequent fragment,until the last fragment,and the whole policy graph was available,which identifying whether the model needed to be updated.Experiment results,which on the quality of solution and the magnitude of model space for multi-agent tiger problem,show the validity of the approximate method.
出处
《华中科技大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2011年第10期64-68,共5页
Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金
国家自然科学基金资助项目(60975052)
关键词
多AGENT系统
AGENT建模
动态决策
交互式动态影响图
行为等价
最小模型集
multi-agent system
agent modeling
dynamic decision-making
interactive dynamic influence diagrams(I-DID)
behaviorally equivalence
minimal model set