交互式动态影响图的一种近似求解算法被引量：3

Approximate solving-solution of interactive dynamic influence diagrams

导出

摘要提出一种基于行为等价原理分段处理交互式动态影响图(I-DID)的近似算法:先将底层I-DID模型分解成包含若干时间片的子片段,求解首片段,获得各模型的策略树,并依行为等价原理合并策略树,形成策略图,其结果作为下一片段的初始模型,再进行求解.重复这个过程,直到最后片段结束,获得完全策略图,用来指导agent是否进行模型更新.最后,针对多agent老虎问题进行试验和算法比较,试验结果从模型解的质量和模型空间大小2个方面验证了所提算法的有效性. An approximate solution was presented based on the principle of behaviorally equivalent for interactive dynamic influence diagrams（I-DID）.The amount of calculation was reduced by decomposing the I-DID model into more than one fragment and compressing the space of other agents′ candidate models.First,the model of I-DID or DID at bottom level was split into sub-segments that include a number of time slices,then the solution of the first segment for the initial models was obtained,and the policy graph could be gotten by merging policy trees based on the principle of behaviorally equivalent.Continue to solve the next I-DID or DID,the output of the previous fragment was regarded as the input for the subsequent fragment,until the last fragment,and the whole policy graph was available,which identifying whether the model needed to be updated.Experiment results,which on the quality of solution and the magnitude of model space for multi-agent tiger problem,show the validity of the approximate method.

作者李波罗键庄进发尹华一

机构地区厦门大学自动化系厦门东南融通系统工程有限公司博士后科研工作站解放军信息工程大学通信与信息系

出处《华中科技大学学报（自然科学版）》 EI CAS CSCD 北大核心 2011年第10期64-68,共5页 Journal of Huazhong University of Science and Technology(Natural Science Edition)

基金国家自然科学基金资助项目(60975052)

关键词多AGENT系统 AGENT建模动态决策交互式动态影响图行为等价最小模型集 multi-agent system agent modeling dynamic decision-making interactive dynamic influence diagrams（I-DID） behaviorally equivalence minimal model set

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献11

1Suryadi D, Gmytrasiewiez P. Learning models of other agents using influence diagrams[C]//Proceedings of the Seventh International Conference on User Modeling. New York: Springer-Verlag, 1999: 223-232.
2Koller D, Milch B. Multi-agent influence diagrams for representing and solving games[J]. Games and Economic Behavior, 2003, 45:181-221.
3Gal Y, Pfeffer A. Networks of influence diagrams: a formalism for representing agents' beliefs and decision-making processes[J]. Journal of Artificial Intelligence Research, 2008, 33: 109-147.
4姚宏亮,王浩,汪荣贵,李俊照.多Agent动态影响图的近似计算方法[J].计算机研究与发展,2008,45(3):487-495. 被引量：4
5姚宏亮,王浩,张佑生,汪荣贵.多Agent动态影响图及其一种近似推理算法研究[J].计算机学报,2008,31(2):236-244. 被引量：14
6Tatman J A, Shachter R D. Dynamic programming and influence diagrams[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1990, 20(2): 365- 379.
7Doshi P, Zeng Y F, Chen Q Y. Graphical models for interactive POMDPs: representation and solutions [J]. Journal of Autonomous agents and Multi-agent Systems, 2009, 18(3): 376-416.
8Zeng Y F, Doshi P, Chen Q Y. Approximate solutions of interactive dynamic influence diagrams using model clustering[C]//Proceeding of the Twenty-second Conference on Association for the Advancement of Artificial Intelligence. Vancouver: AAAI Press, 2007 : 782-787.
9Doshi P, Zeng Y F. Improved approximation of interactive dynamic influence diagrams using discriminative model updates[C]//Proceedings of the Ninth International Joint Conference on Autonomous agents and Multi-agent Systems. Budapest: IFAAMAS, 2009: 907-914.
10Zeng Y F, Doshi P. Model identification in interactive influence diagrams using mutual information [J]. Journal of Web Intelligence and agent Systems, 2010, 8(3): 313-327.

二级参考文献34

1王红卫,李琛,刘会新.马尔可夫决策过程复杂性的熵测度[J].控制与决策,2004,19(9):983-987. 被引量：10
2张润梅,王浩,姚宏亮,方宝富.影响图及其在Robocup中的应用[J].系统仿真学报,2005,17(1):134-137. 被引量：6
3吴志勇,蔡莲红.基于动态贝叶斯网络的音视频双模态说话人识别[J].计算机研究与发展,2006,43(3):470-475. 被引量：11
4Howard R A, Matheson J E. Influence diagrams//Howard R A, Matheson J E eds. Readings on the Principles and Applications of Decision Analysis, Menlo Park: Strategic Decisions Group, 1984, 2:719 792
5ZHANG Wei-Hong, Ji Qiang. A faetorization approach to evaluating simultaneous influence diagrams. IEEE Transactions on Systems, Man and Cybernetics, Part A, 2006, 36 (4): 746-757
6Koller D, Milch B. Multi-agent influence diagrams for representing and solving games//Proceedings of the IJCAI. Seattle, USA, 2001:1024-1034
7Gal Y, Pfeffer A. A language for modeling Agents decision making processes in games//Proceedings of the AAMAS- 2nd. Melbourne, Australia, 2003:265-272
8Marengoni Mauricio. Decision making and uncertainty management in a 3D reconstruction system. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(7): 852-858
9Koller D, Parr R. Computing factored value functions for polieies in struetured MDPs//Proeeedings of the IJCAI-99. Morgan Kaufmann, 1999:1332-1339
10Carlos viprin, Daphne Koller. Efficient solution algorithms for factored MDPs. Journal of Artificial Intelligence Research, 2003, 19(10): 399-468

共引文献15

1王浩,何海燕,姚宏亮,胡大伟.IE_-K2:一种基于贝叶斯网络的高效基因调控网络构建方法[J].大连海事大学学报,2008,34(3):111-114.
2张润梅,王浩,张佑生,姚宏亮,方长胜.基于内部结构MPoMDP模型的策略梯度学习算法[J].计算机工程与应用,2009,45(7):20-23. 被引量：1
3康晓凤.多Agent技术在电子商务中的应用[J].计算机与数字工程,2010,38(2):170-173. 被引量：2
4韩仁东,李书章,杨宏桥,孔璐蓉.基于多Agent的临床路径仿真建模方法[J].系统仿真学报,2010,22(7):1561-1565. 被引量：1
5罗键,李波,潘颖慧,尹华一,吴长庆.基于多Agent的交互式动态影响图研究、应用与展望[J].厦门大学学报（自然科学版）,2011,50(2):253-260. 被引量：1
6姚宏亮,王秀芳,胡大伟,王浩,茆美琴.多Agent动态影响图的一种混合近似推理算法[J].计算机研究与发展,2011,48(4):584-591. 被引量：2
7李波,罗键,尹华一,田乐.一种交互式动态影响图的改进算法[J].模式识别与人工智能,2011,24(4):506-513. 被引量：1
8周丽华,刘惟一,王丽珍.影响图的扩展综述[J].计算机科学与探索,2011,5(11):961-975. 被引量：1
9姚宏亮,袁正,王浩.基于Factored Frontier算法的动态贝叶斯网络灵敏性分析方法[J].南京大学学报（自然科学版）,2012,48(4):412-420. 被引量：2
10王浩,曹龙雨,姚宏亮,李俊照.基于结构分析的局部Gibbs抽样自动推理算法[J].模式识别与人工智能,2013,26(4):382-391. 被引量：2

同被引文献28

1黄柯棣,刘宝宏,黄健,曹星平,尹全军,郭刚,张琦,张传富,刘云生.作战仿真技术综述[J].系统仿真学报,2004,16(9):1887-1895. 被引量：107
2薛方正,方帅,徐心和.多机器人对抗系统仿真中的对手建模[J].系统仿真学报,2005,17(9):2138-2141. 被引量：8
3王磊,孙增圻.基于行为的多机器人对手意图识别二次估计方法[J].清华大学学报（自然科学版）,2005,45(10):1421-1424. 被引量：7
4Howard R A,Matheson J E.Influence diagrams[J].Decision Analysis,2005,2(3):127-143.
5Shachter R D.Evaluating influence diagrams[J].In Operations Research,1986,33(6):871-882.
6Merkhofer M W,Conway R,Anderson R G.Multiattribute utility analysis as a framework for public participation in siting a hazardous waste management facility[J].Environmental Management,1997,21(6):831-839.
7Kaoa H Y,Lib H L.A diagnostic reasoning and optimal treatment model for bacterial infections with fuzzy information[J].Computer Methods and Programs in Biomedicine,2005,77(1):23-37.
8Abramson B.Using belief networks to forecast oil price[J].International Journal of Foreceast,1991,7(1):299-315.
9Hiseby A B,Skogen S.Dynamic risk analysis:the Dynrisk concept[J].International Journal Project Management,1992,10(3):160-164.
10Tatman J A,Shachter R D.Dynamic programming and influence diagrams[J].IEEE Transactions on Systems,Man,and Cybernetics,1990,20(2):365-379.

引证文献3

1秦之凡,杨伟龙.基于粒子滤波的隐式对手策略匹配方法[J].装甲兵学报,2022(5):86-92.
2田乐,罗键,曹浪财.多Agent交互动态影响图的近似行为等价算法[J].华中科技大学学报（自然科学版）,2014,42(4):60-63. 被引量：2
3罗键,武鹤,曹浪财.多智能体对手建模及其真实模型的确定[J].华中科技大学学报（自然科学版）,2015,43(10):48-52. 被引量：2

二级引证文献4

1罗键,武鹤,曹浪财.多智能体对手建模及其真实模型的确定[J].华中科技大学学报（自然科学版）,2015,43(10):48-52. 被引量：2
2邢志伟,李世皎,唐云霄,罗谦.基于Agent-元胞自动机的机场场面交通仿真[J].系统仿真学报,2018,30(3):857-865. 被引量：6
3安敬民,李冠宇,张冬青,蒋伟.面向序贯决策中异常情景下交互问题处理方法[J].计算机集成制造系统,2020,26(12):3274-3282.
4程恺,张金鹏,邵天浩,邹世辰,于本川.智能博弈领域中的对手建模方法综述[J].计算机技术与发展,2025,35(9):1-8.

1田乐,罗键,曹浪财,陈志平.基于KL距离的交互式动态影响图近似算法[J].系统工程与电子技术,2013,35(1):207-211. 被引量：2
2田乐,罗键,曹浪财.多Agent交互动态影响图的近似行为等价算法[J].华中科技大学学报（自然科学版）,2014,42(4):60-63. 被引量：2
3李波,罗键,尹华一,田乐.一种交互式动态影响图的改进算法[J].模式识别与人工智能,2011,24(4):506-513. 被引量：1
4李波,曹浪财,庄进发.交互式动态影响图及其精确求解算法[J].解放军理工大学学报（自然科学版）,2011,12(2):119-124. 被引量：1
5罗键,李波,潘颖慧,尹华一,吴长庆.基于多Agent的交互式动态影响图研究、应用与展望[J].厦门大学学报（自然科学版）,2011,50(2):253-260. 被引量：1
6刘石坚,乐晓波,邹峥.关于Petri网系统S-补相关定理的补充证明及其分析[J].系统仿真学报,2008,20(S2):1-5. 被引量：1
7田乐,曹浪财.基于lookahead的交互式动态影响图的DMU改进算法[J].系统工程与电子技术,2014,36(6):1201-1206.
8王丽丽,方贤文,张苗苗.子网行为等价的特殊网系统的同步距离[J].安徽理工大学学报（自然科学版）,2014,34(1):19-23.
9罗键,武鹤.基于交互式动态影响图的对手建模[J].控制与决策,2016,31(4):635-639. 被引量：5
10梁志荣.基于行为等价的远程程序执行认证[J].智能计算机与应用,2013,3(2):77-79.

华中科技大学学报（自然科学版）

2011年第10期

浏览历史

内容加载中请稍等...

交互式动态影响图的一种近似求解算法被引量：3

参考文献11

二级参考文献34

共引文献15

同被引文献28

引证文献3

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

交互式动态影响图的一种近似求解算法 被引量：3

参考文献11

二级参考文献34

共引文献15

同被引文献28

引证文献3

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

交互式动态影响图的一种近似求解算法被引量：3