期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Large sequence models for sequential decision-making:a survey 被引量:1
1
作者 Muning WEN Runji LIN +6 位作者 Hanjing WANG Yaodong YANG Ying WEN Luo MAI Jun WANG Haifeng ZHANG Weinan ZHANG 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第6期25-42,共18页
Transformer architectures have facilitated the development of large-scale and general-purpose sequence models for prediction tasks in natural language processing and computer vision,e.g.,GPT-3 and Swin Transformer.Alt... Transformer architectures have facilitated the development of large-scale and general-purpose sequence models for prediction tasks in natural language processing and computer vision,e.g.,GPT-3 and Swin Transformer.Although originally designed for prediction problems,it is natural to inquire about their suitability for sequential decision-making and reinforcement learning problems,which are typically beset by long-standing issues involving sample efficiency,credit assignment,and partial observability.In recent years,sequence models,especially the Transformer,have attracted increasing interest in the RL communities,spawning numerous approaches with notable effectiveness and generalizability.This survey presents a comprehensive overview of recent works aimed at solving sequential decision-making tasks with sequence models such as the Transformer,by discussing the connection between sequential decision-making and sequence modeling,and categorizing them based on the way they utilize the Transformer.Moreover,this paper puts forth various potential avenues for future research intending to improve the effectiveness of large sequence models for sequential decision-making,encompassing theoretical foundations,network architectures,algorithms,and efficient training systems. 展开更多
关键词 sequential decision-making SEQUENCE modeling the TRANSFORMER TRAINING system
原文传递
干扰惰性序列的连续决策模型模糊测试
2
作者 吴泊逾 王凯锐 +1 位作者 王亚文 王俊杰 《软件学报》 北大核心 2025年第10期4645-4659,共15页
人工智能技术的应用已经从分类、翻译、问答等相对静态的任务延伸到自动驾驶、机器人控制、博弈等需要和环境进行一系列“交互-行动”才能完成的相对动态的任务.执行这类任务的模型核心是连续决策算法,由于面临更高的环境和交互的不确定... 人工智能技术的应用已经从分类、翻译、问答等相对静态的任务延伸到自动驾驶、机器人控制、博弈等需要和环境进行一系列“交互-行动”才能完成的相对动态的任务.执行这类任务的模型核心是连续决策算法,由于面临更高的环境和交互的不确定性,而且这些任务往往是安全攸关的系统,其测试技术面临极大的挑战.现有的智能算法模型测试技术主要集中在单一模型的可靠性、复杂任务多样性测试场景生成、仿真测试等方向,对连续决策模型的“交互-行动”决策序列没有关注,导致无法适应,或者成本效益低下.提出一个干预惰性“交互-行动”决策序列执行的模糊测试方法IIFuzzing,在模糊测试框架中,通过学习“交互-行动”决策序列模式,预测不会触发失效事故的惰性“交互-行动”决策序列,并中止这类序列的测试执行,以提高测试效能.在4种常见的测试配置中进行实验评估,结果表明,与最新的针对连续决策模型的模糊测试相比,IIFuzzing可以在相同时间内多探测16.7%–54.5%的失效事故,并且事故的多样性也优于基线方法. 展开更多
关键词 连续决策模型 马尔可夫决策过程 模糊测试
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部