摘要
针对在开放域对话生成中的回复往往具有高重复性的特点或缺乏实际意义,以及与上下文内容相关性低和弱同理心等问题,利用基于DialoGPT的改进预训练模型分别构建生成排序模型,将DialoGPT模型作为基干网络并引入情绪分类和句子分类模块,便于进行多任务训练。首先将抽取的特定情绪对话之间用特殊间隔符进行拼接,然后输入到在大量数据训练过的生成模型;通过训练特定数据集和编码上下文内容,模型可以在多轮对话过程生成通顺流畅且情感引导的候选文本;同时在生成时采用核采样算法以提高回复多样性;最后通过排序模型挑选与上下文信息相关性最高的回复作为输出。另外为了进一步提高模型生成的泛化能力和算法收敛速度,采用AdamW替换Adam进行梯度更新。实验结果表明,所设计的生成排序模型在Context、Fluency等指标相比基线模型都有一定提升,对话示例显示可以有来有回的对话交流,生成的回复文本内容通顺流畅且多样性高。
Concerning the problems of high repetitiveness,meaninglessness,low relevance to contextual content and weak empathy in the open field dialogue generation,an improved pre-training model based on DialoGPT was used to construct generation and ranking model. The DialoGPT model was used as the backbone network,and the emotion classification and sentence classification modules were introduced to facilitate multi-task training. Firstly,dialogues with specific emotion were spliced with special spacers and input into the generation model that trained on large-scale data. By training specific datasets and coding contextual content,the model can generate fluent and emotionally guided candidate texts in multiple rounds of dialogue. Then the nuclear sampling algorithm was used to improve the diversity of responses when generating responses. Finally,the response with the highest correlation was selected as the output through the ranking model. In addition,in order to further improve the general ability of generation and the convergence speed of the proposed algorithm,AdamW was used to replace Adam for gradient update. Experimental results show that the designed generation and ranking model has a certain improvement in Context and Fluency compared with the baseline models. The dialog examples show that there can be multi-round dialogues,and the generated response text content is smooth and highly diverse.
作者
王浩然
李国勇
徐传淇
胡智翔
WANG Haoran;LI Guoyong;XU Chuanqi;HU Zhixiang(Chengdu Institute of Computer Application,Chinese Academy of Sciences,Chengdu Sichuan 610041,China;School of Computer and Control,University of Chinese Academy of Sciences,Beijing 100049,China)
出处
《计算机应用》
CSCD
北大核心
2021年第S02期66-70,共5页
journal of Computer Applications
基金
四川省重大科技专项(2019ZDZX0005)
四川省科技计划项目(2020YFG0009)。