GuessWhat?!is a goal-oriented visual dialog task where the Guesser infers the target object in an image by asking several questions,and the Answerer provides answers.The quality of question generation is vital for the...GuessWhat?!is a goal-oriented visual dialog task where the Guesser infers the target object in an image by asking several questions,and the Answerer provides answers.The quality of question generation is vital for the task,but exist-ing methods do not consider the redundant objects brought by Faster RCNN for object detection,leading to meaningless,repetitive questions.To address this,we propose Question Improvement and Redundancy Elimination(QIRE)to enhance question generation byfiltering redundant object features.To overcome the prob-lem that Faster RCNN must follow afixed number of objects,resulting in poor quality of detected objects,we design a new module for capturing visual represen-tations of variable number of objects.In addition,we put forward the Target Cate-gory Learner(TCL)module to simulate human questioning thinking,and apply a penalty mechanism to reduce repetition.Experimental results on the GuessWhat?!dataset show QIRE’s competitiveness in question quality and dialog effectiveness compared to existing methods.展开更多
文摘GuessWhat?!is a goal-oriented visual dialog task where the Guesser infers the target object in an image by asking several questions,and the Answerer provides answers.The quality of question generation is vital for the task,but exist-ing methods do not consider the redundant objects brought by Faster RCNN for object detection,leading to meaningless,repetitive questions.To address this,we propose Question Improvement and Redundancy Elimination(QIRE)to enhance question generation byfiltering redundant object features.To overcome the prob-lem that Faster RCNN must follow afixed number of objects,resulting in poor quality of detected objects,we design a new module for capturing visual represen-tations of variable number of objects.In addition,we put forward the Target Cate-gory Learner(TCL)module to simulate human questioning thinking,and apply a penalty mechanism to reduce repetition.Experimental results on the GuessWhat?!dataset show QIRE’s competitiveness in question quality and dialog effectiveness compared to existing methods.