摘要
目前大部分基于生成对抗网络GAN的文本至图像生成算法着眼于设计不同模式的注意力生成模型,以提高图像细节的刻画与表达,但忽略了判别模型对局部关键语义的感知,以至于生成模型可能生成较差的图像细节“欺骗”判别模型。提出了判别语义增强的生成对抗网络DE-GAN模型,试图在判别模型中设计词汇-图像判别注意力模块,增强判别模型对关键语义的感知和捕捉能力,驱动生成模型生成高质量图像细节。实验结果显示,在CUB-Bird数据集上,DE-GAN在IS指标上达到了4.70,相比基准模型提升了4.2%,达到了较高的性能表现。
Based on Generative Adversarial Networks(GANs),most current text-to-image generation algorithms focus on designing different attention generation models to improve the characterization and expression of image details.However,they ignore the discriminator’s perception of key local semantics,so the generation models can easily generate poor image details to “fool” the discriminators.This paper designs a vocabulary-image discriminative attention module in the discriminators to enhance the discriminators’ ability to perceive and capture key semantics,and drive the generation model to generate high-quality image details.Therefore,a discrimination-enhanced generative adversarial model(DE-GAN) is proposed.The experimental results show that,on the CUB-Bird dataset,DE-GAN achieves 4.70 on the IS index,which is 4.2% higher than the baseline model and achieves high performance.
作者
谭红臣
黄世华
肖贺文
于冰冰
刘秀平
TAN Hong-chen;HUANG Shi-hua;XIAO He-wen;YU Bing-bing;LIU Xiu-ping(School of Artificial Intelligence and Automation,Beijing University of Technology,Beijing 100124;Department of Computer Science,The Hong Kong Polytechnic University,Hongkong 999077;School of Mathematical Sciences,Dalian University of Technology,Dalian 116024,China)
出处
《计算机工程与科学》
CSCD
北大核心
2022年第5期855-861,共7页
Computer Engineering & Science
基金
国家自然科学基金(61976040,62172073)
中国博士后科学基金委第70批博士后面上项目(2021M700303)。
关键词
文本至图像生成
生成对抗网络
注意力机制
判别模型
text-to-image generation
generative adversarial network
attention mechanism
discrimination model