摘要
在只有图像和目标文本提示作为输入的情况下,对真实图像进行基于文本引导的编辑是一项极具挑战性的任务。以往基于微调大型预训练扩散模型的方法,往往对源文本特征和目标文本特征进行简单的插值组合,用于引导图像生成过程,这限制了其编辑能力,同时微调大型扩散模型极易出现过拟合且耗时长的问题。提出了一种基于映射融合嵌入扩散模型的文本引导图像编辑方法(Text-guided image editing method based on diffusion model with mapping-fusion embedding,MFE-Diffusion)。该方法由两部分组成:(1)大型预训练扩散模型与源文本特征向量联合学习框架,使模型可以快速学习以重建给定的原图像;(2)特征映射融合模块,深度融合目标文本与原图像的特征信息,生成条件嵌入,用于引导图像编辑过程。在具有挑战性的文本引导图像编辑基准TEdBench上进行实验验证,结果表明所提方法在图像编辑性能上具有优势。
Text-guided editing of real images with only images and target text prompts as input is an extremely challenging problem.Previous approaches based on fine-tuning large pre-trained diffusion models often simply interpolate and combine source and target text features to guide the image generation process,which limits their editing capabilities,while fine-tuning large diffusion models is highly susceptible to overfitting and time-consuming problems.In this paper,we propose a text-guided image editing method based on diffusion model with mapping-fusion embedding(MFE-Diffusion).The method consists of the following two components:(1)A large pre-trained diffusion model and source text feature vectors joint learning framework,which enables the model to quickly learn to reconstruct the original image.(2)A feature mapping-fusion module,which deeply fuses the feature information of the target text and the original image to generate conditional embedding that is used to guide the image editing process.Experimental validation on the challenging text-guided image editing benchmark TEdBench shows that the proposed method has advantages in image editing performance.
作者
吴飞
马永恒
邓哲颖
王银杰
季一木
荆晓远
WU Fei;MA Yongheng;DENG Zheying;WANG Yinjie;JI Yimu;JING Xiaoyuan(College of Artificial Intelligence,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;School of Computer Science,Wuhan University,Wuhan 430072,China)
出处
《数据采集与处理》
北大核心
2025年第4期1035-1045,共11页
Journal of Data Acquisition and Processing
基金
国家自然科学基金(62076139)
信息系统工程全国重点实验室开放基金(05202305)
南京邮电大学1311人才计划。