Dunhuang murals are gems of Chinese traditional art. This paper demonstrates a simple, yet powerful method to automatically identify the aesthetic visual style that lies in Dunhuang murals. Based on the art knowledge ...Dunhuang murals are gems of Chinese traditional art. This paper demonstrates a simple, yet powerful method to automatically identify the aesthetic visual style that lies in Dunhuang murals. Based on the art knowledge on Dunhuang murals, the method explicitly predicts some of possible image attributes that a human might use to understand the aesthetic visual style of a mural. These cues fall into three broad types: ① composition attributes related to mural layout or configuration; ② color attributes related to color types depicted; ③ brightness attributes related to bright conditions. We show that a classifier trained on these attributes can provide an efficient way to predict the aesthetic visual style of Dunhuang murals.展开更多
Vision Transformer has shown impressive performance on the image classification tasks.Observing that most existing visual style transfer(VST)algorithms are based on the texture-biased convolution neural network(CNN),h...Vision Transformer has shown impressive performance on the image classification tasks.Observing that most existing visual style transfer(VST)algorithms are based on the texture-biased convolution neural network(CNN),here raises the question of whether the shape-biased Vision Transformer can perform style transfer as CNN.In this work,we focus on comparing and analyzing the shape bias between CNN-and transformer-based models from the view of VST tasks.For comprehensive comparisons,we propose three kinds of transformer-based visual style transfer(Tr-VST)methods(Tr-NST for optimization-based VST,Tr-WCT for reconstruction-based VST and Tr-AdaIN for perceptual-based VST).By engaging three mainstream VST methods in the transformer pipeline,we show that transformer-based models pre-trained on ImageNet are not proper for style transfer methods.Due to the strong shape bias of the transformer-based models,these Tr-VST methods cannot render style patterns.We further analyze the shape bias by considering the influence of the learned parameters and the structure design.Results prove that with proper style supervision,the transformer can learn similar texture-biased features as CNN does.With the reduced shape bias in the transformer encoder,Tr-VST methods can generate higher-quality results compared with state-of-the-art VST methods.展开更多
Graphical representation of hierarchical clustering results is of final importance in hierarchical cluster analysis of data. Unfortunately, almost all mathematical or statistical software may have a weak capability of...Graphical representation of hierarchical clustering results is of final importance in hierarchical cluster analysis of data. Unfortunately, almost all mathematical or statistical software may have a weak capability of showcasing such clustering results. Particularly, most of clustering results or trees drawn cannot be represented in a dendrogram with a resizable, rescalable and free-style fashion. With the “dynamic” drawing instead of “static” one, this research works around these weak functionalities that restrict visualization of clustering results in an arbitrary manner. It introduces an algorithmic solution to these functionalities, which adopts seamless pixel rearrangements to be able to resize and rescale dendrograms or tree diagrams. The results showed that the algorithm developed makes clustering outcome representation a really free visualization of hierarchical clustering and bioinformatics analysis. Especially, it possesses features of selectively visualizing and/or saving results in a specific size, scale and style (different views).展开更多
中国山水画风格迁移的目标是在保持原有山水真实场景图像内容的前提下,引入传统中国画作特征,以生成具有中国山水画艺术特征的图像。近年,由于深度学习的快速发展,卷积神经网络(CNN)和对抗生成网络(GAN)几乎主导了包括风格迁移在内的大...中国山水画风格迁移的目标是在保持原有山水真实场景图像内容的前提下,引入传统中国画作特征,以生成具有中国山水画艺术特征的图像。近年,由于深度学习的快速发展,卷积神经网络(CNN)和对抗生成网络(GAN)几乎主导了包括风格迁移在内的大部分图像生成任务,但也存在一些问题,如真实场景在风格迁移过程中易丢失语义,GAN网络训练出现模型坍塌,CNN风格迁移方法出现棋盘效应等。视觉Transformer模型为图像处理任务提供了新的解决方案,但训练需大量数据且计算复杂。为了解决生成中国画过程中由上述因素引起的图像质量低及细节特征丢失等问题,本文提出一种能基于细节特征提取融合的中国山水画风格迁移网络,即SSTR(swin style transfer transformer)。该网络在StyTr^(2)网络的基础上,引入了Swin–Transformer模型,利用视觉Transformer的强语义性保留山水场景的特征;同时利用Swin–Transformer模型的分层体系结构及滑窗操作计算注意力机制,提取更多的山水画艺术风格细节,同时降低模型训练复杂度;最后,引入一个CNN解码器细化生成目标图像。本文利用公开视觉数据集COCO 2014与公开山水画数据集进行训练、验证与测试,并将结果与基线方法进行比较。结果表明,SSTR在处理中国山水画风格迁移任务中,风格损失和内容损失分别为1.35和1.88,在风格损失上优于StyTr^(2),表现出了优异的特征提取能力和图像生成能力。展开更多
Audio visual style has a complex multi-functionality and the important part of this deals with the way that the character body is visualized and how body language is implemented in the moving image. With a number of e...Audio visual style has a complex multi-functionality and the important part of this deals with the way that the character body is visualized and how body language is implemented in the moving image. With a number of examples of contemporary film and television, the article will line up of key issues of body language in the moving image. This article describes two important aspects of body language in visual media--how visual style mediates the body expressions of fictional character and real persons in news on television and how aspects of the visual style always represent bodily presence in the moving image.展开更多
基金the National Basic Research Program(973)of China(No.2012CB725305)the National Key Technology R&D Program of China(No.2012BAH03F02)
文摘Dunhuang murals are gems of Chinese traditional art. This paper demonstrates a simple, yet powerful method to automatically identify the aesthetic visual style that lies in Dunhuang murals. Based on the art knowledge on Dunhuang murals, the method explicitly predicts some of possible image attributes that a human might use to understand the aesthetic visual style of a mural. These cues fall into three broad types: ① composition attributes related to mural layout or configuration; ② color attributes related to color types depicted; ③ brightness attributes related to bright conditions. We show that a classifier trained on these attributes can provide an efficient way to predict the aesthetic visual style of Dunhuang murals.
基金the National Key Research and Development Program of China under Grant No.2020AAA0106200the National Natural Science Foundation of China under Grant Nos.62102162,61832016,U20B2070,and 6210070958+1 种基金the CASIA-Tencent Youtu Joint Research Projectthe Open Projects Program of the National Laboratory of Pattern Recognition.
文摘Vision Transformer has shown impressive performance on the image classification tasks.Observing that most existing visual style transfer(VST)algorithms are based on the texture-biased convolution neural network(CNN),here raises the question of whether the shape-biased Vision Transformer can perform style transfer as CNN.In this work,we focus on comparing and analyzing the shape bias between CNN-and transformer-based models from the view of VST tasks.For comprehensive comparisons,we propose three kinds of transformer-based visual style transfer(Tr-VST)methods(Tr-NST for optimization-based VST,Tr-WCT for reconstruction-based VST and Tr-AdaIN for perceptual-based VST).By engaging three mainstream VST methods in the transformer pipeline,we show that transformer-based models pre-trained on ImageNet are not proper for style transfer methods.Due to the strong shape bias of the transformer-based models,these Tr-VST methods cannot render style patterns.We further analyze the shape bias by considering the influence of the learned parameters and the structure design.Results prove that with proper style supervision,the transformer can learn similar texture-biased features as CNN does.With the reduced shape bias in the transformer encoder,Tr-VST methods can generate higher-quality results compared with state-of-the-art VST methods.
文摘Graphical representation of hierarchical clustering results is of final importance in hierarchical cluster analysis of data. Unfortunately, almost all mathematical or statistical software may have a weak capability of showcasing such clustering results. Particularly, most of clustering results or trees drawn cannot be represented in a dendrogram with a resizable, rescalable and free-style fashion. With the “dynamic” drawing instead of “static” one, this research works around these weak functionalities that restrict visualization of clustering results in an arbitrary manner. It introduces an algorithmic solution to these functionalities, which adopts seamless pixel rearrangements to be able to resize and rescale dendrograms or tree diagrams. The results showed that the algorithm developed makes clustering outcome representation a really free visualization of hierarchical clustering and bioinformatics analysis. Especially, it possesses features of selectively visualizing and/or saving results in a specific size, scale and style (different views).
文摘中国山水画风格迁移的目标是在保持原有山水真实场景图像内容的前提下,引入传统中国画作特征,以生成具有中国山水画艺术特征的图像。近年,由于深度学习的快速发展,卷积神经网络(CNN)和对抗生成网络(GAN)几乎主导了包括风格迁移在内的大部分图像生成任务,但也存在一些问题,如真实场景在风格迁移过程中易丢失语义,GAN网络训练出现模型坍塌,CNN风格迁移方法出现棋盘效应等。视觉Transformer模型为图像处理任务提供了新的解决方案,但训练需大量数据且计算复杂。为了解决生成中国画过程中由上述因素引起的图像质量低及细节特征丢失等问题,本文提出一种能基于细节特征提取融合的中国山水画风格迁移网络,即SSTR(swin style transfer transformer)。该网络在StyTr^(2)网络的基础上,引入了Swin–Transformer模型,利用视觉Transformer的强语义性保留山水场景的特征;同时利用Swin–Transformer模型的分层体系结构及滑窗操作计算注意力机制,提取更多的山水画艺术风格细节,同时降低模型训练复杂度;最后,引入一个CNN解码器细化生成目标图像。本文利用公开视觉数据集COCO 2014与公开山水画数据集进行训练、验证与测试,并将结果与基线方法进行比较。结果表明,SSTR在处理中国山水画风格迁移任务中,风格损失和内容损失分别为1.35和1.88,在风格损失上优于StyTr^(2),表现出了优异的特征提取能力和图像生成能力。
文摘Audio visual style has a complex multi-functionality and the important part of this deals with the way that the character body is visualized and how body language is implemented in the moving image. With a number of examples of contemporary film and television, the article will line up of key issues of body language in the moving image. This article describes two important aspects of body language in visual media--how visual style mediates the body expressions of fictional character and real persons in news on television and how aspects of the visual style always represent bodily presence in the moving image.