We propose Mixed-Precision Multibranch Network(M+MNet)to compensate for the neglect of background information in image aesthetics assessment(IAA)while providing strategies for overcoming the dilemma between training c...We propose Mixed-Precision Multibranch Network(M+MNet)to compensate for the neglect of background information in image aesthetics assessment(IAA)while providing strategies for overcoming the dilemma between training costs and performance.First,two exponentially weighted pooling methods are used to selectively boost the extraction of background and salient information during downsampling.Second,we propose Corner Grid,an unsupervised data augmentation method that leverages the diffusive characteristics of convolution to force the network to seek more relevant background information.Third,we perform mixed-precision training by switching the precision format,thus significantly reducing the time and memory consumption of data representation and transmission.Most of our methods specifically designed for IAA tasks have demonstrated generalizability to other IAA works.For performance verification,we develop a large-scale benchmark(the most comprehensive thus far)by comparing 17 methods with M+MNet on two representative datasets:the Aesthetic Visual Analysis(AVA)dataset and FLICKR-Aesthetic Evaluation Subset(FLICKR-AES).M+MNet achieves state-of-the-art performance on all tasks.展开更多
In this paper, a method to infer global depth ordering for monocular images is presented. Firstly a distance metric is defined with color, compactness, entropy and edge features to estimate the difference between pixe...In this paper, a method to infer global depth ordering for monocular images is presented. Firstly a distance metric is defined with color, compactness, entropy and edge features to estimate the difference between pixels and seeds, which can ensure the superpixels to obtain more accurate object contours. To correctly infer local depth relationship, a weighting descriptor is designed that combines edge, T-junction and saliency features to avoid wrong local inference caused by a single feature. Based on the weighting descriptor, a global inference strategy is presented,which not only can promote the performance of global depth ordering, but also can infer the depth relationships correctly between two non-adjacent regions. The simulation results on the BSDS500 dataset, Cornell dataset and NYU 2 dataset demonstrate the effectiveness of the approach.展开更多
基金supported by the National Natural Science Foundation of China under Grant No.62502040the ZTE Industry-University-Institute Cooperation Funds under Grant No.IA20230700001.
文摘We propose Mixed-Precision Multibranch Network(M+MNet)to compensate for the neglect of background information in image aesthetics assessment(IAA)while providing strategies for overcoming the dilemma between training costs and performance.First,two exponentially weighted pooling methods are used to selectively boost the extraction of background and salient information during downsampling.Second,we propose Corner Grid,an unsupervised data augmentation method that leverages the diffusive characteristics of convolution to force the network to seek more relevant background information.Third,we perform mixed-precision training by switching the precision format,thus significantly reducing the time and memory consumption of data representation and transmission.Most of our methods specifically designed for IAA tasks have demonstrated generalizability to other IAA works.For performance verification,we develop a large-scale benchmark(the most comprehensive thus far)by comparing 17 methods with M+MNet on two representative datasets:the Aesthetic Visual Analysis(AVA)dataset and FLICKR-Aesthetic Evaluation Subset(FLICKR-AES).M+MNet achieves state-of-the-art performance on all tasks.
基金supported by the National Natural Science Foundation of China(61701036)
文摘In this paper, a method to infer global depth ordering for monocular images is presented. Firstly a distance metric is defined with color, compactness, entropy and edge features to estimate the difference between pixels and seeds, which can ensure the superpixels to obtain more accurate object contours. To correctly infer local depth relationship, a weighting descriptor is designed that combines edge, T-junction and saliency features to avoid wrong local inference caused by a single feature. Based on the weighting descriptor, a global inference strategy is presented,which not only can promote the performance of global depth ordering, but also can infer the depth relationships correctly between two non-adjacent regions. The simulation results on the BSDS500 dataset, Cornell dataset and NYU 2 dataset demonstrate the effectiveness of the approach.