摘要
图像和语音已成为日常生活和科研的常见数据类型,图像的聚类分析是数据挖掘和图像处理领域的重要任务之一.基于自编码器的深度聚类方法具有表征能力有限的缺点,并且特征的生成与聚类指派是分步进行的.为此,提出一种基于新颖卷积自编码器的深度Softmax聚类算法(ASCAE‐Softmax).首先设计一种非对称的卷积自编码器网络结构(ASCAE),通过优化卷积和添加全连接层,使整个网络呈非对称;接着使用Softmax聚类器把特征映射成聚类概率分布,构造辅助目标概率分布,将特征学习与聚类判别联合在一起.通过迭代最小化KL(Kullback‐Leibler)散度损失达到清晰的聚类划分.实验结果表明,该方法能够学习出使同类更加紧凑、异类更加稀疏的特征表示,且聚类结果优于经典的深度聚类算法.
Image and speech have been common data in daily life and academic research.Therefore,image clustering analysis becomes one of the vital tasks in data mining and image processing fields.The deep clustering methods based on auto‐encoders have the limited representation ability.Moreover,feature extraction and clustering assignment are carried out separately.A new deep Softmax clustering algorithm(ASCAE‐Softmax)based on a novel convolutional auto‐encoder is proposed.Firstly,an asymmetric convolutional auto‐encoder network structure(ASCAE)is designed.The whole network is asymmetric with an optimizing convolution operation and adding fully connected layers.Secondly,Softmax clustering is proposed that is composed of mapping features into clustering probability distribution,making auxiliary target probability distribution,and combining features learning with clustering assignment.Then,clustering divisions become clearer by iteratively minimizing KL(Kullback‐Leibler)divergence.Experimental results showed that the proposed deep clustering algorithm can achieve the optimal features representation which makes the intra‐clusters more compact and the inter‐clusters more dispersive,and the clustering result is better than the state‐of‐the‐art deep clustering algorithms.
作者
陈俊芬
赵佳成
韩洁
翟俊海
Chen Junfen;Zhao Jiacheng;Han Jie;Zhai Junhai(Hebei Key Laboratory of Machine Learning and Computational Intelligence,College of Mathematics and Information Science,Hebei University,Baoding,071002,China)
出处
《南京大学学报(自然科学版)》
CAS
CSCD
北大核心
2020年第4期533-540,共8页
Journal of Nanjing University(Natural Science)
基金
河北省科技重点研发项目(19210310D)
河北大学高层次创新人才科研启动经费项目。
关键词
无监督学习
特征表示
卷积自编码器
图像聚类
Softmax
分类器
unsupervised learning
features representation
convolutional auto‐encoder
image clustering
Softmax classifier