An improved Gaussian mixture model (GMM)- based clustering method is proposed for the difficult case where the true distribution of data is against the assumed GMM. First, an improved model selection criterion, the ...An improved Gaussian mixture model (GMM)- based clustering method is proposed for the difficult case where the true distribution of data is against the assumed GMM. First, an improved model selection criterion, the completed likelihood minimum message length criterion, is derived. It can measure both the goodness-of-fit of the candidate GMM to the data and the goodness-of-partition of the data. Secondly, by utilizing the proposed criterion as the clustering objective function, an improved expectation- maximization (EM) algorithm is developed, which can avoid poor local optimal solutions compared to the standard EM algorithm for estimating the model parameters. The experimental results demonstrate that the proposed method can rectify the over-fitting tendency of representative GMM-based clustering approaches and can robustly provide more accurate clustering results.展开更多
Three Bayesian related approaches,namely,variational Bayesian(VB),minimum message length(MML)and Bayesian Ying-Yang(BYY)harmony learning,have been applied to automatically determining an appropriate number of componen...Three Bayesian related approaches,namely,variational Bayesian(VB),minimum message length(MML)and Bayesian Ying-Yang(BYY)harmony learning,have been applied to automatically determining an appropriate number of components during learning Gaussian mixture model(GMM).This paper aims to provide a comparative investigation on these approaches with not only a Jeffreys prior but also a conjugate Dirichlet-Normal-Wishart(DNW)prior on GMM.In addition to adopting the existing algorithms either directly or with some modifications,the algorithm for VB with Jeffreys prior and the algorithm for BYY with DNW prior are developed in this paper to fill the missing gap.The performances of automatic model selection are evaluated through extensive experiments,with several empirical findings:1)Considering priors merely on the mixing weights,each of three approaches makes biased mistakes,while considering priors on all the parameters of GMM makes each approach reduce its bias and also improve its performance.2)As Jeffreys prior is replaced by the DNW prior,all the three approaches improve their performances.Moreover,Jeffreys prior makes MML slightly better than VB,while the DNW prior makes VB better than MML.3)As the hyperparameters of DNW prior are further optimized by each of its own learning principle,BYY improves its performances while VB and MML deteriorate their performances when there are too many free hyper-parameters.Actually,VB and MML lack a good guide for optimizing the hyper-parameters of DNW prior.4)BYY considerably outperforms both VB and MML for any type of priors and whether hyper-parameters are optimized.Being different from VB and MML that rely on appropriate priors to perform model selection,BYY does not highly depend on the type of priors.It has model selection ability even without priors and performs already very well with Jeffreys prior,and incrementally improves as Jeffreys prior is replaced by the DNW prior.Finally,all algorithms are applied on the Berkeley segmentation database of real world images.Again,BYY considerably outperforms both VB and MML,especially in detecting the objects of interest from a confusing background.展开更多
基金The National Natural Science Foundation of China(No.61105048,60972165)the Doctoral Fund of Ministry of Education of China(No.20110092120034)+2 种基金the Natural Science Foundation of Jiangsu Province(No.BK2010240)the Technology Foundation for Selected Overseas Chinese Scholar,Ministry of Human Resources and Social Security of China(No.6722000008)the Open Fund of Jiangsu Province Key Laboratory for Remote Measuring and Control(No.YCCK201005)
文摘An improved Gaussian mixture model (GMM)- based clustering method is proposed for the difficult case where the true distribution of data is against the assumed GMM. First, an improved model selection criterion, the completed likelihood minimum message length criterion, is derived. It can measure both the goodness-of-fit of the candidate GMM to the data and the goodness-of-partition of the data. Secondly, by utilizing the proposed criterion as the clustering objective function, an improved expectation- maximization (EM) algorithm is developed, which can avoid poor local optimal solutions compared to the standard EM algorithm for estimating the model parameters. The experimental results demonstrate that the proposed method can rectify the over-fitting tendency of representative GMM-based clustering approaches and can robustly provide more accurate clustering results.
基金The work described in this paper was supported by a grant of the General Research Fund(GRF)from the Research Grant Council of Hong Kong SAR(Project No.CUHK418011E).
文摘Three Bayesian related approaches,namely,variational Bayesian(VB),minimum message length(MML)and Bayesian Ying-Yang(BYY)harmony learning,have been applied to automatically determining an appropriate number of components during learning Gaussian mixture model(GMM).This paper aims to provide a comparative investigation on these approaches with not only a Jeffreys prior but also a conjugate Dirichlet-Normal-Wishart(DNW)prior on GMM.In addition to adopting the existing algorithms either directly or with some modifications,the algorithm for VB with Jeffreys prior and the algorithm for BYY with DNW prior are developed in this paper to fill the missing gap.The performances of automatic model selection are evaluated through extensive experiments,with several empirical findings:1)Considering priors merely on the mixing weights,each of three approaches makes biased mistakes,while considering priors on all the parameters of GMM makes each approach reduce its bias and also improve its performance.2)As Jeffreys prior is replaced by the DNW prior,all the three approaches improve their performances.Moreover,Jeffreys prior makes MML slightly better than VB,while the DNW prior makes VB better than MML.3)As the hyperparameters of DNW prior are further optimized by each of its own learning principle,BYY improves its performances while VB and MML deteriorate their performances when there are too many free hyper-parameters.Actually,VB and MML lack a good guide for optimizing the hyper-parameters of DNW prior.4)BYY considerably outperforms both VB and MML for any type of priors and whether hyper-parameters are optimized.Being different from VB and MML that rely on appropriate priors to perform model selection,BYY does not highly depend on the type of priors.It has model selection ability even without priors and performs already very well with Jeffreys prior,and incrementally improves as Jeffreys prior is replaced by the DNW prior.Finally,all algorithms are applied on the Berkeley segmentation database of real world images.Again,BYY considerably outperforms both VB and MML,especially in detecting the objects of interest from a confusing background.