摘要
针对Mohemmed等新近提出的基于粒子群优化(PSO)算法的离群点检测方法(MOHEMMED A,ZHANGM,BROWNE W.Particle swarm optimisation for outlier detection[C]//GECCO'10:Proceedings of the 12th AnnualConference on Genetic and Evolutionary Computation.Oregon,Portland:ACM,2010:83-84)可能出现适应值和相应数据对象的离群度不匹配的不合理现象,分析了存在这种现象的原因,并提出一种改进的适应值函数。新的适应值调整了对不合理邻域半径估值的惩罚力度,从而弱化粒子适应值和对象离群度之间的偏差;算法在解空间范围内搜索近似最优粒子,以确定合适的邻域半径估值;最终基于该半径估值衡量各数据对象的离群度。通过对若干UCI数据集的实验表明,采用新的适应值函数的离群检测算法优于原有方法和LOF方法。所提算法不仅解决了上述存在的问题,离群点检测效果也更突出,这表明合理定义适应值函数有助于提高算法的检测质量。
A new outlier detection method based on Particle Swarm Optimization (PSO) was recently proposed by Mohemmed, et al. ( MOHEMMED A, ZHANG M, BROWNE W. Particle swarm optimisation for outlier detection [C]// GECCO'10: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation. Oregon, Portland: ACM, 2010:83 - 84). There exists an unreasonable phenomenon that its way of defining the fitness function does not necessarily ensure a good match with outlying degree of an object. A new fitness function by weakening the penalty on unreasonable radiuses was proposed so that the deviation between a particle's fitness and outlying degree of the corresponding data object was narrowed. The algorithm searched for an approximate optimal solution, and the radius was then determined to compute the outlying degree of each object. The experimental results on several UCI datasets show the superiority of the proposed outlier detection method with the new fitness function over the original one and the LOF algorithm. The study shows that a reasonable definition of fitness function contributes to the improvement in quality of outlier detection.
出处
《计算机应用》
CSCD
北大核心
2012年第A01期139-143,共5页
journal of Computer Applications
基金
福建省自然科学基金资助项目(2010J01329)
福建省高校产学研重大项目(2010H6012)
关键词
数据挖掘
离群点检测
粒子群优化
离群度
适应值函数
data mining
outlier detection
Particle Swarm Optimization (PSO)
outlying degree
7 fitness function