期刊文献+

基于改进距离的孤立点检测方法 被引量:12

Outlier Detection Method Based on Improved Distance
在线阅读 下载PDF
导出
摘要 局部切空间排列(LTSA)算法是一种有效的流形学习方法,但该算法对孤立点的存在非常敏感.为了增强LTSA算法对孤立点的鲁棒性,文中提出了一种基于改进距离的孤立点检测方法.该方法通过改进距离来度量样本点之间的距离,降低了样本点分布不均匀对孤立点检测算法的影响.实验结果表明,该数据预处理方法能有效地提高LTSA算法的鲁棒性,更好地挖掘数据集的本征特性,具有更好的数据可视化效果. As an effective manifold-learning method, the local tangent space alignment (LTSA) algorithm is sensitive to outliers. In order to enhance the robustness of LTSA algorithm, an outlier detection method based on the improved distance is presented in this paper. In this method, the improved distance is used to measure the distance of the samples for the purpose of reducing the negative influence of the nonuniform distribution of the samples. Experimental results demonstrate that the proposed data preprocessing method can effectively improve the robustness of the LTSA algorithm and can discover the intrinsic characteristics of the dataset with better visualization effect.
出处 《华南理工大学学报(自然科学版)》 EI CAS CSCD 北大核心 2008年第9期25-30,共6页 Journal of South China University of Technology(Natural Science Edition)
基金 广东省自然科学基金资助项目(07006474)
关键词 数据预处理 孤立点检测 改进距离 流形学习 局部切空间排列 data preprocessing outlier detection improved distance manifold learning local tangent space alignment
  • 相关文献

参考文献13

  • 1罗四维,赵连伟.基于谱图理论的流形学习算法[J].计算机研究与发展,2006,43(7):1173-1179. 被引量:76
  • 2Roweis S T, Saul L K. Nonlinear dimensionality reduction by locally linear embedding [ J ]. Science, 2000,290: 2323-2327.
  • 3Tenenbaum J B, de Silva V, Langford J C. A global geometric framework for nonlinear dimensionality reduction [ J]. Science ,2000,290:2319-2323.
  • 4Belkin M,Niyogi P. Laplacian eigenmaps for dimensionality reduction and data representation [ J 1. Neural Computation, 2003,15 ( 6 ) : 1373-1 396.
  • 5Zhang Zhen-yue,Zha Hong-yuan. Principal manifolds and nonlinear dimensionality reduction via tangent space alignment [ J]. SIAM Journal of Scientific Computing, 2004,26( 1 ) :313-338.
  • 6Chen Hai-feng, Jiang Guo-fei, Yoshihira K. Robust nonlinear dimensionality reduction for manifold learning [ C]// Proc of the 18th International Conference on Pattern Recognition. Hongkong : IEEE, 2006:447-450.
  • 7陆声链,林士敏.基于距离的孤立点检测研究[J].计算机工程与应用,2004,40(33):73-75. 被引量:44
  • 8黄洪宇,林甲祥,陈崇成,樊明辉.离群数据挖掘综述[J].计算机应用研究,2006,23(8):8-13. 被引量:43
  • 9Knorr E M,Ng R T,Tucakov V. Distance-based outliers:algorithms and applications [ J ]. VLDB Journal: Very Large Data Bases,2000,8(3/4) :237-253.
  • 10Ramaswamy S, Rastogi R, Shim K. Efficient algorithms for mining outliers from large data sets [ C ]//Proc of the ACM SIGMOD International Conference on Management of Data. Dallas : ACM ,2000:427-438.

二级参考文献64

  • 1Zheng Binxiang,Du Xiuhua & Xi Yugeng Institute of Automation, Shanghai Jiaotong University,Shanghai 200030,P.R.China.Outliers Mining in Time Series Data Sets[J].Journal of Systems Engineering and Electronics,2002,13(1):93-97. 被引量:3
  • 2范大昭,雷蓉,张永生.从地理数据库中探测奇异值[J].测绘科学,2004,29(5):12-15. 被引量:2
  • 3陆声链,林士敏.基于距离的孤立点检测及其应用[J].计算机与数字工程,2004,32(5):94-97. 被引量:23
  • 4张振跃,查宏远.线性低秩逼近与非线性降维[J].中国科学(A辑),2005,35(3):273-285. 被引量:8
  • 5杨剑,李伏欣,王珏.一种改进的局部切空间排列算法[J].软件学报,2005,16(9):1584-1590. 被引量:36
  • 6JiaweiHan MichelineKamber 范明 孟小峰 译.数据挖掘概念与技术[M].北京:机械工业出版社,2002..
  • 7E M Knorr,R T Ng,V Tucakov. Distance-Based Outliers :Algorithms and Applications[J].VLDB Journal:Very Large Databases,2000:237~253
  • 8S D Bay,M Schwabacher. Mining Distance-Based Outliers in Near Linear Time with Randomization and a Simple Pruning Rule[C].In:SIGKDD '03, Washington, DC, USA ,2003
  • 9J Laurikkala,M Juhola,E Kentala. Informal Identification of Outliers in Medical Data[C].In :5th International Workshop on Intelligent Data Analysis in Medicine and Pharmacology, (IDAMAP-2000) ,2000
  • 10K Yamanishi,J Takeuchi.A Unifying Framework for Detecting Oulliers and Change Points from Non-Stationary Time Series Data[C].In:SIGKDD '02 Edmonton,Alberta,Canda,2002

共引文献160

同被引文献84

引证文献12

二级引证文献28

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部