期刊文献+

基于核函数距离测度的LLE降维及其在离群聚类中的应用 被引量:5

LLE dimensionality reduction based on kernel-induced distance measurement and its application in clustering with outliers
在线阅读 下载PDF
导出
摘要 局部线性嵌入算法(locally linear embedding,LLE)是一种流形降维方法,在高维稀疏数据空间中,针对LLE不适合稀疏采样和欧氏距离公式的缺陷,研究该算法的扩展,引入核函数,并将样本映射到高维特征空间,核映射改善了样本的空间分布,改进的LLE方法在适当选取近邻点个数情况下,可得到良好的效果。对从高维采样数据中恢复得到低维数据集,通过本文提出的离群数据假设,并结合本文给出的离群聚类方法对所得低维数据是否是离群数据进行判别。仿真文验的结果表明了该方法能够有效地发现高维数据集中的离群点,与此同时,该算法具有参数估计简单、参数影响不大等优点,该算法为离群点检测问题的机器学习提供了一条新的途径。 Locally linear embedding (LLE) is one of the methods intended for dimensionality reduction. In the sparse data space of high dimension, its extension using kernel function and improved LLE for sparse sample is investigated. Using kernel function, the samples are mapped to high dimensional feature space, and are classified there. By kernel mapping , the distribution of samples is improved . When the number K of the nearest neighbors is selected, it can obtain good results. In this paper, we can transform nonlinear large-scale data into linear data in the feature space, and introduce a nonlinear data transformation to reduce data dimension. On the basis of outlier data hypothesis, outlier data is determined through the algorithm, which is called clustering with outliers detection. Simulation results illustrate that this algorithm is very efficient. Moreover, our method has the advantage of simple parameter estimation and low parameter sensitivity. Our method gives a new way for the solution of detection of outliers.
出处 《仪器仪表学报》 EI CAS CSCD 北大核心 2008年第9期1996-2000,共5页 Chinese Journal of Scientific Instrument
关键词 核函数 维数消减 非线性数据集 离群数据 聚类 kernel function dimensionality reduction nonlinear datasets outliers clustering
  • 相关文献

参考文献11

  • 1XIA H S. Editor-in-chief data warehouse and data mining technology[ M ]. Beijing : Science Press, 2004.
  • 2张东波,王耀南.FCM聚类算法和粗糙集在医疗图像分割中的应用[J].仪器仪表学报,2006,27(12):1683-1687. 被引量:32
  • 3GOIL S, NAGESH H, CHOUDHARY A. Mafia:efficient and scalable subspace clustering for very large data sets [ R]. Technical Report CPDC-TR-9906-010, Northwestern University, 2145 Sheridan Road, Evanston IL 60208, June 1999.
  • 4ROWELS S T, SAUL L K. Nonlmensionality reduction by locally linear embedding [ J ]. Science, 2000,290 (22) : 2323 -2326.
  • 5邓星亮,吴清.LLE算法及其应用[J].兵工自动化,2005,24(3):65-66. 被引量:8
  • 6HAN J W, KAMBER M. Data mining: concepts and techniques[ M]. Academic Press, 2001.
  • 7GIUDICIL P. Applied data ming: statistical methods for business and industry [ M ]. Beijing: Electronics Industry Press, 2004.
  • 8PAL N R, BEZDEK J C. On cluster validity for the fuzzy c-means model [ J ]. IEEE Transactions on Fuzzy Systems, 1995,3:370-390.
  • 9RAY S, TURI R H. Determination of number of clusters in k-means clustering and application in colour image segmentation[ C]. ICAPRDT 99, Calcutta, India, 1999, 12 : 27-29.
  • 10KNORR E M, NG R T. Algorithms for mining distance- based oufliers in large datasets[ C] J. New York: Proc. of Int. Conf. Very Large Data-bases(VLDB'98), 1998: 392-403.

二级参考文献16

  • 1陈真诚,张锋,蒋大宗,倪利莉,王红艳.利用多分辨率分析的胸部X线数字图像粗糙集滤波增强[J].中国生物医学工程学报,2004,23(6):486-489. 被引量:7
  • 2I T Jolliffe. Principal Component Analysis [M]. Springer-Verlag, New York, 1989.
  • 3T Cox, M Cox. Multidimensional Scaling [M]. Chapman & Hall, London, 1994.
  • 4S T Roweis, L K Saul. Nonlinear Dimensionality Reduction by Locally Linear Embedding [J]. Science, 2000, (290):2323-2326.
  • 5D DeMers, G W Cottrell. Nonlinear Dimensionality Reduction [A]. In Advances in Neural Information Processing Systems 5 [C], D Hanson, J Cowan, L Giles, Eds. Morgan Kaufmann, San Mateo, CA, 1993. 580-587.
  • 6M Kramer. Nonlinear Principal Component Analysis Using Autoassociative Neural Networks [J]. AIChE Journal, 1991, (37):233-243.
  • 7T Kohonen. Self-Organization and Associative Memory [M]. Springer-Verlag, Berlin, 1988.
  • 8C Bishop, M Svensen, CWilliams. GTM:The Generative Topographic Mapping [J]. Neural Computation, 1998, (10):215-234.
  • 9SELIM S Z,ISMAIL M A.K-means type alogorithm[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1994,6(1):81-87.
  • 10BEZDEK J C.Pattern recognition with fuzzy objective function algorithms[M].New York:Plenum Press,1981.

共引文献38

同被引文献48

  • 1王亚雄,李建英.主成分分析法在多元质量控制中的应用[J].工业工程与管理,2005,10(3):121-125. 被引量:9
  • 2董广军.高光谱影像流形降维与融合分类技术研究[D].郑州:解放军信息工程大学,2008.
  • 3Domeniconi C, Peng J, and Gunopulos D. Locally adaptive metric nearest-neighbor classification [ J ]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24 ( 9 ) : 1281 - 1285.
  • 4Pedrycz W, Waletzky J. Fuzzy Clustering with Partial Supervision [ J ]. IEEE Transactions on Cybernetics, 1997,27(5) : 787 -795.
  • 5Silva V, Tenenbaum J. Unsupervised learning of curved manifolds [R]. Proceedings of the MSRI workshop on nonlinear estimation and classification,2002.
  • 6Charles M. Bachmann. Modeling Data Manifold Geometry in Hyperspectral Imagery [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING ,2004,43 ( 3 ) :440 - 450.
  • 7何选明.煤化学[M].2版.北京:冶金工业出版社,2010.
  • 8JOLLIFFE I T.Principal component analysis[M].Springer,1989.
  • 9COX T,COX M.Multidimensional scanning[M].Chap-man and Hall,1994.
  • 10TENENBAUM J,DE SILVA V,LANGFORD J.A globalgeometric framework for nonlinear dimensionality re-duction[J].Science,2000:290:2319-2323.

引证文献5

二级引证文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部