期刊文献+

一种有效的自组织数据缩减算法 被引量:1

An Efficient Self-organizing Based Data Reduction Algorithm
在线阅读 下载PDF
导出
摘要 为提高数据采掘的效率,通常需要在提供同等分析结果的情况下对原数据集进行简化。文章提出了一种有效的数据缩减算法Sodra,以无监督与有监督相结合的学习方式生成适于分类的缩减数据集。对实际数据集和人工数据集的分类实验表明,所提出的算法既能大大降低空间需求,又不损害分类性能。同时,利用缩减集上的特征分析算法Relif-P可进一步提高算法对无关特征的适应能力。 Data reduction techniques are used to obtain a reduced representation of the data set that is much smaller in volume.It should be more efficient for mining on the reduced data set yet produce the same or almost the same analytical results.In the paper,a new self-organizing based data reduction algorithm called Sodra is proposed,which is an iterative process of unsupervised learning and supervised learning.The results of two experiments on real and artifi-cial datasets show that the reduced data set generated by Sodra can achieve the same generalization accuracy as its o-riginal while requiring much less storage,and increase the tolerance for irrelevant features through the use of Relif-P,a feature relevance analysis algorithm on the reduced data.
出处 《计算机工程与应用》 CSCD 北大核心 2003年第24期189-192,共4页 Computer Engineering and Applications
关键词 数据缩减数据采掘 自组织学习 有监督学习 Sodra算法 Data reduction,Data mining,Self-organizing learning,Supervised learning
  • 相关文献

参考文献10

  • 1汪加才,陈奇,俞瑞钊.面向分类数据的自组织神经网络[J].计算机工程与应用,2003,39(5):96-98. 被引量:7
  • 2J W Han,M Kamber.Data Mining:Concepts and Techniques[M].Morgan Kaufmann, 2000.
  • 3T Kohonen.Self-organizing maps[M].Berlin:Springer-Verlag, 1997.
  • 4M Sebban,R Nock.Contribution of dataset reduction techniques to tree--simplification and knowledge discovery[C].In:DAZighed Eds.PKDD 2000,LNAI 1910,2000:44-53.
  • 5C E Brodley,M A Friedl.Identifying and eIiminating mislabeIed training instances[C].In:13th National Conference on Artificial Intelligence, 1996.
  • 6P E Hart.The condensed nearest neighbor rule[J].IEEE Transactions on Information Theory, 1968; 14:515-516.
  • 7D R Wilson,T R Martinez.An integrated instance-based learning algorithm[J].Computational InteIIigence ,2000 ; 16 ( 1 ).
  • 8L Kononenko.Estimating attributes:analysis and extensions of reIief [C].In:Proceedings of the 1994 European Conference on Machine Learning, Amsterdam.Springer Verlag, 1994 : 171 - 182.
  • 9C BIake,E Keogh,C J Merz.UCI repository of machine learning database. 1998.
  • 10K M Ting.Discretisation in Iazy Iearning algorithms[J].Artificial Intelligence Review, 1997; 11 : 157-174.

二级参考文献1

共引文献6

同被引文献10

  • 1李春华,朱燕飞,毛宗源.一种新型的自适应人工免疫算法[J].计算机工程与应用,2004,40(22):84-87. 被引量:11
  • 2邬依林,李中华,毛宗源.自适应人工免疫算法在数据挖掘中的应用[J].计算机应用,2006,26(8):1943-1946. 被引量:9
  • 3Carlos A C C,Nareli C C.An approach to solve multiobjective optimization problems Based on Artificial Immune Systems[C]//The 1st International Conference on Artificial Immune Systems,Canterbury,2002,212-221.
  • 4Forrest S,Javornik B.Using genetic algorithm to explore pattern recognition in the Immune System[J].Evolutionary Computation,1993,1(2):191-149.
  • 5De Castro L N,Von Zuben F J.aiNet:Artificial Immune Network for Data Analysis[M]//Data Mining:A Heuristic Approach.USA:Idea Group Publishing,2001:231-259.
  • 6Han J W,Kamber M.Data Mining:Concepts and Techniques[M].[S.l.]:Mor gan Kaufmann,2000.
  • 7De Castro L N,Von Zuben F J.Artificial immune system:Part Ⅰ:basic theory and applications[R].Technical Report RT-DCA 01/99,1999.
  • 8Perelson A S,Oster G F.Theoretical studies of clonal selection:Minimal antibody repertoire size and reliability of self-nonself discrimination[J].Journal of Theoretical Biology,1979(81):645.
  • 9余英泽,廖里,吴渝.一种新型数据分析技术——数据挖掘[J].计算机与现代化,2000(1):27-31. 被引量:35
  • 10王磊,潘进,焦李成.免疫算法[J].电子学报,2000,28(7):74-78. 被引量:354

引证文献1

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部