期刊文献+

一种离群数据集延伸知识发现框架 被引量:2

An Extended Knowledge Discovery Framework for Outlier Data Set
在线阅读 下载PDF
导出
摘要 现有离群数据研究主要集中于离群检测.为了对离群数据的来源、分类、含义、行为特征以及离群趋势等进行全面分析,以现有离群挖掘技术为基础,结合已提出的离群约简与关键域子空间等一系列概念及其搜索算法,定义了离群最近邻、原子离群类及离群变异类等概念,提出了离群簇分析及离群趋势分析方法,建立了一种完整的离群数据集特征描述及延伸知识发现的整体框架.通过对移动通信业务数据的离群分析进行具体讨论,表明了这种离群延伸知识发现框架在实际应用中的有效性. The existing researches on outlier data mainly focus on the outlier detection. In order to completely analyze the origin, classification, meaning, behavior characteristics and outlying trend of outlier data, some concepts such as the nearest outlying neighbor, the atomic outlier class and the outlying mutation class are defined and the approaches to outlier clustering and outlying trend analyses are proposed based on the existing outlier mining techniques as well as a series of concepts and their searching algorithms including the outlying reduction and the key attribute subspace. Furthermore, an integrated framework of characteristic description and extended knowledge discovery of outlier data set is constructed, whose validity in practical applications is finally verified by the outlier analysis of mobile communication operation data.
出处 《华南理工大学学报(自然科学版)》 EI CAS CSCD 北大核心 2008年第9期31-36,共6页 Journal of South China University of Technology(Natural Science Edition)
基金 重庆市自然科学基金资助项目(2005BB2224) 教育部高校博士点基金资助项目(20050611027)
关键词 数据挖掘 离群分析 关键域子空间 知识发现框架 data mining outlier analysis key attribute subspace knowledge discovery framework
  • 相关文献

参考文献10

  • 1Angiulli F,Pizzuti C. Outlier mining in large high dimensional data sets [ J ]. IEEE Trans on Knowledge and Data Engineering,2005,17 (2) :203-215.
  • 2邓健爽,郑启伦,彭宏,邓维维.基于连通图动态分裂的聚类算法[J].华南理工大学学报(自然科学版),2007,35(1):118-122. 被引量:5
  • 3Hodge V J, Austin J. A survey of outlier detection methodologies [ J ]. Artificial Intelligence Review, 2004,22 : 85- 126.
  • 4Petrovskiy M I. Outlier detection algorithms in data mining systems [ J]. Programming and Computer Software,2003,29(4) :228-237.
  • 5Filzmoser P, Maronna R, Werner M. Outlier identification in high dimensions [ J ]. Computational Statistics & Data Analysis, 2008,52 ( 3 ) : 1694-1711.
  • 6金义富,朱庆生,邢永康.一种基于关键域子空间的离群数据聚类算法[J].计算机研究与发展,2007,44(4):651-659. 被引量:8
  • 7Knorr E M, Ng R T. Finding intensional knowledge of distance-based outliers [ C ] //Proc of the 25th International Conference on Very Large Data Bases. New York: Morgan Kaufmann, 1999 -211-222.
  • 8Chen Z, Tang J, Fu A. Modeling and efficient mining of intentional knowledge of outliers [ C ]//Proc of the 7th International Database Engineering and Applications Symposium. Hong Kong : IEEE ,2003 : 1 -10.
  • 9Nakamura T, Kamidoi Y, Wakabayashi S, et al. A decision method of attribute importance for classification by outlier detection [ C ]//Proc of the 22nd Interrrational Conference on Data Engineering Workshops. Atlanta :IEEE,2006:45-50.
  • 10Caroni C, Karioti V. Detecting an innovative outlier in a set of time series [ J ]. Computational Statistics & Data Analysis, 2004,46:561-570.

二级参考文献21

  • 1Chen M S,Han J W,Yu P S.Data mining:an overview from a database perspective[J].IEEE Trans on Know-ledge and Data Eng,1996,8(6):866-883.
  • 2Jain A K,Dubes R C.Algorithms for clustering data[M].New Jersy:Prentice Hall,1988.
  • 3Han E H,Karypis G,Kumar V.Hypergraph-based clustering in high-dimensional data sets:a summary of results[J].Bull Tech Committee on Data Eng,1998,21(1):15-22.
  • 4Boley D,Gini M,Gross R,et al.Document categorization and query generation on the World Wide Web using WebACE[J].Journal of Aitificial Intelligent Review,1999,13(5):365-391.
  • 5Ng R,Han J.Efficient and effective clustering method for spatial data mining[C]∥Morgan Kaufmann.Proc of 20th Int'l Conf on Very Large Data Bases.San Francisco:AAAI Press,1994:144-155.
  • 6Ester M,Kriegel H P,Sander J,et al.A density-based algorithm for discovering clusters in large spatial databases with noise[C]∥Proc of 2nd Int'l Conf on Knowledge Discovery and Data Mining.Menlo Park:AAAI Press,1996:226-231.
  • 7Guha S,Rastogi R,Shim K.CURE:an efficient clustering algorithm for large databases[C]∥Proc of ACM SIGMOD Int'l Conf.New York:ACM Press,1998:73-84.
  • 8Guha S,Rastogi R,Shim K.ROCK:a robust clustering algorithm for categorical attributes[C]∥Proc of 15th Int'l Conf on Data Eng.Los Alamitos:IEEE Computer Society,1999:512-521.
  • 9Karypis G,Han E H,Kumar V.Chameleon:a hierarchical clustering using dynamic modeling[J].Computer,1999,32:68-75.
  • 10W Jin,A K H Tung,J Han.Mining top-n local outliers in large databases[C].The 7th ACM SIGKDD Int'l Conf on Knowledge Discovery and Data Mining,San Francisco,California,2001

共引文献11

同被引文献20

  • 1黄洪宇,林甲祥,陈崇成,樊明辉.离群数据挖掘综述[J].计算机应用研究,2006,23(8):8-13. 被引量:43
  • 2HAWKINS D. Identification of outliers[ M]. London: Chapman and Hall, 1980.
  • 3KNORR E M, NG R T. Finding intensional knowledge of distance- based outliers [ C ]//Proe of the 25 th International Conference on Very Large Data Bases. New York : Morgan Kaufmann, 1999:211 - 222.
  • 4CHEN Zhi-xiang, TANG Jian, FU A W C. Modeling and efficient mining of intentional knowledge of outliers [ C ]//Proc of the 7th Inter- national Database Engineering and Applications Symposium. [ S. l. ] : IEEE Computer Society, 2003 : 1-10.
  • 5连风娜.离群点挖掘及其内涵知识发现研究[D].厦门:厦门大学,2008.
  • 6葛道凯,张少刚,魏顺平著.教育数据挖掘:方法与应用[M].北京:教育科学出版社,2012.
  • 7Annika Wolff, Zdenek Zdrahal, Andriy Nikolov, Michal Pantucek. Improving retention: predicting at-risk students by analyzing clicking behavior in a virtual learning environment [EB/OL]. http://oro.open. ac.uk/36936,2015-12-31.
  • 8顾小清,张进良等.学习分析:正在浮现中的数据技术[DB/OL].http://"at-COW.doc88.eom/p-0681404625840.html,2015-11-30.
  • 9Kimberly E. Arnold and Matthew D. Pistilli. Course Signals at Purdue: Using learning Analytics to Increase Student Success[EB/OL].http:// www.itap.purdue.edu/learning/docs/research/Arnold_Pistilli-Purdue_ University_Course_Signals-2012.pdf,2015-12-31.
  • 10马红亮,袁莉等.反省分析技术在教育领域中的应用[DB/OL].http://www.aiweibang.com/yuedu/45947986.html.2015-11-30.

引证文献2

二级引证文献67

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部