基于不完全数据的数据挖掘方法研究被引量：2

Research of data mining based on incomplete data

下载PDF

导出

摘要将不完全数据分为了两类:属性值残缺和属性值隐含。对基于这两类不完全数据的数据挖掘方法分别进行了探讨,给出了相应的处理方法,并对这些方法及其应用进行了讨论。属性值残缺的处理主要采用一系列“补漏”的方法,使数据成为完全数据集;属性值隐含的处理则通过EM算法来优化模型的参数,弥补数据的不完全性。 It is divided into two classes for incomplete data:The attribute values missing and the attribute values concealed.The data mining methods based on these two kinds of incomplete data are explored.The methods to process these two kinds of incomplete data are presented and the applications about these methods are discussed.Some prosthesis methods are used to process the attribute values missing situation and make the data complete.The EM algorithm is used to process the attribute values concealed situation and make the model parameters more suitable.

作者吴新玲 WU Xin—ling(Department of Information Engineering,GuangDong Polytechnic Normal University,Guangzhou 510262,China;State-key Lab of Software Engineering,Wuhan University,Wuhan 430072,China)

机构地区广东技术师范学院信息工程系武汉大学软件工程国家重点实验室

出处《计算机工程与设计》 CSCD 北大核心 2006年第9期1557-1559,共3页 Computer Engineering and Design

基金武汉大学软件工程国家重点实验室开放基金资助项目(SKLSE05-09)

关键词数据挖掘数据处理期望最大化算法数据模型参数估计 data mining data processing EM algorithm data model parameter estimate

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献5

1Jawei Han,Micheline Kamber.Data mining:Concepts and techniques[M].San Francisco,CA:Morgan Kaufmann,2000.
2DavidHand.Principles of Data Mining[M].北京:机械工业出版社,2003..
3Trevor Hastie,Robert Tibshirani,Jerome Friedman.The elements of statistical learning:Data mining,inference,and prediction[M].北京:电子工业出版社,2004.
4Olivia Parr Rud.Data mining cookbook:Modeling data for marketing,risk,and customer relationship management[M].北京:机械工业出版社,2003.
5吴新玲,徐仁佐.EM算法在非齐次泊松过程模型参数估计中的应用[J].武汉大学学报（自然科学版）,1992(4):17-22. 被引量：1

共引文献2

1刘蓉,陈晓红.利用数据挖掘技术提高电信企业管理决策水平[J].计算机应用与软件,2005,22(12):66-68. 被引量：1
2王雁平,乐春峡.移动网络最差小区数据挖掘分析与实现[J].计算机工程与设计,2007,28(17):4165-4168. 被引量：3

同被引文献14

1张红云,苗夺谦,李道国.基于主曲线的相似字符模糊分类方法[J].模式识别与人工智能,2005,18(6):758-762. 被引量：2
2余瑞康,施润身.聚类思想在贝叶斯算法中的应用[J].计算机工程与应用,2006,42(28):159-160. 被引量：10
3哈金才.数据挖掘算法的评价标准与方法[J].微电子学与计算机,2006,23(12):195-196. 被引量：3
4王洪春,彭宏.基于模糊C-均值的增量式聚类算法[J].微电子学与计算机,2007,24(6):156-157. 被引量：22
5Quinlan J R. CA. 5: programs for machine learning[M]. San Mateo, CA: Morgan Kaufmann Publishers, 1993.
6Kononenko I, Bratko I. Experiment in automatic learning of medical diagnostic rules [ R ]. Ljubljana, Yugoslavia: Jozef Stefan Institute, 1984.
7Eberhart R C, Kennedy J. A new optimizer using particles swarm theory [ C ]. Proceedings of Sixth International Symposium on Micro Machine and Human Science, Nagoya ,Japan, 1995 : 39 - 43.
8Shi Y H, Eberhart R C. A modified particle swarm optimizer [ C]. IEEE International Conference on Evolutionary Computation, Anchorage, Alaska, May 4 - 9,1998 : 69 - 73.
9Kennedy J, Eberhart R. Particle swarm optimization [ A ]. IEEE International Conference on Neural Networks [ C ]. Perth, 1995 : 1942 - 1948.
10Shi Y H, Eberhart R. Parameter selection in particle swarm optimization[ A]. Proceedings of the 7th Annual Conf on Ev- olutionary Programming [ C ]. Washington DC, 1998 : 591 - 600.

引证文献2

1王洪春.缺失数据的主曲线恢复方法[J].微电子学与计算机,2008,25(11):160-161. 被引量：1
2高尚.基于粒子群优化算法的不完全数据的处理[J].航空计算技术,2009,39(1):68-70.

二级引证文献1

1沈奇,王池社.生物缺失数据处理的贝叶斯模型研究[J].微电子学与计算机,2011,28(7):110-112. 被引量：2

1周中良,于雷,李永华.基于信息融合的改进型模糊神经网络[J].计算机应用,2006,26(B06):117-118. 被引量：3
2杨宇,高晓光,郭志高.小数据集条件下基于数据再利用的BN参数学习[J].自动化学报,2015,41(12):2058-2071. 被引量：7
3李跃新,郑诗峰,杨岗.基于集中式智能控制系统上位机数据采集算法[J].计算机测量与控制,2015,23(3):873-875. 被引量：3
4汤希玮,李勇帆,胡秋玲.结合多数据源预测蛋白质复合物[J].计算机工程与应用,2012,48(27):105-108.
5刘大琨,谭晓阳.基于大间隔编码的空间非负矩阵分解[J].华南理工大学学报（自然科学版）,2015,43(5):120-125. 被引量：1
6刘雅琴,王成,章鲁.基于神经网络的乳腺癌生存预测模型[J].中国生物医学工程学报,2009,28(2):221-225. 被引量：5
7董是,郝培文,张敏江,徐启程.面板数据马尔可夫预测沥青路面性能衰变方法[J].北京工业大学学报,2016,42(11):1703-1712. 被引量：12

计算机工程与设计

2006年第9期

浏览历史

内容加载中请稍等...

基于不完全数据的数据挖掘方法研究被引量：2

参考文献5

共引文献2

同被引文献14

引证文献2

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于不完全数据的数据挖掘方法研究 被引量：2

参考文献5

共引文献2

同被引文献14

引证文献2

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于不完全数据的数据挖掘方法研究被引量：2