期刊文献+

挖掘不确定数据的最大频繁项集 被引量:2

Mining maximal frequent itemsets in uncertain data
原文传递
导出
摘要 针对不确定数据频繁项集挖掘效率低和准确度不高的问题,提出了一种基于改进的频繁模式树(FPtree)和遗传算法(GA)挖掘不确定数据概率频繁项集的方法,即UFPGA(基于频繁模式树和遗传的挖掘算法).该算法根据不确定数据的构成特征,改进频繁模式树方法挖掘不确定数据频繁项集,采用缩小变异空间和增加育种算子的遗传算法搜索最大频繁项集,收缩了搜索范围,提高了挖掘效率.实验结果表明:该方法在时间复杂度方面有很好的优越性,对大规模的不确定数据挖掘提供了一种有效的技术手段. Since efficiency and accuracy of mining frequent patterns was not high in uncertain data ,an improved UFPGA (uncertain frequent pattern genetic algorithm) was proposed for mining frequent itemsets of the probability .According to features of uncertain data ,FP‐tree (frequent pattern tree) was improved to mine frequent itemsets and the genetic algorithm with variability of space reduced and breeding operator increased was employed to search for the largest frequent itemsets .UFPGA algo‐rithm shrank the search scope to improve the efficiency of mining frequent itemsets .Results of experi‐ments show that UFPGA algorithm has a good advantage of the time complexity with a positive signif‐icance for large‐scale uncertain data mining .
出处 《华中科技大学学报(自然科学版)》 EI CAS CSCD 北大核心 2015年第9期29-34,共6页 Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金 国家科技支撑计划资助项目(2012BAF12B14) 贵州省重大科技专项资助项目([2012]6018 [2013]6019) 贵州省科学技术基金资助项目([2011]2196) 贵州省工业攻关项目([2014]3004)
关键词 数据挖掘 不确定数据 频繁项集 最大频繁项集 频繁模式树 遗传算法 data mining uncertain data frequent itemsets maximal frequent itemsets frequent pattern tree genetic algorithm
  • 相关文献

参考文献13

  • 1周傲英,金澈清,王国仁,李建中.不确定性数据管理技术研究综述[J].计算机学报,2009,32(1):1-16. 被引量:185
  • 2刘立新,张晓琳,毛伊敏.一种有效的不确定数据概率频繁项集挖掘算法[J].计算机应用研究,2012,29(3):841-843. 被引量:8
  • 3Agrawal R,Imielinski T,Swami A.Mining association rules between sets of items in large databases[C]∥Proceedings of the Acm Sigmod International Conference on Management of Data.Washington:ACM,1993:207-216.
  • 4Chui C K,Kao B,Hung E.Mining frequent itemsets from uncertain data[C]∥Proc of the 11th PacificAsia Conference on Knowledge Diacovery and Data Mining.Berlin:Springer-Verlag,2007:47-58.
  • 5Chui C K,Kao B.A detrimental approach for mining frequent itemsets from uncertain data[C]∥Proc of the 12th Pacific-Asia Conference on Knowledge Diacovery and Data Mining.Berlin:Springer-Verlag,2008:64-75.
  • 6张李一,张守志,施伯乐.一种不确定性数据频繁模式的垂直挖掘算法[J].小型微型计算机系统,2012,33(2):206-209. 被引量:12
  • 7Han J,Pei J,Yin Y.Mining frequent patterns without candidate generation[C]∥Proc of ACM SIGMOD International Conference on Management of Data.New York:ACM Press,2000,29(2):1-12.
  • 8Leung C K S,Mateo M A F,Brajczuk D A A.Treebased approach for frequent pattern mining from uncertain data[C]∥Proc of the 12th Pacific-Asia Conference on Knowledge Diacovery and Data Mining.Berlin:Springer-Verlag,2008:653-661.
  • 9Aggarwal C C,Li Yan,Wang Jianyong,et al.Frequent pattern mining with uncertain data[C]∥Proc of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2009:29-38.
  • 10Leung C K S,Sun L.Equivalence class transformation based mining of frequent itemsets from uncertain data[C]∥Proceedings of the 201ACM Symposium on Applied Computing.New York:ACM,2011:983-984.

二级参考文献140

  • 1金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:163
  • 2颜跃进,李舟军,陈火旺.基于FP-Tree有效挖掘最大频繁项集[J].软件学报,2005,16(2):215-222. 被引量:69
  • 3谷峪,于戈,张天成.RFID复杂事件处理技术[J].计算机科学与探索,2007,1(3):255-267. 被引量:54
  • 4Deshpande A, viprin C, Madden S, Hellerstein J M, Hong W. Model-driven data acquisition in sensor networks// Proceedings of the 30th International Conference on Very Large Data Bases. Toronto, 2004:588-599
  • 5Madhavan J, Cohen S, Xin D, Halevy A, Jeffery S, Ko D, Yu C. Web-scale data integration: You can afford to pay as you go//Proceedings of the 33rd Biennial Conference on Innovative Data Systems Research. Asilomar, 2007:342-350
  • 6Liu Ling. From data privacy to location privacy: Models and algorithms (tutorial)//Proceedings of the 33rd International Conference on Very Large Data bases. Vienna, 2007: 1429- 1430
  • 7Samarati P, Sweeney L. Generalizing data to provide anonymity when disclosing information (abstract)//Proeeedings of the 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. Seattle, 1998:188
  • 8Cavallo R, Pittarelli M. The theory of probabilistic databases//Proceedings of the 13th International Conference on Very Large Data Bases. Brighton, 1987:71-81
  • 9Barbara D, Garcia-Molina H, Porter D. The management of probabilistic data. IEEE Transactions on Knowledge and Data Engineering, 1992, 4(5): 487-502
  • 10Fuhr N, Rolleke T. A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Transactions on Information Systems, 1997, 15(1): 32-66

共引文献205

同被引文献10

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部