基于图的四叉链表存储结构的最大频繁项集挖掘算法

The maximum frequent item set mining algorithm based on the four-fork linked storage structure

下载PDF

导出

摘要虽然已有的最大频繁项集挖掘算法在结构和技术上已经做了很多改进,但还是存在挖掘速度慢、效率低的缺点,在此提出了图的四叉链表存储结构和基于该存储结构的最大频繁项集挖掘算法,该结构具有一次生成多次使用,不必耗用额外的存储空间等特点,基于该存储结构的最大频繁项集挖掘算法充分利用了该存储结构的特点以及频繁扩展集的性质,有效地减少了冗余候选集的生成,降低了串的冗余存储,将串集合间的比较转化为整型数组的比较,从而使得它比已有的最大频繁项集挖掘算法在挖掘效率上有了明显的提高,最后通过实验证明了该算法较其他已有算法效率有了较大的提高. Although a variety of improvements have been done on the existing maximum frequent item mining algorithms in terms of structures and technologies, they still suffer from low efficiency. Given these shortcomings of the existing algorithms, we propose the quad-pointer linked list structure for graph and the maximum frequent item mining algorithm based on this structure. This structure possesses once-created-multiple-used property, without the need for extra storage space. This structure property and the characteristics of the frequent extension set are utilized fully by our algorithm, which effectively reduce the redundancy for the candidate generation and storage. Besides, we convert the comparison between strings into the comparison between integer arrays, which improves the efficiency greatly for the maximum frequent item mining algorithm. Through the experiments, the efficiency of our algorithm is proved to outperform the other existing algorithms.

作者王春华宁慧邹韵郭江鸿

机构地区哈尔滨工程大学计算机科学与技术学院哈尔滨工程大学经济管理学院

出处《应用科技》 CAS 2013年第1期76-79,共4页 Applied Science and Technology

基金国家自然科学基金资助项目(60975071) 黑龙江省教育厅科学技术研究资助项目(12513055)

关键词四叉链表频繁项集存储结构挖掘算法 four-fork link frequent item set storage structure mining algorithm

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献9

1刘黎明,王水,王乐.基于迭代事务集与交集剪枝的最大频繁项集挖掘算法[J].南开大学学报（自然科学版）,2009,42(4):97-102. 被引量：3
2YANG Kai, MA Yuan. A fast algorithm for discovering maximum frequent itemsets[C]//Proc of the 21 th Int'l Conf on Communication Software and Networks. Xi'an, China, 2011: 434-438.
3刘应东,冷明伟,陈晓云.基于链表数组的最大频繁项集挖掘算法[J].计算机工程,2010,36(6):89-90. 被引量：4
4马志新,陈晓云,王雪,李龙杰.最大频繁项集挖掘中搜索空间的剪枝策略[J].清华大学学报（自然科学版）,2005,45(S1):1748-1752. 被引量：5
5陈晨,鞠时光.基于改进FP-tree的最大频繁项集挖掘算法[J].计算机工程与设计,2008,29(24):6236-6239. 被引量：14
6LIU Zhenyu, XU Weixiang, LIU Xumin. Efficiently using matrix in mining maximum frequent itemset[C]//Proc of the 20th Int'l Conf on Knowledge Discovery and Data Mining. Washington DC, USA, 2010: 50-54.
7薛安荣,王富强,李明.基于Iceberg概念格的最大频繁项集挖掘[J].计算机工程,2011,37(7):35-37. 被引量：4
8HUANG Guoyang, WANG Libo, HU Changzhen, et al. An efficient algorithm based on time decay model for mining maximal frequent itemsets[C]//Proc of the 20th Int'l Conf on Machine Learning and Cybernetics. Perth, Australia, 2009: 2063 -2066.
9花红娟,张健,陈少华.基于频繁模式树的约束最大频繁项集挖掘算法[J].计算机工程,2011,37(9):78-80. 被引量：15

二级参考文献48

1李庆华,王卉,蒋盛益.挖掘最大频繁项集的并行算法[J].计算机科学,2004,31(12):132-134. 被引量：5
2颜跃进,李舟军,陈火旺.基于FP-Tree有效挖掘最大频繁项集[J].软件学报,2005,16(2):215-222. 被引量：69
3宋余庆,朱玉全,孙志挥,杨鹤标.一种基于频繁模式树的约束最大频繁项目集挖掘及其更新算法[J].计算机研究与发展,2005,42(5):777-783. 被引量：21
4王黎明,赵辉.基于FP树的全局最大频繁项集挖掘算法[J].计算机研究与发展,2007,44(3):445-451. 被引量：16
5Agrawal R,Imietinski T, Swami A.Mining association rules between sets of items in large database[C].Washington:Proceeding of the ACM SIGMOD International Conference on Management of Data, 1993:207-216.
6Agrawal R,Srikant.Fast algorithms for mining association rules [C]. Proceeding of the 20th International Conference on Very Large Databases, 1994:487-499.
7Han J, Pei J,Yin Y.Mining frequent patterns without candidate generation[C].Dallas:Proceeding of the ACM SIGMOD Intema- tional Conference on Management of Data,2000:1-12.
8Bayardo R.Efficiently mining long patterns from databases[C]. New York: Proceeding of 1998 ACM SIGMOD International Conference on Management of Data,1998:85-93.
9Burdick D,Calimlim M,Flannick J,et al.MAFIA:A maximal frequent itemset algorithm [J]. IEEE Transactions on Knowledge and Data Engineering,2005(11): 1490-1504.
10Gouda K,Zaki MJ.Efficiently mining maximal frequent itemsets [C].Proceeding of the IEEE International Conference on Data Mining,2001:163 - 170.

共引文献39

1白承森,马志新,徐玉生.一种基于ESEquivPS的封闭频繁项集挖掘算法[J].计算机工程与科学,2009,31(1):151-153.
2韩立毛,鞠时光,朱金伟.数据挖掘算法TCMA及其应用研究[J].计算机时代,2009(12):12-13.
3张笑达,徐立臻.一种改进的基于矩阵的频繁项集挖掘算法[J].计算机技术与发展,2010,20(4):93-96. 被引量：8
4韩立毛,鞠时光,朱金伟.用于挖掘TCM-FP树中维间最大频繁项集的算法[J].江南大学学报（自然科学版）,2010,9(2):185-190.
5惠亮,钱雪忠.关联规则中FP-tree的最大频繁模式非检验挖掘算法[J].计算机应用,2010,30(7):1922-1925. 被引量：5
6陈晨,鞠时光.改进的最大频繁项集挖掘算法[J].计算机工程与设计,2010,31(18):4009-4011. 被引量：2
7彭剑,王小玲.基于聚类矩阵的入侵日志关联规则算法[J].计算机工程,2010,36(22):170-172.
8张月琴.数据挖掘在多Agent入侵检测系统中的应用[J].计算机应用与软件,2010,27(11):284-286. 被引量：1
9钱雪忠,惠亮.关联规则中基于降维的最大频繁模式挖掘算法[J].计算机应用,2011,31(5):1339-1343. 被引量：13
10李贵,韩子扬,郑新录,李征宇.基于Apriori算法的Deep Web网页关系挖掘研究[J].山东大学学报（理学版）,2011,46(5):67-70.

1周清松.模拟人工手算的方法实现任意位大整数的四则运算[J].电子制作,2015,0(5Z):58-60. 被引量：1
2王晓龙.基于位运算与海明距离的Apriori算法改进[J].信息技术,2014,38(5):147-150. 被引量：4
3何大德.一种求任意正整数n阶乘的算法及实现[J].湖南教育学院学报,2001,19(4):160-162.
4张海军,丁溪源,朱朝勇.一种改进的中文字符串排序方法[J].计算机工程与应用,2010,46(19):129-131. 被引量：3
5钟红山.电子商务交易系统安全技术实现方法研究——任意长度数值有符号整数四则运算[J].数字技术与应用,2013,31(11):175-176. 被引量：1
6张海军,潘伟民,木妮娜,栾静.一种自定义顺序的字符串排序算法[J].小型微型计算机系统,2012,33(9):1968-1971. 被引量：4
7陈凤祥,李汪根.C++动态数组的实现与重用[J].计算机技术与发展,2010,20(2):79-82. 被引量：9
8孔令德,刘杰.自然数196的回文数猜想检验的新算法[J].计算机工程与设计,2007,28(24):5841-5843. 被引量：2
9李立新,李小虎.生成三维多边形平面域的双直线算法[J].工程图学学报,2008,29(2):67-70.

应用科技

2013年第1期

浏览历史

内容加载中请稍等...

基于图的四叉链表存储结构的最大频繁项集挖掘算法

参考文献9

二级参考文献48

共引文献39

相关作者

相关机构

相关主题

浏览历史