A new algorithm based on an FC-tree (frequent closed pattern tree) and a max-FCIA (maximal frequent closed itemsets algorithm) is presented, which is used to mine the frequent closed itemsets for solving memory an...A new algorithm based on an FC-tree (frequent closed pattern tree) and a max-FCIA (maximal frequent closed itemsets algorithm) is presented, which is used to mine the frequent closed itemsets for solving memory and time consuming problems. This algorithm maps the transaction database by using a Hash table,gets the support of all frequent itemsets through operating the Hash table and forms a lexicographic subset tree including the frequent itemsets.Efficient pruning methods are used to get the FC-tree including all the minimum frequent closed itemsets through processing the lexicographic subset tree.Finally,frequent closed itemsets are generated from minimum frequent closed itemsets.The experimental results show that the mapping transaction database is introduced in the algorithm to reduce time consumption and to improve the efficiency of the program.Furthermore,the effective pruning strategy restrains the number of candidates,which saves space.The results show that the algorithm is effective.展开更多
The numerous volumes of data generated every day necessitate the deployment of new technologies capable of dealing with massive amounts of data efficiently.This is the case with Association Rules,a tool for unsupervis...The numerous volumes of data generated every day necessitate the deployment of new technologies capable of dealing with massive amounts of data efficiently.This is the case with Association Rules,a tool for unsupervised data mining that extracts information in the form of IF-THEN patterns.Although various approaches for extracting frequent itemset(prior step before mining association rules)in extremely large databases have been presented,the high computational cost and shortage of memory remain key issues to be addressed while processing enormous data.The objective of this research is to discover frequent itemset by using clustering for preprocessing and adopting the linear prefix tree algorithm for mining the maximal frequent itemset.The performance of the proposed CL-LP-MAX-tree was evaluated by comparing it with the existing FP-max algorithm.Experimentation was performed with the three different standard datasets to record evidence to prove that the proposed CL-LP-MAX-tree algorithm outperform the existing FP-max algorithm in terms of runtime and memory consumption.展开更多
基金The National Natural Science Foundation of China(No.60603047)the Natural Science Foundation of Liaoning ProvinceLiaoning Higher Education Research Foundation(No.2008341)
文摘A new algorithm based on an FC-tree (frequent closed pattern tree) and a max-FCIA (maximal frequent closed itemsets algorithm) is presented, which is used to mine the frequent closed itemsets for solving memory and time consuming problems. This algorithm maps the transaction database by using a Hash table,gets the support of all frequent itemsets through operating the Hash table and forms a lexicographic subset tree including the frequent itemsets.Efficient pruning methods are used to get the FC-tree including all the minimum frequent closed itemsets through processing the lexicographic subset tree.Finally,frequent closed itemsets are generated from minimum frequent closed itemsets.The experimental results show that the mapping transaction database is introduced in the algorithm to reduce time consumption and to improve the efficiency of the program.Furthermore,the effective pruning strategy restrains the number of candidates,which saves space.The results show that the algorithm is effective.
文摘The numerous volumes of data generated every day necessitate the deployment of new technologies capable of dealing with massive amounts of data efficiently.This is the case with Association Rules,a tool for unsupervised data mining that extracts information in the form of IF-THEN patterns.Although various approaches for extracting frequent itemset(prior step before mining association rules)in extremely large databases have been presented,the high computational cost and shortage of memory remain key issues to be addressed while processing enormous data.The objective of this research is to discover frequent itemset by using clustering for preprocessing and adopting the linear prefix tree algorithm for mining the maximal frequent itemset.The performance of the proposed CL-LP-MAX-tree was evaluated by comparing it with the existing FP-max algorithm.Experimentation was performed with the three different standard datasets to record evidence to prove that the proposed CL-LP-MAX-tree algorithm outperform the existing FP-max algorithm in terms of runtime and memory consumption.