Developing an efficient algorithm that can maintain discovered information as a database changes is quite important in data mining.Many proposed algorithms focused on a single level,and did not utilize previously mine...Developing an efficient algorithm that can maintain discovered information as a database changes is quite important in data mining.Many proposed algorithms focused on a single level,and did not utilize previously mined information in incrementally growing databases.In the past,we proposed an incremental mining algorithm for maintenance of multiple-level association rules as new transactions were inserted.Deletion of records in databases is,however,commonly seen in real-world applications.In this paper,we thus attempt to extend our previous approach to solve this issue.The concept of pre-large itemsets is used to reduce the need for rescanning original databases and to save maintenance costs.A pre-large itemset is not truly large,but promises to be large in the future.A lower support threshold and an upper support threshold are used to realize this concept.The two user-specified upper and lower support thresholds make the pre-large itemsets act as a gap to avoid small itemsets becoming large in the updated database when transactions are deleted.A new algorithm is thus proposed based on the concept to maintain discovered multiple-level association rules for deletion of records.The proposed algorithm doesn't need to rescan the original database until a number of records have been deleted.It can thus save much maintenance time.展开更多
In this paper, the problem of discovering association rules between items in a large database of sales transactions is discussed, and a novel algorithm, BitMatrix, is proposed. The proposed algorithm is fundamentally ...In this paper, the problem of discovering association rules between items in a large database of sales transactions is discussed, and a novel algorithm, BitMatrix, is proposed. The proposed algorithm is fundamentally different from the known algorithms Apriori and AprioriTid. Empirical evaluation shows that the algorithm outperforms the known ones for large databases. Scale-up experiments show that the algorithm scales linearly with the number of transactions.展开更多
This paper introduces a new algorithm of mining association rules. The algorithm RP counts the itemsets with different sizes in the same pass of scanning over the database by dividing the database into m partitions. ...This paper introduces a new algorithm of mining association rules. The algorithm RP counts the itemsets with different sizes in the same pass of scanning over the database by dividing the database into m partitions. The total number of passes over the database is only (k + 2m - 2)/m, where k is the longest size in the itemsets. It is much less than k.展开更多
文摘Developing an efficient algorithm that can maintain discovered information as a database changes is quite important in data mining.Many proposed algorithms focused on a single level,and did not utilize previously mined information in incrementally growing databases.In the past,we proposed an incremental mining algorithm for maintenance of multiple-level association rules as new transactions were inserted.Deletion of records in databases is,however,commonly seen in real-world applications.In this paper,we thus attempt to extend our previous approach to solve this issue.The concept of pre-large itemsets is used to reduce the need for rescanning original databases and to save maintenance costs.A pre-large itemset is not truly large,but promises to be large in the future.A lower support threshold and an upper support threshold are used to realize this concept.The two user-specified upper and lower support thresholds make the pre-large itemsets act as a gap to avoid small itemsets becoming large in the updated database when transactions are deleted.A new algorithm is thus proposed based on the concept to maintain discovered multiple-level association rules for deletion of records.The proposed algorithm doesn't need to rescan the original database until a number of records have been deleted.It can thus save much maintenance time.
基金This work was supported in part by the National '863' High-Tech Programme of China !(No.863-306-ZD06-2)
文摘In this paper, the problem of discovering association rules between items in a large database of sales transactions is discussed, and a novel algorithm, BitMatrix, is proposed. The proposed algorithm is fundamentally different from the known algorithms Apriori and AprioriTid. Empirical evaluation shows that the algorithm outperforms the known ones for large databases. Scale-up experiments show that the algorithm scales linearly with the number of transactions.
文摘This paper introduces a new algorithm of mining association rules. The algorithm RP counts the itemsets with different sizes in the same pass of scanning over the database by dividing the database into m partitions. The total number of passes over the database is only (k + 2m - 2)/m, where k is the longest size in the itemsets. It is much less than k.