Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting corre...Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.展开更多
目的系统整理并归纳总结乳腺癌的中医证候分布及临床组方用药论治规律,为临床应用中医药治疗乳腺癌提供选方用药思路借鉴。方法检索建库至2024年6月30日中国知网(CNKI)、万方医学网(WANFANG MED ONLINE)、维普网(VIP)、中国生物医学服...目的系统整理并归纳总结乳腺癌的中医证候分布及临床组方用药论治规律,为临床应用中医药治疗乳腺癌提供选方用药思路借鉴。方法检索建库至2024年6月30日中国知网(CNKI)、万方医学网(WANFANG MED ONLINE)、维普网(VIP)、中国生物医学服务系统(SinoMed)文献数据库有关中医诊治乳腺癌的文献资料,建立数据库,通过Excel、SPSS Statistics、SPSS Modeler软件对证型、症状、方剂、药物进行频数统计、系统聚类和Apriori关联规则分析。结果共筛选得到文献164篇,中医证型36个,其中气血两虚、肝郁脾虚、肝郁气滞、气阴两虚、肝肾阴虚、瘀毒互结为常见证型;包含方剂257首,其中成方47首,以逍遥散、八珍汤、六君子汤为主;涉及中药281味,高频中药有白术、茯苓、甘草、黄芪、当归、柴胡、白芍、陈皮、白花蛇舌草等,功效以补虚药、清热药、活血化瘀药、理气药为主,主归肝、脾、肾经;45味高频中药系统聚类分析得到4个聚类方;关联规则分析得到20个药物组合。结论乳腺癌患者以气血两虚、肝郁脾虚、肝郁气滞证为主,在遣方用药上以益气健脾、滋阴养血为基本治法,同时辅以行气活血、清热解毒散结等药物辨证论治。展开更多
Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a...Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a novel algorithm updating for global frequent patterns-IPARUC. A rapid clustering method is introduced to divide database into n parts in IPARUC firstly, where the data are similar in the same part. Then, the nodes in the tree are adjusted dynamically in inserting process by "pruning and laying back" to keep the frequency descending order so that they can be shared to approaching optimization. Finally local frequent itemsets mined from each local dataset are merged into global frequent itemsets. The results of experimental study are very encouraging. It is obvious from experiment that IPARUC is more effective and efficient than other two contrastive methods. Furthermore, there is significant application potential to a prototype of Web log Analyzer in web usage mining that can help us to discover useful knowledge effectively, even help managers making decision.展开更多
Traditional pattern representation in information extraction lack in the ability of representing domain-specific concepts and are therefore devoid of flexibility. To overcome these restrictions, an enhanced pattern re...Traditional pattern representation in information extraction lack in the ability of representing domain-specific concepts and are therefore devoid of flexibility. To overcome these restrictions, an enhanced pattern representation is designed which includes ontological concepts, neighboring-tree structures and soft constraints. An information-(extraction) inference engine based on hypothesis-generation and conflict-resolution is implemented. The proposed technique is successfully applied to an information extraction system for Chinese-language query front-end of a job-recruitment search engine.展开更多
An Invitation to exhibitors is one of the typical exhibition documents.The General-Particular pattern is the common text pattern found in invitations to exhibitors.This paper is an analysis on its way of organization ...An Invitation to exhibitors is one of the typical exhibition documents.The General-Particular pattern is the common text pattern found in invitations to exhibitors.This paper is an analysis on its way of organization and expression from the perspective of text macro structure and aims at summarizing its general laws and features to master the essence of other exhibition documents.展开更多
The choice of a fuzzy partitioning is crucial to the performance of a fuzzy system based on if-then rules. However, most of the existing methods are complicated or lead ,o too many subspaces, which is unfit for the ap...The choice of a fuzzy partitioning is crucial to the performance of a fuzzy system based on if-then rules. However, most of the existing methods are complicated or lead ,o too many subspaces, which is unfit for the applications of pattern classification. A simple but effective clustering approach is proposed in this paper, which obtains a set of compact subspaces and is applicable for classification problems with higher dimensional feature. Its effectiveness is demonstrated by the experimental results.展开更多
文摘Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.
文摘目的系统整理并归纳总结乳腺癌的中医证候分布及临床组方用药论治规律,为临床应用中医药治疗乳腺癌提供选方用药思路借鉴。方法检索建库至2024年6月30日中国知网(CNKI)、万方医学网(WANFANG MED ONLINE)、维普网(VIP)、中国生物医学服务系统(SinoMed)文献数据库有关中医诊治乳腺癌的文献资料,建立数据库,通过Excel、SPSS Statistics、SPSS Modeler软件对证型、症状、方剂、药物进行频数统计、系统聚类和Apriori关联规则分析。结果共筛选得到文献164篇,中医证型36个,其中气血两虚、肝郁脾虚、肝郁气滞、气阴两虚、肝肾阴虚、瘀毒互结为常见证型;包含方剂257首,其中成方47首,以逍遥散、八珍汤、六君子汤为主;涉及中药281味,高频中药有白术、茯苓、甘草、黄芪、当归、柴胡、白芍、陈皮、白花蛇舌草等,功效以补虚药、清热药、活血化瘀药、理气药为主,主归肝、脾、肾经;45味高频中药系统聚类分析得到4个聚类方;关联规则分析得到20个药物组合。结论乳腺癌患者以气血两虚、肝郁脾虚、肝郁气滞证为主,在遣方用药上以益气健脾、滋阴养血为基本治法,同时辅以行气活血、清热解毒散结等药物辨证论治。
基金Supported by the National Natural Science Foundation of China(60472099)Ningbo Natural Science Foundation(2006A610017)
文摘Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a novel algorithm updating for global frequent patterns-IPARUC. A rapid clustering method is introduced to divide database into n parts in IPARUC firstly, where the data are similar in the same part. Then, the nodes in the tree are adjusted dynamically in inserting process by "pruning and laying back" to keep the frequency descending order so that they can be shared to approaching optimization. Finally local frequent itemsets mined from each local dataset are merged into global frequent itemsets. The results of experimental study are very encouraging. It is obvious from experiment that IPARUC is more effective and efficient than other two contrastive methods. Furthermore, there is significant application potential to a prototype of Web log Analyzer in web usage mining that can help us to discover useful knowledge effectively, even help managers making decision.
文摘Traditional pattern representation in information extraction lack in the ability of representing domain-specific concepts and are therefore devoid of flexibility. To overcome these restrictions, an enhanced pattern representation is designed which includes ontological concepts, neighboring-tree structures and soft constraints. An information-(extraction) inference engine based on hypothesis-generation and conflict-resolution is implemented. The proposed technique is successfully applied to an information extraction system for Chinese-language query front-end of a job-recruitment search engine.
文摘An Invitation to exhibitors is one of the typical exhibition documents.The General-Particular pattern is the common text pattern found in invitations to exhibitors.This paper is an analysis on its way of organization and expression from the perspective of text macro structure and aims at summarizing its general laws and features to master the essence of other exhibition documents.
文摘The choice of a fuzzy partitioning is crucial to the performance of a fuzzy system based on if-then rules. However, most of the existing methods are complicated or lead ,o too many subspaces, which is unfit for the applications of pattern classification. A simple but effective clustering approach is proposed in this paper, which obtains a set of compact subspaces and is applicable for classification problems with higher dimensional feature. Its effectiveness is demonstrated by the experimental results.
基金Acknowledgements: This work is supported by the National Natural Science Foundation of China (No. 60205007), Natural Science Foundation of Guangdong Province (No.031558, No. 04300462), Research Foundation of National Science and Technology Plan Project (No.2004BA721A02), Research Foundation of Science and Technology Plan Project in Guangdong Province (No.2003C50118), and Research Foundation of Science and Technology Plan Project in Guangzhou City (No.2002Z3-E0017).