By analyzing the correlation between courses in students’grades,we can provide a decision-making basis for the revision of courses and syllabi,rationally optimize courses,and further improve teaching effects.With the...By analyzing the correlation between courses in students’grades,we can provide a decision-making basis for the revision of courses and syllabi,rationally optimize courses,and further improve teaching effects.With the help of IBM SPSS Modeler data mining software,this paper uses Apriori algorithm for association rule mining to conduct an in-depth analysis of the grades of nursing students in Shandong College of Traditional Chinese Medicine,and to explore the correlation between professional basic courses and professional core courses.Lastly,according to the detailed analysis of the mining results,valuable curriculum information will be found from the actual teaching data.展开更多
With the gradual acceleration of information construction in colleges and universities,digital campus and smart campus have gradually become important means for colleges and universities to scientifically manage the c...With the gradual acceleration of information construction in colleges and universities,digital campus and smart campus have gradually become important means for colleges and universities to scientifically manage the campus.They have been applied to teaching,scientific research,student management,and other fields,improving the quality and efficiency of management.This paper mainly studies the intelligent educational administration management system based on data mining technology.Firstly,this paper introduces the application process of data mining technology,and builds an intelligent educational administration management system based on data mining technology.Then,this paper optimizes the application of the Apriori algorithm in educational administration management through transaction compression and frequent sampling.Compared with the traditional Apriori algorithm,the optimized Apriori algorithm in this paper has a shorter execution time under the same minimum support.展开更多
The assembly process of aerospace products such as satellites and rockets has the characteristics of single-or small-batch production,a long development period,high reliability,and frequent disturbances.How to predict...The assembly process of aerospace products such as satellites and rockets has the characteristics of single-or small-batch production,a long development period,high reliability,and frequent disturbances.How to predict and avoid quality abnormalities,quickly locate their causes,and improve product assembly quality and efficiency are urgent engineering issues.As the core technology to realize the integration of virtual and physical space,digital twin(DT)technology can make full use of the low cost,high efficiency,and predictable advantages of digital space to provide a feasible solution to such problems.Hence,a quality management method for the assembly process of aerospace products based on DT is proposed.Given that traditional quality control methods for the assembly process of aerospace products are mostly post-inspection,the Grey-Markov model and T-K control chart are used with a small sample of assembly quality data to predict the value of quality data and the status of an assembly system.The Apriori algorithm is applied to mine the strong association rules related to quality data anomalies and uncontrolled assembly systems so as to solve the issue that the causes of abnormal quality are complicated and difficult to trace.The implementation of the proposed approach is described,taking the collected centroid data of an aerospace product’s cabin,one of the key quality data in the assembly process of aerospace products,as an example.A DT-based quality management system for the assembly process of aerospace products is developed,which can effectively improve the efficiency of quality management for the assembly process of aerospace products and reduce quality abnormalities.展开更多
A feature extraction, which means extracting the representative words from a text, is an important issue in text mining field. This paper presented a new Apriori and N-gram based Chinese text feature extraction method...A feature extraction, which means extracting the representative words from a text, is an important issue in text mining field. This paper presented a new Apriori and N-gram based Chinese text feature extraction method, and analyzed its correctness and performance. Our method solves the question that the exist extraction methods cannot find the frequent words with arbitrary length in Chinese texts. The experimental results show this method is feasible.展开更多
A method for mining frequent itemsets by evaluating their probability of supports based on asso-ciation analysis is presented.This paper obtains the probability of every 1-itemset by scanning the database,then evaluat...A method for mining frequent itemsets by evaluating their probability of supports based on asso-ciation analysis is presented.This paper obtains the probability of every 1-itemset by scanning the database,then evaluates the probability of every 2-itemset,every 3-itemset,every k-itemset from the frequent 1-itemsets and gains all the candidate frequent itemsets.This paper also scans the database for verifying the support of the candidate frequent itemsets.Last,the frequent itemsets are mined.The method reduces a lot of time of scanning database and shortens the computation time of the algorithm.展开更多
It is a key challenge to exploit the label coupling relationship in multi-label classification(MLC)problems.Most previous work focused on label pairwise relations,in which generally only global statistical informati...It is a key challenge to exploit the label coupling relationship in multi-label classification(MLC)problems.Most previous work focused on label pairwise relations,in which generally only global statistical information is used to analyze the coupled label relationship.In this work,firstly Bayesian and hypothesis testing methods are applied to predict the label set size of testing samples within their k nearest neighbor samples,which combines global and local statistical information,and then apriori algorithm is used to mine the label coupling relationship among multiple labels rather than pairwise labels,which can exploit the label coupling relations more accurately and comprehensively.The experimental results on text,biology and audio datasets shown that,compared with the state-of-the-art algorithm,the proposed algorithm can obtain better performance on 5 common criteria.展开更多
Objective:To analyze misdiagnosis features in clinical cases of“Classified Medical Cases of Famous Physicians”and“Supplement to Classified Case Records of Celebrated Physicians.”Materials and Methods:Two hundred a...Objective:To analyze misdiagnosis features in clinical cases of“Classified Medical Cases of Famous Physicians”and“Supplement to Classified Case Records of Celebrated Physicians.”Materials and Methods:Two hundred and five ancient misdiagnosed cases were analyzed in aspects of locations(exterior-interior type,qi-blood type and Zang‑Fu organs type)and patterns(heat-cold type and deficiency-excess type)by Apriori Algorithm Method.Results:The main types of misdiagnosis in those medical casesare as follows::Zang‑Fu location misjudgment,misjudging the interior as the exterior,misjudging deficiency pattern as excess pattern,and misjudging cold pattern as heat pattern.Among them,the most outstanding type is the misjudgment of deficiency–cold pattern as excess–heat pattern.Conclusions:(1)Accurate judgment of location and differentiation of deficiency and excess patterns are the key points in diagnosing the diseases correctly.The confusion of true deficiency–cold and pseudo‑excess–heat pattern should be taken seriously.(2)Data mining on ancient clinical cases offers a new methodology for assisting clinical diagnosis of traditional Chinese medicine.展开更多
This paper aims to mine the knowledge and rules on compatibility of drugs from the prescriptions for curing arrhythmia in the Chinese traditional medicine database by Apriori algorithm. For data preparation, 1 113 pre...This paper aims to mine the knowledge and rules on compatibility of drugs from the prescriptions for curing arrhythmia in the Chinese traditional medicine database by Apriori algorithm. For data preparation, 1 113 prescriptions for arrhythmia, including 535 herbs ( totally 10884 counts of herbs) were collected into the database. The prescription data were preprocessed through redundancy reduction, normalized storage, and knowledge induction according to the pretreatment demands of data mining. Then the Apriori algorithm was used to analyze the data and form the related technical rules and treatment procedures. The experimental result of compatibility of drugs for curing arrhythmia from the Chinese traditional medicine database shows that the prescription compatibility obtained by Apriori algorithm generally accords with the basic law of traditional Chinese medicine for arrhythmia. Some special compatibilities unreported were also discovered in the experiment, which may be used as the basis for developing new prescriptions for arrhythmia.展开更多
Investigations towards studying terrorist activities have recently attracted a great amount of research interest. In this paper, we investigate the use of the Apriori algorithm on the Global Terrorism Database (GTD) f...Investigations towards studying terrorist activities have recently attracted a great amount of research interest. In this paper, we investigate the use of the Apriori algorithm on the Global Terrorism Database (GTD) for forensic investigation purposes. Recently, the Apriori algorithm, which could be considered a forensic tool</span><span style="font-family:Verdana;">,</span><span style="font-family:Verdana;"> has been used to study terrorist activities and patterns across the world. As such, our motivation is to utilise the Apriori algorithm approach on the GTD to study terrorist activities and the areas/states in Nigeria with high frequencies of terrorist activities. We observe that the most preferred method of terrorist attacks in Nigeria is through armed assault. Again, our experiment shows that attacks in Nigeria are mostly successful. Also, we observe from our investigations that most terrorists in Nigeria are not suicidal. The main application of this work can be used by forensic experts to assist law enforcement agencies in decision making when handling terrorist attacks in Nigeria</span><span style="font-family:Verdana;">. </p>展开更多
Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.Th...Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.The inherent laws reflected by the historical data of the distribution network are ignored,which affects the objectivity of the planning scheme.In this study,to improve the efficiency and accuracy of distribution network planning,the characteristics of distribution network data were extracted using a data-mining technique,and correlation knowledge of existing problems in the network was obtained.A data-mining model based on correlation rules was established.The inputs of the model were the electrical characteristic indices screened using the gray correlation method.The Apriori algorithm was used to extract correlation knowledge from the operational data of the distribution network and obtain strong correlation rules.Degree of promotion and chi-square tests were used to verify the rationality of the strong correlation rules of the model output.In this study,the correlation relationship between heavy load or overload problems of distribution network feeders in different regions and related characteristic indices was determined,and the confidence of the correlation rules was obtained.These results can provide an effective basis for the formulation of a distribution network planning scheme.展开更多
Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting corre...Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.展开更多
The market trends rapidly changed over the last two decades.The primary reason is the newly created opportunities and the increased number of competitors competing to grasp market share using business analysis techniq...The market trends rapidly changed over the last two decades.The primary reason is the newly created opportunities and the increased number of competitors competing to grasp market share using business analysis techniques.Market Basket Analysis has a tangible effect in facilitating current change in the market.Market Basket Analysis is one of the famous fields that deal with Big Data and Data Mining applications.MBA initially uses Association Rule Learning(ARL)as a mean for realization.ARL has a beneficial effect in providing a plenty benefit in analyzing the market data and understanding customers’behavior.An important motive of using such techniques is maximizing the business profit as well as matching the exact customer needs as closely as possible.In this survey paper,we discussed several applications and methods of MBA based on ARL.Also,we reviewed some association rule learning measurements including trust,lift,leverage,and others.Furthermore,we discuss some open issues and future topics in the area of market basket analysis and association rule learning.展开更多
Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discoveri...Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discovering correlations,patterns,and causal structures within datasets.In the healthcare domain,association rules offer valuable opportunities for building knowledge bases,enabling intelligent diagnoses,and extracting invaluable information rapidly.This paper presents a novel approach called the Machine Learning based Association Rule Mining and Classification for Healthcare Data Management System(MLARMC-HDMS).The MLARMC-HDMS technique integrates classification and association rule mining(ARM)processes.Initially,the chimp optimization algorithm-based feature selection(COAFS)technique is employed within MLARMC-HDMS to select relevant attributes.Inspired by the foraging behavior of chimpanzees,the COA algorithm mimics their search strategy for food.Subsequently,the classification process utilizes stochastic gradient descent with a multilayer perceptron(SGD-MLP)model,while the Apriori algorithm determines attribute relationships.We propose a COA-based feature selection approach for medical data classification using machine learning techniques.This approach involves selecting pertinent features from medical datasets through COA and training machine learning models using the reduced feature set.We evaluate the performance of our approach on various medical datasets employing diverse machine learning classifiers.Experimental results demonstrate that our proposed approach surpasses alternative feature selection methods,achieving higher accuracy and precision rates in medical data classification tasks.The study showcases the effectiveness and efficiency of the COA-based feature selection approach in identifying relevant features,thereby enhancing the diagnosis and treatment of various diseases.To provide further validation,we conduct detailed experiments on a benchmark medical dataset,revealing the superiority of the MLARMCHDMS model over other methods,with a maximum accuracy of 99.75%.Therefore,this research contributes to the advancement of feature selection techniques in medical data classification and highlights the potential for improving healthcare outcomes through accurate and efficient data analysis.The presented MLARMC-HDMS framework and COA-based feature selection approach offer valuable insights for researchers and practitioners working in the field of healthcare data mining and machine learning.展开更多
Data mining techniques offer great opportunities for developing ethics lines whose main aim is to ensure improvements and compliance with the values, conduct and commitments making up the code of ethics. The aim of th...Data mining techniques offer great opportunities for developing ethics lines whose main aim is to ensure improvements and compliance with the values, conduct and commitments making up the code of ethics. The aim of this study is to suggest a process for exploiting the data generated by the data generated and collected from an ethics line by extracting rules of association and applying the Apriori algorithm. This makes it possible to identify anomalies and behaviour patterns requiring action to review, correct, promote or expand them, as appropriate.展开更多
In this paper,association rule mining algorithm is utilized to analyze the correlations of various factors of causing traffic accidents,from which the relationship model of dangerous driving behaviors is established.I...In this paper,association rule mining algorithm is utilized to analyze the correlations of various factors of causing traffic accidents,from which the relationship model of dangerous driving behaviors is established.In this model,the factors and their correlations include:ability of risk control,ability of driving self-confidence,individual characteristics,and incorrect driving operations.By selecting the drivers in the city of Chengdu to be the objects of investigation,a group of valid sample data is obtained.Based on these data,the Support and Confidence for association rules are analyzed.In the analysis,the two stage computing of Apriori algorithm programming is simulated,and from which some important rules are obtained.With these rules,departments of traffic administration can focus on these key factors in their processing of traffic transactions.By the training of drivers’skills and their physical and mental behaviors,the incorrect driving operations can be greatly reduced and the traffic safety can be effectively guaranteed.展开更多
With the increasing of data on the internet, data analysis has become inescapable to gain time and efficiency, especially in bibliographic information retrieval systems. We can estimate the number of actual scientific...With the increasing of data on the internet, data analysis has become inescapable to gain time and efficiency, especially in bibliographic information retrieval systems. We can estimate the number of actual scientific journals points to around 40</span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">,</span></span></span><span><span><span style="font-family:""><span style="font-family:Verdana;">000 with about four million articles published each year. Machine learning and deep learning applied to recommender systems had become unavoidable whether in industry or in research. In this current, we propose an optimized interface for bibliographic information retrieval as a </span><span style="font-family:Verdana;">running example, which allows different kind of researchers to find their</span><span style="font-family:Verdana;"> needs following some relevant criteria through natural language understanding. Papers indexed in Web of Science and Scopus are in high demand. Natural language including text and linguistic-based techniques, such as tokenization, named entity recognition, syntactic and semantic analysis, are used to express natural language queries. Our Interface uses association rules to find more related papers for recommendation. Spanning trees are challenged to optimize the search process of the system.展开更多
In order to make effective use a large amount of graduate data in colleges and universities that accumulate by teaching management of work, the paper study the data mining for higher vocational graduates database usin...In order to make effective use a large amount of graduate data in colleges and universities that accumulate by teaching management of work, the paper study the data mining for higher vocational graduates database using the data mining technology. Using a variety of data preprocessing methods for the original data, and the paper put forward to mining algorithm based on commonly association rule Apriori algorithm, then according to the actual needs of the design and implementation of association rule mining system, has been beneficial to the employment guidance of college teaching management decision and graduates of the mining results.展开更多
Since the implementation of the transportation power strategy, China’s transportation industry has developed rapidly, yet the number of road traffic accidents has remained high in recent years. Many scholars have inv...Since the implementation of the transportation power strategy, China’s transportation industry has developed rapidly, yet the number of road traffic accidents has remained high in recent years. Many scholars have investigated the factors influencing traffic accidents to find the underlying mechanisms, thereby enhancing road traffic safety. Compared to general accidents, the factors influencing major road traffic accidents are more complex. This study focuses on examining the relationships between factors affecting major road traffic accidents. Data on 968 major road traffic accidents from 2012 to 2018 in China were collected and organized. The accident information fields were analyzed to identify seven attributes: accident province, accident region, accident quarter, accident time, accident form, accident vehicle, and weather condition. The Apriori association rule algorithm was employed to mine and solve the strong association rules between accident attribute values. The associations between different influencing factors and the form of accident results were analyzed, with a deeper exploration of three-factor and four-factor rules. The results indicate that certain causal factors jointly contribute to major accidents, particularly in the western region, represented by Guangxi. These accidents mainly involved trucks and occurred in rainy and snowy weather during the first quarter. The conclusions of this research can provide the transportation management department with measures to improve urban road traffic safety and reduce the occurrence of traffic accidents.展开更多
Today,the customer’s requirements are entirely transformed.Many big retail organizations are facing sudden decline in the sales and revenues caused due to indecisive and erratic purchasing habits of recent generation...Today,the customer’s requirements are entirely transformed.Many big retail organizations are facing sudden decline in the sales and revenues caused due to indecisive and erratic purchasing habits of recent generation of users,as they get abundant preferred information such as cheaper rates,amazing offers,discounts,comparison of similar products,etc.over their smartphones or laptops hence they straightaway place order instead of walking down to showroom.As a result,large companies such as Tesco,Wal-Mart,Target,etc.have realized that it is requisite to shake hands with startup firms which already supports platform to retain customers either via deep exploration of transactional data or by offering lucrative offers in the benefit of customer and to promote market basket.The data which are generated from consumer purchase pattern,Big Data is a concern for companies as a result various big retail organizations are applying advanced and scalable data mining algorithms to precisely store and evaluate data in real-time manner to boost market basket analysis.This research work discusses various improved association rule mining(ARM)algorithms.The objective of this study is to identify gaps,providing opportunities for new research,to recognize expansion of Big Data analytics with retail environment and its future directions.This paper assimilates various aspects of parallel ARM algorithm for market basket analysis against sequential and distributed nature which are further escalated to Hadoop and MapReduce computing platform.Further various use cases highlighting the need of‘Big Data Retail Analytics’are discussed for emerging trends to promote sales and revenues,to keep check on competitor’s websites,comparison of various brands,enticing new customers.展开更多
With the development of wireless networks, the amount of multiple services increased sharply in recent years. High quality multiple services with low price are urgently needed especially in new generation mobile commu...With the development of wireless networks, the amount of multiple services increased sharply in recent years. High quality multiple services with low price are urgently needed especially in new generation mobile communication systems, e.g., 3G/LTE networks. It is important to enhance the availability of data service resources. Services have strong association which are used by clients with similar behavior habits in networks. Such feature results in service behavior convergence (SBC) and its utilization will enhance resource efficiency. This paper proposes two applications of service behavior: service prediction and a scheduling algorithm which enhances bandwidth efficiency. Convergence cells are classified according to SBC and hot-spot services are broadcasted separately in each convergence cell. It is demonstrated by stimulation that the bandwidth is saved 80% more than classical cellular system and nearly 20% more than traditional broadcasting system.展开更多
文摘By analyzing the correlation between courses in students’grades,we can provide a decision-making basis for the revision of courses and syllabi,rationally optimize courses,and further improve teaching effects.With the help of IBM SPSS Modeler data mining software,this paper uses Apriori algorithm for association rule mining to conduct an in-depth analysis of the grades of nursing students in Shandong College of Traditional Chinese Medicine,and to explore the correlation between professional basic courses and professional core courses.Lastly,according to the detailed analysis of the mining results,valuable curriculum information will be found from the actual teaching data.
文摘With the gradual acceleration of information construction in colleges and universities,digital campus and smart campus have gradually become important means for colleges and universities to scientifically manage the campus.They have been applied to teaching,scientific research,student management,and other fields,improving the quality and efficiency of management.This paper mainly studies the intelligent educational administration management system based on data mining technology.Firstly,this paper introduces the application process of data mining technology,and builds an intelligent educational administration management system based on data mining technology.Then,this paper optimizes the application of the Apriori algorithm in educational administration management through transaction compression and frequent sampling.Compared with the traditional Apriori algorithm,the optimized Apriori algorithm in this paper has a shorter execution time under the same minimum support.
基金National Key Research and Development Program of China(Grant No.2020YFB1710300)National Natural Science Foundation of China(Grant No.52005042)+2 种基金National Defense Fundamental Research Foundation of China(Grant No.JCKY2020203B039)Equipment Pre-research Foundation of China(Grant No.80923010101)Beijing Institute of Technology Research Fund Program for Young Scholars.
文摘The assembly process of aerospace products such as satellites and rockets has the characteristics of single-or small-batch production,a long development period,high reliability,and frequent disturbances.How to predict and avoid quality abnormalities,quickly locate their causes,and improve product assembly quality and efficiency are urgent engineering issues.As the core technology to realize the integration of virtual and physical space,digital twin(DT)technology can make full use of the low cost,high efficiency,and predictable advantages of digital space to provide a feasible solution to such problems.Hence,a quality management method for the assembly process of aerospace products based on DT is proposed.Given that traditional quality control methods for the assembly process of aerospace products are mostly post-inspection,the Grey-Markov model and T-K control chart are used with a small sample of assembly quality data to predict the value of quality data and the status of an assembly system.The Apriori algorithm is applied to mine the strong association rules related to quality data anomalies and uncontrolled assembly systems so as to solve the issue that the causes of abnormal quality are complicated and difficult to trace.The implementation of the proposed approach is described,taking the collected centroid data of an aerospace product’s cabin,one of the key quality data in the assembly process of aerospace products,as an example.A DT-based quality management system for the assembly process of aerospace products is developed,which can effectively improve the efficiency of quality management for the assembly process of aerospace products and reduce quality abnormalities.
文摘A feature extraction, which means extracting the representative words from a text, is an important issue in text mining field. This paper presented a new Apriori and N-gram based Chinese text feature extraction method, and analyzed its correctness and performance. Our method solves the question that the exist extraction methods cannot find the frequent words with arbitrary length in Chinese texts. The experimental results show this method is feasible.
基金Funded by the National 973 Project(No.2003CB415205).
文摘A method for mining frequent itemsets by evaluating their probability of supports based on asso-ciation analysis is presented.This paper obtains the probability of every 1-itemset by scanning the database,then evaluates the probability of every 2-itemset,every 3-itemset,every k-itemset from the frequent 1-itemsets and gains all the candidate frequent itemsets.This paper also scans the database for verifying the support of the candidate frequent itemsets.Last,the frequent itemsets are mined.The method reduces a lot of time of scanning database and shortens the computation time of the algorithm.
基金Supported by Australian Research Council Discovery(DP130102691)the National Science Foundation of China(61302157)+1 种基金China National 863 Project(2012AA12A308)China Pre-research Project of Nuclear Industry(FZ1402-08)
文摘It is a key challenge to exploit the label coupling relationship in multi-label classification(MLC)problems.Most previous work focused on label pairwise relations,in which generally only global statistical information is used to analyze the coupled label relationship.In this work,firstly Bayesian and hypothesis testing methods are applied to predict the label set size of testing samples within their k nearest neighbor samples,which combines global and local statistical information,and then apriori algorithm is used to mine the label coupling relationship among multiple labels rather than pairwise labels,which can exploit the label coupling relations more accurately and comprehensively.The experimental results on text,biology and audio datasets shown that,compared with the state-of-the-art algorithm,the proposed algorithm can obtain better performance on 5 common criteria.
基金Budget Foundation of Shanghai University of TCM(A1-GY010130)Philosophy and Social Science Foundation of Shanghai(2019BTQ005)。
文摘Objective:To analyze misdiagnosis features in clinical cases of“Classified Medical Cases of Famous Physicians”and“Supplement to Classified Case Records of Celebrated Physicians.”Materials and Methods:Two hundred and five ancient misdiagnosed cases were analyzed in aspects of locations(exterior-interior type,qi-blood type and Zang‑Fu organs type)and patterns(heat-cold type and deficiency-excess type)by Apriori Algorithm Method.Results:The main types of misdiagnosis in those medical casesare as follows::Zang‑Fu location misjudgment,misjudging the interior as the exterior,misjudging deficiency pattern as excess pattern,and misjudging cold pattern as heat pattern.Among them,the most outstanding type is the misjudgment of deficiency–cold pattern as excess–heat pattern.Conclusions:(1)Accurate judgment of location and differentiation of deficiency and excess patterns are the key points in diagnosing the diseases correctly.The confusion of true deficiency–cold and pseudo‑excess–heat pattern should be taken seriously.(2)Data mining on ancient clinical cases offers a new methodology for assisting clinical diagnosis of traditional Chinese medicine.
文摘This paper aims to mine the knowledge and rules on compatibility of drugs from the prescriptions for curing arrhythmia in the Chinese traditional medicine database by Apriori algorithm. For data preparation, 1 113 prescriptions for arrhythmia, including 535 herbs ( totally 10884 counts of herbs) were collected into the database. The prescription data were preprocessed through redundancy reduction, normalized storage, and knowledge induction according to the pretreatment demands of data mining. Then the Apriori algorithm was used to analyze the data and form the related technical rules and treatment procedures. The experimental result of compatibility of drugs for curing arrhythmia from the Chinese traditional medicine database shows that the prescription compatibility obtained by Apriori algorithm generally accords with the basic law of traditional Chinese medicine for arrhythmia. Some special compatibilities unreported were also discovered in the experiment, which may be used as the basis for developing new prescriptions for arrhythmia.
文摘Investigations towards studying terrorist activities have recently attracted a great amount of research interest. In this paper, we investigate the use of the Apriori algorithm on the Global Terrorism Database (GTD) for forensic investigation purposes. Recently, the Apriori algorithm, which could be considered a forensic tool</span><span style="font-family:Verdana;">,</span><span style="font-family:Verdana;"> has been used to study terrorist activities and patterns across the world. As such, our motivation is to utilise the Apriori algorithm approach on the GTD to study terrorist activities and the areas/states in Nigeria with high frequencies of terrorist activities. We observe that the most preferred method of terrorist attacks in Nigeria is through armed assault. Again, our experiment shows that attacks in Nigeria are mostly successful. Also, we observe from our investigations that most terrorists in Nigeria are not suicidal. The main application of this work can be used by forensic experts to assist law enforcement agencies in decision making when handling terrorist attacks in Nigeria</span><span style="font-family:Verdana;">. </p>
基金supported by the Science and Technology Project of China Southern Power Grid(GZHKJXM20210043-080041KK52210002).
文摘Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.The inherent laws reflected by the historical data of the distribution network are ignored,which affects the objectivity of the planning scheme.In this study,to improve the efficiency and accuracy of distribution network planning,the characteristics of distribution network data were extracted using a data-mining technique,and correlation knowledge of existing problems in the network was obtained.A data-mining model based on correlation rules was established.The inputs of the model were the electrical characteristic indices screened using the gray correlation method.The Apriori algorithm was used to extract correlation knowledge from the operational data of the distribution network and obtain strong correlation rules.Degree of promotion and chi-square tests were used to verify the rationality of the strong correlation rules of the model output.In this study,the correlation relationship between heavy load or overload problems of distribution network feeders in different regions and related characteristic indices was determined,and the confidence of the correlation rules was obtained.These results can provide an effective basis for the formulation of a distribution network planning scheme.
文摘Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.
文摘The market trends rapidly changed over the last two decades.The primary reason is the newly created opportunities and the increased number of competitors competing to grasp market share using business analysis techniques.Market Basket Analysis has a tangible effect in facilitating current change in the market.Market Basket Analysis is one of the famous fields that deal with Big Data and Data Mining applications.MBA initially uses Association Rule Learning(ARL)as a mean for realization.ARL has a beneficial effect in providing a plenty benefit in analyzing the market data and understanding customers’behavior.An important motive of using such techniques is maximizing the business profit as well as matching the exact customer needs as closely as possible.In this survey paper,we discussed several applications and methods of MBA based on ARL.Also,we reviewed some association rule learning measurements including trust,lift,leverage,and others.Furthermore,we discuss some open issues and future topics in the area of market basket analysis and association rule learning.
基金Deputyship for Research&Innovation,Ministry of Education in Saudi Arabia for funding this research work through the Project Number RI-44-0444.
文摘Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discovering correlations,patterns,and causal structures within datasets.In the healthcare domain,association rules offer valuable opportunities for building knowledge bases,enabling intelligent diagnoses,and extracting invaluable information rapidly.This paper presents a novel approach called the Machine Learning based Association Rule Mining and Classification for Healthcare Data Management System(MLARMC-HDMS).The MLARMC-HDMS technique integrates classification and association rule mining(ARM)processes.Initially,the chimp optimization algorithm-based feature selection(COAFS)technique is employed within MLARMC-HDMS to select relevant attributes.Inspired by the foraging behavior of chimpanzees,the COA algorithm mimics their search strategy for food.Subsequently,the classification process utilizes stochastic gradient descent with a multilayer perceptron(SGD-MLP)model,while the Apriori algorithm determines attribute relationships.We propose a COA-based feature selection approach for medical data classification using machine learning techniques.This approach involves selecting pertinent features from medical datasets through COA and training machine learning models using the reduced feature set.We evaluate the performance of our approach on various medical datasets employing diverse machine learning classifiers.Experimental results demonstrate that our proposed approach surpasses alternative feature selection methods,achieving higher accuracy and precision rates in medical data classification tasks.The study showcases the effectiveness and efficiency of the COA-based feature selection approach in identifying relevant features,thereby enhancing the diagnosis and treatment of various diseases.To provide further validation,we conduct detailed experiments on a benchmark medical dataset,revealing the superiority of the MLARMCHDMS model over other methods,with a maximum accuracy of 99.75%.Therefore,this research contributes to the advancement of feature selection techniques in medical data classification and highlights the potential for improving healthcare outcomes through accurate and efficient data analysis.The presented MLARMC-HDMS framework and COA-based feature selection approach offer valuable insights for researchers and practitioners working in the field of healthcare data mining and machine learning.
文摘Data mining techniques offer great opportunities for developing ethics lines whose main aim is to ensure improvements and compliance with the values, conduct and commitments making up the code of ethics. The aim of this study is to suggest a process for exploiting the data generated by the data generated and collected from an ethics line by extracting rules of association and applying the Apriori algorithm. This makes it possible to identify anomalies and behaviour patterns requiring action to review, correct, promote or expand them, as appropriate.
文摘In this paper,association rule mining algorithm is utilized to analyze the correlations of various factors of causing traffic accidents,from which the relationship model of dangerous driving behaviors is established.In this model,the factors and their correlations include:ability of risk control,ability of driving self-confidence,individual characteristics,and incorrect driving operations.By selecting the drivers in the city of Chengdu to be the objects of investigation,a group of valid sample data is obtained.Based on these data,the Support and Confidence for association rules are analyzed.In the analysis,the two stage computing of Apriori algorithm programming is simulated,and from which some important rules are obtained.With these rules,departments of traffic administration can focus on these key factors in their processing of traffic transactions.By the training of drivers’skills and their physical and mental behaviors,the incorrect driving operations can be greatly reduced and the traffic safety can be effectively guaranteed.
文摘With the increasing of data on the internet, data analysis has become inescapable to gain time and efficiency, especially in bibliographic information retrieval systems. We can estimate the number of actual scientific journals points to around 40</span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">,</span></span></span><span><span><span style="font-family:""><span style="font-family:Verdana;">000 with about four million articles published each year. Machine learning and deep learning applied to recommender systems had become unavoidable whether in industry or in research. In this current, we propose an optimized interface for bibliographic information retrieval as a </span><span style="font-family:Verdana;">running example, which allows different kind of researchers to find their</span><span style="font-family:Verdana;"> needs following some relevant criteria through natural language understanding. Papers indexed in Web of Science and Scopus are in high demand. Natural language including text and linguistic-based techniques, such as tokenization, named entity recognition, syntactic and semantic analysis, are used to express natural language queries. Our Interface uses association rules to find more related papers for recommendation. Spanning trees are challenged to optimize the search process of the system.
文摘In order to make effective use a large amount of graduate data in colleges and universities that accumulate by teaching management of work, the paper study the data mining for higher vocational graduates database using the data mining technology. Using a variety of data preprocessing methods for the original data, and the paper put forward to mining algorithm based on commonly association rule Apriori algorithm, then according to the actual needs of the design and implementation of association rule mining system, has been beneficial to the employment guidance of college teaching management decision and graduates of the mining results.
基金supported by the National Natural Science Foundation of China(Grant Nos.72288101,72331001,72361137003)the Talent Fund of Beijing Jiaotong University(GrantNo.2023XKRC036).
文摘Since the implementation of the transportation power strategy, China’s transportation industry has developed rapidly, yet the number of road traffic accidents has remained high in recent years. Many scholars have investigated the factors influencing traffic accidents to find the underlying mechanisms, thereby enhancing road traffic safety. Compared to general accidents, the factors influencing major road traffic accidents are more complex. This study focuses on examining the relationships between factors affecting major road traffic accidents. Data on 968 major road traffic accidents from 2012 to 2018 in China were collected and organized. The accident information fields were analyzed to identify seven attributes: accident province, accident region, accident quarter, accident time, accident form, accident vehicle, and weather condition. The Apriori association rule algorithm was employed to mine and solve the strong association rules between accident attribute values. The associations between different influencing factors and the form of accident results were analyzed, with a deeper exploration of three-factor and four-factor rules. The results indicate that certain causal factors jointly contribute to major accidents, particularly in the western region, represented by Guangxi. These accidents mainly involved trucks and occurred in rainy and snowy weather during the first quarter. The conclusions of this research can provide the transportation management department with measures to improve urban road traffic safety and reduce the occurrence of traffic accidents.
文摘Today,the customer’s requirements are entirely transformed.Many big retail organizations are facing sudden decline in the sales and revenues caused due to indecisive and erratic purchasing habits of recent generation of users,as they get abundant preferred information such as cheaper rates,amazing offers,discounts,comparison of similar products,etc.over their smartphones or laptops hence they straightaway place order instead of walking down to showroom.As a result,large companies such as Tesco,Wal-Mart,Target,etc.have realized that it is requisite to shake hands with startup firms which already supports platform to retain customers either via deep exploration of transactional data or by offering lucrative offers in the benefit of customer and to promote market basket.The data which are generated from consumer purchase pattern,Big Data is a concern for companies as a result various big retail organizations are applying advanced and scalable data mining algorithms to precisely store and evaluate data in real-time manner to boost market basket analysis.This research work discusses various improved association rule mining(ARM)algorithms.The objective of this study is to identify gaps,providing opportunities for new research,to recognize expansion of Big Data analytics with retail environment and its future directions.This paper assimilates various aspects of parallel ARM algorithm for market basket analysis against sequential and distributed nature which are further escalated to Hadoop and MapReduce computing platform.Further various use cases highlighting the need of‘Big Data Retail Analytics’are discussed for emerging trends to promote sales and revenues,to keep check on competitor’s websites,comparison of various brands,enticing new customers.
基金supported by the Joint Funds of NSFC Guangdong (U1035001)the National Natural Science Foundation of China (61001117)+2 种基金the National Basic Research Program of China (2007CB310602)the State Major Science and Technology Special Projects (2010ZX03005-003, 2009ZX03007-004,2010ZX03003-001-01)the Specialized Research Fund for the Doctoral Program of Higher Education (200800131015)
文摘With the development of wireless networks, the amount of multiple services increased sharply in recent years. High quality multiple services with low price are urgently needed especially in new generation mobile communication systems, e.g., 3G/LTE networks. It is important to enhance the availability of data service resources. Services have strong association which are used by clients with similar behavior habits in networks. Such feature results in service behavior convergence (SBC) and its utilization will enhance resource efficiency. This paper proposes two applications of service behavior: service prediction and a scheduling algorithm which enhances bandwidth efficiency. Convergence cells are classified according to SBC and hot-spot services are broadcasted separately in each convergence cell. It is demonstrated by stimulation that the bandwidth is saved 80% more than classical cellular system and nearly 20% more than traditional broadcasting system.