Objective:To explore the core acupuncture acupoints and pattern-adapted acupoint combination rules for autism spectrum disorder(ASD)complicated with sleep disorder using clinical data mining technology.Methods:A retro...Objective:To explore the core acupuncture acupoints and pattern-adapted acupoint combination rules for autism spectrum disorder(ASD)complicated with sleep disorder using clinical data mining technology.Methods:A retrospective analysis was conducted on the diagnosis and treatment data of 104 children with ASD complicated with sleep disorder admitted to Xi’an Traditional Chinese Medicine(TCM)Encephalopathy Hospital from January 2022 to December 2024.Cross-pattern main acupoints were screened via frequency statistics,chi-square test,and factor analysis;pattern-specific auxiliary acupoints were extracted by combining multiple correspondence analysis,cluster analysis,and association rule mining.Results:Ten cross-pattern main acupoints(Baihui,Sishenzhen,Language Area 1,Language Area 2,Neiguan,Shenmen,Yongquan,Xuanzhong)were identified,and acupoint combination schemes for four major TCM patterns(Hyperactivity of Liver and Heart Fire,Deficiency of Kidney Essence,Deficiency of Both Heart and Spleen,Hyperactivity of Liver with Spleen Deficiency)were established.Conclusion:Acupuncture treatment should follow the principle of“regulating spirit and calming the brain as the root,and dredging collaterals based on pattern differentiation as the branch”.The synergy between main and auxiliary acupoints can accurately regulate the disease,providing a basis for precise clinical treatment.展开更多
Introduction Neurosurgical emergencies such as spontaneous intracerebral hemorrhage(ICH),traumatic brain injury(TBI),and acute brain herniation are among the most time-sensitive and high-stakes conditions in modern me...Introduction Neurosurgical emergencies such as spontaneous intracerebral hemorrhage(ICH),traumatic brain injury(TBI),and acute brain herniation are among the most time-sensitive and high-stakes conditions in modern medicine.Clinical decisions often must be made within minutes,yet these decisions are traditionally guided by limited information,heuristic reasoning,and past experience.In this context,the rise of medical data mining and real-time analytics offers a transformative opportunity:to extract actionable intelligence from the flood of clinical,imaging,and physiological data already being collected,and to use this intelligence to guide care in real time[1–3](Figure 1).展开更多
Previous weighted frequent pattern (WFP) mining algorithms are not suitable for data streams for they need multiple database scans. In this paper, we present an efficient algorithm SWFP-Miner to mine weighted freque...Previous weighted frequent pattern (WFP) mining algorithms are not suitable for data streams for they need multiple database scans. In this paper, we present an efficient algorithm SWFP-Miner to mine weighted frequent pattern over data streams. SWFP-Miner is based on sliding window and can discover important frequent pattern from the recent data. A new refined weight definition is proposed to keep the downward closure property, and two pruning strategies are presented to prune the weighted infrequent pattern. Experimental studies are performed to evaluate the effectiveness and efficiency of SWFP-Miner.展开更多
Different acupuncture-moxibustion therapies can produce different clinical effects, that is, the effect has specificity, which is significantly important in obtaining acupuncture-moxibustion efficacy. In this study, t...Different acupuncture-moxibustion therapies can produce different clinical effects, that is, the effect has specificity, which is significantly important in obtaining acupuncture-moxibustion efficacy. In this study, the clinical application laws of fire needle, acupoint injection, catgut embedment, acupoint application, moxibustion therapy and filiform needle acupuncture were summarized in the aspects of category of disease, efficacy and related prescriptions (such as medication and acupoint selection) based on the result of data mining, and the general applicable categories of disease of acupuncture-moxibustion treatment methods were further screened, so as to guide the clinical application and give play to the best efficacy.展开更多
Knowledge Discovery in Databases is gaining attention and raising new hopes for traditional Chinese medicine (TCM) researchers. It is a useful tool in understanding and deciphering TCM theories. Aiming for a better ...Knowledge Discovery in Databases is gaining attention and raising new hopes for traditional Chinese medicine (TCM) researchers. It is a useful tool in understanding and deciphering TCM theories. Aiming for a better understanding of Chinese herbal property theory (CHPT), this paper performed an improved association rule learning to analyze semistructured text in the book entitled Shennong's Classic of Materia Medica. The text was firstly annotated and transformed to well-structured multidimensional data. Subsequently, an Apriori algorithm was employed for producing association rules after the sensitivity analysis of parameters. From the confirmed 120 resulting rules that described the intrinsic relationships between herbal property (qi, flavor and their combinations) and herbal efficacy, two novel fundamental principles underlying CHPT were acquired and further elucidated: (1) the many-to-one mapping of herbal efficacy to herbal property; (2) the nonrandom overlap between the related efficacy of qi and flavor. This work provided an innovative knowledge about CHPT, which would be helpful for its modern research.展开更多
Unstable angina(UA) is the most dangerous type of Coronary Heart Disease(CHD) to cause more and more mortal and morbid world wide. Identification of biomarkers for UA at the level of proteomics and metabolomics is...Unstable angina(UA) is the most dangerous type of Coronary Heart Disease(CHD) to cause more and more mortal and morbid world wide. Identification of biomarkers for UA at the level of proteomics and metabolomics is a better avenue to understand the inner mechanism of it. Feature selection based data mining method is better suited to identify biomarkers of UA. In this study, we carried out clinical epidemiology to collect plasmas of UA in-patients and controls. Proteomics and metabolomics data were obtained via two-dimensional difference gel electrophoresis and gas chromatography techniques. We presented a novel computational strategy to select biomarkers as few as possible for UA in the two groups of data. Firstly, decision tree was used to select biomarkers for UA and 3-fold cross validation was used to evaluate computational performanees for the three methods. Alternatively, we combined inde- pendent t test and classification based data mining method as well as backward elimination technique to select, as few as possible, protein and metabolite biomarkers with best classification performances. By the method, we selected 6 proteins and 5 metabolites for UA. The novel method presented here provides a better insight into the pathology of a disease.展开更多
OBJECTIVE:We applied data mining techniques to the study of acupuncture as a treatment for juvenile myopia,with the aim of identifying hidden patterns in the data.METHODS:Fifty patients with juvenile myopia were selec...OBJECTIVE:We applied data mining techniques to the study of acupuncture as a treatment for juvenile myopia,with the aim of identifying hidden patterns in the data.METHODS:Fifty patients with juvenile myopia were selected and treated with acupuncture,and data mining was used to analyze the effects of treatment and the influence of behavioral variables.Clustering analysis was used to divide myopia patients into two classifications before acupuncture treatment.Artificial neural network BP algorithm was adopted to analyze the roles of different factors in changes in diopters.An association algorithm was used to analyze factors associated with the subjective experience of acupuncture and average diopter.RESULTS:The two classification results were fully consistent with the understandings of the ophthalmic circles.The duration of using the Internet and watching TV every day was the main factor that affected vision.Acupuncture feelings and therapeutic effect have a strong correlativity.A good or above experience's score of acupuncture could slow the progression of juvenile myopia.CONCLUSION:Collecting data from patients with juvenile myopia by using data mining can extract hidden potential rules and knowledge from the research evidence.The decision support can be provided to improve the doctor's clinical acupuncture treatment effects.展开更多
For the multi-mode radar working in the modern electronicbattlefield, different working states of one single radar areprone to being classified as multiple emitters when adoptingtraditional classification methods to p...For the multi-mode radar working in the modern electronicbattlefield, different working states of one single radar areprone to being classified as multiple emitters when adoptingtraditional classification methods to process intercepted signals,which has a negative effect on signal classification. A classificationmethod based on spatial data mining is presented to address theabove challenge. Inspired by the idea of spatial data mining, theclassification method applies nuclear field to depicting the distributioninformation of pulse samples in feature space, and digs out thehidden cluster information by analyzing distribution characteristics.In addition, a membership-degree criterion to quantify the correlationamong all classes is established, which ensures classificationaccuracy of signal samples. Numerical experiments show that thepresented method can effectively prevent different working statesof multi-mode emitter from being classified as several emitters,and achieve higher classification accuracy.展开更多
Tanshinone IIA is a pharmacologically active compound isolated from Danshen(Salvia miltiorrhiza), a traditional Chinese herbal medicine for the management of cardiac diseases and other disorders. But its underlying mo...Tanshinone IIA is a pharmacologically active compound isolated from Danshen(Salvia miltiorrhiza), a traditional Chinese herbal medicine for the management of cardiac diseases and other disorders. But its underlying molecular mechanisms of action are still unclear. The present investigation utilized a data mining approach based on network pharmacology to uncover the potential protein targets of Tanshinone IIA. Network pharmacology, an integrated multidisciplinary study, incorporates systems biology, network analysis, connectivity, redundancy, and pleiotropy, providing powerful new tools and insights into elucidating the fine details of drug-target interactions. In the present study, two separate drug-target networks for Tanshinone IIA were constructed using the Agilent Literature Search(ALS) and STITCH(search tool for interactions of chemicals) methods. Analysis of the ALS-constructed network revealed a target network with a scale-free topology and five top nodes(protein targets) corresponding to Fos, Jun, Src, phosphatidylinositol-4, 5-bisphosphate 3-kinase, catalytic subunit alpha(PIK3CA), and mitogen-activated protein kinase kinase 1(MAP2K1), whereas analysis of the STITCH-constructed network revealed three top nodes corresponding to cytochrome P450 3A4(CYP3A4), cytochrome P450 A1(CYP1A1), and nuclear factor kappa B1(NFκB1). The discrepancies were probably due to the differences in the divergent computer mining tools and databases employed by the two methods. However, it is conceivable that all eight proteins mediate important biological functions of Tanshinone IIA, contributing to its overall drug-target network. In conclusion, the current results may assist in developing a comprehensive understanding of the molecular mechanisms and signaling pathways of in a simple, compact, and visual manner.展开更多
Objective:To analyze the component law of Chinese patent medicines for anti-influenza and develop new prescriptions for anti-influenza by unsupervised data mining methods. Methods: Chinese patent medicine recipes for ...Objective:To analyze the component law of Chinese patent medicines for anti-influenza and develop new prescriptions for anti-influenza by unsupervised data mining methods. Methods: Chinese patent medicine recipes for anti-influenza were collected and recorded in the database, and then the correlation coefficient between herbs, core combinations of herbs and new prescriptions were analyzed by using modified mutual information, complex system entropy cluster and unsupervised hierarchical clustering, respectively. Results: Based on analysis of 126 Chinese patent medicine recipes, the frequency of each herb occurrence in these recipes, 54 frequently-used herb pairs, 34 core combinations were determined, and 4 new recipes for influenza were developed. Conclusion: Unsupervised data mining methods are able to mine the component law quickly and develop new prescriptions.展开更多
Background Traditional Chinese medicine(TCM)is becoming a popular complementary approach in pediatric oncology.However,few or no meta-analyses have focused on clinical studies of the use of TCM in pediatric oncology.O...Background Traditional Chinese medicine(TCM)is becoming a popular complementary approach in pediatric oncology.However,few or no meta-analyses have focused on clinical studies of the use of TCM in pediatric oncology.Objective We explored the patterns of TCM use and its efficacy in children with cancer,using a systematic review,meta-analysis and data mining study.Search strategy We conducted a search of five English(Allied and Complementary Medicine Database,Embase,PubMed,Cochrane Central Register of Controlled Trials,and ClinicalTrials.gov)and four Chinese databases(Wanfang Data,China National Knowledge Infrastructure,Chinese Biomedical Literature Database,and VIP Chinese Science and Technology Periodicals Database)for clinical studies published before October 2021,using keywords related to“pediatric,”“cancer,”and“TCM.”Inclusion criteria We included studies which were randomized controlled trials(RCTs)or observational clinical studies,focused on patients aged<19 years old who had been diagnosed with cancer,and included at least one group of subjects receiving TCM treatment.Data extraction and analysis The methodological quality of RCTs and observational studies was assessed using the six-item Jadad scale and the Effective Public Healthcare Panacea Project Quality Assessment Tool,respectively.Meta-analysis was used to evaluate the efficacy of combining TCM with chemotherapy.Study outcomes included the treatment response rate and occurrence of cancer-related symptoms.Association rule mining(ARM)was used to investigate the associations among medicinal herbs and patient symptoms.Results The fifty-four studies included in this analysis were comprised of RCTs(63.0%)and observational studies(37.0%).Most RCTs focused on hematological malignancies(41.2%).The study outcomes included chemotherapy-induced toxicities(76.5%),infection rate(35.3%),and response,survival or relapse rate(23.5%).The methodological quality of most of the RCTs(82.4%)and observational studies(80.0%)was rated as“moderate.”In studies of leukemia patients,adding TCM to conventional treatment significantly improved the clinical response rate(odds ratio[OR]=2.55;95%confidence interval[CI]=1.49-4.36),lowered infection rate(OR=0.23;95%CI=0.13-0.40),and reduced nausea and vomiting(OR=0.13;95%CI=0.08-0.23).ARM showed that Radix Astragali,the most commonly used medicinal herb(58.0%),was associated with treating myelosuppression,gastrointestinal complications,and infection.Conclusion There is growing evidence that TCM is an effective adjuvant therapy for children with cancer.We proposed a checklist to improve the quality of TCM trials in pediatric oncology.Future work will examine the use of ARM techniques on real-world data to evaluate the efficacy of medicinal herbs and drug-herb interactions in children receiving TCM as a part of integrated cancer therapy.展开更多
Complex repairable system is composed of thousands of components.Some maintenance management and decision problems in maintenance management and decision need to classify a set of components into several classes based...Complex repairable system is composed of thousands of components.Some maintenance management and decision problems in maintenance management and decision need to classify a set of components into several classes based on data mining.Furthermore,with the complexity of industrial equipment increasing,the managers should pay more attention to the key components and carry out the lean management is very important.Therefore,the idea"customer segmentation"of"precise marketing"can be used in the maintenance management of the multi-component system.Following the idea of segmentation,the components of multicomponent systems should be subdivied into groups based on specific attributes relevant to maintenance,such as maintenance cost,mean time between failures,and failure frequency.For the target specific groups of parts,the optimal maintenance policy,health assessment and maintenance scheduling can be determined.The proposed analysis framework will be given out.In order to illustrate the effectiveness of this method,a numerical example is given out.展开更多
Privacy is a critical requirement in distributed data mining. Cryptography-based secure multiparty computation is a main approach for privacy preserving. However, it shows poor performance in large scale distributed s...Privacy is a critical requirement in distributed data mining. Cryptography-based secure multiparty computation is a main approach for privacy preserving. However, it shows poor performance in large scale distributed systems. Meanwhile, data perturbation techniques are comparatively efficient but are mainly used in centralized privacy-preserving data mining (PPDM). In this paper, we propose a light-weight anonymous data perturbation method for efficient privacy preserving in distributed data mining. We first define the privacy constraints for data perturbation based PPDM in a semi-honest distributed environment. Two protocols are proposed to address these constraints and protect data statistics and the randomization process against collusion attacks: the adaptive privacy-preserving summary protocol and the anonymous exchange protocol. Finally, a distributed data perturbation framework based on these protocols is proposed to realize distributed PPDM. Experiment results show that our approach achieves a high security level and is very efficient in a large scale distributed environment.展开更多
Although big data are widely used in various fields,its application is still rare in the study of mining subsidence prediction(MSP)caused by underground mining.Traditional research in MSP has the problem of oversimpli...Although big data are widely used in various fields,its application is still rare in the study of mining subsidence prediction(MSP)caused by underground mining.Traditional research in MSP has the problem of oversimplifying geological mining conditions,ignoring the fluctuation of rock layers with space.In the context of geospatial big data,a data-intensive FLAC3D(Fast Lagrangian Analysis of a Continua in 3 Dimensions)model is proposed in this paper based on borehole logs.In the modeling process,we developed a method to handle geospatial big data and were able to make full use of borehole logs.The effectiveness of the proposed method was verified by comparing the results of the traditional method,proposed method,and field observation.The findings show that the proposed method has obvious advantages over the traditional prediction results.The relative error of the maximum surface subsidence predicted by the proposed method decreased by 93.7%and the standard deviation of the prediction results(which was 70 points)decreased by 39.4%,on average.The data-intensive modeling method is of great significance for improving the accuracy of mining subsidence predictions.展开更多
Bioinformatic analysis of large and complex omics datasets has become increasingly useful in modern day biology by providing a great depth of information,with its application to neuroscience termed neuroinformatics.Da...Bioinformatic analysis of large and complex omics datasets has become increasingly useful in modern day biology by providing a great depth of information,with its application to neuroscience termed neuroinformatics.Data mining of omics datasets has enabled the generation of new hypotheses based on differentially regulated biological molecules associated with disease mechanisms,which can be tested experimentally for improved diagnostic and therapeutic targeting of neurodegenerative diseases.Importantly,integrating multi-omics data using a systems bioinformatics approach will advance the understanding of the layered and interactive network of biological regulation that exchanges systemic knowledge to facilitate the development of a comprehensive human brain profile.In this review,we first summarize data mining studies utilizing datasets from the individual type of omics analysis,including epigenetics/epigenomics,transcriptomics,proteomics,metabolomics,lipidomics,and spatial omics,pertaining to Alzheimer's disease,Parkinson's disease,and multiple sclerosis.We then discuss multi-omics integration approaches,including independent biological integration and unsupervised integration methods,for more intuitive and informative interpretation of the biological data obtained across different omics layers.We further assess studies that integrate multi-omics in data mining which provide convoluted biological insights and offer proof-of-concept proposition towards systems bioinformatics in the reconstruction of brain networks.Finally,we recommend a combination of high dimensional bioinformatics analysis with experimental validation to achieve translational neuroscience applications including biomarker discovery,therapeutic development,and elucidation of disease mechanisms.We conclude by providing future perspectives and opportunities in applying integrative multi-omics and systems bioinformatics to achieve precision phenotyping of neurodegenerative diseases and towards personalized medicine.展开更多
Anomaly detection has been an active research topic in the field of network intrusion detection for many years. A novel method is presented for anomaly detection based on system calls into the kernels of Unix or Linux...Anomaly detection has been an active research topic in the field of network intrusion detection for many years. A novel method is presented for anomaly detection based on system calls into the kernels of Unix or Linux systems. The method uses the data mining technique to model the normal behavior of a privileged program and uses a variable-length pattern matching algorithm to perform the comparison of the current behavior and historic normal behavior, which is more suitable for this problem than the fixed-length pattern matching algorithm proposed by Forrest et al. At the detection stage, the particularity of the audit data is taken into account, and two alternative schemes could be used to distinguish between normalities and intrusions. The method gives attention to both computational efficiency and detection accuracy and is especially applicable for on-line detection. The performance of the method is evaluated using the typical testing data set, and the results show that it is significantly better than the anomaly detection method based on hidden Markov models proposed by Yan et al. and the method based on fixed-length patterns proposed by Forrest and Hofmeyr. The novel method has been applied to practical hosted-based intrusion detection systems and achieved high detection performance.展开更多
OBJECTIVE: To analyze the component law of Chinese medicines in fuming-washing therapy for knee osteoarthritis(KOA), and develop new fuming-washing prescriptions for KOA through unsupervised data mining methods.METHOD...OBJECTIVE: To analyze the component law of Chinese medicines in fuming-washing therapy for knee osteoarthritis(KOA), and develop new fuming-washing prescriptions for KOA through unsupervised data mining methods.METHODS: Chinese medicine recipes for fuming-washing therapy for KOA were collected and recorded in a database. The correlation coefficient among herbs, core combinations of herbs, andnew prescriptions were analyzed using modified mutual information, complex system entropy cluster, and unsupervised hierarchical clustering, respectively.RESULTS: Based on analysis of 345 Chinese medicine recipes for fuming-washing therapy, 68 herbs occurred frequently, 33 herb pairs occurred frequently, and 12 core combinations were found.Five new fuming-washing recipes for KOA were developed.CONCLUSION: Chinese medicines for fuming-washing therapy of KOA mainly consist of wind-dampness-dispelling and cold-dispersing herbs, blood-activating and stasis-resolving herbs,and wind-dampness-dispelling and heat-clearing herbs. The treatment of fuming-washing therapy for KOA also includes dispelling wind-dampness and dispersing cold, activating blood and resolving stasis, and dispelling wind-dampness and clearing heat. Zhenzhutougucao(Herba Speranskiae Tuberculatae), Honghua(Flos Carthami), Niuxi(Radix Achyranthis Bidentatae), Shenjincao(Herba Lycopodii Japonici), Weilingxian(Radix et Rhizoma Clematidis Chinensis), Chuanwu(Radix Aconiti), Haitongpi(Cortex Erythrinae Variegatae), Ruxiang(Olibanum),Danggui(Radix Angelicae Sinensis), Caowu(Radix Aconiti Kusnezoffii), Moyao(Myrrha), and Aiye(Folium Artemisiae Argyi) are the main herbs used in the fuming-washing treatment for KOA.展开更多
Data Mining (DM) methods are being increasingly used in prediction with time series data, in addition to traditional statistical approaches. This paper presents a literature review of the use of DM with time series da...Data Mining (DM) methods are being increasingly used in prediction with time series data, in addition to traditional statistical approaches. This paper presents a literature review of the use of DM with time series data, focusing on shorttime stocks prediction. This is an area that has been attracting a great deal of attention from researchers in the field. The main contribution of this paper is to provide an outline of the use of DM with time series data, using mainly examples related with short-term stocks prediction. This is important to a better understanding of the field. Some of the main trends and open issues will also be introduced.展开更多
This paper considers the problem of applying data mining techniques to aeronautical field.The truncation method,which is one of the techniques in the aeronautical data mining,can be used to efficiently handle the air-...This paper considers the problem of applying data mining techniques to aeronautical field.The truncation method,which is one of the techniques in the aeronautical data mining,can be used to efficiently handle the air-combat behavior data.The technique of air-combat behavior data mining based on the truncation method is proposed to discover the air-combat rules or patterns.The simulation platform of the air-combat behavior data mining that supports two fighters is implemented.The simulation experimental results show that the proposed air-combat behavior data mining technique based on the truncation method is feasible whether in efficiency or in effectiveness.展开更多
The technique of data mining was provided to predict gas disaster in view of the characteristics of coal mine gas disaster and feature knowledge based on gas disaster. The rough set theory was used to establish data m...The technique of data mining was provided to predict gas disaster in view of the characteristics of coal mine gas disaster and feature knowledge based on gas disaster. The rough set theory was used to establish data mining model of gas disaster prediction, and rough set attributes relations was discussed in prediction model of gas disaster to supplement the shortages of rough intensive reduction method by using information en- tropy criteria.The effectiveness and practicality of data mining technology in the prediction of gas disaster is confirmed through practical application.展开更多
基金Song Hujie’s Inheritance Studio of National Renowned Traditional Chinese Medicine Experts.
文摘Objective:To explore the core acupuncture acupoints and pattern-adapted acupoint combination rules for autism spectrum disorder(ASD)complicated with sleep disorder using clinical data mining technology.Methods:A retrospective analysis was conducted on the diagnosis and treatment data of 104 children with ASD complicated with sleep disorder admitted to Xi’an Traditional Chinese Medicine(TCM)Encephalopathy Hospital from January 2022 to December 2024.Cross-pattern main acupoints were screened via frequency statistics,chi-square test,and factor analysis;pattern-specific auxiliary acupoints were extracted by combining multiple correspondence analysis,cluster analysis,and association rule mining.Results:Ten cross-pattern main acupoints(Baihui,Sishenzhen,Language Area 1,Language Area 2,Neiguan,Shenmen,Yongquan,Xuanzhong)were identified,and acupoint combination schemes for four major TCM patterns(Hyperactivity of Liver and Heart Fire,Deficiency of Kidney Essence,Deficiency of Both Heart and Spleen,Hyperactivity of Liver with Spleen Deficiency)were established.Conclusion:Acupuncture treatment should follow the principle of“regulating spirit and calming the brain as the root,and dredging collaterals based on pattern differentiation as the branch”.The synergy between main and auxiliary acupoints can accurately regulate the disease,providing a basis for precise clinical treatment.
文摘Introduction Neurosurgical emergencies such as spontaneous intracerebral hemorrhage(ICH),traumatic brain injury(TBI),and acute brain herniation are among the most time-sensitive and high-stakes conditions in modern medicine.Clinical decisions often must be made within minutes,yet these decisions are traditionally guided by limited information,heuristic reasoning,and past experience.In this context,the rise of medical data mining and real-time analytics offers a transformative opportunity:to extract actionable intelligence from the flood of clinical,imaging,and physiological data already being collected,and to use this intelligence to guide care in real time[1–3](Figure 1).
文摘Previous weighted frequent pattern (WFP) mining algorithms are not suitable for data streams for they need multiple database scans. In this paper, we present an efficient algorithm SWFP-Miner to mine weighted frequent pattern over data streams. SWFP-Miner is based on sliding window and can discover important frequent pattern from the recent data. A new refined weight definition is proposed to keep the downward closure property, and two pruning strategies are presented to prune the weighted infrequent pattern. Experimental studies are performed to evaluate the effectiveness and efficiency of SWFP-Miner.
基金National Natural Science Foundation of China:81072883,81173342,81473773Scientific Research Project of Hebei Education Department:Z 2014145Planned Project of Young Talents in Colleges and Universities in Hebei Province:BJ 2014047
文摘Different acupuncture-moxibustion therapies can produce different clinical effects, that is, the effect has specificity, which is significantly important in obtaining acupuncture-moxibustion efficacy. In this study, the clinical application laws of fire needle, acupoint injection, catgut embedment, acupoint application, moxibustion therapy and filiform needle acupuncture were summarized in the aspects of category of disease, efficacy and related prescriptions (such as medication and acupoint selection) based on the result of data mining, and the general applicable categories of disease of acupuncture-moxibustion treatment methods were further screened, so as to guide the clinical application and give play to the best efficacy.
文摘Knowledge Discovery in Databases is gaining attention and raising new hopes for traditional Chinese medicine (TCM) researchers. It is a useful tool in understanding and deciphering TCM theories. Aiming for a better understanding of Chinese herbal property theory (CHPT), this paper performed an improved association rule learning to analyze semistructured text in the book entitled Shennong's Classic of Materia Medica. The text was firstly annotated and transformed to well-structured multidimensional data. Subsequently, an Apriori algorithm was employed for producing association rules after the sensitivity analysis of parameters. From the confirmed 120 resulting rules that described the intrinsic relationships between herbal property (qi, flavor and their combinations) and herbal efficacy, two novel fundamental principles underlying CHPT were acquired and further elucidated: (1) the many-to-one mapping of herbal efficacy to herbal property; (2) the nonrandom overlap between the related efficacy of qi and flavor. This work provided an innovative knowledge about CHPT, which would be helpful for its modern research.
基金Supported by the National Basic Research Program of China(No2011CB505106)the National Natural Science Foundation of China(No30902020)+2 种基金the Foundation of National Department of Public Benefit Research of China(No200807007)the Creation Fund for Significant New Drugs of China(No2009ZX09502-018)the Foundation of International Science and Technology Cooperation of China(No2008DFA30610)
文摘Unstable angina(UA) is the most dangerous type of Coronary Heart Disease(CHD) to cause more and more mortal and morbid world wide. Identification of biomarkers for UA at the level of proteomics and metabolomics is a better avenue to understand the inner mechanism of it. Feature selection based data mining method is better suited to identify biomarkers of UA. In this study, we carried out clinical epidemiology to collect plasmas of UA in-patients and controls. Proteomics and metabolomics data were obtained via two-dimensional difference gel electrophoresis and gas chromatography techniques. We presented a novel computational strategy to select biomarkers as few as possible for UA in the two groups of data. Firstly, decision tree was used to select biomarkers for UA and 3-fold cross validation was used to evaluate computational performanees for the three methods. Alternatively, we combined inde- pendent t test and classification based data mining method as well as backward elimination technique to select, as few as possible, protein and metabolite biomarkers with best classification performances. By the method, we selected 6 proteins and 5 metabolites for UA. The novel method presented here provides a better insight into the pathology of a disease.
基金Supported by National Natural Science Foundation grant NO.40976108Public Projects of Science and Technology Ministry grant NO.201105033
文摘OBJECTIVE:We applied data mining techniques to the study of acupuncture as a treatment for juvenile myopia,with the aim of identifying hidden patterns in the data.METHODS:Fifty patients with juvenile myopia were selected and treated with acupuncture,and data mining was used to analyze the effects of treatment and the influence of behavioral variables.Clustering analysis was used to divide myopia patients into two classifications before acupuncture treatment.Artificial neural network BP algorithm was adopted to analyze the roles of different factors in changes in diopters.An association algorithm was used to analyze factors associated with the subjective experience of acupuncture and average diopter.RESULTS:The two classification results were fully consistent with the understandings of the ophthalmic circles.The duration of using the Internet and watching TV every day was the main factor that affected vision.Acupuncture feelings and therapeutic effect have a strong correlativity.A good or above experience's score of acupuncture could slow the progression of juvenile myopia.CONCLUSION:Collecting data from patients with juvenile myopia by using data mining can extract hidden potential rules and knowledge from the research evidence.The decision support can be provided to improve the doctor's clinical acupuncture treatment effects.
基金supported by the National Natural Science Foundation of China(61371172)the International S&T Cooperation Program of China(2015DFR10220)+1 种基金the Ocean Engineering Project of National Key Laboratory Foundation(1213)the Fundamental Research Funds for the Central Universities(HEUCF1608)
文摘For the multi-mode radar working in the modern electronicbattlefield, different working states of one single radar areprone to being classified as multiple emitters when adoptingtraditional classification methods to process intercepted signals,which has a negative effect on signal classification. A classificationmethod based on spatial data mining is presented to address theabove challenge. Inspired by the idea of spatial data mining, theclassification method applies nuclear field to depicting the distributioninformation of pulse samples in feature space, and digs out thehidden cluster information by analyzing distribution characteristics.In addition, a membership-degree criterion to quantify the correlationamong all classes is established, which ensures classificationaccuracy of signal samples. Numerical experiments show that thepresented method can effectively prevent different working statesof multi-mode emitter from being classified as several emitters,and achieve higher classification accuracy.
基金supported by the Foundation of Zhejiang Province Educational Committee(No.Y201330180)
文摘Tanshinone IIA is a pharmacologically active compound isolated from Danshen(Salvia miltiorrhiza), a traditional Chinese herbal medicine for the management of cardiac diseases and other disorders. But its underlying molecular mechanisms of action are still unclear. The present investigation utilized a data mining approach based on network pharmacology to uncover the potential protein targets of Tanshinone IIA. Network pharmacology, an integrated multidisciplinary study, incorporates systems biology, network analysis, connectivity, redundancy, and pleiotropy, providing powerful new tools and insights into elucidating the fine details of drug-target interactions. In the present study, two separate drug-target networks for Tanshinone IIA were constructed using the Agilent Literature Search(ALS) and STITCH(search tool for interactions of chemicals) methods. Analysis of the ALS-constructed network revealed a target network with a scale-free topology and five top nodes(protein targets) corresponding to Fos, Jun, Src, phosphatidylinositol-4, 5-bisphosphate 3-kinase, catalytic subunit alpha(PIK3CA), and mitogen-activated protein kinase kinase 1(MAP2K1), whereas analysis of the STITCH-constructed network revealed three top nodes corresponding to cytochrome P450 3A4(CYP3A4), cytochrome P450 A1(CYP1A1), and nuclear factor kappa B1(NFκB1). The discrepancies were probably due to the differences in the divergent computer mining tools and databases employed by the two methods. However, it is conceivable that all eight proteins mediate important biological functions of Tanshinone IIA, contributing to its overall drug-target network. In conclusion, the current results may assist in developing a comprehensive understanding of the molecular mechanisms and signaling pathways of in a simple, compact, and visual manner.
基金supported by Scientific Research Special Project of TCM Profession (200907001E)Science and Technology Special Major Project for "Significant New Drugs Formulation" (2009ZX09301-005-02)
文摘Objective:To analyze the component law of Chinese patent medicines for anti-influenza and develop new prescriptions for anti-influenza by unsupervised data mining methods. Methods: Chinese patent medicine recipes for anti-influenza were collected and recorded in the database, and then the correlation coefficient between herbs, core combinations of herbs and new prescriptions were analyzed by using modified mutual information, complex system entropy cluster and unsupervised hierarchical clustering, respectively. Results: Based on analysis of 126 Chinese patent medicine recipes, the frequency of each herb occurrence in these recipes, 54 frequently-used herb pairs, 34 core combinations were determined, and 4 new recipes for influenza were developed. Conclusion: Unsupervised data mining methods are able to mine the component law quickly and develop new prescriptions.
文摘Background Traditional Chinese medicine(TCM)is becoming a popular complementary approach in pediatric oncology.However,few or no meta-analyses have focused on clinical studies of the use of TCM in pediatric oncology.Objective We explored the patterns of TCM use and its efficacy in children with cancer,using a systematic review,meta-analysis and data mining study.Search strategy We conducted a search of five English(Allied and Complementary Medicine Database,Embase,PubMed,Cochrane Central Register of Controlled Trials,and ClinicalTrials.gov)and four Chinese databases(Wanfang Data,China National Knowledge Infrastructure,Chinese Biomedical Literature Database,and VIP Chinese Science and Technology Periodicals Database)for clinical studies published before October 2021,using keywords related to“pediatric,”“cancer,”and“TCM.”Inclusion criteria We included studies which were randomized controlled trials(RCTs)or observational clinical studies,focused on patients aged<19 years old who had been diagnosed with cancer,and included at least one group of subjects receiving TCM treatment.Data extraction and analysis The methodological quality of RCTs and observational studies was assessed using the six-item Jadad scale and the Effective Public Healthcare Panacea Project Quality Assessment Tool,respectively.Meta-analysis was used to evaluate the efficacy of combining TCM with chemotherapy.Study outcomes included the treatment response rate and occurrence of cancer-related symptoms.Association rule mining(ARM)was used to investigate the associations among medicinal herbs and patient symptoms.Results The fifty-four studies included in this analysis were comprised of RCTs(63.0%)and observational studies(37.0%).Most RCTs focused on hematological malignancies(41.2%).The study outcomes included chemotherapy-induced toxicities(76.5%),infection rate(35.3%),and response,survival or relapse rate(23.5%).The methodological quality of most of the RCTs(82.4%)and observational studies(80.0%)was rated as“moderate.”In studies of leukemia patients,adding TCM to conventional treatment significantly improved the clinical response rate(odds ratio[OR]=2.55;95%confidence interval[CI]=1.49-4.36),lowered infection rate(OR=0.23;95%CI=0.13-0.40),and reduced nausea and vomiting(OR=0.13;95%CI=0.08-0.23).ARM showed that Radix Astragali,the most commonly used medicinal herb(58.0%),was associated with treating myelosuppression,gastrointestinal complications,and infection.Conclusion There is growing evidence that TCM is an effective adjuvant therapy for children with cancer.We proposed a checklist to improve the quality of TCM trials in pediatric oncology.Future work will examine the use of ARM techniques on real-world data to evaluate the efficacy of medicinal herbs and drug-herb interactions in children receiving TCM as a part of integrated cancer therapy.
基金National Natural Science Foundations of China(No.71501103)Natural Science Foundation of Inner Mongolia,China(No.2015BS0705)the Program of Higher-Level Talents of Inner Mongolia University,China(No.20700-5145131)
文摘Complex repairable system is composed of thousands of components.Some maintenance management and decision problems in maintenance management and decision need to classify a set of components into several classes based on data mining.Furthermore,with the complexity of industrial equipment increasing,the managers should pay more attention to the key components and carry out the lean management is very important.Therefore,the idea"customer segmentation"of"precise marketing"can be used in the maintenance management of the multi-component system.Following the idea of segmentation,the components of multicomponent systems should be subdivied into groups based on specific attributes relevant to maintenance,such as maintenance cost,mean time between failures,and failure frequency.For the target specific groups of parts,the optimal maintenance policy,health assessment and maintenance scheduling can be determined.The proposed analysis framework will be given out.In order to illustrate the effectiveness of this method,a numerical example is given out.
基金Project supported by the National Natural Science Foundation of China (Nos. 60772098 and 60672068)the New Century Excel-lent Talents in University of China (No. NCET-06-0393)
文摘Privacy is a critical requirement in distributed data mining. Cryptography-based secure multiparty computation is a main approach for privacy preserving. However, it shows poor performance in large scale distributed systems. Meanwhile, data perturbation techniques are comparatively efficient but are mainly used in centralized privacy-preserving data mining (PPDM). In this paper, we propose a light-weight anonymous data perturbation method for efficient privacy preserving in distributed data mining. We first define the privacy constraints for data perturbation based PPDM in a semi-honest distributed environment. Two protocols are proposed to address these constraints and protect data statistics and the randomization process against collusion attacks: the adaptive privacy-preserving summary protocol and the anonymous exchange protocol. Finally, a distributed data perturbation framework based on these protocols is proposed to realize distributed PPDM. Experiment results show that our approach achieves a high security level and is very efficient in a large scale distributed environment.
文摘Although big data are widely used in various fields,its application is still rare in the study of mining subsidence prediction(MSP)caused by underground mining.Traditional research in MSP has the problem of oversimplifying geological mining conditions,ignoring the fluctuation of rock layers with space.In the context of geospatial big data,a data-intensive FLAC3D(Fast Lagrangian Analysis of a Continua in 3 Dimensions)model is proposed in this paper based on borehole logs.In the modeling process,we developed a method to handle geospatial big data and were able to make full use of borehole logs.The effectiveness of the proposed method was verified by comparing the results of the traditional method,proposed method,and field observation.The findings show that the proposed method has obvious advantages over the traditional prediction results.The relative error of the maximum surface subsidence predicted by the proposed method decreased by 93.7%and the standard deviation of the prediction results(which was 70 points)decreased by 39.4%,on average.The data-intensive modeling method is of great significance for improving the accuracy of mining subsidence predictions.
基金supported by a Lee Kong Chian School of Medicine Dean’s Postdoctoral Fellowship(021207-00001)from Nanyang Technological University(NTU)Singapore and a Mistletoe Research Fellowship(022522-00001)from the Momental Foundation USA.Jialiu Zeng is supported by a Presidential Postdoctoral Fellowship(021229-00001)from NTU Singapore and an Open Fund Young Investigator Research Grant(OF-YIRG)(MOH-001147)from the National Medical Research Council(NMRC)SingaporeSu Bin Lim is supported by the National Research Foundation(NRF)of Korea(Grant Nos.:2020R1A6A1A03043539,2020M3A9D8037604,2022R1C1C1004756)a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute(KHIDI),funded by the Ministry of Health&Welfare,Republic of Korea(Grant No.:HR22C1734).
文摘Bioinformatic analysis of large and complex omics datasets has become increasingly useful in modern day biology by providing a great depth of information,with its application to neuroscience termed neuroinformatics.Data mining of omics datasets has enabled the generation of new hypotheses based on differentially regulated biological molecules associated with disease mechanisms,which can be tested experimentally for improved diagnostic and therapeutic targeting of neurodegenerative diseases.Importantly,integrating multi-omics data using a systems bioinformatics approach will advance the understanding of the layered and interactive network of biological regulation that exchanges systemic knowledge to facilitate the development of a comprehensive human brain profile.In this review,we first summarize data mining studies utilizing datasets from the individual type of omics analysis,including epigenetics/epigenomics,transcriptomics,proteomics,metabolomics,lipidomics,and spatial omics,pertaining to Alzheimer's disease,Parkinson's disease,and multiple sclerosis.We then discuss multi-omics integration approaches,including independent biological integration and unsupervised integration methods,for more intuitive and informative interpretation of the biological data obtained across different omics layers.We further assess studies that integrate multi-omics in data mining which provide convoluted biological insights and offer proof-of-concept proposition towards systems bioinformatics in the reconstruction of brain networks.Finally,we recommend a combination of high dimensional bioinformatics analysis with experimental validation to achieve translational neuroscience applications including biomarker discovery,therapeutic development,and elucidation of disease mechanisms.We conclude by providing future perspectives and opportunities in applying integrative multi-omics and systems bioinformatics to achieve precision phenotyping of neurodegenerative diseases and towards personalized medicine.
基金supported by the National Grand Fundamental Research "973" Program of China (2004CB318109)the National High-Technology Research and Development Plan of China (2006AA01Z452)the National Information Security "242"Program of China (2005C39).
文摘Anomaly detection has been an active research topic in the field of network intrusion detection for many years. A novel method is presented for anomaly detection based on system calls into the kernels of Unix or Linux systems. The method uses the data mining technique to model the normal behavior of a privileged program and uses a variable-length pattern matching algorithm to perform the comparison of the current behavior and historic normal behavior, which is more suitable for this problem than the fixed-length pattern matching algorithm proposed by Forrest et al. At the detection stage, the particularity of the audit data is taken into account, and two alternative schemes could be used to distinguish between normalities and intrusions. The method gives attention to both computational efficiency and detection accuracy and is especially applicable for on-line detection. The performance of the method is evaluated using the typical testing data set, and the results show that it is significantly better than the anomaly detection method based on hidden Markov models proposed by Yan et al. and the method based on fixed-length patterns proposed by Forrest and Hofmeyr. The novel method has been applied to practical hosted-based intrusion detection systems and achieved high detection performance.
基金Supported by Grant from the Administration of Traditional Chinese Medicine of Guangdong Province in China(No.20131161)the Specialized Research Fund for the Doctoral Program of Higher Education of China(No.20124425110004)
文摘OBJECTIVE: To analyze the component law of Chinese medicines in fuming-washing therapy for knee osteoarthritis(KOA), and develop new fuming-washing prescriptions for KOA through unsupervised data mining methods.METHODS: Chinese medicine recipes for fuming-washing therapy for KOA were collected and recorded in a database. The correlation coefficient among herbs, core combinations of herbs, andnew prescriptions were analyzed using modified mutual information, complex system entropy cluster, and unsupervised hierarchical clustering, respectively.RESULTS: Based on analysis of 345 Chinese medicine recipes for fuming-washing therapy, 68 herbs occurred frequently, 33 herb pairs occurred frequently, and 12 core combinations were found.Five new fuming-washing recipes for KOA were developed.CONCLUSION: Chinese medicines for fuming-washing therapy of KOA mainly consist of wind-dampness-dispelling and cold-dispersing herbs, blood-activating and stasis-resolving herbs,and wind-dampness-dispelling and heat-clearing herbs. The treatment of fuming-washing therapy for KOA also includes dispelling wind-dampness and dispersing cold, activating blood and resolving stasis, and dispelling wind-dampness and clearing heat. Zhenzhutougucao(Herba Speranskiae Tuberculatae), Honghua(Flos Carthami), Niuxi(Radix Achyranthis Bidentatae), Shenjincao(Herba Lycopodii Japonici), Weilingxian(Radix et Rhizoma Clematidis Chinensis), Chuanwu(Radix Aconiti), Haitongpi(Cortex Erythrinae Variegatae), Ruxiang(Olibanum),Danggui(Radix Angelicae Sinensis), Caowu(Radix Aconiti Kusnezoffii), Moyao(Myrrha), and Aiye(Folium Artemisiae Argyi) are the main herbs used in the fuming-washing treatment for KOA.
文摘Data Mining (DM) methods are being increasingly used in prediction with time series data, in addition to traditional statistical approaches. This paper presents a literature review of the use of DM with time series data, focusing on shorttime stocks prediction. This is an area that has been attracting a great deal of attention from researchers in the field. The main contribution of this paper is to provide an outline of the use of DM with time series data, using mainly examples related with short-term stocks prediction. This is important to a better understanding of the field. Some of the main trends and open issues will also be introduced.
文摘This paper considers the problem of applying data mining techniques to aeronautical field.The truncation method,which is one of the techniques in the aeronautical data mining,can be used to efficiently handle the air-combat behavior data.The technique of air-combat behavior data mining based on the truncation method is proposed to discover the air-combat rules or patterns.The simulation platform of the air-combat behavior data mining that supports two fighters is implemented.The simulation experimental results show that the proposed air-combat behavior data mining technique based on the truncation method is feasible whether in efficiency or in effectiveness.
基金the National Natural Science Foundation of China(70572070)the Liaoning Province Talents Fund Projects(2005219005)the Technology Key Project of Liaoning Province(2006220019)
文摘The technique of data mining was provided to predict gas disaster in view of the characteristics of coal mine gas disaster and feature knowledge based on gas disaster. The rough set theory was used to establish data mining model of gas disaster prediction, and rough set attributes relations was discussed in prediction model of gas disaster to supplement the shortages of rough intensive reduction method by using information en- tropy criteria.The effectiveness and practicality of data mining technology in the prediction of gas disaster is confirmed through practical application.