期刊文献+
共找到481篇文章
< 1 2 25 >
每页显示 20 50 100
Accuracies and Training Times of Data Mining Classification Algorithms:An Empirical Comparative Study 被引量:2
1
作者 S.Olalekan Akinola O.Jephthar Oyabugbe 《Journal of Software Engineering and Applications》 2015年第9期470-477,共8页
Two important performance indicators for data mining algorithms are accuracy of classification/ prediction and time taken for training. These indicators are useful for selecting best algorithms for classification/pred... Two important performance indicators for data mining algorithms are accuracy of classification/ prediction and time taken for training. These indicators are useful for selecting best algorithms for classification/prediction tasks in data mining. Empirical studies on these performance indicators in data mining are few. Therefore, this study was designed to determine how data mining classification algorithm perform with increase in input data sizes. Three data mining classification algorithms—Decision Tree, Multi-Layer Perceptron (MLP) Neural Network and Na&iuml;ve Bayes— were subjected to varying simulated data sizes. The time taken by the algorithms for trainings and accuracies of their classifications were analyzed for the different data sizes. Results show that Na&iuml;ve Bayes takes least time to train data but with least accuracy as compared to MLP and Decision Tree algorithms. 展开更多
关键词 Artificial Neural Network classification data mining decision tree Naive Bayesian Performance Evaluation
在线阅读 下载PDF
Research on Scholarship Evaluation System based on Decision Tree Algorithm 被引量:1
2
作者 YIN Xiao WANG Ming-yu 《电脑知识与技术》 2015年第3X期11-13,共3页
Under the modern education system of China, the annual scholarship evaluation is a vital thing for many of the collegestudents. This paper adopts the classification algorithm of decision tree C4.5 based on the betteri... Under the modern education system of China, the annual scholarship evaluation is a vital thing for many of the collegestudents. This paper adopts the classification algorithm of decision tree C4.5 based on the bettering of ID3 algorithm and constructa data set of the scholarship evaluation system through the analysis of the related attributes in scholarship evaluation information.And also having found some factors that plays a significant role in the growing up of the college students through analysis and re-search of moral education, intellectural education and culture&PE. 展开更多
关键词 data mining scholarship evaluation system decision tree algorithm C4.5 algorithm
在线阅读 下载PDF
Data mining and well logging interpretation: application to a conglomerate reservoir 被引量:8
3
作者 石宁 李洪奇 罗伟平 《Applied Geophysics》 SCIE CSCD 2015年第2期263-272,276,共11页
Data mining is the process of extracting implicit but potentially useful information from incomplete, noisy, and fuzzy data. Data mining offers excellent nonlinear modeling and self-organized learning, and it can play... Data mining is the process of extracting implicit but potentially useful information from incomplete, noisy, and fuzzy data. Data mining offers excellent nonlinear modeling and self-organized learning, and it can play a vital role in the interpretation of well logging data of complex reservoirs. We used data mining to identify the lithologies in a complex reservoir. The reservoir lithologies served as the classification task target and were identified using feature extraction, feature selection, and modeling of data streams. We used independent component analysis to extract information from well curves. We then used the branch-and- bound algorithm to look for the optimal feature subsets and eliminate redundant information. Finally, we used the C5.0 decision-tree algorithm to set up disaggregated models of the well logging curves. The modeling and actual logging data were in good agreement, showing the usefulness of data mining methods in complex reservoirs. 展开更多
关键词 data mining well logging interpretation independent component analysis branch-and-bound algorithm C5.0 decision tree
在线阅读 下载PDF
Forecasting Model of Agro-meteorological Disaster Grade Based on Decision Tree 被引量:2
4
作者 司巧梅 《Meteorological and Environmental Research》 CAS 2010年第2期85-87,90,共4页
Based on the discuss of the basic concept of data mining technology and the decision tree method,combining with the data samples of wind and hailstorm disasters in some counties of Mudanjiang region,the forecasting mo... Based on the discuss of the basic concept of data mining technology and the decision tree method,combining with the data samples of wind and hailstorm disasters in some counties of Mudanjiang region,the forecasting model of agro-meteorological disaster grade was established by adopting the C4.5 classification algorithm of decision tree,which can forecast the direct economic loss degree to provide rational data mining model and obtain effective analysis results. 展开更多
关键词 data mining Agro-meteorology decision tree C4.5 algorithm classification mining China
在线阅读 下载PDF
Developing a prediction model for customer churn from electronic banking services using data mining 被引量:5
5
作者 Abbas Keramati Hajar Ghaneei Seyed Mohammad Mirmohammadi 《Financial Innovation》 2016年第1期122-134,共13页
Background:Given the importance of customers as the most valuable assets of organizations,customer retention seems to be an essential,basic requirement for any organization.Banks are no exception to this rule.The comp... Background:Given the importance of customers as the most valuable assets of organizations,customer retention seems to be an essential,basic requirement for any organization.Banks are no exception to this rule.The competitive atmosphere within which electronic banking services are provided by different banks increases the necessity of customer retention.Methods:Being based on existing information technologies which allow one to collect data from organizations’databases,data mining introduces a powerful tool for the extraction of knowledge from huge amounts of data.In this research,the decision tree technique was applied to build a model incorporating this knowledge.Results:The results represent the characteristics of churned customers.Conclusions:Bank managers can identify churners in future using the results of decision tree.They should be provide some strategies for customers whose features are getting more likely to churner’s features. 展开更多
关键词 Customer churn data mining Electronic banking services decision tree classification
在线阅读 下载PDF
Improving Decision Tree Performance by Exception Handling 被引量:1
6
作者 Appavu Alias Balamurugan Subramanian S.Pramala +1 位作者 B.Rajalakshmi Ramasamy Rajaram 《International Journal of Automation and computing》 EI 2010年第3期372-380,共9页
This paper focuses on improving decision tree induction algorithms when a kind of tie appears during the rule generation procedure for specific training datasets.The tie occurs when there are equal proportions of the ... This paper focuses on improving decision tree induction algorithms when a kind of tie appears during the rule generation procedure for specific training datasets.The tie occurs when there are equal proportions of the target class outcome in the leaf node's records that leads to a situation where majority voting cannot be applied.To solve the above mentioned exception,we propose to base the prediction of the result on the naive Bayes(NB)estimate,k-nearest neighbour(k-NN)and association rule mining(ARM).The other features used for splitting the parent nodes are also taken into consideration. 展开更多
关键词 data mining classification decision tree majority voting naive Bayes(NB) k nearest neighbour(k NN) association rule mining(ARM)
在线阅读 下载PDF
Exploiting Empirical Variance for Data Stream Classification
7
作者 ZIA-UR REHMAN Muhammad 李天瑞 李涛 《Journal of Shanghai Jiaotong university(Science)》 EI 2012年第2期245-250,共6页
Classification,using the decision tree algorithm,is a widely studied problem in data streams.The challenge is when to split a decision node into multiple leaves.Concentration inequalities,that exploit variance informa... Classification,using the decision tree algorithm,is a widely studied problem in data streams.The challenge is when to split a decision node into multiple leaves.Concentration inequalities,that exploit variance information such as Bernstein's and Bennett's inequalities,are often substantially strict as compared with Hoeffding's bound which disregards variance.Many machine learning algorithms for stream classification such as very fast decision tree(VFDT) learner,AdaBoost and support vector machines(SVMs),use the Hoeffding's bound as a performance guarantee.In this paper,we propose a new algorithm based on the recently proposed empirical Bernstein's bound to achieve a better probabilistic bound on the accuracy of the decision tree.Experimental results on four synthetic and two real world data sets demonstrate the performance gain of our proposed technique. 展开更多
关键词 Hoeffding and Bernstein’s bounds data stream classification decision tree anytime algorithm
原文传递
Study on the Grouping of Patients with Chronic Infectious Diseases Based on Data Mining
8
作者 Min Li 《Journal of Biosciences and Medicines》 2019年第11期119-135,共17页
Objective: According to RFM model theory of customer relationship management, data mining technology was used to group the chronic infectious disease patients to explore the effect of customer segmentation on the mana... Objective: According to RFM model theory of customer relationship management, data mining technology was used to group the chronic infectious disease patients to explore the effect of customer segmentation on the management of patients with different characteristics. Methods: 170,246 outpatient data was extracted from the hospital management information system (HIS) during January 2016 to July 2016, 43,448 data was formed after the data cleaning. K-Means clustering algorithm was used to classify patients with chronic infectious diseases, and then C5.0 decision tree algorithm was used to predict the situation of patients with chronic infectious diseases. Results: Male patients accounted for 58.7%, patients living in Shanghai accounted for 85.6%. The average age of patients is 45.88 years old, the high incidence age is 25 to 65 years old. Patients was gathered into three categories: 1) Clusters 1—Important patients (4786 people, 11.72%, R = 2.89, F = 11.72, M = 84,302.95);2) Clustering 2—Major patients (23,103, 53.2%, R = 5.22, F = 3.45, M = 9146.39);3) Cluster 3—Potential patients (15,559 people, 35.8%, R = 19.77, F = 1.55, M = 1739.09). C5.0 decision tree algorithm was used to predict the treatment situation of patients with chronic infectious diseases, the final treatment time (weeks) is an important predictor, the accuracy rate is 99.94% verified by the confusion model. Conclusion: Medical institutions should strengthen the adherence education for patients with chronic infectious diseases, establish the chronic infectious diseases and customer relationship management database, take the initiative to help them improve treatment adherence. Chinese governments at all levels should speed up the construction of hospital information, establish the chronic infectious disease database, strengthen the blocking of mother-to-child transmission, to effectively curb chronic infectious diseases, reduce disease burden and mortality. 展开更多
关键词 data mining K-Means Clustering algorithm C5.0 decision tree algorithm Customer Relationship Management PATIENTS with CHRONIC INFECTIOUS Disease
暂未订购
Innovative data mining approaches for outcome prediction of trauma patients
9
作者 Eleni-Maria Theodoraki Stylianos Katsaragakis +1 位作者 Christos Koukouvinos Christina Parpoula 《Journal of Biomedical Science and Engineering》 2010年第8期791-798,共8页
Trauma is the most common cause of death to young people and many of these deaths are preventable [1]. The prediction of trauma patients outcome was a difficult problem to investigate till present times. In this study... Trauma is the most common cause of death to young people and many of these deaths are preventable [1]. The prediction of trauma patients outcome was a difficult problem to investigate till present times. In this study, prediction models are built and their capabilities to accurately predict the mortality are assessed. The analysis includes a comparison of data mining techniques using classification, clustering and association algorithms. Data were collected by Hellenic Trauma and Emergency Surgery Society from 30 Greek hospitals. Dataset contains records of 8544 patients suffering from severe injuries collected from the year 2005 to 2006. Factors include patients' demographic elements and several other variables registered from the time and place of accident until the hospital treatment and final outcome. Using this analysis the obtained results are compared in terms of sensitivity, specificity, positive predictive value and negative predictive value and the ROC curve depicts these methods performance. 展开更多
关键词 data mining Medical data decision trees classification RULES Association RULES CLUSTERS CONFUSION Matrix ROC
暂未订购
SPRINT算法及其改进方法 被引量:3
10
作者 罗可 张学茂 《计算机工程与应用》 CSCD 北大核心 2005年第32期178-180,189,共4页
分类是数据挖掘中重要的研究课题。文章介绍了SPRINT分类算法。为了提高该算法在海量数据库中分类的总体效率,笔者提出了两种处理离散属性的新方法,这些方法能明显减少求最佳分割点的运算量,提高算法的执行速度。
关键词 数据挖掘 分类 决策树 sprint算法
在线阅读 下载PDF
基于SPRINT方法的并行决策树分类研究 被引量:18
11
作者 魏红宁 《计算机应用》 CSCD 北大核心 2005年第1期39-41,共3页
决策树技术的最大问题之一就是它的计算复杂性和训练数据的规模成正比,导致在大的数据集上构造决策树的计算时间太长。并行构造决策树是解决这个问题的一种有效方法。文中基于同步构造决策树的思想,对SPRINT方法的并行性做了详细分析和... 决策树技术的最大问题之一就是它的计算复杂性和训练数据的规模成正比,导致在大的数据集上构造决策树的计算时间太长。并行构造决策树是解决这个问题的一种有效方法。文中基于同步构造决策树的思想,对SPRINT方法的并行性做了详细分析和研究,并提出了进一步研究的方向。 展开更多
关键词 数据挖掘 sprint决策树分类 并行性
在线阅读 下载PDF
基于Hadoop平台的SPRINT算法的分析与研究 被引量:2
12
作者 黄刚 孙媛 《南京师大学报(自然科学版)》 CAS CSCD 北大核心 2016年第4期25-30,共6页
传统的决策树算法在单机平台上处理海量数据挖掘时,容易受到计算能力和存储能力的限制,所以存在耗时过长、容错性差、存储量小的缺点.而拥有高可靠性和高容错性的Hadoop平台的出现为决策树算法的并行化提供了新的思路.本文设计和实现了... 传统的决策树算法在单机平台上处理海量数据挖掘时,容易受到计算能力和存储能力的限制,所以存在耗时过长、容错性差、存储量小的缺点.而拥有高可靠性和高容错性的Hadoop平台的出现为决策树算法的并行化提供了新的思路.本文设计和实现了一种基于Hadoop平台的并行SPRINT分类算法.实验结果表明:基于Hadoop平台的SPRINT分类算法比没有进行并行化的SPRINT算法具有较好的分类正确率、较低的时间复杂度和较好的并行性能,并且能明显提高算法求最佳分裂点时的执行速度. 展开更多
关键词 HADOOP MAPREDUCE 数据挖掘 决策树 sprint算法
在线阅读 下载PDF
基于SPRINT分类算法挖掘保险业务数据中的风险规则 被引量:1
13
作者 宾宁 《广东工业大学学报》 CAS 2007年第2期99-102,共4页
提出利用SPRINT算法对保险业务数据进行风险分析.针对医疗保险业务,详细介绍了SPRINT算法的预处理、计算最佳分裂、执行分裂的具体设计实现过程,并得出一些实用的风险规则.
关键词 sprint算法 分类算法 数据挖掘 保险业务
在线阅读 下载PDF
基于改进SPRINT分类算法的数据挖掘模型 被引量:2
14
作者 林敏 王李杰 《信息技术》 2024年第3期170-174,187,共6页
为解决目前数据挖掘模型分类时间长、挖掘准确率不高的问题,提出基于改进决策树分类算法(SPRINT)的数据挖掘模型。先采用最大-最小规范化公式完成原始数据线性变换,利用改进后的SPRINT分类算法按照输入数据特性进行分类,使用协同过滤技... 为解决目前数据挖掘模型分类时间长、挖掘准确率不高的问题,提出基于改进决策树分类算法(SPRINT)的数据挖掘模型。先采用最大-最小规范化公式完成原始数据线性变换,利用改进后的SPRINT分类算法按照输入数据特性进行分类,使用协同过滤技术生成与数据相近的属性集,计算数据属性相似度,生成语义规则集,为用户提供更优的数据服务。选取某公司营销数据集作为样本进行对比实验,结果表明,与对比模型相比,所提出的数据挖掘模型分类时间更短,挖掘准确率更高,能为用户提供更优质的数据服务。 展开更多
关键词 决策树分类算法 协同过滤技术 语义规则集 数据挖掘模型 神经网络
在线阅读 下载PDF
基于SPRINT分类算法进行医学预后分析的研究与应用 被引量:2
15
作者 雷炜 《现代计算机》 2008年第10期67-69,共3页
SPRINT算法是一种具有良好扩展性且能实现并行处理的数据分类方法,可以方便地从算法生成的决策树提取规则。在使用海量医学数据库进行预后分析中,它是值得推荐的一种研究方法。对该算法进行了深入研究,并在预后分析中进行了应用,对于类... SPRINT算法是一种具有良好扩展性且能实现并行处理的数据分类方法,可以方便地从算法生成的决策树提取规则。在使用海量医学数据库进行预后分析中,它是值得推荐的一种研究方法。对该算法进行了深入研究,并在预后分析中进行了应用,对于类似医学信息处理有启发意义。 展开更多
关键词 数据挖掘 决策树 sprint算法 预后分析
在线阅读 下载PDF
一个基于SPRINT的分类算法的实现
16
作者 谭勇 《湖北民族学院学报(自然科学版)》 CAS 2004年第2期72-75,共4页
在介绍数据挖掘、分类算法有关概念的基础上,介绍了决策树的具体生成算法.为了减少数据量,改进决策树算法实现时的数据结构,详细描述了基于SPRINT(scalable parallelizable induction of decision trees)分类算法的实现,给出了SPRINT算... 在介绍数据挖掘、分类算法有关概念的基础上,介绍了决策树的具体生成算法.为了减少数据量,改进决策树算法实现时的数据结构,详细描述了基于SPRINT(scalable parallelizable induction of decision trees)分类算法的实现,给出了SPRINT算法的性能评估. 展开更多
关键词 数据挖掘 分类 决策树 sprint算法
在线阅读 下载PDF
一种改进的SPRINT算法
17
作者 白玲玲 韩天鹏 《韶关学院学报》 2018年第9期20-25,共6页
自大数据时代以来,数据密集型计算已经引起了相当大的关注.数据密集型计算环境中的数据挖掘研究仍处于初级阶段.提出一种基于MapReduce编程框架和SPRINT算法的决策树分类算法M-BCBT. M-BCBT继承了MapReduce的优点,使算法更适合数据密集... 自大数据时代以来,数据密集型计算已经引起了相当大的关注.数据密集型计算环境中的数据挖掘研究仍处于初级阶段.提出一种基于MapReduce编程框架和SPRINT算法的决策树分类算法M-BCBT. M-BCBT继承了MapReduce的优点,使算法更适合数据密集型计算应用.算法的性能根据实例进行分析评估.实验结果表明,MBCBT可以缩短操作时间,提高大数据环境的准确性. 展开更多
关键词 sprint MAPREDUCE 决策树 数据挖掘
在线阅读 下载PDF
改进决策树算法的混合属性大数据分类优化方法
18
作者 剧树春 李来杰 《电子设计工程》 2026年第1期45-49,共5页
为了简化混合属性大数据的分类过程,并依据各类属性数据的内在特征,确保分类结果的准确性,文中提出了改进决策树算法的混合属性大数据分类优化方法。通过主成分分析法挖掘混合属性大数据之间的内在规律,提取混合属性大数据关键特征;构... 为了简化混合属性大数据的分类过程,并依据各类属性数据的内在特征,确保分类结果的准确性,文中提出了改进决策树算法的混合属性大数据分类优化方法。通过主成分分析法挖掘混合属性大数据之间的内在规律,提取混合属性大数据关键特征;构建基于C4.5算法的改进决策树算法分类模型,输入提取的关键特征,计算该特征的信息熵和信息增益率,采用动态调整的方式进行模式学习,实现动态修正信息熵,以此优化节点的分裂效果,从而进一步提升分类精准度,输出混合属性大数据分类结果。通过实验验证,该方法具有极高的精确度,能够清晰区分不同类别的数据,且性能稳定,分类效率更高、可靠性更强,能够有效抵御噪声对分类性能的不利影响,证明了所提方法实现混合属性大数据分类稳定性和可靠性。 展开更多
关键词 改进决策树算法 混合属性大数据 分类优化 C4.5算法 信息熵 信息增益率
在线阅读 下载PDF
SPRINT算法中寻找连续属性分割点方法的改进 被引量:2
19
作者 彭程 罗可 《计算机工程与应用》 CSCD 北大核心 2006年第27期155-157,共3页
文章针对SPRINT算法中的寻找连续属性最佳分割点计算量大的问题,改进了寻找连续属性最佳分割点的方法。改进后的方法可减少候选分割点的数目,从而减少计算量和计算时间。
关键词 数据挖掘 决策树 sprint算法
在线阅读 下载PDF
SPRINT分类算法的改进 被引量:3
20
作者 王云飞 《科学技术与工程》 2008年第23期6248-6252,共5页
在数据挖掘中分类是一个重要的研究方向,SPRINT算法是分类算法中很著名的算法。分析了SPRINT算法存在的不足和可以改进的地方。提出一种提高SPRINT算法建树速度的新方法。
关键词 数据挖掘 分类 决策树 sprint算法
在线阅读 下载PDF
上一页 1 2 25 下一页 到第
使用帮助 返回顶部