The increase of competition, economic recession and financial crises has increased business failure and depending on this the researchers have attempted to develop new approaches which can yield more correct and more ...The increase of competition, economic recession and financial crises has increased business failure and depending on this the researchers have attempted to develop new approaches which can yield more correct and more reliable results. The classification and regression tree (CART) is one of the new modeling techniques which is developed for this purpose. In this study, the classification and regression trees method is explained and tested the power of the financial failure prediction. CART is applied for the data of industry companies which is trade in Istanbul Stock Exchange (ISE) between 1997-2007. As a result of this study, it has been observed that, CART has a high predicting power of financial failure one, two and three years prior to failure, and profitability ratios being the most important ratios in the prediction of failure.展开更多
This paper presents a supervised learning algorithm for retinal vascular segmentation based on classification and regression tree (CART) algorithm and improved adptive bosting (AdaBoost). Local binary patterns (LBP) t...This paper presents a supervised learning algorithm for retinal vascular segmentation based on classification and regression tree (CART) algorithm and improved adptive bosting (AdaBoost). Local binary patterns (LBP) texture features and local features are extracted by extracting,reversing,dilating and enhancing the green components of retinal images to construct a 17-dimensional feature vector. A dataset is constructed by using the feature vector and the data manually marked by the experts. The feature is used to generate CART binary tree for nodes,where CART binary tree is as the AdaBoost weak classifier,and AdaBoost is improved by adding some re-judgment functions to form a strong classifier. The proposed algorithm is simulated on the digital retinal images for vessel extraction (DRIVE). The experimental results show that the proposed algorithm has higher segmentation accuracy for blood vessels,and the result basically contains complete blood vessel details. Moreover,the segmented blood vessel tree has good connectivity,which basically reflects the distribution trend of blood vessels. Compared with the traditional AdaBoost classification algorithm and the support vector machine (SVM) based classification algorithm,the proposed algorithm has higher average accuracy and reliability index,which is similar to the segmentation results of the state-of-the-art segmentation algorithm.展开更多
Researchers in bioinformatics, biostatistics and other related fields seek biomarkers for many purposes, including risk assessment, disease diagnosis and prognosis, which can be formulated as a patient classification....Researchers in bioinformatics, biostatistics and other related fields seek biomarkers for many purposes, including risk assessment, disease diagnosis and prognosis, which can be formulated as a patient classification. In this paper, a new method of using a tree regression to improve logistic classification model is introduced in biomarker data analysis. The numerical results show that the linear logistic model can be significantly improved by a tree regression on the residuals. Although the classification problem of binary responses is discussed in this research, the idea is easy to extend to the classification of multinomial responses.展开更多
There is still no effective means to analyze in depth and utilize domestic mass data about agricultural product quality safety tests in china now. The neural network algorithm, the classification regression tree algor...There is still no effective means to analyze in depth and utilize domestic mass data about agricultural product quality safety tests in china now. The neural network algorithm, the classification regression tree algorithm, the Bayesian network algorithm were selected according to the principle of selecting combination model and were used to build models respectively and then combined, innovatively establishing a combination model which has relatively high precision, strong robustness and better explanatory to predict the results of perishable food transportation meta-morphism monitoring. The relative optimal prediction model of the perishable food transportation metamorphism monitoring system could be got. The relative perfect prediction model can guide the actual sampling work about food quality and safety by prognosticating the occurrence of unqualified food to select the typical and effective samples for test, thus improving the efficiency and effectiveness of sampling work effectively, so as to avoid deteriorated perishable food’s approaching the market to ensure the quality and safety of perishable food transportation. A solid protective wall was built in the protection of general perishable food consumers’ health.展开更多
The sub-pixel impervious surface percentage(SPIS) is the fraction of impervious surface area in one pixel,and it is an important indicator of urbanization.Using remote sensing data,the spatial distribution of SPIS val...The sub-pixel impervious surface percentage(SPIS) is the fraction of impervious surface area in one pixel,and it is an important indicator of urbanization.Using remote sensing data,the spatial distribution of SPIS values over large areas can be extracted,and these data are significant for studies of urban climate,environment and hydrology.To develop a stabilized,multi-temporal SPIS estimation method suitable for typical temperate semi-arid climate zones with distinct seasons,an optimal model for estimating SPIS values within Beijing Municipality was built that is based on the classification and regression tree(CART) algorithm.First,models with different input variables for SPIS estimation were built by integrating multi-source remote sensing data with other auxiliary data.The optimal model was selected through the analysis and comparison of the assessed accuracy of these models.Subsequently,multi-temporal SPIS mapping was carried out based on the optimal model.The results are as follows:1) multi-seasonal images and nighttime light(NTL) data are the optimal input variables for SPIS estimation within Beijing Municipality,where the intra-annual variability in vegetation is distinct.The different spectral characteristics in the cultivated land caused by the different farming characteristics and vegetation phenology can be detected by the multi-seasonal images effectively.NLT data can effectively reduce the misestimation caused by the spectral similarity between bare land and impervious surfaces.After testing,the SPIS modeling correlation coefficient(r) is approximately 0.86,the average error(AE) is approximately 12.8%,and the relative error(RE) is approximately 0.39.2) The SPIS results have been divided into areas with high-density impervious cover(70%–100%),medium-density impervious cover(40%–70%),low-density impervious cover(10%–40%) and natural cover(0%–10%).The SPIS model performed better in estimating values for high-density urban areas than other categories.3) Multi-temporal SPIS mapping(1991–2016) was conducted based on the optimized SPIS results for 2005.After testing,AE ranges from 12.7% to 15.2%,RE ranges from 0.39 to 0.46,and r ranges from 0.81 to 0.86.It is demonstrated that the proposed approach for estimating sub-pixel level impervious surface by integrating the CART algorithm and multi-source remote sensing data is feasible and suitable for multi-temporal SPIS mapping of areas with distinct intra-annual variability in vegetation.展开更多
分布式光伏受天气影响较大,测算110kV供电区域的分布式光伏承载能力,对区域供电来说意义重大。基于此,提出基于分类与回归树(calssification and regression tree,CART)的110kV供电区域分布式光伏承载能力测算模型。该模型以分布式电源...分布式光伏受天气影响较大,测算110kV供电区域的分布式光伏承载能力,对区域供电来说意义重大。基于此,提出基于分类与回归树(calssification and regression tree,CART)的110kV供电区域分布式光伏承载能力测算模型。该模型以分布式电源输出功率、区域分布式电源发电量占比、局部分布式电源线损增量等数据为基础,利用CART决策树建立110kV供电区域分布式光伏承载能力测算模型,并使用改进鲸鱼优化算法求解测算结果。经实验测试发现,该模型对分布式光伏承载能力的测算精准度较高,可有效测算不同实验区域在不同季节时的分布式光伏承载能力,具有较高的应用价值。展开更多
In enterprise operations,maintaining manual rules for enterprise processes can be expensive,time-consuming,and dependent on specialized domain knowledge in that enterprise domain.Recently,rule-generation has been auto...In enterprise operations,maintaining manual rules for enterprise processes can be expensive,time-consuming,and dependent on specialized domain knowledge in that enterprise domain.Recently,rule-generation has been automated in enterprises,particularly through Machine Learning,to streamline routine tasks.Typically,these machine models are black boxes where the reasons for the decisions are not always transparent,and the end users need to verify the model proposals as a part of the user acceptance testing to trust it.In such scenarios,rules excel over Machine Learning models as the end-users can verify the rules and have more trust.In many scenarios,the truth label changes frequently thus,it becomes difficult for the Machine Learning model to learn till a considerable amount of data has been accumulated,but with rules,the truth can be adapted.This paper presents a novel framework for generating human-understandable rules using the Classification and Regression Tree(CART)decision tree method,which ensures both optimization and user trust in automated decision-making processes.The framework generates comprehensible rules in the form of if condition and then predicts class even in domains where noise is present.The proposed system transforms enterprise operations by automating the production of human-readable rules from structured data,resulting in increased efficiency and transparency.Removing the need for human rule construction saves time and money while guaranteeing that users can readily check and trust the automatic judgments of the system.The remarkable performance metrics of the framework,which achieve 99.85%accuracy and 96.30%precision,further support its efficiency in translating complex data into comprehensible rules,eventually empowering users and enhancing organizational decision-making processes.展开更多
以贫困形势严峻和地理环境空间异质性显著的贵州省为案例,将分类与回归树(Classification and Regression Tree,CART)模型引入贫困研究,分析了贫困空间格局影响因素并制定了相关对策。结论表明:①贵州省的贫困格局呈现出典型的敞口“马...以贫困形势严峻和地理环境空间异质性显著的贵州省为案例,将分类与回归树(Classification and Regression Tree,CART)模型引入贫困研究,分析了贫困空间格局影响因素并制定了相关对策。结论表明:①贵州省的贫困格局呈现出典型的敞口“马蹄”形结构,黔东、南和西部地区高而中部及北部较低。②基于CART模型的贵州省贫困影响因素重要性的排序为平均隔离度>路网密度>水域比例>平均偏远度>NDVI>年均降水。③根据CART模型决策规则,对贵州省扶贫攻坚提出以下对策建议:首先,应采取更加“精准”的易地扶贫和村镇体系规划降低居民点隔离度,确保居民点之间平均隔离度小于4847 m。其次,在居民点距离确定的基础上,应科学改善区域的生产生活用水条件,将水域面积比例尽可能提升至0.8%以上,保障生活用水和生产灌溉,提升水资源承载能力。最后,在确保居民点隔离度改善,水资源丰度提升的前提下,应重视喀斯特石漠化地区的生态保护修复,将县域的NDVI提升至0.45以上,提高区域生态资产,提升贫困社区韧性,将生态保护与脱贫攻坚相结合,促进区域人地关系和谐发展。展开更多
目的:分析肺癌病人治疗期输液港发生医用粘胶相关皮肤损伤(medical adhesive related skin injury,MARSI)的危险因素,并建立风险预测模型,以期为临床护理干预提供参考。方法:回顾性收集2023年1月—2024年4月在某三级甲等综合医院呼吸与...目的:分析肺癌病人治疗期输液港发生医用粘胶相关皮肤损伤(medical adhesive related skin injury,MARSI)的危险因素,并建立风险预测模型,以期为临床护理干预提供参考。方法:回顾性收集2023年1月—2024年4月在某三级甲等综合医院呼吸与危重症医学科使用胸壁输液港的650例病人为调查对象,运用Logistic回归模型、决策树分类回归树(CART)模型和随机森林模型分别建立肺癌病人治疗期输液港医用粘胶相关皮肤损伤风险预测模型,通过比较3种模型的准确率、灵敏度、特异度、阳性预测值、阴性预测值、Kappa系数和受试者工作特征(ROC)曲线下面积(AUC)评价其性能。结果:Logistic回归模型、决策树CART模型和随机森林模型的准确率分别为84%、86%、86%,特异度为97%、98%、97%,灵敏度为54%、59%、61%,阳性预测值为54%、59%、61%,阴性预测值为97%、98%、97%,Kappa值为0.57,0.63,0.64,AUC为0.83,0.87,0.86。Logistic回归模型、决策树CART模型、随机森林的AUC比较差异均有统计学意义(P<0.05)。皮肤毒性为3种模型的共同预测因子。结论:决策树CART模型和随机森林模型相比Logistic回归模型在构建肺癌病人治疗期输液港医用粘胶相关皮肤损伤风险预测模型中具有更好的性能,可为临床护士预测肺癌病人输液港医用粘胶相关皮肤损伤发生风险提供参考。展开更多
在开展新能源出力预测阶段,由于新能源自身具有波动性和间歇性,导致预测结果的可靠性难以得到保障。为此,提出基于XGBoost和QRLSTM的新能源出力高精度预测方法。采用极限梯度提升算法(EXtreme Gradient Boosting,XGBoost)建立新能源出...在开展新能源出力预测阶段,由于新能源自身具有波动性和间歇性,导致预测结果的可靠性难以得到保障。为此,提出基于XGBoost和QRLSTM的新能源出力高精度预测方法。采用极限梯度提升算法(EXtreme Gradient Boosting,XGBoost)建立新能源出力数据的目标函数,利用二阶泰勒展开式对目标函数进行近似处理。结合分位数回归构(Quantile Regression,QR)改进长短期记忆(Long Short Term Memory,LSTM)递归神经网络,构建QRLSTM模型将近似处理后的数据输入至该模型中,通过逻辑门完成新能源出力预测。在测试结果中,实际方法在不同环境条件下对于新能源机组出力情况的预测结果均与实际情况保持较高的拟合度,具有较高的精准度。展开更多
文摘The increase of competition, economic recession and financial crises has increased business failure and depending on this the researchers have attempted to develop new approaches which can yield more correct and more reliable results. The classification and regression tree (CART) is one of the new modeling techniques which is developed for this purpose. In this study, the classification and regression trees method is explained and tested the power of the financial failure prediction. CART is applied for the data of industry companies which is trade in Istanbul Stock Exchange (ISE) between 1997-2007. As a result of this study, it has been observed that, CART has a high predicting power of financial failure one, two and three years prior to failure, and profitability ratios being the most important ratios in the prediction of failure.
基金National Natural Science Foundation of China(No.61163010)
文摘This paper presents a supervised learning algorithm for retinal vascular segmentation based on classification and regression tree (CART) algorithm and improved adptive bosting (AdaBoost). Local binary patterns (LBP) texture features and local features are extracted by extracting,reversing,dilating and enhancing the green components of retinal images to construct a 17-dimensional feature vector. A dataset is constructed by using the feature vector and the data manually marked by the experts. The feature is used to generate CART binary tree for nodes,where CART binary tree is as the AdaBoost weak classifier,and AdaBoost is improved by adding some re-judgment functions to form a strong classifier. The proposed algorithm is simulated on the digital retinal images for vessel extraction (DRIVE). The experimental results show that the proposed algorithm has higher segmentation accuracy for blood vessels,and the result basically contains complete blood vessel details. Moreover,the segmented blood vessel tree has good connectivity,which basically reflects the distribution trend of blood vessels. Compared with the traditional AdaBoost classification algorithm and the support vector machine (SVM) based classification algorithm,the proposed algorithm has higher average accuracy and reliability index,which is similar to the segmentation results of the state-of-the-art segmentation algorithm.
文摘Researchers in bioinformatics, biostatistics and other related fields seek biomarkers for many purposes, including risk assessment, disease diagnosis and prognosis, which can be formulated as a patient classification. In this paper, a new method of using a tree regression to improve logistic classification model is introduced in biomarker data analysis. The numerical results show that the linear logistic model can be significantly improved by a tree regression on the residuals. Although the classification problem of binary responses is discussed in this research, the idea is easy to extend to the classification of multinomial responses.
文摘There is still no effective means to analyze in depth and utilize domestic mass data about agricultural product quality safety tests in china now. The neural network algorithm, the classification regression tree algorithm, the Bayesian network algorithm were selected according to the principle of selecting combination model and were used to build models respectively and then combined, innovatively establishing a combination model which has relatively high precision, strong robustness and better explanatory to predict the results of perishable food transportation meta-morphism monitoring. The relative optimal prediction model of the perishable food transportation metamorphism monitoring system could be got. The relative perfect prediction model can guide the actual sampling work about food quality and safety by prognosticating the occurrence of unqualified food to select the typical and effective samples for test, thus improving the efficiency and effectiveness of sampling work effectively, so as to avoid deteriorated perishable food’s approaching the market to ensure the quality and safety of perishable food transportation. A solid protective wall was built in the protection of general perishable food consumers’ health.
基金Under the auspices of National Natural Science Foundation of China(No.41671339)
文摘The sub-pixel impervious surface percentage(SPIS) is the fraction of impervious surface area in one pixel,and it is an important indicator of urbanization.Using remote sensing data,the spatial distribution of SPIS values over large areas can be extracted,and these data are significant for studies of urban climate,environment and hydrology.To develop a stabilized,multi-temporal SPIS estimation method suitable for typical temperate semi-arid climate zones with distinct seasons,an optimal model for estimating SPIS values within Beijing Municipality was built that is based on the classification and regression tree(CART) algorithm.First,models with different input variables for SPIS estimation were built by integrating multi-source remote sensing data with other auxiliary data.The optimal model was selected through the analysis and comparison of the assessed accuracy of these models.Subsequently,multi-temporal SPIS mapping was carried out based on the optimal model.The results are as follows:1) multi-seasonal images and nighttime light(NTL) data are the optimal input variables for SPIS estimation within Beijing Municipality,where the intra-annual variability in vegetation is distinct.The different spectral characteristics in the cultivated land caused by the different farming characteristics and vegetation phenology can be detected by the multi-seasonal images effectively.NLT data can effectively reduce the misestimation caused by the spectral similarity between bare land and impervious surfaces.After testing,the SPIS modeling correlation coefficient(r) is approximately 0.86,the average error(AE) is approximately 12.8%,and the relative error(RE) is approximately 0.39.2) The SPIS results have been divided into areas with high-density impervious cover(70%–100%),medium-density impervious cover(40%–70%),low-density impervious cover(10%–40%) and natural cover(0%–10%).The SPIS model performed better in estimating values for high-density urban areas than other categories.3) Multi-temporal SPIS mapping(1991–2016) was conducted based on the optimized SPIS results for 2005.After testing,AE ranges from 12.7% to 15.2%,RE ranges from 0.39 to 0.46,and r ranges from 0.81 to 0.86.It is demonstrated that the proposed approach for estimating sub-pixel level impervious surface by integrating the CART algorithm and multi-source remote sensing data is feasible and suitable for multi-temporal SPIS mapping of areas with distinct intra-annual variability in vegetation.
文摘分布式光伏受天气影响较大,测算110kV供电区域的分布式光伏承载能力,对区域供电来说意义重大。基于此,提出基于分类与回归树(calssification and regression tree,CART)的110kV供电区域分布式光伏承载能力测算模型。该模型以分布式电源输出功率、区域分布式电源发电量占比、局部分布式电源线损增量等数据为基础,利用CART决策树建立110kV供电区域分布式光伏承载能力测算模型,并使用改进鲸鱼优化算法求解测算结果。经实验测试发现,该模型对分布式光伏承载能力的测算精准度较高,可有效测算不同实验区域在不同季节时的分布式光伏承载能力,具有较高的应用价值。
文摘In enterprise operations,maintaining manual rules for enterprise processes can be expensive,time-consuming,and dependent on specialized domain knowledge in that enterprise domain.Recently,rule-generation has been automated in enterprises,particularly through Machine Learning,to streamline routine tasks.Typically,these machine models are black boxes where the reasons for the decisions are not always transparent,and the end users need to verify the model proposals as a part of the user acceptance testing to trust it.In such scenarios,rules excel over Machine Learning models as the end-users can verify the rules and have more trust.In many scenarios,the truth label changes frequently thus,it becomes difficult for the Machine Learning model to learn till a considerable amount of data has been accumulated,but with rules,the truth can be adapted.This paper presents a novel framework for generating human-understandable rules using the Classification and Regression Tree(CART)decision tree method,which ensures both optimization and user trust in automated decision-making processes.The framework generates comprehensible rules in the form of if condition and then predicts class even in domains where noise is present.The proposed system transforms enterprise operations by automating the production of human-readable rules from structured data,resulting in increased efficiency and transparency.Removing the need for human rule construction saves time and money while guaranteeing that users can readily check and trust the automatic judgments of the system.The remarkable performance metrics of the framework,which achieve 99.85%accuracy and 96.30%precision,further support its efficiency in translating complex data into comprehensible rules,eventually empowering users and enhancing organizational decision-making processes.
文摘以贫困形势严峻和地理环境空间异质性显著的贵州省为案例,将分类与回归树(Classification and Regression Tree,CART)模型引入贫困研究,分析了贫困空间格局影响因素并制定了相关对策。结论表明:①贵州省的贫困格局呈现出典型的敞口“马蹄”形结构,黔东、南和西部地区高而中部及北部较低。②基于CART模型的贵州省贫困影响因素重要性的排序为平均隔离度>路网密度>水域比例>平均偏远度>NDVI>年均降水。③根据CART模型决策规则,对贵州省扶贫攻坚提出以下对策建议:首先,应采取更加“精准”的易地扶贫和村镇体系规划降低居民点隔离度,确保居民点之间平均隔离度小于4847 m。其次,在居民点距离确定的基础上,应科学改善区域的生产生活用水条件,将水域面积比例尽可能提升至0.8%以上,保障生活用水和生产灌溉,提升水资源承载能力。最后,在确保居民点隔离度改善,水资源丰度提升的前提下,应重视喀斯特石漠化地区的生态保护修复,将县域的NDVI提升至0.45以上,提高区域生态资产,提升贫困社区韧性,将生态保护与脱贫攻坚相结合,促进区域人地关系和谐发展。
文摘目的:分析肺癌病人治疗期输液港发生医用粘胶相关皮肤损伤(medical adhesive related skin injury,MARSI)的危险因素,并建立风险预测模型,以期为临床护理干预提供参考。方法:回顾性收集2023年1月—2024年4月在某三级甲等综合医院呼吸与危重症医学科使用胸壁输液港的650例病人为调查对象,运用Logistic回归模型、决策树分类回归树(CART)模型和随机森林模型分别建立肺癌病人治疗期输液港医用粘胶相关皮肤损伤风险预测模型,通过比较3种模型的准确率、灵敏度、特异度、阳性预测值、阴性预测值、Kappa系数和受试者工作特征(ROC)曲线下面积(AUC)评价其性能。结果:Logistic回归模型、决策树CART模型和随机森林模型的准确率分别为84%、86%、86%,特异度为97%、98%、97%,灵敏度为54%、59%、61%,阳性预测值为54%、59%、61%,阴性预测值为97%、98%、97%,Kappa值为0.57,0.63,0.64,AUC为0.83,0.87,0.86。Logistic回归模型、决策树CART模型、随机森林的AUC比较差异均有统计学意义(P<0.05)。皮肤毒性为3种模型的共同预测因子。结论:决策树CART模型和随机森林模型相比Logistic回归模型在构建肺癌病人治疗期输液港医用粘胶相关皮肤损伤风险预测模型中具有更好的性能,可为临床护士预测肺癌病人输液港医用粘胶相关皮肤损伤发生风险提供参考。
文摘在开展新能源出力预测阶段,由于新能源自身具有波动性和间歇性,导致预测结果的可靠性难以得到保障。为此,提出基于XGBoost和QRLSTM的新能源出力高精度预测方法。采用极限梯度提升算法(EXtreme Gradient Boosting,XGBoost)建立新能源出力数据的目标函数,利用二阶泰勒展开式对目标函数进行近似处理。结合分位数回归构(Quantile Regression,QR)改进长短期记忆(Long Short Term Memory,LSTM)递归神经网络,构建QRLSTM模型将近似处理后的数据输入至该模型中,通过逻辑门完成新能源出力预测。在测试结果中,实际方法在不同环境条件下对于新能源机组出力情况的预测结果均与实际情况保持较高的拟合度,具有较高的精准度。