Recently, researches on distributed data mining by making use of grid are in trend. This paper introduces a data mining algorithm by means of distributed decision-tree,which has taken the advantage of conveniences and...Recently, researches on distributed data mining by making use of grid are in trend. This paper introduces a data mining algorithm by means of distributed decision-tree,which has taken the advantage of conveniences and services supplied by the computing platform-grid,and can perform a data mining of distributed classification on grid.展开更多
Background:As China's population ages,its disease spectrum is changing,and the coexistence of multiple chronic diseases has become the norm with respect to the health status of its elderly population.However,the h...Background:As China's population ages,its disease spectrum is changing,and the coexistence of multiple chronic diseases has become the norm with respect to the health status of its elderly population.However,the health institution choices of older patients with multimorbidity in stabilization period remains underresearched.This study investigate the factors influencing the choices of older patients with multimorbidity to provide references for the rational allocation of healthcare resources.Methods:A multistage,stratified,whole-group random-sampling method was used to select eligible older patients from September to December of 2022 who attended the Community Health Service Center of Guangdong Province.We adopted a self-designed questionnaire to collect patients'general,diseaserelated,social-support information,their intention to choose a healthcare provider.A binary logistic regression and decision tree model based on the Chi-squared automatic interaction detector algorithm were implemented to analyze the associated factors involved.Results:A total of 998 patients in stabilization period were included in the study,of which 593(59.42%)chose hospital and 405(40.58%)chose primary care.Our binary logistic regression results revealed that age,sex,individual average annual income,educational level,self-reported health status,activities of daily living,alcohol consumption,family doctor contracting,and family supervision of medication or exercise were the principal factors influencing the choice of medical institutions for older patients with multimorbidity(p<0.05).The decision-tree model reflected three levels and 11 nodes,and we screened a total of four influencing factors:activities of daily living,age,a family doctor contract,and patient sex.The data showed that the logistic regression model possessed an accuracy of 72.9%and that the decision tree model exhibited an accuracy of 68.7%.Prediction using the binary logistic regression was thus statistically superior to the categorical decision-tree model based on the Chisquared automatic interaction detector algorithm(Z=3.238,p=0.001).Conclusion:More than half of older patients with multimorbidity in stabilization period chose hospitals for healthcare.Efforts should be made to improve the quality of healthcare services and increase the medical contracting rate and recognition of family doctors so as to attract older patients with multimorbidity to primary medical institutions.展开更多
Floods are one of the major hazards worldwide. They are the source of huge risks in rural and urban areas, resulting in severe impacts on the civil society, industry and the economy. The Elbe River has suffered from m...Floods are one of the major hazards worldwide. They are the source of huge risks in rural and urban areas, resulting in severe impacts on the civil society, industry and the economy. The Elbe River has suffered from many severe floods during recent decades. In this study, the zones flooded during 2011 were analyzed using TerraSAR-X images and a digital elevation model for the area in order to identify possible ways to mitigate flood hazards in the future, regarding sustainable land-use. Two study areas are investigated, around the Walmsburg oxbow and the Wehningen oxbow. These are located between Elbe-Kilometer (505-520) and (533-543), respectively, within the Lower Saxonian Elbe River Biosphere Reserve. Those areas are characterized by several types of land use, with agricultural land use being predominant. The study investigated the possibility of using a Decision-Tree object-based classifier for determining the major land uses and the extent of the inundation areas. The inundation areas identify for 2011 submerged some agricultural fields that must be added to existing flood risk maps, and future cultivation activities there prevented to avoid the possible economic losses. Furthermore, part of the residential area is located within the high flood zone, and must be included in risk maps to avoid the possible human and economic losses, to achieve sustainable land use for the areas studied.展开更多
Preterm births have been seen to have psychological and financial implications;current surveys suggest that amongst the various methods of preterm prediction,there is yet to exist a reliable and standard means of pred...Preterm births have been seen to have psychological and financial implications;current surveys suggest that amongst the various methods of preterm prediction,there is yet to exist a reliable and standard means of predicting preterm births.This study investigates the application of electrohysterogram and tocogram signals acquired at various points during the third pregnancy trimester,alongside information from the patients'medical health record regarding the pregnancy,towards preterm prediction and an associated delivery imminency timeline.In addition to this,the impact of both linear and non-linear dimensional embedding methods towards the preterm prediction is explored.The classification exercises were carried out using a support vector machine and decision tree,both of which have a certain degree of model interpretability and have potential to be introduced into a clinical operating framework.展开更多
The need for renewable energy sources has challenged most countries to comply with environmental protection actions and to handle climate change.Solar energy figures as a natural option,despite its intermittence.Brazi...The need for renewable energy sources has challenged most countries to comply with environmental protection actions and to handle climate change.Solar energy figures as a natural option,despite its intermittence.Brazil has a green energy matrix with significant expansion of solar form in recent years.To preserve the Amazon basin,the use of solar energy can help communities and cities improve their living standards without new hydroelectric units or even to burn biomass,avoiding harsh environmental consequences.The novelty of this work is using data science with machine-learning tools to predict the solar incidence(W.h/m^(2))in four cities in Amazonas state(north-west Brazil),using data from NASA satellites within the period of 2013-22.Decision-tree-based models and vector autoregressive(time-series)models were used with three time aggregations:day,week and month.The predictor model can aid in the economic assessment of solar energy in the Amazon basin and the use of satellite data was encouraged by the lack of data from ground stations.The mean absolute error was selected as the output indicator,with the lowest values obtained close to 0.20,from the adaptive boosting and light gradient boosting algorithms,in the same order of magnitude of similar references.展开更多
The global shift towards sustainable and environmentally friendly transportation options has led to the increasing adoption of electric buses(Ebuses).To optimize the deployment and operational strategies of Ebuses,it ...The global shift towards sustainable and environmentally friendly transportation options has led to the increasing adoption of electric buses(Ebuses).To optimize the deployment and operational strategies of Ebuses,it is imperative to accurately predict their energy consumption under varying conditions,particularly in cold climates where battery life is typically degraded.The exploration of this aspect within the Canadian context has been limited.In addition,we have found that existing models in the literature perform poorly in the Canadian environment,giving rise to the need for new models using Canadian data.This paper focuses on the development,comparison,and evaluation of various data-driven models designed to predict the energy consumption of different Ebuses with different heating technologies under a wide range of climate conditions.We specifically use Canadian data as a good representative of cold climates in general.The results show that the performance of the different bus types varies substantially under the exact same conditions.In addition,tree-based family of models proves to be the most suitable approach for predicting the Ebus consumption rate.The results indicate that the Random Forest method emerges as the superior choice for predicting the energy consumption rate,with a resulting mean absolute error of 0.09–0.1 kWh/km observed across the different models.Furthermore,SHAP analysis shows that the main variables influencing the energy consumption rate depend on the type of heating system(using the battery for heating or using an auxiliary system that utilizes diesel for heating)adopted.展开更多
文摘Recently, researches on distributed data mining by making use of grid are in trend. This paper introduces a data mining algorithm by means of distributed decision-tree,which has taken the advantage of conveniences and services supplied by the computing platform-grid,and can perform a data mining of distributed classification on grid.
基金National Natural Science Foundation of China,Grant/Award Number:72004112。
文摘Background:As China's population ages,its disease spectrum is changing,and the coexistence of multiple chronic diseases has become the norm with respect to the health status of its elderly population.However,the health institution choices of older patients with multimorbidity in stabilization period remains underresearched.This study investigate the factors influencing the choices of older patients with multimorbidity to provide references for the rational allocation of healthcare resources.Methods:A multistage,stratified,whole-group random-sampling method was used to select eligible older patients from September to December of 2022 who attended the Community Health Service Center of Guangdong Province.We adopted a self-designed questionnaire to collect patients'general,diseaserelated,social-support information,their intention to choose a healthcare provider.A binary logistic regression and decision tree model based on the Chi-squared automatic interaction detector algorithm were implemented to analyze the associated factors involved.Results:A total of 998 patients in stabilization period were included in the study,of which 593(59.42%)chose hospital and 405(40.58%)chose primary care.Our binary logistic regression results revealed that age,sex,individual average annual income,educational level,self-reported health status,activities of daily living,alcohol consumption,family doctor contracting,and family supervision of medication or exercise were the principal factors influencing the choice of medical institutions for older patients with multimorbidity(p<0.05).The decision-tree model reflected three levels and 11 nodes,and we screened a total of four influencing factors:activities of daily living,age,a family doctor contract,and patient sex.The data showed that the logistic regression model possessed an accuracy of 72.9%and that the decision tree model exhibited an accuracy of 68.7%.Prediction using the binary logistic regression was thus statistically superior to the categorical decision-tree model based on the Chisquared automatic interaction detector algorithm(Z=3.238,p=0.001).Conclusion:More than half of older patients with multimorbidity in stabilization period chose hospitals for healthcare.Efforts should be made to improve the quality of healthcare services and increase the medical contracting rate and recognition of family doctors so as to attract older patients with multimorbidity to primary medical institutions.
文摘Floods are one of the major hazards worldwide. They are the source of huge risks in rural and urban areas, resulting in severe impacts on the civil society, industry and the economy. The Elbe River has suffered from many severe floods during recent decades. In this study, the zones flooded during 2011 were analyzed using TerraSAR-X images and a digital elevation model for the area in order to identify possible ways to mitigate flood hazards in the future, regarding sustainable land-use. Two study areas are investigated, around the Walmsburg oxbow and the Wehningen oxbow. These are located between Elbe-Kilometer (505-520) and (533-543), respectively, within the Lower Saxonian Elbe River Biosphere Reserve. Those areas are characterized by several types of land use, with agricultural land use being predominant. The study investigated the possibility of using a Decision-Tree object-based classifier for determining the major land uses and the extent of the inundation areas. The inundation areas identify for 2011 submerged some agricultural fields that must be added to existing flood risk maps, and future cultivation activities there prevented to avoid the possible economic losses. Furthermore, part of the residential area is located within the high flood zone, and must be included in risk maps to avoid the possible human and economic losses, to achieve sustainable land use for the areas studied.
文摘Preterm births have been seen to have psychological and financial implications;current surveys suggest that amongst the various methods of preterm prediction,there is yet to exist a reliable and standard means of predicting preterm births.This study investigates the application of electrohysterogram and tocogram signals acquired at various points during the third pregnancy trimester,alongside information from the patients'medical health record regarding the pregnancy,towards preterm prediction and an associated delivery imminency timeline.In addition to this,the impact of both linear and non-linear dimensional embedding methods towards the preterm prediction is explored.The classification exercises were carried out using a support vector machine and decision tree,both of which have a certain degree of model interpretability and have potential to be introduced into a clinical operating framework.
基金The authors acknowledge the support of the Research Centre for Greenhouse Gas Innovation(RCGI),hosted by University of Sao Paulo(USP)and sponsored by FAPESP(grants#2014/50279-4 and#2020/15230-5,#2022/07974-0)Shell Brasil,and the strategic importance of the support given by Brazil’s National Oil,Natural Gas and Biofuels Agency(ANP)through the R&D levy regulation.Equally importantly,Felipe Almeida is sponsored by the National Council for Scientific and Technological Development(CNPq),grant#140253/2021-1.
文摘The need for renewable energy sources has challenged most countries to comply with environmental protection actions and to handle climate change.Solar energy figures as a natural option,despite its intermittence.Brazil has a green energy matrix with significant expansion of solar form in recent years.To preserve the Amazon basin,the use of solar energy can help communities and cities improve their living standards without new hydroelectric units or even to burn biomass,avoiding harsh environmental consequences.The novelty of this work is using data science with machine-learning tools to predict the solar incidence(W.h/m^(2))in four cities in Amazonas state(north-west Brazil),using data from NASA satellites within the period of 2013-22.Decision-tree-based models and vector autoregressive(time-series)models were used with three time aggregations:day,week and month.The predictor model can aid in the economic assessment of solar energy in the Amazon basin and the use of satellite data was encouraged by the lack of data from ground stations.The mean absolute error was selected as the output indicator,with the lowest values obtained close to 0.20,from the adaptive boosting and light gradient boosting algorithms,in the same order of magnitude of similar references.
文摘The global shift towards sustainable and environmentally friendly transportation options has led to the increasing adoption of electric buses(Ebuses).To optimize the deployment and operational strategies of Ebuses,it is imperative to accurately predict their energy consumption under varying conditions,particularly in cold climates where battery life is typically degraded.The exploration of this aspect within the Canadian context has been limited.In addition,we have found that existing models in the literature perform poorly in the Canadian environment,giving rise to the need for new models using Canadian data.This paper focuses on the development,comparison,and evaluation of various data-driven models designed to predict the energy consumption of different Ebuses with different heating technologies under a wide range of climate conditions.We specifically use Canadian data as a good representative of cold climates in general.The results show that the performance of the different bus types varies substantially under the exact same conditions.In addition,tree-based family of models proves to be the most suitable approach for predicting the Ebus consumption rate.The results indicate that the Random Forest method emerges as the superior choice for predicting the energy consumption rate,with a resulting mean absolute error of 0.09–0.1 kWh/km observed across the different models.Furthermore,SHAP analysis shows that the main variables influencing the energy consumption rate depend on the type of heating system(using the battery for heating or using an auxiliary system that utilizes diesel for heating)adopted.