Disease forecasting and surveillance often involve fitting models to a tremendous volume of historical testing data collected over space and time.Bayesian spatio-temporal regression models fit with Markov chain Monte ...Disease forecasting and surveillance often involve fitting models to a tremendous volume of historical testing data collected over space and time.Bayesian spatio-temporal regression models fit with Markov chain Monte Carlo(MCMC)methods are commonly used for such data.When the spatio-temporal support of the model is large,implementing an MCMC algorithm becomes a significant computational burden.This research proposes a computationally efficient gradient boosting algorithm for fitting a Bayesian spatiotemporal mixed effects binomial regression model.We demonstrate our method on a disease forecasting model and compare it to a computationally optimized MCMC approach.Both methods are used to produce monthly forecasts for Lyme disease,anaplasmosis,ehrlichiosis,and heartworm disease in domestic dogs for the contiguous United States.The data have a spatial support of 3108 counties and a temporal support of 108e138 months with 71e135 million test results.The proposed estimation approach is several orders of magnitude faster than the optimized MCMC algorithm,with a similar mean absolute prediction error.展开更多
The prediction of particles less than 2.5 micrometers in diameter(PM2.5)in fog and haze has been paid more and more attention,but the prediction accuracy of the results is not ideal.Haze prediction algorithms based on...The prediction of particles less than 2.5 micrometers in diameter(PM2.5)in fog and haze has been paid more and more attention,but the prediction accuracy of the results is not ideal.Haze prediction algorithms based on traditional numerical and statistical prediction have poor effects on nonlinear data prediction of haze.In order to improve the effects of prediction,this paper proposes a haze feature extraction and pollution level identification pre-warning algorithm based on feature selection and integrated learning.Minimum Redundancy Maximum Relevance method is used to extract low-level features of haze,and deep confidence network is utilized to extract high-level features.eXtreme Gradient Boosting algorithm is adopted to fuse low-level and high-level features,as well as predict haze.Establish PM2.5 concentration pollution grade classification index,and grade the forecast data.The expert experience knowledge is utilized to assist the optimization of the pre-warning results.The experiment results show the presented algorithm can get better prediction effects than the results of Support Vector Machine(SVM)and Back Propagation(BP)widely used at present,the accuracy has greatly improved compared with SVM and BP.展开更多
基金RH and SS were supported in part or in full by the Companion Animal Parasite Council.SSAM were supported in part by the Research Center for Child Well-Being[NIGMS P20GM130420].
文摘Disease forecasting and surveillance often involve fitting models to a tremendous volume of historical testing data collected over space and time.Bayesian spatio-temporal regression models fit with Markov chain Monte Carlo(MCMC)methods are commonly used for such data.When the spatio-temporal support of the model is large,implementing an MCMC algorithm becomes a significant computational burden.This research proposes a computationally efficient gradient boosting algorithm for fitting a Bayesian spatiotemporal mixed effects binomial regression model.We demonstrate our method on a disease forecasting model and compare it to a computationally optimized MCMC approach.Both methods are used to produce monthly forecasts for Lyme disease,anaplasmosis,ehrlichiosis,and heartworm disease in domestic dogs for the contiguous United States.The data have a spatial support of 3108 counties and a temporal support of 108e138 months with 71e135 million test results.The proposed estimation approach is several orders of magnitude faster than the optimized MCMC algorithm,with a similar mean absolute prediction error.
基金The work was financially supported by National Natural Science Fund of China,specific grant numbers were 61371143 and 61662033initials of authors who received the grants were respectively Z.YM,H.L,and the URLs to sponsors’websites was http://www.nsfc.gov.cn/.This paper was supported by National Natural Science Fund of China(Grant Nos.61371143,61662033).
文摘The prediction of particles less than 2.5 micrometers in diameter(PM2.5)in fog and haze has been paid more and more attention,but the prediction accuracy of the results is not ideal.Haze prediction algorithms based on traditional numerical and statistical prediction have poor effects on nonlinear data prediction of haze.In order to improve the effects of prediction,this paper proposes a haze feature extraction and pollution level identification pre-warning algorithm based on feature selection and integrated learning.Minimum Redundancy Maximum Relevance method is used to extract low-level features of haze,and deep confidence network is utilized to extract high-level features.eXtreme Gradient Boosting algorithm is adopted to fuse low-level and high-level features,as well as predict haze.Establish PM2.5 concentration pollution grade classification index,and grade the forecast data.The expert experience knowledge is utilized to assist the optimization of the pre-warning results.The experiment results show the presented algorithm can get better prediction effects than the results of Support Vector Machine(SVM)and Back Propagation(BP)widely used at present,the accuracy has greatly improved compared with SVM and BP.