In this paper, we define the generalized linear models (GLM) based on the observed data with incomplete information and random censorship under the case that the regressors are stochastic. Under the given conditions, ...In this paper, we define the generalized linear models (GLM) based on the observed data with incomplete information and random censorship under the case that the regressors are stochastic. Under the given conditions, we obtain a law of iterated logarithm and a Chung type law of iterated logarithm for the maximum likelihood estimator (MLE) in the present model.展开更多
The development of many estimators of parameters of linear regression model is traceable to non-validity of the assumptions under which the model is formulated, especially when applied to real life situation. This not...The development of many estimators of parameters of linear regression model is traceable to non-validity of the assumptions under which the model is formulated, especially when applied to real life situation. This notwithstanding, regression analysis may aim at prediction. Consequently, this paper examines the performances of the Ordinary Least Square (OLS) estimator, Cochrane-Orcutt (COR) estimator, Maximum Likelihood (ML) estimator and the estimators based on Principal Component (PC) analysis in prediction of linear regression model under the joint violations of the assumption of non-stochastic regressors, independent regressors and error terms. With correlated stochastic normal variables as regressors and autocorrelated error terms, Monte-Carlo experiments were conducted and the study further identifies the best estimator that can be used for prediction purpose by adopting the goodness of fit statistics of the estimators. From the results, it is observed that the performances of COR at each level of correlation (multicollinearity) and that of ML, especially when the sample size is large, over the levels of autocorrelation have a convex-like pattern while that of OLS and PC are concave-like. Also, as the levels of multicollinearity increase, the estimators, except the PC estimators when multicollinearity is negative, rapidly perform better over the levels autocorrelation. The COR and ML estimators are generally best for prediction in the presence of multicollinearity and autocorrelated error terms. However, at low levels of autocorrelation, the OLS estimator is either best or competes consistently with the best estimator, while the PC estimator is either best or competes with the best when multicollinearity level is high(λ>0.8 or λ-0.49).展开更多
A good machine learning model would greatly contribute to an accurate crime prediction. Thus, researchers select advanced models more frequently than basic models. To find out whether advanced models have a prominent ...A good machine learning model would greatly contribute to an accurate crime prediction. Thus, researchers select advanced models more frequently than basic models. To find out whether advanced models have a prominent advantage, this study focuses shift from obtaining crime prediction to on comparing model performance between these two types of models on crime prediction. In this study, we aimed to predict burglary occurrence in Los Angeles City, and compared a basic model just using prior year burglary occurrence with advanced models including linear regressor and random forest regressor. In addition, American Community Survey data was used to provide neighborhood level socio-economic features. After finishing data preprocessing steps that regularize the dataset, recursive feature elimination was utilized to determine the final features and the parameters of the two advanced models. Finally, to find out the best fit model, three metrics were used to evaluate model performance: R squared, adjusted R squared and mean squared error. The results indicate that linear regressor is the most suitable model among three models applied in the study with a slightly smaller mean squared error than that of basic model, whereas random forest model performed worse than the basic model. With a much more complex learning steps, advanced models did not show prominent advantages, and further research to extend the current study were discussed.展开更多
文摘In this paper, we define the generalized linear models (GLM) based on the observed data with incomplete information and random censorship under the case that the regressors are stochastic. Under the given conditions, we obtain a law of iterated logarithm and a Chung type law of iterated logarithm for the maximum likelihood estimator (MLE) in the present model.
文摘The development of many estimators of parameters of linear regression model is traceable to non-validity of the assumptions under which the model is formulated, especially when applied to real life situation. This notwithstanding, regression analysis may aim at prediction. Consequently, this paper examines the performances of the Ordinary Least Square (OLS) estimator, Cochrane-Orcutt (COR) estimator, Maximum Likelihood (ML) estimator and the estimators based on Principal Component (PC) analysis in prediction of linear regression model under the joint violations of the assumption of non-stochastic regressors, independent regressors and error terms. With correlated stochastic normal variables as regressors and autocorrelated error terms, Monte-Carlo experiments were conducted and the study further identifies the best estimator that can be used for prediction purpose by adopting the goodness of fit statistics of the estimators. From the results, it is observed that the performances of COR at each level of correlation (multicollinearity) and that of ML, especially when the sample size is large, over the levels of autocorrelation have a convex-like pattern while that of OLS and PC are concave-like. Also, as the levels of multicollinearity increase, the estimators, except the PC estimators when multicollinearity is negative, rapidly perform better over the levels autocorrelation. The COR and ML estimators are generally best for prediction in the presence of multicollinearity and autocorrelated error terms. However, at low levels of autocorrelation, the OLS estimator is either best or competes consistently with the best estimator, while the PC estimator is either best or competes with the best when multicollinearity level is high(λ>0.8 or λ-0.49).
文摘A good machine learning model would greatly contribute to an accurate crime prediction. Thus, researchers select advanced models more frequently than basic models. To find out whether advanced models have a prominent advantage, this study focuses shift from obtaining crime prediction to on comparing model performance between these two types of models on crime prediction. In this study, we aimed to predict burglary occurrence in Los Angeles City, and compared a basic model just using prior year burglary occurrence with advanced models including linear regressor and random forest regressor. In addition, American Community Survey data was used to provide neighborhood level socio-economic features. After finishing data preprocessing steps that regularize the dataset, recursive feature elimination was utilized to determine the final features and the parameters of the two advanced models. Finally, to find out the best fit model, three metrics were used to evaluate model performance: R squared, adjusted R squared and mean squared error. The results indicate that linear regressor is the most suitable model among three models applied in the study with a slightly smaller mean squared error than that of basic model, whereas random forest model performed worse than the basic model. With a much more complex learning steps, advanced models did not show prominent advantages, and further research to extend the current study were discussed.