摘要
根据波士顿房价数据集中的变量使用R软件对波士顿房价建立线性回归模型,对回归方程和回归系数进行显著性检验,针对违背基本假设的情况使用Box-Cox变换后再建立模型。为适当精简方程使用Lasso回归,但其建立的模型回归系数很小,原因是此数据中的变量并没有多重共线性,与使用R软件判断结果一致。最后,数据中的响应变量与其相关系数的绝对值大于0.5的自变量建立线性回归方程,并对房价进行预测。由于波士顿房价的分布范围会随着影响因素的变化而发生变化,且中位数具有一定的稳健性,因而我们对房价的中位数建立回归模型,即分位数回归模型。
According to the variables in the Boston housing price data set, a linear regression model was es-tablished for the Boston housing price by using R software. The significance test of the regression equation and regression coefficient was carried out. The model was established after the Box-Cox transformation was used for the case that the basic assumptions were violated. Lasso regression was used to simplify the equation appropriately, but the regression coefficient of the model estab-lished by lasso regression was small, because the variables in this data were not multicollinearity, which was consistent with the judgment results of R software. Finally, the response variable in the data and the independent variable whose absolute value of its correlation coefficient is greater than 0.5 establish a linear regression equation and predict the housing price. Because the distribution range of housing price in Boston will change with the change of influencing factors, and the median has certain robustness, we establish a regression model for the median of housing price, namely quantile regression model.
出处
《统计学与应用》
2020年第3期335-344,共10页
Statistical and Application