The results of mass appraisal in many countries are used as a basis for calculating the amount of real estate tax,therefore,regardless of the methods used to calculate it,the resulting value should be as close as poss...The results of mass appraisal in many countries are used as a basis for calculating the amount of real estate tax,therefore,regardless of the methods used to calculate it,the resulting value should be as close as possible to the market value of the real estate to maintain a balance of interests between the state and the rights holders.In practice,this condition is not always met,since,firstly,the quality of market data is often very low,and secondly,some markets are characterized by low activity,which is expressed in a deficit of information on asking prices.The aim of the work is ecological valuation of land use:how regression-based mass appraisal can inform ecological conservation,land degradation,and sustainable land management.Four multiple regression models were constructed for AI generated map of land plots for recreational use in St.Petersburg(Russia)with different volumes of market information(32,30,20 and 15 units of market information with four price-forming factors).During the analysis of the quality of the models,it was revealed that the best result is shown by the model built on the maximum sample size,then the model based on 15 analogs,which proves that a larger number of analog objects does not always allow us to achieve better results,since the more analog objects there are.展开更多
The work proposes a distributed Kalman filtering(KF)algorithm to track a time-varying unknown signal process for a stochastic regression model over network systems in a cooperative way.We provide the stability analysi...The work proposes a distributed Kalman filtering(KF)algorithm to track a time-varying unknown signal process for a stochastic regression model over network systems in a cooperative way.We provide the stability analysis of the proposed distributed KF algorithm without independent and stationary signal assumptions,which implies that the theoretical results are able to be applied to stochastic feedback systems.Note that the main difficulty of stability analysis lies in analyzing the properties of the product of non-independent and non-stationary random matrices involved in the error equation.We employ analysis techniques such as stochastic Lyapunov function,stability theory of stochastic systems,and algebraic graph theory to deal with the above issue.The stochastic spatio-temporal cooperative information condition shows the cooperative property of multiple sensors that even though any local sensor cannot track the time-varying unknown signal,the distributed KF algorithm can be utilized to finish the filtering task in a cooperative way.At last,we illustrate the property of the proposed distributed KF algorithm by a simulation example.展开更多
Taking the nonlinear nature of runoff system into account,and combining auto-regression method and multi-regression method,a Nonlinear Mixed Regression Model (NMR) was established to analyze the impact of temperature ...Taking the nonlinear nature of runoff system into account,and combining auto-regression method and multi-regression method,a Nonlinear Mixed Regression Model (NMR) was established to analyze the impact of temperature and precipitation changes on annual river runoff process. The model was calibrated and verified by using BP neural network with observed meteorological and runoff data from Daiying Hydrological Station in the Chaohe River of Hebei Province in 1956–2000. Compared with auto-regression model,linear multi-regression model and linear mixed regression model,NMR can improve forecasting precision remarkably. Therefore,the simulation of climate change scenarios was carried out by NMR. The results show that the nonlinear mixed regression model can simulate annual river runoff well.展开更多
A fuzzy observations-based radial basis function neural network (FORBFNN) is presented for modeling nonlinear systems in which the observations of response are imprecise but can be represented as fuzzy membership fu...A fuzzy observations-based radial basis function neural network (FORBFNN) is presented for modeling nonlinear systems in which the observations of response are imprecise but can be represented as fuzzy membership functions. In the FORBFNN model, the weight coefficients of nodes in the hidden layer are identified by using the fuzzy expectation-maximization ( EM ) algorithm, whereas the optimal number of these nodes as well as the centers and widths of radial basis functions are automatically constructed by using a data-driven method. Namely, the method starts with an initial node, and then a new node is added in a hidden layer according to some rules. This procedure is not terminated until the model meets the preset requirements. The method considers both the accuracy and complexity of the model. Numerical simulation results show that the modeling method is effective, and the established model has high prediction accuracy.展开更多
Under-fitting problems usually occur in regression models for dam safety monitoring.To overcome the local convergence of the regression, a genetic algorithm (GA) was proposed using a real parameter coding, a ranking s...Under-fitting problems usually occur in regression models for dam safety monitoring.To overcome the local convergence of the regression, a genetic algorithm (GA) was proposed using a real parameter coding, a ranking selection operator, an arithmetical crossover operator and a uniform mutation operator, and calculated the least-square error of the observed and computed values as its fitness function. The elitist strategy was used to improve the speed of the convergence. After that, the modified genetic algorithm was applied to reassess the coefficients of the regression model and a genetic regression model was set up. As an example, a slotted gravity dam in the Northeast of China was introduced. The computational results show that the genetic regression model can solve the under-fitting problems perfectly.展开更多
In the era of big data,traditional regression models cannot deal with uncertain big data efficiently and accurately.In order to make up for this deficiency,this paper proposes a quantum fuzzy regression model,which us...In the era of big data,traditional regression models cannot deal with uncertain big data efficiently and accurately.In order to make up for this deficiency,this paper proposes a quantum fuzzy regression model,which uses fuzzy theory to describe the uncertainty in big data sets and uses quantum computing to exponentially improve the efficiency of data set preprocessing and parameter estimation.In this paper,data envelopment analysis(DEA)is used to calculate the degree of importance of each data point.Meanwhile,Harrow,Hassidim and Lloyd(HHL)algorithm and quantum swap circuits are used to improve the efficiency of high-dimensional data matrix calculation.The application of the quantum fuzzy regression model to smallscale financial data proves that its accuracy is greatly improved compared with the quantum regression model.Moreover,due to the introduction of quantum computing,the speed of dealing with high-dimensional data matrix has an exponential improvement compared with the fuzzy regression model.The quantum fuzzy regression model proposed in this paper combines the advantages of fuzzy theory and quantum computing which can efficiently calculate high-dimensional data matrix and complete parameter estimation using quantum computing while retaining the uncertainty in big data.Thus,it is a new model for efficient and accurate big data processing in uncertain environments.展开更多
Mixture of Experts(MoE)regression models are widely studied in statistics and machine learning for modeling heterogeneity in data for regression,clustering and classification.Laplace distribution is one of the most im...Mixture of Experts(MoE)regression models are widely studied in statistics and machine learning for modeling heterogeneity in data for regression,clustering and classification.Laplace distribution is one of the most important statistical tools to analyze thick and tail data.Laplace Mixture of Linear Experts(LMoLE)regression models are based on the Laplace distribution which is more robust.Similar to modelling variance parameter in a homogeneous population,we propose and study a new novel class of models:heteroscedastic Laplace mixture of experts regression models to analyze the heteroscedastic data coming from a heterogeneous population in this paper.The issues of maximum likelihood estimation are addressed.In particular,Minorization-Maximization(MM)algorithm for estimating the regression parameters is developed.Properties of the estimators of the regression coefficients are evaluated through Monte Carlo simulations.Results from the analysis of two real data sets are presented.展开更多
Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calcu...Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calculated using rainfall, catchment area and runoff coefficient. In this study, runoff quantity and quality data gathered from a 28-month monitoring conducted on the road and parking lot sites in Korea were evaluated using multiple linear regression (MLR) to develop equations for estimating pollutant loads and EMCs as a function of rainfall variables. The results revealed that total event rainfall and average rainfall intensity are possible predictors of pollutant loads. Overall, the models are indicators of the high uncertainties of NPSs; perhaps estimation of EMCs and loads could be accurately obtained by means of water quality sampling or a long term monitoring is needed to gather more data that can be used for the development of estimation models.展开更多
In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calcula...In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.展开更多
The radial basis function (RBF) emerged as a variant of artificial neural network. Generalized regression neural network (GRNN) is one type of RBF, and its principal advantages are that it can quickly learn and ra...The radial basis function (RBF) emerged as a variant of artificial neural network. Generalized regression neural network (GRNN) is one type of RBF, and its principal advantages are that it can quickly learn and rapidly converge to the optimal regression surface with large number of data sets. Hyperspectral reflectance (350 to 2500 nm) data were recorded at two different rice sites in two experiment fields with two cultivars, three nitrogen treatments and one plant density (45 plants m^-2). Stepwise multivariable regression model (SMR) and RBF were used to compare their predictability for the leaf area index (LAI) and green leaf chlorophyll density (GLCD) of rice based on reflectance (R) and its three different transformations, the first derivative reflectance (D1), the second derivative reflectance (D2) and the log-transformed reflectance (LOG). GRNN based on D1 was the best model for the prediction of rice LAI and CLCD. The relationships between different transformations of reflectance and rice parameters could be further improved when RBF was employed. Owing to its strong capacity for nonlinear mapping and good robustness, GRNN could maximize the sensitivity to chlorophyll content using D1. It is concluded that RBF may provide a useful exploratory and predictive tool for the estimation of rice biophysical parameters.展开更多
In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the...In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.展开更多
In this article,a procedure for estimating the coefficient functions on the functional-coefficient regression models with different smoothing variables in different coefficient functions is defined.First step,by the l...In this article,a procedure for estimating the coefficient functions on the functional-coefficient regression models with different smoothing variables in different coefficient functions is defined.First step,by the local linear technique and the averaged method,the initial estimates of the coefficient functions are given.Second step,based on the initial estimates,the efficient estimates of the coefficient functions are proposed by a one-step back-fitting procedure.The efficient estimators share the same asymptotic normalities as the local linear estimators for the functional-coefficient models with a single smoothing variable in different functions.Two simulated examples show that the procedure is effective.展开更多
In this paper,a class of functional-coefficient regression models is proposed and an estimation procedure based on the locally weighted least equares is suggested.This class of models,with the proposed estimation meth...In this paper,a class of functional-coefficient regression models is proposed and an estimation procedure based on the locally weighted least equares is suggested.This class of models,with the proposed estimation method,is a powerful means for exploratory data analysis.展开更多
Spatial models are effective in obtaining local details on grassland biomass,and their accuracy has important practical significance for the stable management of grasses and livestock.To this end,the present study uti...Spatial models are effective in obtaining local details on grassland biomass,and their accuracy has important practical significance for the stable management of grasses and livestock.To this end,the present study utilized measured quadrat data of grass yield across different regions in the main growing season of temperate grasslands in Ningxia of China(August 2020),combined with hydrometeorology,elevation,net primary productivity(NPP),and other auxiliary data over the same period.Accordingly,non-stationary characteristics of the spatial scale,and the effects of influencing factors on grass yield were analyzed using a mixed geographically weighted regression(MGWR)model.The results showed that the model was suitable for correlation analysis.The spatial scale of ratio resident-area index(PRI)was the largest,followed by the digital elevation model,NPP,distance from gully,distance from river,average July rainfall,and daily temperature range;whereas the spatial scales of night light,distance from roads,and relative humidity(RH)were the most limited.All influencing factors maintained positive and negative effects on grass yield,save for the strictly negative effect of RH.The regression results revealed a multiscale differential spatial response regularity of different influencing factors on grass yield.Regression parameters revealed that the results of Ordinary least squares(OLS)(Adjusted R^(2)=0.642)and geographically weighted regression(GWR)(Adjusted R^(2)=0.797)models were worse than those of MGWR(Adjusted R^(2)=0.889)models.Based on the results of the RMSE and radius index,the simulation effect also was MGWR>GWR>OLS models.Ultimately,the MGWR model held the strongest prediction performance(R^(2)=0.8306).Spatially,the grass yield was high in the south and west,and low in the north and east of the study area.The results of this study provide a new technical support for rapid and accurate estimation of grassland yield to dynamically adjust grazing decision in the semi-arid loess hilly region.展开更多
Wavelets are applied to detect the jumps in a heteroscedastic regression model. It is shown that the wavelet coefficients of the data have significantly large absolute values across fine scale levels near the jump poi...Wavelets are applied to detect the jumps in a heteroscedastic regression model. It is shown that the wavelet coefficients of the data have significantly large absolute values across fine scale levels near the jump points. Then a procedure is developed to estimate the jumps and jump heights. All estimators are proved to be consistent.展开更多
A geometric framework is proposed for semiparametric nonlinear regression models based on the concept of least favorable curve, introduced by Severini and Wong (1992). The authors use this framework to drive three kin...A geometric framework is proposed for semiparametric nonlinear regression models based on the concept of least favorable curve, introduced by Severini and Wong (1992). The authors use this framework to drive three kinds of improved approximate confidence regions for the parameter and parameter subset in terms of curvatures. The results obtained by Hamilton et al. (1982), Hamilton (1986) and Wei (1994) are extended to semiparametric nonlinear regression models.展开更多
Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ri...Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ridge height and potential energy during wave-ridge interaction with a binary and cumulative logistic regression model. In testing the Global Null Hypothesis, all values are p 〈0.001, with three statistical methods, such as Likelihood Ratio, Score, and Wald. While comparing with two kinds of models, tests values obtained by cumulative logistic regression models are better than those by binary logistic regression models. Although this study employed cumulative logistic regression model, three probability functions p^1, p^2 and p^3, are utilized for investigating the weighted influence of factors on wave reflection. Deviance and Pearson tests are applied to cheek the goodness-of-fit of the proposed model. The analytical results demonstrated that both ridge height (X1 ) and potential energy (X2 ) significantly impact (p 〈 0. 0001 ) the amplitude-based refleeted rate; the P-values for the deviance and Pearson are all 〉 0.05 (0.2839, 0.3438, respectively). That is, the goodness-of-fit between ridge height ( X1 ) and potential energy (X2) can further predict parameters under the scenario of the best parsimonious model. Investigation of 6 predictive powers ( R2, Max-rescaled R^2, Sorners' D, Gamma, Tau-a, and c, respectively) indicate that these predictive estimates of the proposed model have better predictive ability than ridge height alone, and are very similar to the interaction of ridge height and potential energy. It can be concluded that the goodness-of-fit and prediction ability of the cumulative logistic regression model are better than that of the binary logistic regression model.展开更多
This study used spatial autoregression(SAR)model and geographically weighted regression(GWR)model to model the spatial patterns of farmland density and its temporal change in Gucheng County,Hubei Province,China in 199...This study used spatial autoregression(SAR)model and geographically weighted regression(GWR)model to model the spatial patterns of farmland density and its temporal change in Gucheng County,Hubei Province,China in 1999 and 2009,and discussed the difference between global and local spatial autocorrelations in terms of spatial heterogeneity and non-stationarity.Results showed that strong spatial positive correlations existed in the spatial distributions of farmland density,its temporal change and the driving factors,and the coefficients of spatial autocorrelations decreased as the spatial lag distance increased.SAR models revealed the global spatial relations between dependent and independent variables,while the GWR model showed the spatially varying fitting degree and local weighting coefficients of driving factors and farmland indices(i.e.,farmland density and temporal change).The GWR model has smooth process when constructing the farmland spatial model.The coefficients of GWR model can show the accurate influence degrees of different driving factors on the farmland at different geographical locations.The performance indices of GWR model showed that GWR model produced more accurate simulation results than other models at different times,and the improvement precision of GWR model was obvious.The global and local farmland models used in this study showed different characteristics in the spatial distributions of farmland indices at different scales,which may provide the theoretical basis for farmland protection from the influence of different driving factors.展开更多
Gaussian process(GP)has fewer parameters,simple model and output of probabilistic sense,when compared with the methods such as support vector machines.Selection of the hyper-parameters is critical to the performance o...Gaussian process(GP)has fewer parameters,simple model and output of probabilistic sense,when compared with the methods such as support vector machines.Selection of the hyper-parameters is critical to the performance of Gaussian process model.However,the common-used algorithm has the disadvantages of difficult determination of iteration steps,over-dependence of optimization effect on initial values,and easily falling into local optimum.To solve this problem,a method combining the Gaussian process with memetic algorithm was proposed.Based on this method,memetic algorithm was used to search the optimal hyper parameters of Gaussian process regression(GPR)model in the training process and form MA-GPR algorithms,and then the model was used to predict and test the results.When used in the marine long-range precision strike system(LPSS)battle effectiveness evaluation,the proposed MA-GPR model significantly improved the prediction accuracy,compared with the conjugate gradient method and the genetic algorithm optimization process.展开更多
This paper studies the relationship between accessibility and housing prices in Dalian by using an improved geographically weighted regression model and house prices, traffic, remote sensing images, etc. Multi-source ...This paper studies the relationship between accessibility and housing prices in Dalian by using an improved geographically weighted regression model and house prices, traffic, remote sensing images, etc. Multi-source data improves the accuracy of the spatial differentiation that reflects the impact of traffic accessibility on house prices. The results are as follows: first, the average house price is 12 436 yuan(RMB)/m^2, and reveals a declining trend from coastal areas to inland areas. The exception was Guilin Street, which demonstrates a local peak of house prices that decreases from the center of the street to its periphery. Second, the accessibility value is 33 minutes on average, excluding northern and eastern fringe areas, which was over 50 minutes. Third, the significant spatial correlation coefficient between accessibility and house prices is 0.423, and the coefficient increases in the southeastern direction. The strongest impact of accessibility on house prices is in the southeastern coast, and can be seen in the Lehua, Yingke, and Hushan communities, while the weakest impact is in the northwestern fringe, and can be seen in the Yingchengzi, Xixiaomo, and Daheishi community areas.展开更多
基金financed as part of the project“Development of a methodology for instrumental base formation for analysis and modeling of the spatial socio-economic development of systems based on internal reserves in the context of digitalization”(FSEG-2023-0008)funded by the Russian Science Foundation(Agreement 23-41-10001,https://doi.org/https://rscf.ru/project/23-41-10001/).
文摘The results of mass appraisal in many countries are used as a basis for calculating the amount of real estate tax,therefore,regardless of the methods used to calculate it,the resulting value should be as close as possible to the market value of the real estate to maintain a balance of interests between the state and the rights holders.In practice,this condition is not always met,since,firstly,the quality of market data is often very low,and secondly,some markets are characterized by low activity,which is expressed in a deficit of information on asking prices.The aim of the work is ecological valuation of land use:how regression-based mass appraisal can inform ecological conservation,land degradation,and sustainable land management.Four multiple regression models were constructed for AI generated map of land plots for recreational use in St.Petersburg(Russia)with different volumes of market information(32,30,20 and 15 units of market information with four price-forming factors).During the analysis of the quality of the models,it was revealed that the best result is shown by the model built on the maximum sample size,then the model based on 15 analogs,which proves that a larger number of analog objects does not always allow us to achieve better results,since the more analog objects there are.
基金supported in part by Sichuan Science and Technology Program under Grant No.2025ZNSFSC151in part by the Strategic Priority Research Program of Chinese Academy of Sciences under Grant No.XDA27030201+1 种基金the Natural Science Foundation of China under Grant No.U21B6001in part by the Natural Science Foundation of Tianjin under Grant No.24JCQNJC01930.
文摘The work proposes a distributed Kalman filtering(KF)algorithm to track a time-varying unknown signal process for a stochastic regression model over network systems in a cooperative way.We provide the stability analysis of the proposed distributed KF algorithm without independent and stationary signal assumptions,which implies that the theoretical results are able to be applied to stochastic feedback systems.Note that the main difficulty of stability analysis lies in analyzing the properties of the product of non-independent and non-stationary random matrices involved in the error equation.We employ analysis techniques such as stochastic Lyapunov function,stability theory of stochastic systems,and algebraic graph theory to deal with the above issue.The stochastic spatio-temporal cooperative information condition shows the cooperative property of multiple sensors that even though any local sensor cannot track the time-varying unknown signal,the distributed KF algorithm can be utilized to finish the filtering task in a cooperative way.At last,we illustrate the property of the proposed distributed KF algorithm by a simulation example.
基金Under the auspices of National Natural Science Foundation of China (No. 50809004)
文摘Taking the nonlinear nature of runoff system into account,and combining auto-regression method and multi-regression method,a Nonlinear Mixed Regression Model (NMR) was established to analyze the impact of temperature and precipitation changes on annual river runoff process. The model was calibrated and verified by using BP neural network with observed meteorological and runoff data from Daiying Hydrological Station in the Chaohe River of Hebei Province in 1956–2000. Compared with auto-regression model,linear multi-regression model and linear mixed regression model,NMR can improve forecasting precision remarkably. Therefore,the simulation of climate change scenarios was carried out by NMR. The results show that the nonlinear mixed regression model can simulate annual river runoff well.
基金The National Natural Science Foundation of China(No.51106025,51106027,51036002)Specialized Research Fund for the Doctoral Program of Higher Education(No.20130092110061)the Youth Foundation of Nanjing Institute of Technology(No.QKJA201303)
文摘A fuzzy observations-based radial basis function neural network (FORBFNN) is presented for modeling nonlinear systems in which the observations of response are imprecise but can be represented as fuzzy membership functions. In the FORBFNN model, the weight coefficients of nodes in the hidden layer are identified by using the fuzzy expectation-maximization ( EM ) algorithm, whereas the optimal number of these nodes as well as the centers and widths of radial basis functions are automatically constructed by using a data-driven method. Namely, the method starts with an initial node, and then a new node is added in a hidden layer according to some rules. This procedure is not terminated until the model meets the preset requirements. The method considers both the accuracy and complexity of the model. Numerical simulation results show that the modeling method is effective, and the established model has high prediction accuracy.
文摘Under-fitting problems usually occur in regression models for dam safety monitoring.To overcome the local convergence of the regression, a genetic algorithm (GA) was proposed using a real parameter coding, a ranking selection operator, an arithmetical crossover operator and a uniform mutation operator, and calculated the least-square error of the observed and computed values as its fitness function. The elitist strategy was used to improve the speed of the convergence. After that, the modified genetic algorithm was applied to reassess the coefficients of the regression model and a genetic regression model was set up. As an example, a slotted gravity dam in the Northeast of China was introduced. The computational results show that the genetic regression model can solve the under-fitting problems perfectly.
基金This work is supported by the NationalNatural Science Foundation of China(No.62076042)the Key Research and Development Project of Sichuan Province(Nos.2021YFSY0012,2020YFG0307,2021YFG0332)+3 种基金the Science and Technology Innovation Project of Sichuan(No.2020017)the Key Research and Development Project of Chengdu(No.2019-YF05-02028-GX)the Innovation Team of Quantum Security Communication of Sichuan Province(No.17TD0009)the Academic and Technical Leaders Training Funding Support Projects of Sichuan Province(No.2016120080102643).
文摘In the era of big data,traditional regression models cannot deal with uncertain big data efficiently and accurately.In order to make up for this deficiency,this paper proposes a quantum fuzzy regression model,which uses fuzzy theory to describe the uncertainty in big data sets and uses quantum computing to exponentially improve the efficiency of data set preprocessing and parameter estimation.In this paper,data envelopment analysis(DEA)is used to calculate the degree of importance of each data point.Meanwhile,Harrow,Hassidim and Lloyd(HHL)algorithm and quantum swap circuits are used to improve the efficiency of high-dimensional data matrix calculation.The application of the quantum fuzzy regression model to smallscale financial data proves that its accuracy is greatly improved compared with the quantum regression model.Moreover,due to the introduction of quantum computing,the speed of dealing with high-dimensional data matrix has an exponential improvement compared with the fuzzy regression model.The quantum fuzzy regression model proposed in this paper combines the advantages of fuzzy theory and quantum computing which can efficiently calculate high-dimensional data matrix and complete parameter estimation using quantum computing while retaining the uncertainty in big data.Thus,it is a new model for efficient and accurate big data processing in uncertain environments.
基金the National Natural Science Foundation of China(11861041,11261025).
文摘Mixture of Experts(MoE)regression models are widely studied in statistics and machine learning for modeling heterogeneity in data for regression,clustering and classification.Laplace distribution is one of the most important statistical tools to analyze thick and tail data.Laplace Mixture of Linear Experts(LMoLE)regression models are based on the Laplace distribution which is more robust.Similar to modelling variance parameter in a homogeneous population,we propose and study a new novel class of models:heteroscedastic Laplace mixture of experts regression models to analyze the heteroscedastic data coming from a heterogeneous population in this paper.The issues of maximum likelihood estimation are addressed.In particular,Minorization-Maximization(MM)algorithm for estimating the regression parameters is developed.Properties of the estimators of the regression coefficients are evaluated through Monte Carlo simulations.Results from the analysis of two real data sets are presented.
基金provided by the Korean Ministry of Environment and Eco Star Project
文摘Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calculated using rainfall, catchment area and runoff coefficient. In this study, runoff quantity and quality data gathered from a 28-month monitoring conducted on the road and parking lot sites in Korea were evaluated using multiple linear regression (MLR) to develop equations for estimating pollutant loads and EMCs as a function of rainfall variables. The results revealed that total event rainfall and average rainfall intensity are possible predictors of pollutant loads. Overall, the models are indicators of the high uncertainties of NPSs; perhaps estimation of EMCs and loads could be accurately obtained by means of water quality sampling or a long term monitoring is needed to gather more data that can be used for the development of estimation models.
基金Supported by the Natural Science Foundation of Anhui Education Committee
文摘In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.
基金Project supported by the National Natural Science Foundation of China (No.40571115)the National High Tech-nology Research and Development Program (863 Program) of China (Nos.2006AA120101 and 2007AA10Z205)
文摘The radial basis function (RBF) emerged as a variant of artificial neural network. Generalized regression neural network (GRNN) is one type of RBF, and its principal advantages are that it can quickly learn and rapidly converge to the optimal regression surface with large number of data sets. Hyperspectral reflectance (350 to 2500 nm) data were recorded at two different rice sites in two experiment fields with two cultivars, three nitrogen treatments and one plant density (45 plants m^-2). Stepwise multivariable regression model (SMR) and RBF were used to compare their predictability for the leaf area index (LAI) and green leaf chlorophyll density (GLCD) of rice based on reflectance (R) and its three different transformations, the first derivative reflectance (D1), the second derivative reflectance (D2) and the log-transformed reflectance (LOG). GRNN based on D1 was the best model for the prediction of rice LAI and CLCD. The relationships between different transformations of reflectance and rice parameters could be further improved when RBF was employed. Owing to its strong capacity for nonlinear mapping and good robustness, GRNN could maximize the sensitivity to chlorophyll content using D1. It is concluded that RBF may provide a useful exploratory and predictive tool for the estimation of rice biophysical parameters.
基金Project supported by the National Natural Science Foundation of China (Grant No 60573065)the Natural Science Foundation of Shandong Province,China (Grant No Y2007G33)the Key Subject Research Foundation of Shandong Province,China(Grant No XTD0708)
文摘In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.
文摘In this article,a procedure for estimating the coefficient functions on the functional-coefficient regression models with different smoothing variables in different coefficient functions is defined.First step,by the local linear technique and the averaged method,the initial estimates of the coefficient functions are given.Second step,based on the initial estimates,the efficient estimates of the coefficient functions are proposed by a one-step back-fitting procedure.The efficient estimators share the same asymptotic normalities as the local linear estimators for the functional-coefficient models with a single smoothing variable in different functions.Two simulated examples show that the procedure is effective.
文摘In this paper,a class of functional-coefficient regression models is proposed and an estimation procedure based on the locally weighted least equares is suggested.This class of models,with the proposed estimation method,is a powerful means for exploratory data analysis.
文摘Spatial models are effective in obtaining local details on grassland biomass,and their accuracy has important practical significance for the stable management of grasses and livestock.To this end,the present study utilized measured quadrat data of grass yield across different regions in the main growing season of temperate grasslands in Ningxia of China(August 2020),combined with hydrometeorology,elevation,net primary productivity(NPP),and other auxiliary data over the same period.Accordingly,non-stationary characteristics of the spatial scale,and the effects of influencing factors on grass yield were analyzed using a mixed geographically weighted regression(MGWR)model.The results showed that the model was suitable for correlation analysis.The spatial scale of ratio resident-area index(PRI)was the largest,followed by the digital elevation model,NPP,distance from gully,distance from river,average July rainfall,and daily temperature range;whereas the spatial scales of night light,distance from roads,and relative humidity(RH)were the most limited.All influencing factors maintained positive and negative effects on grass yield,save for the strictly negative effect of RH.The regression results revealed a multiscale differential spatial response regularity of different influencing factors on grass yield.Regression parameters revealed that the results of Ordinary least squares(OLS)(Adjusted R^(2)=0.642)and geographically weighted regression(GWR)(Adjusted R^(2)=0.797)models were worse than those of MGWR(Adjusted R^(2)=0.889)models.Based on the results of the RMSE and radius index,the simulation effect also was MGWR>GWR>OLS models.Ultimately,the MGWR model held the strongest prediction performance(R^(2)=0.8306).Spatially,the grass yield was high in the south and west,and low in the north and east of the study area.The results of this study provide a new technical support for rapid and accurate estimation of grassland yield to dynamically adjust grazing decision in the semi-arid loess hilly region.
文摘Wavelets are applied to detect the jumps in a heteroscedastic regression model. It is shown that the wavelet coefficients of the data have significantly large absolute values across fine scale levels near the jump points. Then a procedure is developed to estimate the jumps and jump heights. All estimators are proved to be consistent.
文摘A geometric framework is proposed for semiparametric nonlinear regression models based on the concept of least favorable curve, introduced by Severini and Wong (1992). The authors use this framework to drive three kinds of improved approximate confidence regions for the parameter and parameter subset in terms of curvatures. The results obtained by Hamilton et al. (1982), Hamilton (1986) and Wei (1994) are extended to semiparametric nonlinear regression models.
基金This paper was financially supported by NSC96-2628-E-366-004-MY2 and NSC96-2628-E-132-001-MY2
文摘Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ridge height and potential energy during wave-ridge interaction with a binary and cumulative logistic regression model. In testing the Global Null Hypothesis, all values are p 〈0.001, with three statistical methods, such as Likelihood Ratio, Score, and Wald. While comparing with two kinds of models, tests values obtained by cumulative logistic regression models are better than those by binary logistic regression models. Although this study employed cumulative logistic regression model, three probability functions p^1, p^2 and p^3, are utilized for investigating the weighted influence of factors on wave reflection. Deviance and Pearson tests are applied to cheek the goodness-of-fit of the proposed model. The analytical results demonstrated that both ridge height (X1 ) and potential energy (X2 ) significantly impact (p 〈 0. 0001 ) the amplitude-based refleeted rate; the P-values for the deviance and Pearson are all 〉 0.05 (0.2839, 0.3438, respectively). That is, the goodness-of-fit between ridge height ( X1 ) and potential energy (X2) can further predict parameters under the scenario of the best parsimonious model. Investigation of 6 predictive powers ( R2, Max-rescaled R^2, Sorners' D, Gamma, Tau-a, and c, respectively) indicate that these predictive estimates of the proposed model have better predictive ability than ridge height alone, and are very similar to the interaction of ridge height and potential energy. It can be concluded that the goodness-of-fit and prediction ability of the cumulative logistic regression model are better than that of the binary logistic regression model.
基金Under the auspices of National Natural Science Foundation of China(No.40601073,41101192,41201571)Fundamental Research Funds for the Central Universities(No.2011PY112,2011QC041,2011QC091)Huazhong Agricultural University Scientific&Technological Self-innovation Foundation(No.2011SC21)
文摘This study used spatial autoregression(SAR)model and geographically weighted regression(GWR)model to model the spatial patterns of farmland density and its temporal change in Gucheng County,Hubei Province,China in 1999 and 2009,and discussed the difference between global and local spatial autocorrelations in terms of spatial heterogeneity and non-stationarity.Results showed that strong spatial positive correlations existed in the spatial distributions of farmland density,its temporal change and the driving factors,and the coefficients of spatial autocorrelations decreased as the spatial lag distance increased.SAR models revealed the global spatial relations between dependent and independent variables,while the GWR model showed the spatially varying fitting degree and local weighting coefficients of driving factors and farmland indices(i.e.,farmland density and temporal change).The GWR model has smooth process when constructing the farmland spatial model.The coefficients of GWR model can show the accurate influence degrees of different driving factors on the farmland at different geographical locations.The performance indices of GWR model showed that GWR model produced more accurate simulation results than other models at different times,and the improvement precision of GWR model was obvious.The global and local farmland models used in this study showed different characteristics in the spatial distributions of farmland indices at different scales,which may provide the theoretical basis for farmland protection from the influence of different driving factors.
基金Project(513300303)supported by the General Armament Department,China
文摘Gaussian process(GP)has fewer parameters,simple model and output of probabilistic sense,when compared with the methods such as support vector machines.Selection of the hyper-parameters is critical to the performance of Gaussian process model.However,the common-used algorithm has the disadvantages of difficult determination of iteration steps,over-dependence of optimization effect on initial values,and easily falling into local optimum.To solve this problem,a method combining the Gaussian process with memetic algorithm was proposed.Based on this method,memetic algorithm was used to search the optimal hyper parameters of Gaussian process regression(GPR)model in the training process and form MA-GPR algorithms,and then the model was used to predict and test the results.When used in the marine long-range precision strike system(LPSS)battle effectiveness evaluation,the proposed MA-GPR model significantly improved the prediction accuracy,compared with the conjugate gradient method and the genetic algorithm optimization process.
基金Under the auspices of National Natural Science Foundation of China(No.41471140,41771178)Liaoning Province Outstanding Youth Program(No.LJQ2015058)
文摘This paper studies the relationship between accessibility and housing prices in Dalian by using an improved geographically weighted regression model and house prices, traffic, remote sensing images, etc. Multi-source data improves the accuracy of the spatial differentiation that reflects the impact of traffic accessibility on house prices. The results are as follows: first, the average house price is 12 436 yuan(RMB)/m^2, and reveals a declining trend from coastal areas to inland areas. The exception was Guilin Street, which demonstrates a local peak of house prices that decreases from the center of the street to its periphery. Second, the accessibility value is 33 minutes on average, excluding northern and eastern fringe areas, which was over 50 minutes. Third, the significant spatial correlation coefficient between accessibility and house prices is 0.423, and the coefficient increases in the southeastern direction. The strongest impact of accessibility on house prices is in the southeastern coast, and can be seen in the Lehua, Yingke, and Hushan communities, while the weakest impact is in the northwestern fringe, and can be seen in the Yingchengzi, Xixiaomo, and Daheishi community areas.