In clinical research,subgroup analysis can help identify patient groups that respond better or worse to specific treatments,improve therapeutic effect and safety,and is of great significance in precision medicine.This...In clinical research,subgroup analysis can help identify patient groups that respond better or worse to specific treatments,improve therapeutic effect and safety,and is of great significance in precision medicine.This article considers subgroup analysis methods for longitudinal data containing multiple covariates and biomarkers.We divide subgroups based on whether a linear combination of these biomarkers exceeds a predetermined threshold,and assess the heterogeneity of treatment effects across subgroups using the interaction between subgroups and exposure variables.Quantile regression is used to better characterize the global distribution of the response variable and sparsity penalties are imposed to achieve variable selection of covariates and biomarkers.The effectiveness of our proposed methodology for both variable selection and parameter estimation is verified through random simulations.Finally,we demonstrate the application of this method by analyzing data from the PA.3 trial,further illustrating the practicality of the method proposed in this paper.展开更多
As the core component of inertial navigation systems, fiber optic gyroscope (FOG), with technical advantages such as low power consumption, long lifespan, fast startup speed, and flexible structural design, are widely...As the core component of inertial navigation systems, fiber optic gyroscope (FOG), with technical advantages such as low power consumption, long lifespan, fast startup speed, and flexible structural design, are widely used in aerospace, unmanned driving, and other fields. However, due to the temper-ature sensitivity of optical devices, the influence of environmen-tal temperature causes errors in FOG, thereby greatly limiting their output accuracy. This work researches on machine-learn-ing based temperature error compensation techniques for FOG. Specifically, it focuses on compensating for the bias errors gen-erated in the fiber ring due to the Shupe effect. This work pro-poses a composite model based on k-means clustering, sup-port vector regression, and particle swarm optimization algo-rithms. And it significantly reduced redundancy within the sam-ples by adopting the interval sequence sample. Moreover, met-rics such as root mean square error (RMSE), mean absolute error (MAE), bias stability, and Allan variance, are selected to evaluate the model’s performance and compensation effective-ness. This work effectively enhances the consistency between data and models across different temperature ranges and tem-perature gradients, improving the bias stability of the FOG from 0.022 °/h to 0.006 °/h. Compared to the existing methods utiliz-ing a single machine learning model, the proposed method increases the bias stability of the compensated FOG from 57.11% to 71.98%, and enhances the suppression of rate ramp noise coefficient from 2.29% to 14.83%. This work improves the accuracy of FOG after compensation, providing theoretical guid-ance and technical references for sensors error compensation work in other fields.展开更多
The results of mass appraisal in many countries are used as a basis for calculating the amount of real estate tax,therefore,regardless of the methods used to calculate it,the resulting value should be as close as poss...The results of mass appraisal in many countries are used as a basis for calculating the amount of real estate tax,therefore,regardless of the methods used to calculate it,the resulting value should be as close as possible to the market value of the real estate to maintain a balance of interests between the state and the rights holders.In practice,this condition is not always met,since,firstly,the quality of market data is often very low,and secondly,some markets are characterized by low activity,which is expressed in a deficit of information on asking prices.The aim of the work is ecological valuation of land use:how regression-based mass appraisal can inform ecological conservation,land degradation,and sustainable land management.Four multiple regression models were constructed for AI generated map of land plots for recreational use in St.Petersburg(Russia)with different volumes of market information(32,30,20 and 15 units of market information with four price-forming factors).During the analysis of the quality of the models,it was revealed that the best result is shown by the model built on the maximum sample size,then the model based on 15 analogs,which proves that a larger number of analog objects does not always allow us to achieve better results,since the more analog objects there are.展开更多
The work proposes a distributed Kalman filtering(KF)algorithm to track a time-varying unknown signal process for a stochastic regression model over network systems in a cooperative way.We provide the stability analysi...The work proposes a distributed Kalman filtering(KF)algorithm to track a time-varying unknown signal process for a stochastic regression model over network systems in a cooperative way.We provide the stability analysis of the proposed distributed KF algorithm without independent and stationary signal assumptions,which implies that the theoretical results are able to be applied to stochastic feedback systems.Note that the main difficulty of stability analysis lies in analyzing the properties of the product of non-independent and non-stationary random matrices involved in the error equation.We employ analysis techniques such as stochastic Lyapunov function,stability theory of stochastic systems,and algebraic graph theory to deal with the above issue.The stochastic spatio-temporal cooperative information condition shows the cooperative property of multiple sensors that even though any local sensor cannot track the time-varying unknown signal,the distributed KF algorithm can be utilized to finish the filtering task in a cooperative way.At last,we illustrate the property of the proposed distributed KF algorithm by a simulation example.展开更多
Under-fitting problems usually occur in regression models for dam safety monitoring.To overcome the local convergence of the regression, a genetic algorithm (GA) was proposed using a real parameter coding, a ranking s...Under-fitting problems usually occur in regression models for dam safety monitoring.To overcome the local convergence of the regression, a genetic algorithm (GA) was proposed using a real parameter coding, a ranking selection operator, an arithmetical crossover operator and a uniform mutation operator, and calculated the least-square error of the observed and computed values as its fitness function. The elitist strategy was used to improve the speed of the convergence. After that, the modified genetic algorithm was applied to reassess the coefficients of the regression model and a genetic regression model was set up. As an example, a slotted gravity dam in the Northeast of China was introduced. The computational results show that the genetic regression model can solve the under-fitting problems perfectly.展开更多
A fuzzy observations-based radial basis function neural network (FORBFNN) is presented for modeling nonlinear systems in which the observations of response are imprecise but can be represented as fuzzy membership fu...A fuzzy observations-based radial basis function neural network (FORBFNN) is presented for modeling nonlinear systems in which the observations of response are imprecise but can be represented as fuzzy membership functions. In the FORBFNN model, the weight coefficients of nodes in the hidden layer are identified by using the fuzzy expectation-maximization ( EM ) algorithm, whereas the optimal number of these nodes as well as the centers and widths of radial basis functions are automatically constructed by using a data-driven method. Namely, the method starts with an initial node, and then a new node is added in a hidden layer according to some rules. This procedure is not terminated until the model meets the preset requirements. The method considers both the accuracy and complexity of the model. Numerical simulation results show that the modeling method is effective, and the established model has high prediction accuracy.展开更多
The conventional single model strategy may be ill- suited due to the multiplicity of operation phases and system uncertainty. A novel global-local discriminant analysis (GLDA) based Gaussian process regression (GPR...The conventional single model strategy may be ill- suited due to the multiplicity of operation phases and system uncertainty. A novel global-local discriminant analysis (GLDA) based Gaussian process regression (GPR) approach is developed for the quality prediction of nonlinear and multiphase batch processes. After the collected data is preprocessed through batchwise unfolding, the hidden Markov model (HMM) is applied to identify different operation phases. A GLDA algorithm is also presented to extract the appropriate process variables highly correlated with the quality variables, decreasing the complexity of modeling. Besides, the multiple local GPR models are built in the reduced- dimensional space for all the identified operation phases. Furthermore, the HMM-based state estimation is used to classify each measurement sample of a test batch into a corresponding phase with the maximal likelihood estimation. Therefore, the local GPR model with respect to specific phase is selected for online prediction. The effectiveness of the proposed prediction approach is demonstrated through the multiphase penicillin fermentation process. The comparison results show that the proposed GLDA-GPR approach is superior to the regular GPR model and the GPR based on HMM (HMM-GPR) model.展开更多
In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calcula...In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.展开更多
Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calcu...Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calculated using rainfall, catchment area and runoff coefficient. In this study, runoff quantity and quality data gathered from a 28-month monitoring conducted on the road and parking lot sites in Korea were evaluated using multiple linear regression (MLR) to develop equations for estimating pollutant loads and EMCs as a function of rainfall variables. The results revealed that total event rainfall and average rainfall intensity are possible predictors of pollutant loads. Overall, the models are indicators of the high uncertainties of NPSs; perhaps estimation of EMCs and loads could be accurately obtained by means of water quality sampling or a long term monitoring is needed to gather more data that can be used for the development of estimation models.展开更多
Wastewater treatment is one of critical issues faced by water utilities, and receives more and more attentions recently. The energy consumption modeling in biochemical wastewater treatment was investigated in the stud...Wastewater treatment is one of critical issues faced by water utilities, and receives more and more attentions recently. The energy consumption modeling in biochemical wastewater treatment was investigated in the study via a general and robust approach based on Bayesian semi-parametric quantile regression. The dataset was derived from a municipal wastewater treatment plant, where the energy consumption of unit chemical oxygen demand(COD) reduction was the response variable of interest. Via the proposed approach,the comprehensive regression pictures of the energy consumption and truly influencing factors, i.e., the regression relationships at lower, median and higher energy consumption levels were characterized respectively. Meanwhile, the proposals for energy saving in different cases were also facilitated specifically. First, the lower level of energy consumption was closely associated with the temperature of influent wastewater, and the chroma-rich wastewater also showed helpful in the execution of energy saving. Second, at median energy consumption level, the COD-rich wastewater played a determinative role in the reduction of energy consumption, while the higher quality of treated water led to slightly energy intensive. Third, the higher level of energy consumption was most likely to be attributed to the relatively high temperature of wastewater and total nitrogen(TN)-rich wastewater,and both of the factors were preferably to be avoided to alleviate the burden of energy consumption. The study provided an efficient approach to controlling the energy consumption of wastewater treatment in the perspective of statistical regression modeling, and offered valuable suggestions for the future energy saving.展开更多
The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to elimin...The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.展开更多
In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the...In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.展开更多
This paper presents a semiparametric adjustment method suitable for general cases.Assuming that the regularizer matrix is positive definite,the calculation method is discussed and the corresponding formulae are presen...This paper presents a semiparametric adjustment method suitable for general cases.Assuming that the regularizer matrix is positive definite,the calculation method is discussed and the corresponding formulae are presented.Finally,a simulated adjustment problem is constructed to explain the method given in this paper.The results from the semiparametric model and G_M model are compared.The results demonstrate that the model errors or the systematic errors of the observations can be detected correctly with the semiparametric estimate method.展开更多
Bailongjiang watershed in southern Gansu province, China, is one of the most landslide-prone regions in China, characterized by very high frequency of landslide occurrence. In order to predict the landslide occurrence...Bailongjiang watershed in southern Gansu province, China, is one of the most landslide-prone regions in China, characterized by very high frequency of landslide occurrence. In order to predict the landslide occurrence, a comprehensive map of landslide susceptibility is required which may be significantly helpful in reducing loss of property and human life. In this study, an integrated model of information value method and logistic regression is proposed by using their merits at maximum and overcoming their weaknesses, which may enhance precision and accuracy of landslide susceptibility assessment. A detailed and reliable landslide inventory with 1587 landslides was prepared and randomly divided into two groups,(i) training dataset and(ii) testing dataset. Eight distinct landslide conditioning factors including lithology, slope gradient, aspect, elevation, distance to drainages,distance to faults, distance to roads and vegetation coverage were selected for landslide susceptibility mapping. The produced landslide susceptibility maps were validated by the success rate and prediction rate curves. The validation results show that the success rate and the prediction rate of the integrated model are 81.7 % and 84.6 %, respectively, which indicate that the proposed integrated method is reliable to produce an accurate landslide susceptibility map and the results may be used for landslides management and mitigation.展开更多
Backgrounds:Evaluating the growth performance of pigs in real-time is laborious and expensive,thus mathematical models based on easily accessible variables are developed.Multiple regression(MR)is the most widely used ...Backgrounds:Evaluating the growth performance of pigs in real-time is laborious and expensive,thus mathematical models based on easily accessible variables are developed.Multiple regression(MR)is the most widely used tool to build prediction models in swine nutrition,while the artificial neural networks(ANN)model is reported to be more accurate than MR model in prediction performance.Therefore,the potential of ANN models in predicting the growth performance of pigs was evaluated and compared with MR models in this study.Results:Body weight(BW),net energy(NE)intake,standardized ileal digestible lysine(SID Lys)intake,and their quadratic terms were selected as input variables to predict ADG and F/G among 10 candidate variables.In the training phase,MR models showed high accuracy in both ADG and F/G prediction(R^(2)_(ADG)=0.929,R^(2)_(F/G)=0.886)while ANN models with 4,6 neurons and radial basis activation function yielded the best performance in ADG and F/G prediction(R^(2)_(ADG)=0.964,R^(2)_(F/G)=0.932).In the testing phase,these ANN models showed better accuracy in ADG prediction(CCC:0.976 vs.0.861,R^(2):0.951 vs.0.584),and F/G prediction(CCC:0.952 vs.0.900,R^(2):0.905 vs.0.821)compared with the MR models.Meanwhile,the“over-fitting”occurred in MR models but not in ANN models.On validation data from the animal trial,ANN models exhibited superiority over MR models in both ADG and F/G prediction(P<0.01).Moreover,the growth stages have a significant effect on the prediction accuracy of the models.Conclusion:Body weight,NE intake and SID Lys intake can be used as input variables to predict the growth performance of growing-finishing pigs,with trained ANN models are more flexible and accurate than MR models.Therefore,it is promising to use ANN models in related swine nutrition studies in the future.展开更多
In this paper,a class of functional-coefficient regression models is proposed and an estimation procedure based on the locally weighted least equares is suggested.This class of models,with the proposed estimation meth...In this paper,a class of functional-coefficient regression models is proposed and an estimation procedure based on the locally weighted least equares is suggested.This class of models,with the proposed estimation method,is a powerful means for exploratory data analysis.展开更多
Wavelets are applied to detect the jumps in a heteroscedastic regression model. It is shown that the wavelet coefficients of the data have significantly large absolute values across fine scale levels near the jump poi...Wavelets are applied to detect the jumps in a heteroscedastic regression model. It is shown that the wavelet coefficients of the data have significantly large absolute values across fine scale levels near the jump points. Then a procedure is developed to estimate the jumps and jump heights. All estimators are proved to be consistent.展开更多
The radial basis function (RBF) emerged as a variant of artificial neural network. Generalized regression neural network (GRNN) is one type of RBF, and its principal advantages are that it can quickly learn and ra...The radial basis function (RBF) emerged as a variant of artificial neural network. Generalized regression neural network (GRNN) is one type of RBF, and its principal advantages are that it can quickly learn and rapidly converge to the optimal regression surface with large number of data sets. Hyperspectral reflectance (350 to 2500 nm) data were recorded at two different rice sites in two experiment fields with two cultivars, three nitrogen treatments and one plant density (45 plants m^-2). Stepwise multivariable regression model (SMR) and RBF were used to compare their predictability for the leaf area index (LAI) and green leaf chlorophyll density (GLCD) of rice based on reflectance (R) and its three different transformations, the first derivative reflectance (D1), the second derivative reflectance (D2) and the log-transformed reflectance (LOG). GRNN based on D1 was the best model for the prediction of rice LAI and CLCD. The relationships between different transformations of reflectance and rice parameters could be further improved when RBF was employed. Owing to its strong capacity for nonlinear mapping and good robustness, GRNN could maximize the sensitivity to chlorophyll content using D1. It is concluded that RBF may provide a useful exploratory and predictive tool for the estimation of rice biophysical parameters.展开更多
A geometric framework is proposed for semiparametric nonlinear regression models based on the concept of least favorable curve, introduced by Severini and Wong (1992). The authors use this framework to drive three kin...A geometric framework is proposed for semiparametric nonlinear regression models based on the concept of least favorable curve, introduced by Severini and Wong (1992). The authors use this framework to drive three kinds of improved approximate confidence regions for the parameter and parameter subset in terms of curvatures. The results obtained by Hamilton et al. (1982), Hamilton (1986) and Wei (1994) are extended to semiparametric nonlinear regression models.展开更多
Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ri...Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ridge height and potential energy during wave-ridge interaction with a binary and cumulative logistic regression model. In testing the Global Null Hypothesis, all values are p 〈0.001, with three statistical methods, such as Likelihood Ratio, Score, and Wald. While comparing with two kinds of models, tests values obtained by cumulative logistic regression models are better than those by binary logistic regression models. Although this study employed cumulative logistic regression model, three probability functions p^1, p^2 and p^3, are utilized for investigating the weighted influence of factors on wave reflection. Deviance and Pearson tests are applied to cheek the goodness-of-fit of the proposed model. The analytical results demonstrated that both ridge height (X1 ) and potential energy (X2 ) significantly impact (p 〈 0. 0001 ) the amplitude-based refleeted rate; the P-values for the deviance and Pearson are all 〉 0.05 (0.2839, 0.3438, respectively). That is, the goodness-of-fit between ridge height ( X1 ) and potential energy (X2) can further predict parameters under the scenario of the best parsimonious model. Investigation of 6 predictive powers ( R2, Max-rescaled R^2, Sorners' D, Gamma, Tau-a, and c, respectively) indicate that these predictive estimates of the proposed model have better predictive ability than ridge height alone, and are very similar to the interaction of ridge height and potential energy. It can be concluded that the goodness-of-fit and prediction ability of the cumulative logistic regression model are better than that of the binary logistic regression model.展开更多
基金Supported by the Natural Science Foundation of Fujian Province(2022J011177,2024J01903)the Key Project of Fujian Provincial Education Department(JZ230054)。
文摘In clinical research,subgroup analysis can help identify patient groups that respond better or worse to specific treatments,improve therapeutic effect and safety,and is of great significance in precision medicine.This article considers subgroup analysis methods for longitudinal data containing multiple covariates and biomarkers.We divide subgroups based on whether a linear combination of these biomarkers exceeds a predetermined threshold,and assess the heterogeneity of treatment effects across subgroups using the interaction between subgroups and exposure variables.Quantile regression is used to better characterize the global distribution of the response variable and sparsity penalties are imposed to achieve variable selection of covariates and biomarkers.The effectiveness of our proposed methodology for both variable selection and parameter estimation is verified through random simulations.Finally,we demonstrate the application of this method by analyzing data from the PA.3 trial,further illustrating the practicality of the method proposed in this paper.
基金supported by the National Natural Science Foundation of China(62375013).
文摘As the core component of inertial navigation systems, fiber optic gyroscope (FOG), with technical advantages such as low power consumption, long lifespan, fast startup speed, and flexible structural design, are widely used in aerospace, unmanned driving, and other fields. However, due to the temper-ature sensitivity of optical devices, the influence of environmen-tal temperature causes errors in FOG, thereby greatly limiting their output accuracy. This work researches on machine-learn-ing based temperature error compensation techniques for FOG. Specifically, it focuses on compensating for the bias errors gen-erated in the fiber ring due to the Shupe effect. This work pro-poses a composite model based on k-means clustering, sup-port vector regression, and particle swarm optimization algo-rithms. And it significantly reduced redundancy within the sam-ples by adopting the interval sequence sample. Moreover, met-rics such as root mean square error (RMSE), mean absolute error (MAE), bias stability, and Allan variance, are selected to evaluate the model’s performance and compensation effective-ness. This work effectively enhances the consistency between data and models across different temperature ranges and tem-perature gradients, improving the bias stability of the FOG from 0.022 °/h to 0.006 °/h. Compared to the existing methods utiliz-ing a single machine learning model, the proposed method increases the bias stability of the compensated FOG from 57.11% to 71.98%, and enhances the suppression of rate ramp noise coefficient from 2.29% to 14.83%. This work improves the accuracy of FOG after compensation, providing theoretical guid-ance and technical references for sensors error compensation work in other fields.
基金financed as part of the project“Development of a methodology for instrumental base formation for analysis and modeling of the spatial socio-economic development of systems based on internal reserves in the context of digitalization”(FSEG-2023-0008)funded by the Russian Science Foundation(Agreement 23-41-10001,https://doi.org/https://rscf.ru/project/23-41-10001/).
文摘The results of mass appraisal in many countries are used as a basis for calculating the amount of real estate tax,therefore,regardless of the methods used to calculate it,the resulting value should be as close as possible to the market value of the real estate to maintain a balance of interests between the state and the rights holders.In practice,this condition is not always met,since,firstly,the quality of market data is often very low,and secondly,some markets are characterized by low activity,which is expressed in a deficit of information on asking prices.The aim of the work is ecological valuation of land use:how regression-based mass appraisal can inform ecological conservation,land degradation,and sustainable land management.Four multiple regression models were constructed for AI generated map of land plots for recreational use in St.Petersburg(Russia)with different volumes of market information(32,30,20 and 15 units of market information with four price-forming factors).During the analysis of the quality of the models,it was revealed that the best result is shown by the model built on the maximum sample size,then the model based on 15 analogs,which proves that a larger number of analog objects does not always allow us to achieve better results,since the more analog objects there are.
基金supported in part by Sichuan Science and Technology Program under Grant No.2025ZNSFSC151in part by the Strategic Priority Research Program of Chinese Academy of Sciences under Grant No.XDA27030201+1 种基金the Natural Science Foundation of China under Grant No.U21B6001in part by the Natural Science Foundation of Tianjin under Grant No.24JCQNJC01930.
文摘The work proposes a distributed Kalman filtering(KF)algorithm to track a time-varying unknown signal process for a stochastic regression model over network systems in a cooperative way.We provide the stability analysis of the proposed distributed KF algorithm without independent and stationary signal assumptions,which implies that the theoretical results are able to be applied to stochastic feedback systems.Note that the main difficulty of stability analysis lies in analyzing the properties of the product of non-independent and non-stationary random matrices involved in the error equation.We employ analysis techniques such as stochastic Lyapunov function,stability theory of stochastic systems,and algebraic graph theory to deal with the above issue.The stochastic spatio-temporal cooperative information condition shows the cooperative property of multiple sensors that even though any local sensor cannot track the time-varying unknown signal,the distributed KF algorithm can be utilized to finish the filtering task in a cooperative way.At last,we illustrate the property of the proposed distributed KF algorithm by a simulation example.
文摘Under-fitting problems usually occur in regression models for dam safety monitoring.To overcome the local convergence of the regression, a genetic algorithm (GA) was proposed using a real parameter coding, a ranking selection operator, an arithmetical crossover operator and a uniform mutation operator, and calculated the least-square error of the observed and computed values as its fitness function. The elitist strategy was used to improve the speed of the convergence. After that, the modified genetic algorithm was applied to reassess the coefficients of the regression model and a genetic regression model was set up. As an example, a slotted gravity dam in the Northeast of China was introduced. The computational results show that the genetic regression model can solve the under-fitting problems perfectly.
基金The National Natural Science Foundation of China(No.51106025,51106027,51036002)Specialized Research Fund for the Doctoral Program of Higher Education(No.20130092110061)the Youth Foundation of Nanjing Institute of Technology(No.QKJA201303)
文摘A fuzzy observations-based radial basis function neural network (FORBFNN) is presented for modeling nonlinear systems in which the observations of response are imprecise but can be represented as fuzzy membership functions. In the FORBFNN model, the weight coefficients of nodes in the hidden layer are identified by using the fuzzy expectation-maximization ( EM ) algorithm, whereas the optimal number of these nodes as well as the centers and widths of radial basis functions are automatically constructed by using a data-driven method. Namely, the method starts with an initial node, and then a new node is added in a hidden layer according to some rules. This procedure is not terminated until the model meets the preset requirements. The method considers both the accuracy and complexity of the model. Numerical simulation results show that the modeling method is effective, and the established model has high prediction accuracy.
基金The Fundamental Research Funds for the Central Universities(No.JUDCF12027,JUSRP51323B)the Scientific Innovation Research of College Graduates in Jiangsu Province(No.CXLX12_0734)
文摘The conventional single model strategy may be ill- suited due to the multiplicity of operation phases and system uncertainty. A novel global-local discriminant analysis (GLDA) based Gaussian process regression (GPR) approach is developed for the quality prediction of nonlinear and multiphase batch processes. After the collected data is preprocessed through batchwise unfolding, the hidden Markov model (HMM) is applied to identify different operation phases. A GLDA algorithm is also presented to extract the appropriate process variables highly correlated with the quality variables, decreasing the complexity of modeling. Besides, the multiple local GPR models are built in the reduced- dimensional space for all the identified operation phases. Furthermore, the HMM-based state estimation is used to classify each measurement sample of a test batch into a corresponding phase with the maximal likelihood estimation. Therefore, the local GPR model with respect to specific phase is selected for online prediction. The effectiveness of the proposed prediction approach is demonstrated through the multiphase penicillin fermentation process. The comparison results show that the proposed GLDA-GPR approach is superior to the regular GPR model and the GPR based on HMM (HMM-GPR) model.
基金Supported by the Natural Science Foundation of Anhui Education Committee
文摘In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.
基金provided by the Korean Ministry of Environment and Eco Star Project
文摘Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calculated using rainfall, catchment area and runoff coefficient. In this study, runoff quantity and quality data gathered from a 28-month monitoring conducted on the road and parking lot sites in Korea were evaluated using multiple linear regression (MLR) to develop equations for estimating pollutant loads and EMCs as a function of rainfall variables. The results revealed that total event rainfall and average rainfall intensity are possible predictors of pollutant loads. Overall, the models are indicators of the high uncertainties of NPSs; perhaps estimation of EMCs and loads could be accurately obtained by means of water quality sampling or a long term monitoring is needed to gather more data that can be used for the development of estimation models.
基金supported by the National Natural Science Foundation of China (Nos.51478025,11701023,71420107025)
文摘Wastewater treatment is one of critical issues faced by water utilities, and receives more and more attentions recently. The energy consumption modeling in biochemical wastewater treatment was investigated in the study via a general and robust approach based on Bayesian semi-parametric quantile regression. The dataset was derived from a municipal wastewater treatment plant, where the energy consumption of unit chemical oxygen demand(COD) reduction was the response variable of interest. Via the proposed approach,the comprehensive regression pictures of the energy consumption and truly influencing factors, i.e., the regression relationships at lower, median and higher energy consumption levels were characterized respectively. Meanwhile, the proposals for energy saving in different cases were also facilitated specifically. First, the lower level of energy consumption was closely associated with the temperature of influent wastewater, and the chroma-rich wastewater also showed helpful in the execution of energy saving. Second, at median energy consumption level, the COD-rich wastewater played a determinative role in the reduction of energy consumption, while the higher quality of treated water led to slightly energy intensive. Third, the higher level of energy consumption was most likely to be attributed to the relatively high temperature of wastewater and total nitrogen(TN)-rich wastewater,and both of the factors were preferably to be avoided to alleviate the burden of energy consumption. The study provided an efficient approach to controlling the energy consumption of wastewater treatment in the perspective of statistical regression modeling, and offered valuable suggestions for the future energy saving.
基金supported by the National Natural Science Foundation of China(71071077)the Ministry of Education Key Project of National Educational Science Planning(DFA090215)+1 种基金China Postdoctoral Science Foundation(20100481137)Funding of Jiangsu Innovation Program for Graduate Education(CXZZ11-0226)
文摘The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.
基金Project supported by the National Natural Science Foundation of China (Grant No 60573065)the Natural Science Foundation of Shandong Province,China (Grant No Y2007G33)the Key Subject Research Foundation of Shandong Province,China(Grant No XTD0708)
文摘In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.
文摘This paper presents a semiparametric adjustment method suitable for general cases.Assuming that the regularizer matrix is positive definite,the calculation method is discussed and the corresponding formulae are presented.Finally,a simulated adjustment problem is constructed to explain the method given in this paper.The results from the semiparametric model and G_M model are compared.The results demonstrate that the model errors or the systematic errors of the observations can be detected correctly with the semiparametric estimate method.
基金supported by the Project of the 12th Five-year National Sci-Tech Support Plan of China(2011BAK12B09)China Special Project of Basic Work of Science and Technology(2011FY110100-2)
文摘Bailongjiang watershed in southern Gansu province, China, is one of the most landslide-prone regions in China, characterized by very high frequency of landslide occurrence. In order to predict the landslide occurrence, a comprehensive map of landslide susceptibility is required which may be significantly helpful in reducing loss of property and human life. In this study, an integrated model of information value method and logistic regression is proposed by using their merits at maximum and overcoming their weaknesses, which may enhance precision and accuracy of landslide susceptibility assessment. A detailed and reliable landslide inventory with 1587 landslides was prepared and randomly divided into two groups,(i) training dataset and(ii) testing dataset. Eight distinct landslide conditioning factors including lithology, slope gradient, aspect, elevation, distance to drainages,distance to faults, distance to roads and vegetation coverage were selected for landslide susceptibility mapping. The produced landslide susceptibility maps were validated by the success rate and prediction rate curves. The validation results show that the success rate and the prediction rate of the integrated model are 81.7 % and 84.6 %, respectively, which indicate that the proposed integrated method is reliable to produce an accurate landslide susceptibility map and the results may be used for landslides management and mitigation.
基金funded by the National Natural Science Foundation of China(32072764, 31702121)the 2115 Talent Development Program of China Agricultural UniversityNational Key Research and Development Program of China (2019YFD1002605)
文摘Backgrounds:Evaluating the growth performance of pigs in real-time is laborious and expensive,thus mathematical models based on easily accessible variables are developed.Multiple regression(MR)is the most widely used tool to build prediction models in swine nutrition,while the artificial neural networks(ANN)model is reported to be more accurate than MR model in prediction performance.Therefore,the potential of ANN models in predicting the growth performance of pigs was evaluated and compared with MR models in this study.Results:Body weight(BW),net energy(NE)intake,standardized ileal digestible lysine(SID Lys)intake,and their quadratic terms were selected as input variables to predict ADG and F/G among 10 candidate variables.In the training phase,MR models showed high accuracy in both ADG and F/G prediction(R^(2)_(ADG)=0.929,R^(2)_(F/G)=0.886)while ANN models with 4,6 neurons and radial basis activation function yielded the best performance in ADG and F/G prediction(R^(2)_(ADG)=0.964,R^(2)_(F/G)=0.932).In the testing phase,these ANN models showed better accuracy in ADG prediction(CCC:0.976 vs.0.861,R^(2):0.951 vs.0.584),and F/G prediction(CCC:0.952 vs.0.900,R^(2):0.905 vs.0.821)compared with the MR models.Meanwhile,the“over-fitting”occurred in MR models but not in ANN models.On validation data from the animal trial,ANN models exhibited superiority over MR models in both ADG and F/G prediction(P<0.01).Moreover,the growth stages have a significant effect on the prediction accuracy of the models.Conclusion:Body weight,NE intake and SID Lys intake can be used as input variables to predict the growth performance of growing-finishing pigs,with trained ANN models are more flexible and accurate than MR models.Therefore,it is promising to use ANN models in related swine nutrition studies in the future.
文摘In this paper,a class of functional-coefficient regression models is proposed and an estimation procedure based on the locally weighted least equares is suggested.This class of models,with the proposed estimation method,is a powerful means for exploratory data analysis.
文摘Wavelets are applied to detect the jumps in a heteroscedastic regression model. It is shown that the wavelet coefficients of the data have significantly large absolute values across fine scale levels near the jump points. Then a procedure is developed to estimate the jumps and jump heights. All estimators are proved to be consistent.
基金Project supported by the National Natural Science Foundation of China (No.40571115)the National High Tech-nology Research and Development Program (863 Program) of China (Nos.2006AA120101 and 2007AA10Z205)
文摘The radial basis function (RBF) emerged as a variant of artificial neural network. Generalized regression neural network (GRNN) is one type of RBF, and its principal advantages are that it can quickly learn and rapidly converge to the optimal regression surface with large number of data sets. Hyperspectral reflectance (350 to 2500 nm) data were recorded at two different rice sites in two experiment fields with two cultivars, three nitrogen treatments and one plant density (45 plants m^-2). Stepwise multivariable regression model (SMR) and RBF were used to compare their predictability for the leaf area index (LAI) and green leaf chlorophyll density (GLCD) of rice based on reflectance (R) and its three different transformations, the first derivative reflectance (D1), the second derivative reflectance (D2) and the log-transformed reflectance (LOG). GRNN based on D1 was the best model for the prediction of rice LAI and CLCD. The relationships between different transformations of reflectance and rice parameters could be further improved when RBF was employed. Owing to its strong capacity for nonlinear mapping and good robustness, GRNN could maximize the sensitivity to chlorophyll content using D1. It is concluded that RBF may provide a useful exploratory and predictive tool for the estimation of rice biophysical parameters.
文摘A geometric framework is proposed for semiparametric nonlinear regression models based on the concept of least favorable curve, introduced by Severini and Wong (1992). The authors use this framework to drive three kinds of improved approximate confidence regions for the parameter and parameter subset in terms of curvatures. The results obtained by Hamilton et al. (1982), Hamilton (1986) and Wei (1994) are extended to semiparametric nonlinear regression models.
基金This paper was financially supported by NSC96-2628-E-366-004-MY2 and NSC96-2628-E-132-001-MY2
文摘Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ridge height and potential energy during wave-ridge interaction with a binary and cumulative logistic regression model. In testing the Global Null Hypothesis, all values are p 〈0.001, with three statistical methods, such as Likelihood Ratio, Score, and Wald. While comparing with two kinds of models, tests values obtained by cumulative logistic regression models are better than those by binary logistic regression models. Although this study employed cumulative logistic regression model, three probability functions p^1, p^2 and p^3, are utilized for investigating the weighted influence of factors on wave reflection. Deviance and Pearson tests are applied to cheek the goodness-of-fit of the proposed model. The analytical results demonstrated that both ridge height (X1 ) and potential energy (X2 ) significantly impact (p 〈 0. 0001 ) the amplitude-based refleeted rate; the P-values for the deviance and Pearson are all 〉 0.05 (0.2839, 0.3438, respectively). That is, the goodness-of-fit between ridge height ( X1 ) and potential energy (X2) can further predict parameters under the scenario of the best parsimonious model. Investigation of 6 predictive powers ( R2, Max-rescaled R^2, Sorners' D, Gamma, Tau-a, and c, respectively) indicate that these predictive estimates of the proposed model have better predictive ability than ridge height alone, and are very similar to the interaction of ridge height and potential energy. It can be concluded that the goodness-of-fit and prediction ability of the cumulative logistic regression model are better than that of the binary logistic regression model.