In clinical research,subgroup analysis can help identify patient groups that respond better or worse to specific treatments,improve therapeutic effect and safety,and is of great significance in precision medicine.This...In clinical research,subgroup analysis can help identify patient groups that respond better or worse to specific treatments,improve therapeutic effect and safety,and is of great significance in precision medicine.This article considers subgroup analysis methods for longitudinal data containing multiple covariates and biomarkers.We divide subgroups based on whether a linear combination of these biomarkers exceeds a predetermined threshold,and assess the heterogeneity of treatment effects across subgroups using the interaction between subgroups and exposure variables.Quantile regression is used to better characterize the global distribution of the response variable and sparsity penalties are imposed to achieve variable selection of covariates and biomarkers.The effectiveness of our proposed methodology for both variable selection and parameter estimation is verified through random simulations.Finally,we demonstrate the application of this method by analyzing data from the PA.3 trial,further illustrating the practicality of the method proposed in this paper.展开更多
Copula functions have been widely used in stochastic simulation and prediction of streamflow.However,existing models are usually limited to single two-dimensional or three-dimensional copulas with the same bivariate b...Copula functions have been widely used in stochastic simulation and prediction of streamflow.However,existing models are usually limited to single two-dimensional or three-dimensional copulas with the same bivariate block for all months.To address this limitation,this study developed a mixed D-vine copula-based conditional quantile model that can capture temporal correlations.This model can generate streamflow by selecting different historical streamflow variables as the conditions for different months and by exploiting the conditional quantile functions of streamflows in different months with mixed D-vine copulas.The up-to-down sequential method,which couples the maximum weight approach with the Akaike information criteria and the maximum likelihood approach,was used to determine the structures of multivariate Dvine copulas.The developed model was used in a case study to synthesize the monthly streamflow at the Tangnaihai hydrological station,the inflow control station of the Longyangxia Reservoir in the Yellow River Basin.The results showed that the developed model outperformed the commonly used bivariate copula model in terms of the performance in simulating the seasonality and interannual variability of streamflow.This model provides useful information for water-related natural hazard risk assessment and integrated water resources management and utilization.展开更多
In this paper, we propose the double-penalized quantile regression estimators in partially linear models. An iterative algorithm is proposed for solving the proposed optimization problem. Some numerical examples illus...In this paper, we propose the double-penalized quantile regression estimators in partially linear models. An iterative algorithm is proposed for solving the proposed optimization problem. Some numerical examples illustrate that the finite sample performances of proposed method perform better than the least squares based method with regard to the non-causal selection rate (NSR) and the median of model error (MME) when the error distribution is heavy-tail. Finally, we apply the proposed methodology to analyze the ragweed pollen level dataset.展开更多
The composite quantile regression should provide estimation efficiency gain over a single quantile regression. In this paper, we extend composite quantile regression to nonparametric model with random censored data. T...The composite quantile regression should provide estimation efficiency gain over a single quantile regression. In this paper, we extend composite quantile regression to nonparametric model with random censored data. The asymptotic normality of the proposed estimator is established. The proposed methods are applied to the lung cancer data. Extensive simulations are reported, showing that the proposed method works well in practical settings.展开更多
This paper focus on solving the problem of seafloor control point absolute positioning with low vertical accuracy based on the survey ship sailing circle. The method of dealing with the systematic error based on a sem...This paper focus on solving the problem of seafloor control point absolute positioning with low vertical accuracy based on the survey ship sailing circle. The method of dealing with the systematic error based on a semi-parametric adjustment model was proposed. Firstly, the influence of sound velocity change on ranging error is analyzed. Secondly, a semi-parametric adjustment model for determining three-dimensional coordinates of seafloor control points was established. And respectively proposed solutions under two different conditions, the observation duration is an integral multiple or non-integer multiple of the long-period term of the ranging error. The simulation experiment shows that this method can obviously improve the accuracy of vertical solution of seafloor control point compared with the difference technique and the least-squares method when internal waves exist and observation duration is less than an integer multiple of the long-period term of the ranging error.展开更多
In this paper, three smoothed empirical log-likelihood ratio functions for the parameters of nonlinear models with missing response are suggested. Under some regular conditions, the corresponding Wilks phenomena are o...In this paper, three smoothed empirical log-likelihood ratio functions for the parameters of nonlinear models with missing response are suggested. Under some regular conditions, the corresponding Wilks phenomena are obtained and the confidence regions for the parameter can be constructed easily.展开更多
This paper studies how the price movements of pork,chicken and egg respond to those of related cost factors in short terms in Chinese market.We employ a linear quantile approach not only to explore potential data hete...This paper studies how the price movements of pork,chicken and egg respond to those of related cost factors in short terms in Chinese market.We employ a linear quantile approach not only to explore potential data heteroscedasticity but also to generate confidence bands for the purpose of price stability study.We then evaluate our models by comparing the prediction intervals generated from the quantile regression models with in-sample and out-of-sample forecasts.Using monthly data from January 2000 to October 2010,we observed these findings:(i) the price changes of cost factors asymmetrically and unequally influence those of the livestock across different quantiles;(ii) the performance of our models is robust and consistent for both in-sample and out-of-sample forecasts;(iii) the confidence intervals generated from 0.05th and 0.95th quantile regression models are good methods to forecast livestock price fluctuation.展开更多
Scour has been widely accepted as a key reason for bridge failures.Bridges are susceptible and sensitive to the scour phenomenon,which describes the loss of riverbed sediments around the bridge supports because of flo...Scour has been widely accepted as a key reason for bridge failures.Bridges are susceptible and sensitive to the scour phenomenon,which describes the loss of riverbed sediments around the bridge supports because of flow.The carrying capacity of a deep-water foundation is influenced by the formation of a scour hole,which means that a severe scour can lead to a bridge failure without warning.Most of the current scour predictions are based on deterministic models,while other loads at bridges are usually provided as probabilistic values.To integrate scour factors with other loads in bridge design and research,a quantile regression model was utilized to estimate scour depth.Field data and experimental data from previous studies were collected to build the model.Moreover,scour estimations using the HEC-18 equation and the proposed method were compared.By using the“CCC(Calculate,Confirm,and Check)”procedure,the probabilistic concept could be used to calculate various scour depths with the targeted likelihood according to a specified chance of bridge failure.The study shows that with a sufficiently large and continuously updated database,the proposed model could present reasonable results and provide guidance for scour mitigation.展开更多
This paper considers quantile regression analysis based on semi-competing risks data in which a non-terminal event may be dependently censored by a terminal event. The major interest is the covariate effects on the qu...This paper considers quantile regression analysis based on semi-competing risks data in which a non-terminal event may be dependently censored by a terminal event. The major interest is the covariate effects on the quantile of the non-terminal event time. Dependent censoring is handled by assuming that the joint distribution of the two event times follows a parametric copula model with unspecified marginal distributions. The technique of inverse probability weighting (IPW) is adopted to adjust for the selection bias. Large-sample properties of the proposed estimator are derived and a model diagnostic procedure is developed to check the adequacy of the model assumption. Simulation results show that the proposed estimator performs well. For illustrative purposes, our method is applied to analyze the bone marrow transplant data in [1].展开更多
Because the U.S.is a major player in the international oil market,it is interesting to study whether aggregate and state-level economic conditions can predict the subse-quent realized volatility of oil price returns.T...Because the U.S.is a major player in the international oil market,it is interesting to study whether aggregate and state-level economic conditions can predict the subse-quent realized volatility of oil price returns.To address this research question,we frame our analysis in terms of variants of the popular heterogeneous autoregressive realized volatility(HAR-RV)model.To estimate the models,we use quantile-regression and quantile machine learning(Lasso)estimators.Our estimation results highlights the dif-ferential effects of economic conditions on the quantiles of the conditional distribution of realized volatility.Using weekly data for the period April 1987 to December 2021,we document evidence of predictability at a biweekly and monthly horizon.展开更多
The sparse phase retrieval aims to recover the sparse signal from quadratic measurements. However, the measurements are often affected by outliers and asymmetric distribution noise. This paper introduces a novel metho...The sparse phase retrieval aims to recover the sparse signal from quadratic measurements. However, the measurements are often affected by outliers and asymmetric distribution noise. This paper introduces a novel method that combines the quantile regression and the L<sub>1/2</sub>-regularizer. It is a non-convex, non-smooth, non-Lipschitz optimization problem. We propose an efficient algorithm based on the Alternating Direction Methods of Multiplier (ADMM) to solve the corresponding optimization problem. Numerous numerical experiments show that this method can recover sparse signals with fewer measurements and is robust to dense bounded noise and Laplace noise.展开更多
Wastewater treatment is one of critical issues faced by water utilities, and receives more and more attentions recently. The energy consumption modeling in biochemical wastewater treatment was investigated in the stud...Wastewater treatment is one of critical issues faced by water utilities, and receives more and more attentions recently. The energy consumption modeling in biochemical wastewater treatment was investigated in the study via a general and robust approach based on Bayesian semi-parametric quantile regression. The dataset was derived from a municipal wastewater treatment plant, where the energy consumption of unit chemical oxygen demand(COD) reduction was the response variable of interest. Via the proposed approach,the comprehensive regression pictures of the energy consumption and truly influencing factors, i.e., the regression relationships at lower, median and higher energy consumption levels were characterized respectively. Meanwhile, the proposals for energy saving in different cases were also facilitated specifically. First, the lower level of energy consumption was closely associated with the temperature of influent wastewater, and the chroma-rich wastewater also showed helpful in the execution of energy saving. Second, at median energy consumption level, the COD-rich wastewater played a determinative role in the reduction of energy consumption, while the higher quality of treated water led to slightly energy intensive. Third, the higher level of energy consumption was most likely to be attributed to the relatively high temperature of wastewater and total nitrogen(TN)-rich wastewater,and both of the factors were preferably to be avoided to alleviate the burden of energy consumption. The study provided an efficient approach to controlling the energy consumption of wastewater treatment in the perspective of statistical regression modeling, and offered valuable suggestions for the future energy saving.展开更多
Albeit with the notable benefits associated with Dirichlet crash frequency models and spatial ones,there is little research dedicated to exploring their combined advantages.Such ensemble approach could be a viable alt...Albeit with the notable benefits associated with Dirichlet crash frequency models and spatial ones,there is little research dedicated to exploring their combined advantages.Such ensemble approach could be a viable alternative to existing models as it accounts for the unobserved heterogeneity by relaxing the constraints of specific distribution placed on the intercept while addressing the spatial correlations among roadway entities.To fill this gap,the authors aimed to develop Dirichlet semi-parametric models over the overdispersed generalized linear model framework while also incorporating spatially structured random effects using a distance-based weight matrix.Five models were developed which include four semi-parametric with flexible intercept and one parametric base model for comparison purposes.The four semi-parametric models entailed two models with a popular specification of stick-breaking Dirichlet process(DP)and two models with an alternative approach of Dirichlet distribution(DD),which are first applied in the field of traffic safety.All four models were estimated for mixture of points(discrete)and mixture of normals(continuous).The posterior density plots for the precision parameter justified the employment of the flexible Dirichlet approach to fit the crash data and supported the assumed prior for the precision parameter.All four Dirichlet models demonstrated the presence of distinct subpopulations suggesting that the intercepts of the models were not generated from a common distribution.The DP model based on mixture of normals illustrated better performance indicating its potential superiority to fit both insample and out-of-sample crash data.This finding indicated that the approach of continuous densities,unlike discrete points,may lend more flexibility to fit the data.展开更多
We introduce a new wavelet based procedure for detecting outliers in financial discrete time series.The procedure focuses on the analysis of residuals obtained from a model fit,and applied to the Generalized Autoregre...We introduce a new wavelet based procedure for detecting outliers in financial discrete time series.The procedure focuses on the analysis of residuals obtained from a model fit,and applied to the Generalized Autoregressive Conditional Heteroskedasticity(GARCH)like model,but not limited to these models.We apply the Maximal-Overlap Discrete Wavelet Transform(MODWT)to the residuals and compare their wavelet coefficients against quantile thresholds to detect outliers.Our methodology has several advantages over existing methods that make use of the standard Discrete Wavelet Transform(DWT).The series sample size does not need to be a power of 2 and the transform can explore any wavelet filter and be run up to the desired level.Simulated wavelet quantiles from a Normal and Student t-distribution are used as threshold for the maximum of the absolute value of wavelet coefficients.The performance of the procedure is illustrated and applied to two real series:the closed price of the Saudi Stock market and the S&P 500 index respectively.The efficiency of the proposed method is demonstrated and can be considered as a distinct important addition to the existing methods.展开更多
Recurrent event gap times data frequently arise in biomedical studies and often more than one type of event is of interest. To evaluate the effects of covariates on the marginal recurrent event hazards functions, ther...Recurrent event gap times data frequently arise in biomedical studies and often more than one type of event is of interest. To evaluate the effects of covariates on the marginal recurrent event hazards functions, there exist two types of hazards models: the multiplicative hazards model and the additive hazards model. In the paper, we propose a more flexible additive-multiplicative hazards model for multiple type of recurrent gap times data, wherein some covariates are assumed to be additive while others are multiplicative. An estimating equation approach is presented to estimate the regression parameters. We establish asymptotic properties of the proposed estimators.展开更多
Given a sample of regression data from (Y, Z), a new diagnostic plotting method is proposed for checking the hypothesis H0: the data are from a given Cox model with the time-dependent covariates Z. It compares two est...Given a sample of regression data from (Y, Z), a new diagnostic plotting method is proposed for checking the hypothesis H0: the data are from a given Cox model with the time-dependent covariates Z. It compares two estimates of the marginal distribution FY of Y. One is an estimate of the modified expression of FY under H0, based on a consistent estimate of the parameter under H0, and based on the baseline distribution of the data. The other is the Kaplan-Meier-estimator of FY, together with its confidence band. The new plot, called the marginal distribution plot, can be viewed as a test for testing H0. The main advantage of the test over the existing residual tests is in the case that the data do not satisfy any Cox model or the Cox model is mis-specified. Then the new test is still valid, but not the residual tests and the residual tests often make type II error with a very large probability.展开更多
Regional climate change impact assessments are becoming increasingly important for developing adaptation strategies in an uncertain future with respect to hydro-climatic extremes. There are a number of Global Climate ...Regional climate change impact assessments are becoming increasingly important for developing adaptation strategies in an uncertain future with respect to hydro-climatic extremes. There are a number of Global Climate Models (GCMs) and emission scenarios providing predictions of future changes in climate. As a result, there is a level of uncertainty associated with the decision of which climate models to use for the assessment of climate change impacts. The IPCC has recommended using as many global climate model scenarios as possible;however, this approach may be impractical for regional assessments that are computationally demanding. Methods have been developed to select climate model scenarios, generally consisting of selecting a model with the highest skill (validation), creating an ensemble, or selecting one or more extremes. Validation methods limit analyses to models with higher skill in simulating historical climate, ensemble methods typically take multi model means, median, or percentiles, and extremes methods tend to use scenarios which bound the projected changes in precipitation and temperature. In this paper a quantile regression based validation method is developed and applied to generate a reduced set of GCM-scenarios to analyze daily maximum streamflow uncertainty in the Upper Thames River Basin, Canada, while extremes and percentile ensemble approaches are also used for comparison. Results indicate that the validation method was able to effectively rank and reduce the set of scenarios, while the extremes and percentile ensemble methods were found not to necessarily correlate well with the range of extreme flows for all calendar months and return periods.展开更多
Background: Daily paediatric asthma readmissions within 28 days are a good example of a low count time series and not easily amenable to common time series methods used in studies of asthma seasonality and time trends...Background: Daily paediatric asthma readmissions within 28 days are a good example of a low count time series and not easily amenable to common time series methods used in studies of asthma seasonality and time trends. We sought to model and predict daily trends of childhood asthma readmissions over time inVictoria,Australia. Methods: We used a database of 75,000 childhood asthma admissions from the Department ofHealth,Victoria,Australiain 1997-2009. Daily admissions over time were modeled using a semi parametric Generalized Additive Model (GAM) and by sex and age group. Predictions were also estimated by using these models. Results: N = 2401 asthma readmissions within 28 days occurred during study period. Of these, n = 1358 (57%) were boys. Overall, seasonal peaks occurred in winter (30.5%) followed by autumn (28.6%) and then spring (24.6%) (p展开更多
基金Supported by the Natural Science Foundation of Fujian Province(2022J011177,2024J01903)the Key Project of Fujian Provincial Education Department(JZ230054)。
文摘In clinical research,subgroup analysis can help identify patient groups that respond better or worse to specific treatments,improve therapeutic effect and safety,and is of great significance in precision medicine.This article considers subgroup analysis methods for longitudinal data containing multiple covariates and biomarkers.We divide subgroups based on whether a linear combination of these biomarkers exceeds a predetermined threshold,and assess the heterogeneity of treatment effects across subgroups using the interaction between subgroups and exposure variables.Quantile regression is used to better characterize the global distribution of the response variable and sparsity penalties are imposed to achieve variable selection of covariates and biomarkers.The effectiveness of our proposed methodology for both variable selection and parameter estimation is verified through random simulations.Finally,we demonstrate the application of this method by analyzing data from the PA.3 trial,further illustrating the practicality of the method proposed in this paper.
基金supported by the National Natural Science Foundation of China(Grant No.52109010)the Postdoctoral Science Foundation of China(Grant No.2021M701047)the China National Postdoctoral Program for Innovative Talents(Grant No.BX20200113).
文摘Copula functions have been widely used in stochastic simulation and prediction of streamflow.However,existing models are usually limited to single two-dimensional or three-dimensional copulas with the same bivariate block for all months.To address this limitation,this study developed a mixed D-vine copula-based conditional quantile model that can capture temporal correlations.This model can generate streamflow by selecting different historical streamflow variables as the conditions for different months and by exploiting the conditional quantile functions of streamflows in different months with mixed D-vine copulas.The up-to-down sequential method,which couples the maximum weight approach with the Akaike information criteria and the maximum likelihood approach,was used to determine the structures of multivariate Dvine copulas.The developed model was used in a case study to synthesize the monthly streamflow at the Tangnaihai hydrological station,the inflow control station of the Longyangxia Reservoir in the Yellow River Basin.The results showed that the developed model outperformed the commonly used bivariate copula model in terms of the performance in simulating the seasonality and interannual variability of streamflow.This model provides useful information for water-related natural hazard risk assessment and integrated water resources management and utilization.
文摘In this paper, we propose the double-penalized quantile regression estimators in partially linear models. An iterative algorithm is proposed for solving the proposed optimization problem. Some numerical examples illustrate that the finite sample performances of proposed method perform better than the least squares based method with regard to the non-causal selection rate (NSR) and the median of model error (MME) when the error distribution is heavy-tail. Finally, we apply the proposed methodology to analyze the ragweed pollen level dataset.
文摘The composite quantile regression should provide estimation efficiency gain over a single quantile regression. In this paper, we extend composite quantile regression to nonparametric model with random censored data. The asymptotic normality of the proposed estimator is established. The proposed methods are applied to the lung cancer data. Extensive simulations are reported, showing that the proposed method works well in practical settings.
基金The National Key Research and Development Program of China(No.2016YFB0501701)The National High-tech Research and Development Program of China(No.2013AA122501)+1 种基金National Natural Science Foundation of China(Nos.4187610341874016)。
文摘This paper focus on solving the problem of seafloor control point absolute positioning with low vertical accuracy based on the survey ship sailing circle. The method of dealing with the systematic error based on a semi-parametric adjustment model was proposed. Firstly, the influence of sound velocity change on ranging error is analyzed. Secondly, a semi-parametric adjustment model for determining three-dimensional coordinates of seafloor control points was established. And respectively proposed solutions under two different conditions, the observation duration is an integral multiple or non-integer multiple of the long-period term of the ranging error. The simulation experiment shows that this method can obviously improve the accuracy of vertical solution of seafloor control point compared with the difference technique and the least-squares method when internal waves exist and observation duration is less than an integer multiple of the long-period term of the ranging error.
文摘In this paper, three smoothed empirical log-likelihood ratio functions for the parameters of nonlinear models with missing response are suggested. Under some regular conditions, the corresponding Wilks phenomena are obtained and the confidence regions for the parameter can be constructed easily.
基金supported by the Key Project of National Key Technology R&D Program of China(2009BADA9B01)
文摘This paper studies how the price movements of pork,chicken and egg respond to those of related cost factors in short terms in Chinese market.We employ a linear quantile approach not only to explore potential data heteroscedasticity but also to generate confidence bands for the purpose of price stability study.We then evaluate our models by comparing the prediction intervals generated from the quantile regression models with in-sample and out-of-sample forecasts.Using monthly data from January 2000 to October 2010,we observed these findings:(i) the price changes of cost factors asymmetrically and unequally influence those of the livestock across different quantiles;(ii) the performance of our models is robust and consistent for both in-sample and out-of-sample forecasts;(iii) the confidence intervals generated from 0.05th and 0.95th quantile regression models are good methods to forecast livestock price fluctuation.
基金Sponsored by the National Natural Science Foundation of China(Grant Nos.51908421 and 41172246).
文摘Scour has been widely accepted as a key reason for bridge failures.Bridges are susceptible and sensitive to the scour phenomenon,which describes the loss of riverbed sediments around the bridge supports because of flow.The carrying capacity of a deep-water foundation is influenced by the formation of a scour hole,which means that a severe scour can lead to a bridge failure without warning.Most of the current scour predictions are based on deterministic models,while other loads at bridges are usually provided as probabilistic values.To integrate scour factors with other loads in bridge design and research,a quantile regression model was utilized to estimate scour depth.Field data and experimental data from previous studies were collected to build the model.Moreover,scour estimations using the HEC-18 equation and the proposed method were compared.By using the“CCC(Calculate,Confirm,and Check)”procedure,the probabilistic concept could be used to calculate various scour depths with the targeted likelihood according to a specified chance of bridge failure.The study shows that with a sufficiently large and continuously updated database,the proposed model could present reasonable results and provide guidance for scour mitigation.
文摘This paper considers quantile regression analysis based on semi-competing risks data in which a non-terminal event may be dependently censored by a terminal event. The major interest is the covariate effects on the quantile of the non-terminal event time. Dependent censoring is handled by assuming that the joint distribution of the two event times follows a parametric copula model with unspecified marginal distributions. The technique of inverse probability weighting (IPW) is adopted to adjust for the selection bias. Large-sample properties of the proposed estimator are derived and a model diagnostic procedure is developed to check the adequacy of the model assumption. Simulation results show that the proposed estimator performs well. For illustrative purposes, our method is applied to analyze the bone marrow transplant data in [1].
文摘Because the U.S.is a major player in the international oil market,it is interesting to study whether aggregate and state-level economic conditions can predict the subse-quent realized volatility of oil price returns.To address this research question,we frame our analysis in terms of variants of the popular heterogeneous autoregressive realized volatility(HAR-RV)model.To estimate the models,we use quantile-regression and quantile machine learning(Lasso)estimators.Our estimation results highlights the dif-ferential effects of economic conditions on the quantiles of the conditional distribution of realized volatility.Using weekly data for the period April 1987 to December 2021,we document evidence of predictability at a biweekly and monthly horizon.
文摘The sparse phase retrieval aims to recover the sparse signal from quadratic measurements. However, the measurements are often affected by outliers and asymmetric distribution noise. This paper introduces a novel method that combines the quantile regression and the L<sub>1/2</sub>-regularizer. It is a non-convex, non-smooth, non-Lipschitz optimization problem. We propose an efficient algorithm based on the Alternating Direction Methods of Multiplier (ADMM) to solve the corresponding optimization problem. Numerous numerical experiments show that this method can recover sparse signals with fewer measurements and is robust to dense bounded noise and Laplace noise.
基金supported by the National Natural Science Foundation of China (Nos.51478025,11701023,71420107025)
文摘Wastewater treatment is one of critical issues faced by water utilities, and receives more and more attentions recently. The energy consumption modeling in biochemical wastewater treatment was investigated in the study via a general and robust approach based on Bayesian semi-parametric quantile regression. The dataset was derived from a municipal wastewater treatment plant, where the energy consumption of unit chemical oxygen demand(COD) reduction was the response variable of interest. Via the proposed approach,the comprehensive regression pictures of the energy consumption and truly influencing factors, i.e., the regression relationships at lower, median and higher energy consumption levels were characterized respectively. Meanwhile, the proposals for energy saving in different cases were also facilitated specifically. First, the lower level of energy consumption was closely associated with the temperature of influent wastewater, and the chroma-rich wastewater also showed helpful in the execution of energy saving. Second, at median energy consumption level, the COD-rich wastewater played a determinative role in the reduction of energy consumption, while the higher quality of treated water led to slightly energy intensive. Third, the higher level of energy consumption was most likely to be attributed to the relatively high temperature of wastewater and total nitrogen(TN)-rich wastewater,and both of the factors were preferably to be avoided to alleviate the burden of energy consumption. The study provided an efficient approach to controlling the energy consumption of wastewater treatment in the perspective of statistical regression modeling, and offered valuable suggestions for the future energy saving.
文摘Albeit with the notable benefits associated with Dirichlet crash frequency models and spatial ones,there is little research dedicated to exploring their combined advantages.Such ensemble approach could be a viable alternative to existing models as it accounts for the unobserved heterogeneity by relaxing the constraints of specific distribution placed on the intercept while addressing the spatial correlations among roadway entities.To fill this gap,the authors aimed to develop Dirichlet semi-parametric models over the overdispersed generalized linear model framework while also incorporating spatially structured random effects using a distance-based weight matrix.Five models were developed which include four semi-parametric with flexible intercept and one parametric base model for comparison purposes.The four semi-parametric models entailed two models with a popular specification of stick-breaking Dirichlet process(DP)and two models with an alternative approach of Dirichlet distribution(DD),which are first applied in the field of traffic safety.All four models were estimated for mixture of points(discrete)and mixture of normals(continuous).The posterior density plots for the precision parameter justified the employment of the flexible Dirichlet approach to fit the crash data and supported the assumed prior for the precision parameter.All four Dirichlet models demonstrated the presence of distinct subpopulations suggesting that the intercepts of the models were not generated from a common distribution.The DP model based on mixture of normals illustrated better performance indicating its potential superiority to fit both insample and out-of-sample crash data.This finding indicated that the approach of continuous densities,unlike discrete points,may lend more flexibility to fit the data.
文摘We introduce a new wavelet based procedure for detecting outliers in financial discrete time series.The procedure focuses on the analysis of residuals obtained from a model fit,and applied to the Generalized Autoregressive Conditional Heteroskedasticity(GARCH)like model,but not limited to these models.We apply the Maximal-Overlap Discrete Wavelet Transform(MODWT)to the residuals and compare their wavelet coefficients against quantile thresholds to detect outliers.Our methodology has several advantages over existing methods that make use of the standard Discrete Wavelet Transform(DWT).The series sample size does not need to be a power of 2 and the transform can explore any wavelet filter and be run up to the desired level.Simulated wavelet quantiles from a Normal and Student t-distribution are used as threshold for the maximum of the absolute value of wavelet coefficients.The performance of the procedure is illustrated and applied to two real series:the closed price of the Saudi Stock market and the S&P 500 index respectively.The efficiency of the proposed method is demonstrated and can be considered as a distinct important addition to the existing methods.
基金The Science Foundation(JA12301)of Fujian Educational Committeethe Teaching Quality Project(ZL0902/TZ(SJ))of Higher Education in Fujian Provincial Education Department
文摘Recurrent event gap times data frequently arise in biomedical studies and often more than one type of event is of interest. To evaluate the effects of covariates on the marginal recurrent event hazards functions, there exist two types of hazards models: the multiplicative hazards model and the additive hazards model. In the paper, we propose a more flexible additive-multiplicative hazards model for multiple type of recurrent gap times data, wherein some covariates are assumed to be additive while others are multiplicative. An estimating equation approach is presented to estimate the regression parameters. We establish asymptotic properties of the proposed estimators.
文摘Given a sample of regression data from (Y, Z), a new diagnostic plotting method is proposed for checking the hypothesis H0: the data are from a given Cox model with the time-dependent covariates Z. It compares two estimates of the marginal distribution FY of Y. One is an estimate of the modified expression of FY under H0, based on a consistent estimate of the parameter under H0, and based on the baseline distribution of the data. The other is the Kaplan-Meier-estimator of FY, together with its confidence band. The new plot, called the marginal distribution plot, can be viewed as a test for testing H0. The main advantage of the test over the existing residual tests is in the case that the data do not satisfy any Cox model or the Cox model is mis-specified. Then the new test is still valid, but not the residual tests and the residual tests often make type II error with a very large probability.
文摘Regional climate change impact assessments are becoming increasingly important for developing adaptation strategies in an uncertain future with respect to hydro-climatic extremes. There are a number of Global Climate Models (GCMs) and emission scenarios providing predictions of future changes in climate. As a result, there is a level of uncertainty associated with the decision of which climate models to use for the assessment of climate change impacts. The IPCC has recommended using as many global climate model scenarios as possible;however, this approach may be impractical for regional assessments that are computationally demanding. Methods have been developed to select climate model scenarios, generally consisting of selecting a model with the highest skill (validation), creating an ensemble, or selecting one or more extremes. Validation methods limit analyses to models with higher skill in simulating historical climate, ensemble methods typically take multi model means, median, or percentiles, and extremes methods tend to use scenarios which bound the projected changes in precipitation and temperature. In this paper a quantile regression based validation method is developed and applied to generate a reduced set of GCM-scenarios to analyze daily maximum streamflow uncertainty in the Upper Thames River Basin, Canada, while extremes and percentile ensemble approaches are also used for comparison. Results indicate that the validation method was able to effectively rank and reduce the set of scenarios, while the extremes and percentile ensemble methods were found not to necessarily correlate well with the range of extreme flows for all calendar months and return periods.
文摘Background: Daily paediatric asthma readmissions within 28 days are a good example of a low count time series and not easily amenable to common time series methods used in studies of asthma seasonality and time trends. We sought to model and predict daily trends of childhood asthma readmissions over time inVictoria,Australia. Methods: We used a database of 75,000 childhood asthma admissions from the Department ofHealth,Victoria,Australiain 1997-2009. Daily admissions over time were modeled using a semi parametric Generalized Additive Model (GAM) and by sex and age group. Predictions were also estimated by using these models. Results: N = 2401 asthma readmissions within 28 days occurred during study period. Of these, n = 1358 (57%) were boys. Overall, seasonal peaks occurred in winter (30.5%) followed by autumn (28.6%) and then spring (24.6%) (p