The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for mul...The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for multi-dimensional copulas. A goodness-of-fit test based on Rosenblatt's transformation was mathematically expanded from two dimensions to three dimensions and procedures of a bootstrap version of the test were provided. Through stochastic copula simulation, an empirical application of historical drought data at the Lintong Gauge Station shows that the goodness-of-fit tests perform well, revealing that both trivariate Gaussian and Student t copulas are acceptable for modeling the dependence structures of the observed drought duration, severity, and peak. The goodness-of-fit tests for multi-dimensional copulas can provide further support and help a lot in the potential applications of a wider range of copulas to describe the associations of correlated hydrological variables. However, for the application of copulas with the number of dimensions larger than three, more complicated computational efforts as well as exploration and parameterization of corresponding copulas are required.展开更多
The logistic regression model has been become commonly used to study the association between a binary response variable;it is widespread application rests on its easy application and interpretation. The subject of ass...The logistic regression model has been become commonly used to study the association between a binary response variable;it is widespread application rests on its easy application and interpretation. The subject of assessment of goodness-of-fit in logistic regression model has attracted the attention of many scientists and researchers. Goodness-of-fit tests are methods to determine the suitability of the fitted model. Many of methods proposed and discussed for assessing goodness-of fit in logistic regression model, however, the asymptotic distribution of goodness-of-fit statistics are less examine, it is need more investigated. This work, will focus on assessing the behavior of asymptotic distribution of goodness-of-fit tests, also make comparison between global goodness-of-fit tests, and evaluate it by simulation.展开更多
Goodness-of-fit test for regression modes has received much attention in literature.In this paper,empirical likelihood(EL)goodness-of-fit tests for regression models including classical parametric and autoregressive(A...Goodness-of-fit test for regression modes has received much attention in literature.In this paper,empirical likelihood(EL)goodness-of-fit tests for regression models including classical parametric and autoregressive(AR)time series models are proposed.Unlike the existing locally smoothing and globally smoothing methodologies,the new method has the advantage that the tests are self-scale invariant and that the asymptotic null distribution is chi-squared.Simulations are carried out to illustrate the methodology.展开更多
A family of integral-type goodness-of-fit tests is investigated. This family includes some existing tests, such as the Cramer-von Mises test and Anderson-Darling test, etc. The asymptotic distributions of the tests in...A family of integral-type goodness-of-fit tests is investigated. This family includes some existing tests, such as the Cramer-von Mises test and Anderson-Darling test, etc. The asymptotic distributions of the tests in the family under the null and local alternative hypotheses are established. The almost sure convergence under a fixed underlying distribution is obtained. Furthermore, simulations are conducted to compare the powers of the tests in the family. Simulation results show that for different alternatives, the more powerful tests are different, and the parameter ), has great influence on the tests in small sample cases.展开更多
The chi-square test is a well-known goodness-of-fit test. It is available for arbitrary alternative hypothesis, particularly for a very general alternative. However, when the alternative is a “one-sided” hypothesis,...The chi-square test is a well-known goodness-of-fit test. It is available for arbitrary alternative hypothesis, particularly for a very general alternative. However, when the alternative is a “one-sided” hypothesis, which usually appears in genetic linkage analysis, the chi-square test does not use the information offered by the one-sided hypothesis.Therefore, it is possible that an appropriate one-sided test, which uses the information,will be better than the chi-square test. This paper gives such an efficient one-sided test.Monte Carlo simulation results show that it is more powerful than the chi-square test, and its power has been increased by 30 percent as compared with that of the chi-square test in most situations.展开更多
In this paper, a new statistics for testing two samples coming from the same population is derived from a simple linear model with an artificial parameter. Its limit distribution is a chi-squared distribution with 2 d...In this paper, a new statistics for testing two samples coming from the same population is derived from a simple linear model with an artificial parameter. Its limit distribution is a chi-squared distribution with 2 degrees of freedom under null hypothesis and the limit distribution is a noncentral chi-squared distribution with 2 degrees of freedom under certain sequence of alternative hypothesis. Finally, we make power comparison with other tests on two samples, especially, with Smirnov statistics.展开更多
Based on the martingale difference divergence,a recently proposed metric for quantifying conditional mean dependence,we introduce a consistent test of U-type for the goodness-of-fit of linear models under conditional ...Based on the martingale difference divergence,a recently proposed metric for quantifying conditional mean dependence,we introduce a consistent test of U-type for the goodness-of-fit of linear models under conditional mean restriction.Methodologically,our test allows heteroscedastic regression models without imposing any condition on the distribution of the error,utilizes effectively important information contained in the distance of the vector of covariates,has a simple form,is easy to implement,and is free of the subjective choice of parameters.Theoretically,our mathematical analysis is of own interest since it does not take advantage of the empirical process theory and provides some insights on the asymptotic behavior of U-statistic in the framework of model diagnostics.The asymptotic null distribution of the proposed test statistic is derived and its asymptotic power behavior against fixed alternatives and local alternatives converging to the null at the parametric rate is also presented.In particular,we show that its asymptotic null distribution is very different from that obtained for the true error and their differences are interestingly related to the form expression for the estimated parameter vector embodied in regression function and a martingale difference divergence matrix.Since the asymptotic null distribution of the test statistic depends on data generating process,we propose a wild bootstrap scheme to approximate its null distribution.The consistency of the bootstrap scheme is justified.Numerical studies are undertaken to show the good performance of the new test.展开更多
The Laplace distribution can be compared against the normal distribution.The Laplace distribution has an unusual,symmetric shape with a sharp peak and tailsthat are longer than the tails of a normal distribution.It ha...The Laplace distribution can be compared against the normal distribution.The Laplace distribution has an unusual,symmetric shape with a sharp peak and tailsthat are longer than the tails of a normal distribution.It has recently become quitepopular in modeling financial variables(Brownian Laplace motion)like stock returnsbecause of the greater tails.The Laplace distribution is very extensively reviewed in themonograph(Kotz et al.in the laplace distribution and generalizations-a revisit withapplications to communications,economics,engineering,and finance.Birkhauser,Boston,2001).In this article,we propose a density-based empirical likelihood ratio(DBELR)goodness-of-fit test statistic for the Laplace distribution.The test statisticis constructed based on the approach proposed by Vexler and Gurevich(Comput StatData Anal 54:531-545,2010).In order to compute the test statistic,parameters of theLaplace distribution are estimated by the maximum likelihood method.Critical valuesand power values of the proposed test are obtained by Monte Carlo simulations.Also,power comparisons of the proposed test with some known competing tests are carriedout.Finally,two illustrative examples are presented and analyzed.展开更多
Due tocost effectiveness and hIgh efidengy.two-phase Qse control sampling has been wldely used In epldemlology studles.We dewelop a seml-parametric empinial lkellood approach to two-phase ase-control data under the lo...Due tocost effectiveness and hIgh efidengy.two-phase Qse control sampling has been wldely used In epldemlology studles.We dewelop a seml-parametric empinial lkellood approach to two-phase ase-control data under the logst regresslon model.we show that the maxmum empintal lklhoo estimaton has an aymptotically nomal dstibutlon,n,and the empincal lke-lthood ratlo fllws an aymptotcallycentral chi-square dstibution We find that the maxdmum empintial lkellhood estimator Is equal to Breslow and Holubkow(1997175 madimum lkelhood estimator.Evenso,the lmting dstribution of the lkelhood ratio,helhlodratlo based interval,and test are all new.Futhemmiore,we construct new Kolmogorov-smimnov type godnes-F-fit tests to test the vlldation of the undertying lglstic rgressonmodelLour simulation results and a real pplcaion show that the lola based Interval and test hawe certain mentsowver the wald-type counterparts and that the proposed godness-f-f test Is vald.展开更多
The paper proposes and studies some diagnostic tools for checking the goodness-of-fit of general parametric vector autoregressive models in time series.The resulted tests are asymptotically chi-squared under the null ...The paper proposes and studies some diagnostic tools for checking the goodness-of-fit of general parametric vector autoregressive models in time series.The resulted tests are asymptotically chi-squared under the null hypothesis and can detect the alternatives converging to the null at a parametric rate.The tests involve weight functions,which provides us with the flexibility to choose scores for enhancing power performance,especially under directional alternatives.When the alternatives are not directional,we construct asymptotically distribution-free maximin tests for a large class of alternatives.A possibility to construct score-based omnibus tests is discussed when the alternative is saturated.The power performance is also investigated.In addition,when the sample size is small,a nonparametric Monte Carlo test approach for dependent data is proposed to improve the performance of the tests.The algorithm is easy to implement.Simulation studies and real applications are carried out for illustration.展开更多
A test statistic is proposed to perform the goodness-of-fit test in the unbinned maximum likelihood fit. Without using a detailed expression of the efficiency function, the test statistic is found to be strongly corre...A test statistic is proposed to perform the goodness-of-fit test in the unbinned maximum likelihood fit. Without using a detailed expression of the efficiency function, the test statistic is found to be strongly correlated with the maximum likelihood function if the efficiency function varies smoothly. We point out that the correlation coefficient can be estimated by the Monte Carlo technique. With the established method, two examples are given to illustrate the performance of the test statistic.展开更多
Multiple dominant gear meshing frequencies are present in the vibration signals collected from gearboxes and the conventional spiky features that represent initial gear fault conditions are usually difficult to detect...Multiple dominant gear meshing frequencies are present in the vibration signals collected from gearboxes and the conventional spiky features that represent initial gear fault conditions are usually difficult to detect. In order to solve this problem, we propose a new gearbox deterioration detection technique based on autoregressive modeling and hypothesis testing in this paper. A stationary autoregressive model was built by using a normal vibration signal from each shaft. The established autoregressive model was then applied to process fault signals from each shaft of a two-stage gearbox. What this paper investigated is a combined technique which unites a time-varying autoregressive model and a two sample Kolmogorov-Smimov goodness-of-fit test, to detect the deterioration of gearing system with simultaneously variable shaft speed and variable load. The time-varying autoregressive model residuals representing both healthy and faulty gear conditions were compared with the original healthy time-synchronous average signals. Compared with the traditional kurtosis statistic, this technique for gearbox deterioration detection has shown significant advantages in highlighting the presence of incipient gear fault in all different speed shafts involved in the meshing motion under variable conditions.展开更多
This study explored and reviewed the logistic regression (LR) model, a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, with emphasis on m...This study explored and reviewed the logistic regression (LR) model, a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, with emphasis on medical research. Thirty seven research articles published between 2000 and 2018 which employed logistic regression as the main statistical tool as well as six text books on logistic regression were reviewed. Logistic regression concepts such as odds, odds ratio, logit transformation, logistic curve, assumption, selecting dependent and independent variables, model fitting, reporting and interpreting were presented. Upon perusing the literature, considerable deficiencies were found in both the use and reporting of LR. For many studies, the ratio of the number of outcome events to predictor variables (events per variable) was sufficiently small to call into question the accuracy of the regression model. Also, most studies did not report on validation analysis, regression diagnostics or goodness-of-fit measures;measures which authenticate the robustness of the LR model. Here, we demonstrate a good example of the application of the LR model using data obtained on a cohort of pregnant women and the factors that influence their decision to opt for caesarean delivery or vaginal birth. It is recommended that researchers should be more rigorous and pay greater attention to guidelines concerning the use and reporting of LR models.展开更多
Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ri...Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ridge height and potential energy during wave-ridge interaction with a binary and cumulative logistic regression model. In testing the Global Null Hypothesis, all values are p 〈0.001, with three statistical methods, such as Likelihood Ratio, Score, and Wald. While comparing with two kinds of models, tests values obtained by cumulative logistic regression models are better than those by binary logistic regression models. Although this study employed cumulative logistic regression model, three probability functions p^1, p^2 and p^3, are utilized for investigating the weighted influence of factors on wave reflection. Deviance and Pearson tests are applied to cheek the goodness-of-fit of the proposed model. The analytical results demonstrated that both ridge height (X1 ) and potential energy (X2 ) significantly impact (p 〈 0. 0001 ) the amplitude-based refleeted rate; the P-values for the deviance and Pearson are all 〉 0.05 (0.2839, 0.3438, respectively). That is, the goodness-of-fit between ridge height ( X1 ) and potential energy (X2) can further predict parameters under the scenario of the best parsimonious model. Investigation of 6 predictive powers ( R2, Max-rescaled R^2, Sorners' D, Gamma, Tau-a, and c, respectively) indicate that these predictive estimates of the proposed model have better predictive ability than ridge height alone, and are very similar to the interaction of ridge height and potential energy. It can be concluded that the goodness-of-fit and prediction ability of the cumulative logistic regression model are better than that of the binary logistic regression model.展开更多
Random behavior of concrete C45/55 XF2 used for prefabricated pre-stressed bridge beams is described on the basis of evaluating a vast set of measurements. A detailed statistical analysis is carried out on 133 test re...Random behavior of concrete C45/55 XF2 used for prefabricated pre-stressed bridge beams is described on the basis of evaluating a vast set of measurements. A detailed statistical analysis is carried out on 133 test results of cylinders 150 ~ 300 mm in size. The tests have been running in laboratories of the Klokner Institute. A single worker took all specimens throughout the period, and the subsequent measurements of the static modulus of elasticity and the compressive strength of the concrete were performed. The measurements were made at the age of 28 days after specimens casting, and only one testing machine with the same capping method was used. Suitable theoretical models of division are determined on the basis of tests in good congruence, with the use of Z2 and the Bernstein criterion. A set of concrete compressive strength (carried out on 133 test results of cylinders 150 ~ 300 mm after test of static modulus of elasticity) shows relatively high skewness in this specific case. This cause that limited beta distribution is better than generally recommended theoretical distribution for strength the normal or lognormal. The modulus of elasticity is not significantly affected due to skewness because the design value is based on mean value.展开更多
We propose a subsampling method for robust estimation of regression models which is built on classical methods such as the least squares method. It makes use of the non-robust nature of the underlying classical method...We propose a subsampling method for robust estimation of regression models which is built on classical methods such as the least squares method. It makes use of the non-robust nature of the underlying classical method to find a good sample from regression data contaminated with outliers, and then applies the classical method to the good sample to produce robust estimates of the regression model parameters. The subsampling method is a computational method rooted in the bootstrap methodology which trades analytical treatment for intensive computation;it finds the good sample through repeated fitting of the regression model to many random subsamples of the contaminated data instead of through an analytical treatment of the outliers. The subsampling method can be applied to all regression models for which non-robust classical methods are available. In the present paper, we focus on the basic formulation and robustness property of the subsampling method that are valid for all regression models. We also discuss variations of the method and apply it to three examples involving three different regression models.展开更多
For structural comparisons of paired prokaryotic genomes,an important topic in synthetic and evolutionary biology,the locations of shared orthologous genes(henceforth orthologs)are observed as binned data.This and oth...For structural comparisons of paired prokaryotic genomes,an important topic in synthetic and evolutionary biology,the locations of shared orthologous genes(henceforth orthologs)are observed as binned data.This and other data,e.g.,wind directions recorded at monitoring sites and intensive care unit arrival times on the 24-hour clock,are counted in binned circular arcs,thus modeling them by discrete circular distributions(DCDs)is required.We propose a novel method to construct a DCD from a base continuous circular distribution(CCD).The probability mass function is defined to take the normalized values of the probability density function at some pre-fixed equidistant points on the circle.Five families of constructed DCDs which have normalizing constants in closed form are presented.Simulation studies show that DCDs outperform the corresponding CCDs in modeling grouped(discrete)circular data,and minimum chi-square estimation outperforms maximum likelihood estimation for parameters.We apply the constructed DCDs,invariant wrapped Poisson and wrapped discrete skew Laplace to compare the structures of paired bacterial genomes.Specifically,discrete four-parameter wrapped Cauchy(nonnegative trigonometric sums)distribution models multi-modal shared orthologs in Clostridium(Sulfolobus)better than the others considered,in terms of AIC and Freedman’s goodness-of-fit test.The result that different DCDs fit the shared orthologs is consistent with the fact they belong to two kingdoms.Nevertheless,these prokaryotes have a common favored site around 70°on the unit circle;this finding is important for building synthetic prokaryotic genomes in synthetic biology.These DCDs can also be applied to other binned circular data.展开更多
Using the fact that a multivariate random sample of n observations also generates n nearest neighbour distance (NND) univariate observations and from these NND observations, a set of n auxiliary observations can be ob...Using the fact that a multivariate random sample of n observations also generates n nearest neighbour distance (NND) univariate observations and from these NND observations, a set of n auxiliary observations can be obtained and with these auxiliary observations when combined with the original multivariate observations of the random sample, a class of pseudodistance?Dh?is allowed to be used and inference methods can be developed using this class of pseudodistances. The Dh?estimators obtained from this class can achieve high efficiencies and have robustness properties. Model testing also can be handled in a unified way by means of goodness-of-fit tests statistics derived from this class which have an asymptotic normal distribution. These properties make the developed inference methods relatively simple to implement and appear to be suitable for analyzing multivariate data which are often encountered in applications.展开更多
基金supported by the Program of Introducing Talents of Disciplines to Universities of the Ministry of Education and State Administration of the Foreign Experts Affairs of China (the 111 Project, Grant No.B08048)the Special Basic Research Fund for Methodology in Hydrology of the Ministry of Sciences and Technology of China (Grant No. 2011IM011000)
文摘The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for multi-dimensional copulas. A goodness-of-fit test based on Rosenblatt's transformation was mathematically expanded from two dimensions to three dimensions and procedures of a bootstrap version of the test were provided. Through stochastic copula simulation, an empirical application of historical drought data at the Lintong Gauge Station shows that the goodness-of-fit tests perform well, revealing that both trivariate Gaussian and Student t copulas are acceptable for modeling the dependence structures of the observed drought duration, severity, and peak. The goodness-of-fit tests for multi-dimensional copulas can provide further support and help a lot in the potential applications of a wider range of copulas to describe the associations of correlated hydrological variables. However, for the application of copulas with the number of dimensions larger than three, more complicated computational efforts as well as exploration and parameterization of corresponding copulas are required.
文摘The logistic regression model has been become commonly used to study the association between a binary response variable;it is widespread application rests on its easy application and interpretation. The subject of assessment of goodness-of-fit in logistic regression model has attracted the attention of many scientists and researchers. Goodness-of-fit tests are methods to determine the suitability of the fitted model. Many of methods proposed and discussed for assessing goodness-of fit in logistic regression model, however, the asymptotic distribution of goodness-of-fit statistics are less examine, it is need more investigated. This work, will focus on assessing the behavior of asymptotic distribution of goodness-of-fit tests, also make comparison between global goodness-of-fit tests, and evaluate it by simulation.
基金This work was supported by the Research Grants Council of Hong Kong of China and the National Natural Science Foundation of China(Grant No.10661003)
文摘Goodness-of-fit test for regression modes has received much attention in literature.In this paper,empirical likelihood(EL)goodness-of-fit tests for regression models including classical parametric and autoregressive(AR)time series models are proposed.Unlike the existing locally smoothing and globally smoothing methodologies,the new method has the advantage that the tests are self-scale invariant and that the asymptotic null distribution is chi-squared.Simulations are carried out to illustrate the methodology.
基金supported by the National Natural Science Foundation of China under Grant Nos. 10661003, 10371126the Guangxi Science Foundation under Grant No. 0832102the Doctor Foundation of Guangxi Normal University
文摘A family of integral-type goodness-of-fit tests is investigated. This family includes some existing tests, such as the Cramer-von Mises test and Anderson-Darling test, etc. The asymptotic distributions of the tests in the family under the null and local alternative hypotheses are established. The almost sure convergence under a fixed underlying distribution is obtained. Furthermore, simulations are conducted to compare the powers of the tests in the family. Simulation results show that for different alternatives, the more powerful tests are different, and the parameter ), has great influence on the tests in small sample cases.
文摘The chi-square test is a well-known goodness-of-fit test. It is available for arbitrary alternative hypothesis, particularly for a very general alternative. However, when the alternative is a “one-sided” hypothesis, which usually appears in genetic linkage analysis, the chi-square test does not use the information offered by the one-sided hypothesis.Therefore, it is possible that an appropriate one-sided test, which uses the information,will be better than the chi-square test. This paper gives such an efficient one-sided test.Monte Carlo simulation results show that it is more powerful than the chi-square test, and its power has been increased by 30 percent as compared with that of the chi-square test in most situations.
基金This project is supported by Beijing Natural Science Foundation by Chinese Natural ScienceFoundation.
文摘In this paper, a new statistics for testing two samples coming from the same population is derived from a simple linear model with an artificial parameter. Its limit distribution is a chi-squared distribution with 2 degrees of freedom under null hypothesis and the limit distribution is a noncentral chi-squared distribution with 2 degrees of freedom under certain sequence of alternative hypothesis. Finally, we make power comparison with other tests on two samples, especially, with Smirnov statistics.
基金supported by the National Natural Science Foundation of China(No.12271005 and No.11901006)Natural Science Foundation of Anhui Province(2308085Y06,1908085QA06)+2 种基金Young Scholars Program of Anhui Province(2023)Anhui Provincial Natural Science Foundation(Grant No.2008085MA08)Foundation of Anhui Provincial Education Department(Grant No.KJ2021A1523)。
文摘Based on the martingale difference divergence,a recently proposed metric for quantifying conditional mean dependence,we introduce a consistent test of U-type for the goodness-of-fit of linear models under conditional mean restriction.Methodologically,our test allows heteroscedastic regression models without imposing any condition on the distribution of the error,utilizes effectively important information contained in the distance of the vector of covariates,has a simple form,is easy to implement,and is free of the subjective choice of parameters.Theoretically,our mathematical analysis is of own interest since it does not take advantage of the empirical process theory and provides some insights on the asymptotic behavior of U-statistic in the framework of model diagnostics.The asymptotic null distribution of the proposed test statistic is derived and its asymptotic power behavior against fixed alternatives and local alternatives converging to the null at the parametric rate is also presented.In particular,we show that its asymptotic null distribution is very different from that obtained for the true error and their differences are interestingly related to the form expression for the estimated parameter vector embodied in regression function and a martingale difference divergence matrix.Since the asymptotic null distribution of the test statistic depends on data generating process,we propose a wild bootstrap scheme to approximate its null distribution.The consistency of the bootstrap scheme is justified.Numerical studies are undertaken to show the good performance of the new test.
文摘The Laplace distribution can be compared against the normal distribution.The Laplace distribution has an unusual,symmetric shape with a sharp peak and tailsthat are longer than the tails of a normal distribution.It has recently become quitepopular in modeling financial variables(Brownian Laplace motion)like stock returnsbecause of the greater tails.The Laplace distribution is very extensively reviewed in themonograph(Kotz et al.in the laplace distribution and generalizations-a revisit withapplications to communications,economics,engineering,and finance.Birkhauser,Boston,2001).In this article,we propose a density-based empirical likelihood ratio(DBELR)goodness-of-fit test statistic for the Laplace distribution.The test statisticis constructed based on the approach proposed by Vexler and Gurevich(Comput StatData Anal 54:531-545,2010).In order to compute the test statistic,parameters of theLaplace distribution are estimated by the maximum likelihood method.Critical valuesand power values of the proposed test are obtained by Monte Carlo simulations.Also,power comparisons of the proposed test with some known competing tests are carriedout.Finally,two illustrative examples are presented and analyzed.
基金The research was supported by theNationalNatural Science Foundation of China[grant number 11771144]the State Key Program of National Natural Science Foundation of China[grant number 71931004],[grant number 32030063]the development fund for Shanghai talents,and the 111 project(B14019).
文摘Due tocost effectiveness and hIgh efidengy.two-phase Qse control sampling has been wldely used In epldemlology studles.We dewelop a seml-parametric empinial lkellood approach to two-phase ase-control data under the logst regresslon model.we show that the maxmum empintal lklhoo estimaton has an aymptotically nomal dstibutlon,n,and the empincal lke-lthood ratlo fllws an aymptotcallycentral chi-square dstibution We find that the maxdmum empintial lkellhood estimator Is equal to Breslow and Holubkow(1997175 madimum lkelhood estimator.Evenso,the lmting dstribution of the lkelhood ratio,helhlodratlo based interval,and test are all new.Futhemmiore,we construct new Kolmogorov-smimnov type godnes-F-fit tests to test the vlldation of the undertying lglstic rgressonmodelLour simulation results and a real pplcaion show that the lola based Interval and test hawe certain mentsowver the wald-type counterparts and that the proposed godness-f-f test Is vald.
基金supported by Research Grants Council of Hong Kong(Grant No.HKBU2-030/07P)Wu Jianhong was also supported by a grant from Humanities and Social Sciences in Chinese University(Grant No.07JJD790154)+1 种基金Science Fund for Young Scholars of Zhejiang Gongshang University(Grant No.Q09-12)Zhejiang Provincial Natural Science Foundation of China(Grant No.Y6090172)
文摘The paper proposes and studies some diagnostic tools for checking the goodness-of-fit of general parametric vector autoregressive models in time series.The resulted tests are asymptotically chi-squared under the null hypothesis and can detect the alternatives converging to the null at a parametric rate.The tests involve weight functions,which provides us with the flexibility to choose scores for enhancing power performance,especially under directional alternatives.When the alternatives are not directional,we construct asymptotically distribution-free maximin tests for a large class of alternatives.A possibility to construct score-based omnibus tests is discussed when the alternative is saturated.The power performance is also investigated.In addition,when the sample size is small,a nonparametric Monte Carlo test approach for dependent data is proposed to improve the performance of the tests.The algorithm is easy to implement.Simulation studies and real applications are carried out for illustration.
基金Supported by National Natural Science Foundation of China (10775077, 10225522)
文摘A test statistic is proposed to perform the goodness-of-fit test in the unbinned maximum likelihood fit. Without using a detailed expression of the efficiency function, the test statistic is found to be strongly correlated with the maximum likelihood function if the efficiency function varies smoothly. We point out that the correlation coefficient can be estimated by the Monte Carlo technique. With the established method, two examples are given to illustrate the performance of the test statistic.
基金supported by National Natural Science Foundation of China (Grant No. 50675232)Key Project of Ministry of Education of ChinaChongqing Municipal Natural Science Key Foundation of China (Grant No. 2007BA6021)
文摘Multiple dominant gear meshing frequencies are present in the vibration signals collected from gearboxes and the conventional spiky features that represent initial gear fault conditions are usually difficult to detect. In order to solve this problem, we propose a new gearbox deterioration detection technique based on autoregressive modeling and hypothesis testing in this paper. A stationary autoregressive model was built by using a normal vibration signal from each shaft. The established autoregressive model was then applied to process fault signals from each shaft of a two-stage gearbox. What this paper investigated is a combined technique which unites a time-varying autoregressive model and a two sample Kolmogorov-Smimov goodness-of-fit test, to detect the deterioration of gearing system with simultaneously variable shaft speed and variable load. The time-varying autoregressive model residuals representing both healthy and faulty gear conditions were compared with the original healthy time-synchronous average signals. Compared with the traditional kurtosis statistic, this technique for gearbox deterioration detection has shown significant advantages in highlighting the presence of incipient gear fault in all different speed shafts involved in the meshing motion under variable conditions.
文摘This study explored and reviewed the logistic regression (LR) model, a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, with emphasis on medical research. Thirty seven research articles published between 2000 and 2018 which employed logistic regression as the main statistical tool as well as six text books on logistic regression were reviewed. Logistic regression concepts such as odds, odds ratio, logit transformation, logistic curve, assumption, selecting dependent and independent variables, model fitting, reporting and interpreting were presented. Upon perusing the literature, considerable deficiencies were found in both the use and reporting of LR. For many studies, the ratio of the number of outcome events to predictor variables (events per variable) was sufficiently small to call into question the accuracy of the regression model. Also, most studies did not report on validation analysis, regression diagnostics or goodness-of-fit measures;measures which authenticate the robustness of the LR model. Here, we demonstrate a good example of the application of the LR model using data obtained on a cohort of pregnant women and the factors that influence their decision to opt for caesarean delivery or vaginal birth. It is recommended that researchers should be more rigorous and pay greater attention to guidelines concerning the use and reporting of LR models.
基金This paper was financially supported by NSC96-2628-E-366-004-MY2 and NSC96-2628-E-132-001-MY2
文摘Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ridge height and potential energy during wave-ridge interaction with a binary and cumulative logistic regression model. In testing the Global Null Hypothesis, all values are p 〈0.001, with three statistical methods, such as Likelihood Ratio, Score, and Wald. While comparing with two kinds of models, tests values obtained by cumulative logistic regression models are better than those by binary logistic regression models. Although this study employed cumulative logistic regression model, three probability functions p^1, p^2 and p^3, are utilized for investigating the weighted influence of factors on wave reflection. Deviance and Pearson tests are applied to cheek the goodness-of-fit of the proposed model. The analytical results demonstrated that both ridge height (X1 ) and potential energy (X2 ) significantly impact (p 〈 0. 0001 ) the amplitude-based refleeted rate; the P-values for the deviance and Pearson are all 〉 0.05 (0.2839, 0.3438, respectively). That is, the goodness-of-fit between ridge height ( X1 ) and potential energy (X2) can further predict parameters under the scenario of the best parsimonious model. Investigation of 6 predictive powers ( R2, Max-rescaled R^2, Sorners' D, Gamma, Tau-a, and c, respectively) indicate that these predictive estimates of the proposed model have better predictive ability than ridge height alone, and are very similar to the interaction of ridge height and potential energy. It can be concluded that the goodness-of-fit and prediction ability of the cumulative logistic regression model are better than that of the binary logistic regression model.
文摘Random behavior of concrete C45/55 XF2 used for prefabricated pre-stressed bridge beams is described on the basis of evaluating a vast set of measurements. A detailed statistical analysis is carried out on 133 test results of cylinders 150 ~ 300 mm in size. The tests have been running in laboratories of the Klokner Institute. A single worker took all specimens throughout the period, and the subsequent measurements of the static modulus of elasticity and the compressive strength of the concrete were performed. The measurements were made at the age of 28 days after specimens casting, and only one testing machine with the same capping method was used. Suitable theoretical models of division are determined on the basis of tests in good congruence, with the use of Z2 and the Bernstein criterion. A set of concrete compressive strength (carried out on 133 test results of cylinders 150 ~ 300 mm after test of static modulus of elasticity) shows relatively high skewness in this specific case. This cause that limited beta distribution is better than generally recommended theoretical distribution for strength the normal or lognormal. The modulus of elasticity is not significantly affected due to skewness because the design value is based on mean value.
文摘We propose a subsampling method for robust estimation of regression models which is built on classical methods such as the least squares method. It makes use of the non-robust nature of the underlying classical method to find a good sample from regression data contaminated with outliers, and then applies the classical method to the good sample to produce robust estimates of the regression model parameters. The subsampling method is a computational method rooted in the bootstrap methodology which trades analytical treatment for intensive computation;it finds the good sample through repeated fitting of the regression model to many random subsamples of the contaminated data instead of through an analytical treatment of the outliers. The subsampling method can be applied to all regression models for which non-robust classical methods are available. In the present paper, we focus on the basic formulation and robustness property of the subsampling method that are valid for all regression models. We also discuss variations of the method and apply it to three examples involving three different regression models.
基金supported by JSPS KAKENHI Grant Number 18K13459 and Grace S.Shieh was supported in part by MOST 106-2118-M-001-017 and MOST 107-2118-M-001-009-MY2.
文摘For structural comparisons of paired prokaryotic genomes,an important topic in synthetic and evolutionary biology,the locations of shared orthologous genes(henceforth orthologs)are observed as binned data.This and other data,e.g.,wind directions recorded at monitoring sites and intensive care unit arrival times on the 24-hour clock,are counted in binned circular arcs,thus modeling them by discrete circular distributions(DCDs)is required.We propose a novel method to construct a DCD from a base continuous circular distribution(CCD).The probability mass function is defined to take the normalized values of the probability density function at some pre-fixed equidistant points on the circle.Five families of constructed DCDs which have normalizing constants in closed form are presented.Simulation studies show that DCDs outperform the corresponding CCDs in modeling grouped(discrete)circular data,and minimum chi-square estimation outperforms maximum likelihood estimation for parameters.We apply the constructed DCDs,invariant wrapped Poisson and wrapped discrete skew Laplace to compare the structures of paired bacterial genomes.Specifically,discrete four-parameter wrapped Cauchy(nonnegative trigonometric sums)distribution models multi-modal shared orthologs in Clostridium(Sulfolobus)better than the others considered,in terms of AIC and Freedman’s goodness-of-fit test.The result that different DCDs fit the shared orthologs is consistent with the fact they belong to two kingdoms.Nevertheless,these prokaryotes have a common favored site around 70°on the unit circle;this finding is important for building synthetic prokaryotic genomes in synthetic biology.These DCDs can also be applied to other binned circular data.
文摘Using the fact that a multivariate random sample of n observations also generates n nearest neighbour distance (NND) univariate observations and from these NND observations, a set of n auxiliary observations can be obtained and with these auxiliary observations when combined with the original multivariate observations of the random sample, a class of pseudodistance?Dh?is allowed to be used and inference methods can be developed using this class of pseudodistances. The Dh?estimators obtained from this class can achieve high efficiencies and have robustness properties. Model testing also can be handled in a unified way by means of goodness-of-fit tests statistics derived from this class which have an asymptotic normal distribution. These properties make the developed inference methods relatively simple to implement and appear to be suitable for analyzing multivariate data which are often encountered in applications.