In this paper,we propose a novel approach to the multiple-sample testing problem using a studentized test statistic based on the random lifter technique.The method reformulates the classical Ksample test as an indepen...In this paper,we propose a novel approach to the multiple-sample testing problem using a studentized test statistic based on the random lifter technique.The method reformulates the classical Ksample test as an independence test between two random variables,enabling more efficient handling of complex data types.We solve problems with the non-standard normal limiting distributions of degenerate U-statistics using a random lifter approach.This creates a test statistic that is asymptotically normal under the null hypothesis.Numerous simulations and real-world applications have demonstrated that our method performs well with many data types,including Euclidean,directional,and symmetric positive definite data.It is also very good at controlling Type I errors.Our method also shows significant computational efficiency,outperforming existing K-sample tests,particularly when applied to large datasets.These results suggest that the proposed method is a powerful and practical solution for multiple-sample testing in complex data scenarios.展开更多
For several decades, much attention has been paid to the two-sample Behrens-Fisher (BF) problem which tests the equality of the means or mean vectors of two normal populations with unequal variance/covariance structur...For several decades, much attention has been paid to the two-sample Behrens-Fisher (BF) problem which tests the equality of the means or mean vectors of two normal populations with unequal variance/covariance structures. Little work, however, has been done for the k-sample BF problem for high dimensional data which tests the equality of the mean vectors of several high-dimensional normal populations with unequal covariance structures. In this paper we study this challenging problem via extending the famous Scheffe’s transformation method, which reduces the k-sample BF problem to a one-sample problem. The induced one-sample problem can be easily tested by the classical Hotelling’s T 2 test when the size of the resulting sample is very large relative to its dimensionality. For high dimensional data, however, the dimensionality of the resulting sample is often very large, and even much larger than its sample size, which makes the classical Hotelling’s T 2 test not powerful or not even well defined. To overcome this difficulty, we propose and study an L 2-norm based test. The asymptotic powers of the proposed L 2-norm based test and Hotelling’s T 2 test are derived and theoretically compared. Methods for implementing the L 2-norm based test are described. Simulation studies are conducted to compare the L 2-norm based test and Hotelling’s T 2 test when the latter can be well defined, and to compare the proposed implementation methods for the L 2-norm based test otherwise. The methodologies are motivated and illustrated by a real data example.展开更多
In this paper, we consider the general linear hypothesis testing (GLHT) problem in heteroscedastic one-way MANOVA. The well-known Wald-type test statistic is used. Its null distribution is approximated by a Hotelling ...In this paper, we consider the general linear hypothesis testing (GLHT) problem in heteroscedastic one-way MANOVA. The well-known Wald-type test statistic is used. Its null distribution is approximated by a Hotelling T2 distribution with one parameter estimated from the data, resulting in the so-called approximate Hotelling T2 (AHT) test. The AHT test is shown to be invariant under affine transformation, different choices of the contrast matrix specifying the same hypothesis, and different labeling schemes of the mean vectors. The AHT test can be simply conducted using the usual F-distribution. Simulation studies and real data applications show that the AHT test substantially outperforms the test of [1] and is comparable to the parametric bootstrap (PB) test of [2] for the multivariate k-sample Behrens-Fisher problem which is a special case of the GLHT problem in heteroscedastic one-way MANOVA.展开更多
Cui and Zhong(2019),(Computational Statistics&Data Analysis,139,117–133)proposed a test based on the mean variance(MV)index to test independence between a categorical random variable Y with R categories and a con...Cui and Zhong(2019),(Computational Statistics&Data Analysis,139,117–133)proposed a test based on the mean variance(MV)index to test independence between a categorical random variable Y with R categories and a continuous random variable X.They ingeniously proved the asymptotic normality of the MV test statistic when R diverges to infinity,which brings many merits to the MV test,including making it more convenient for independence testing when R is large.This paper considers a new test called the integral Pearson chi-square(IPC)test,whose test statistic can be viewed as a modified MV test statistic.A central limit theorem of the martin-gale difference is used to show that the asymptotic null distribution of the standardized IPC test statistic when R is diverging is also a normal distribution,rendering the IPC test sharing many merits with the MV test.As an application of such a theoretical finding,the IPC test is extended to test independence between continuous random variables.The finite sample performance of the proposed test is assessed by Monte Carlo simulations,and a real data example is presented for illustration.展开更多
基金supported by National Natural Science Foundation of China(Grant Nos.12231017,72171216,71921001 and 71991474)the National Key R&D Program of China(Grant No.2022YFA1003803)supported by National Natural Science Foundation of China(Grant Nos.71873128 and 72293573).
文摘In this paper,we propose a novel approach to the multiple-sample testing problem using a studentized test statistic based on the random lifter technique.The method reformulates the classical Ksample test as an independence test between two random variables,enabling more efficient handling of complex data types.We solve problems with the non-standard normal limiting distributions of degenerate U-statistics using a random lifter approach.This creates a test statistic that is asymptotically normal under the null hypothesis.Numerous simulations and real-world applications have demonstrated that our method performs well with many data types,including Euclidean,directional,and symmetric positive definite data.It is also very good at controlling Type I errors.Our method also shows significant computational efficiency,outperforming existing K-sample tests,particularly when applied to large datasets.These results suggest that the proposed method is a powerful and practical solution for multiple-sample testing in complex data scenarios.
基金supported by the National University of Singapore Academic Research Grant (Grant No. R-155-000-085-112)
文摘For several decades, much attention has been paid to the two-sample Behrens-Fisher (BF) problem which tests the equality of the means or mean vectors of two normal populations with unequal variance/covariance structures. Little work, however, has been done for the k-sample BF problem for high dimensional data which tests the equality of the mean vectors of several high-dimensional normal populations with unequal covariance structures. In this paper we study this challenging problem via extending the famous Scheffe’s transformation method, which reduces the k-sample BF problem to a one-sample problem. The induced one-sample problem can be easily tested by the classical Hotelling’s T 2 test when the size of the resulting sample is very large relative to its dimensionality. For high dimensional data, however, the dimensionality of the resulting sample is often very large, and even much larger than its sample size, which makes the classical Hotelling’s T 2 test not powerful or not even well defined. To overcome this difficulty, we propose and study an L 2-norm based test. The asymptotic powers of the proposed L 2-norm based test and Hotelling’s T 2 test are derived and theoretically compared. Methods for implementing the L 2-norm based test are described. Simulation studies are conducted to compare the L 2-norm based test and Hotelling’s T 2 test when the latter can be well defined, and to compare the proposed implementation methods for the L 2-norm based test otherwise. The methodologies are motivated and illustrated by a real data example.
文摘In this paper, we consider the general linear hypothesis testing (GLHT) problem in heteroscedastic one-way MANOVA. The well-known Wald-type test statistic is used. Its null distribution is approximated by a Hotelling T2 distribution with one parameter estimated from the data, resulting in the so-called approximate Hotelling T2 (AHT) test. The AHT test is shown to be invariant under affine transformation, different choices of the contrast matrix specifying the same hypothesis, and different labeling schemes of the mean vectors. The AHT test can be simply conducted using the usual F-distribution. Simulation studies and real data applications show that the AHT test substantially outperforms the test of [1] and is comparable to the parametric bootstrap (PB) test of [2] for the multivariate k-sample Behrens-Fisher problem which is a special case of the GLHT problem in heteroscedastic one-way MANOVA.
基金National Natural Science Foundation of China[Grant numbers 12271286,11931001 and 11771241].
文摘Cui and Zhong(2019),(Computational Statistics&Data Analysis,139,117–133)proposed a test based on the mean variance(MV)index to test independence between a categorical random variable Y with R categories and a continuous random variable X.They ingeniously proved the asymptotic normality of the MV test statistic when R diverges to infinity,which brings many merits to the MV test,including making it more convenient for independence testing when R is large.This paper considers a new test called the integral Pearson chi-square(IPC)test,whose test statistic can be viewed as a modified MV test statistic.A central limit theorem of the martin-gale difference is used to show that the asymptotic null distribution of the standardized IPC test statistic when R is diverging is also a normal distribution,rendering the IPC test sharing many merits with the MV test.As an application of such a theoretical finding,the IPC test is extended to test independence between continuous random variables.The finite sample performance of the proposed test is assessed by Monte Carlo simulations,and a real data example is presented for illustration.