Model checking is crucial in statistical analyses and has garnered significant attention in the academic literature.However,certain challenges persist in scenarios that involve large-scale datasets and limited resourc...Model checking is crucial in statistical analyses and has garnered significant attention in the academic literature.However,certain challenges persist in scenarios that involve large-scale datasets and limited resource allocations.This research introduces a novel subsampling methodology for testing regression models with continuous and categorical predictors,referred to as the Subsampling Adaptive Projection-Test(SAPT).This innovative approach demonstrates substantial improvements in test power for both local and global alternatives,outperforming conventional uniform subsampling mechanisms.The authors rigorously establish the asymptotic properties of SAPT and delineate its maximum achievable power under asymptotic conditions.Comprehensive simulations and real-world dataset applications provide robust validation of the proposed theoretical propositions.展开更多
In this note,the authors revisit the envelope dimension reduction,which was first introduced for estimating a sufficient dimension reduction subspace without inverting the sample covariance.Motivated by the recent dev...In this note,the authors revisit the envelope dimension reduction,which was first introduced for estimating a sufficient dimension reduction subspace without inverting the sample covariance.Motivated by the recent developments in envelope methods and algorithms,the authors refresh the envelope inverse regression as a flexible alternative to the existing inverse regression methods in dimension reduction.The authors discuss the versatility of the envelope approach and demonstrate the advantages of the envelope dimension reduction through simulation studies.展开更多
The authors extend the marginal coordinate test for predictor contribution(Cook,2004)to the case with multivariate responses.Instead of explicitly specifying the link functions between the responses and the predictors...The authors extend the marginal coordinate test for predictor contribution(Cook,2004)to the case with multivariate responses.Instead of explicitly specifying the link functions between the responses and the predictors,an asymptotic test is proposed under the normality assumption of the predictors as well as an asymmetry assumption about the unknown regression mean function.When these assumptions are violated,the asymptotic test with elliptical trimming and clustering is still valid with desirable numerical performances.展开更多
In this paper,the authors propose a nonlinear dimension reduction technique based on Fréchet inverse regression to achieve sufficient dimension reduction for responses in metric spaces and predictors in Riemannia...In this paper,the authors propose a nonlinear dimension reduction technique based on Fréchet inverse regression to achieve sufficient dimension reduction for responses in metric spaces and predictors in Riemannian manifolds.The authors rigorously establish statistical properties of the estimators,providing formal proofs of their consistency and asymptotic behaviors.The effectiveness of our method is demonstrated through extensive simulations and applications to real-world datasets which highlight its practical utility for complex data with non-Euclidean structures.展开更多
Classical linear discriminant analysis(LDA)(Fisher,1936)implicitly assumes the classification boundary depends on only one linear combination of the predictors.This restriction can lead to poor classification in appli...Classical linear discriminant analysis(LDA)(Fisher,1936)implicitly assumes the classification boundary depends on only one linear combination of the predictors.This restriction can lead to poor classification in applications where the decision boundary depends on multiple linear combinations of the predictors.To overcome this challenge,the authors first project the predictors onto an envelope central space and then perform LDA based on the sufficient predictor.The performance of the proposed method in improving classification accuracy is demonstrated in both synthetic data and real applications.展开更多
Multi-dimensional arrays are referred to as tensors.Tensor-valued predictors are commonly encountered in modern biomedical applications,such as electroencephalogram(EEG),magnetic resonance imaging(MRI),functional MRI(...Multi-dimensional arrays are referred to as tensors.Tensor-valued predictors are commonly encountered in modern biomedical applications,such as electroencephalogram(EEG),magnetic resonance imaging(MRI),functional MRI(fMRI),diffusion-weighted MRI,and longitudinal health data.In survival analysis,it is both important and challenging to integrate clinically relevant information,such as gender,age,and disease state along with medical imaging tensor data or longitudinal health data to predict disease outcomes.Most existing higher-order sufficient dimension reduction regressions for matrix-or array-valued data focus solely on tensor data,often neglecting established clinical covariates that are readily available and known to have predictive value.Based on the idea of Folded-Minimum Average Variance Estimation(Folded-MAVE:Xue and Yin,2014),the authors propose a new method,Partial Dimension Folded-MAVE(PF-MAVE),to address regression mean functions with tensor-valued covariates while simultaneously incorporating clinical covariates,which are typically categorical variables.Theorems and simulation studies demonstrate the importance of incorporating these categorical clinical predictors.A survival analysis of a longitudinal study of primary biliary cirrhosis(PBC)data is included for illustration of the proposed method.展开更多
It is well understood that for conventional survey designs the set of unordered distinct units in a sample is a minimally sufficient statistic. This means that for inferential statistic of the sample, the value of the...It is well understood that for conventional survey designs the set of unordered distinct units in a sample is a minimally sufficient statistic. This means that for inferential statistic of the sample, the value of the sampled units rather than the sample design is important. Sampling rare populations presents distinct challenges. Examples of rare populations are in biology with rare and endangered animals where there are only a few remaining individuals, or in social science, with the low incidence of people from an unusually high (or low) income group. Sampling rare populations tends to result in the case that many of the sample units do not contain information on the characteristic of interest (e.g., the rare animal, or people from the unusual income group). For finite rare populations the set of unordered distinct rare-units in a sample is a minimally sufficient statistic. In an example case study of a rare buttercup, the properties of the minimal sufficient estimator are explored. We compare the efficiency of the estimator for the population total based on the minimally sufficient statistic, with the standard estimator for a range of sample sizes. The variance of the minimally sufficient estimator was always smaller than the variance of the sufficient estimator. For rare populations where non-rare units can be distinguished from rare units because they have the same fixed value, the minimal sufficient statistic is the rare units, if any, in the sample.展开更多
A new concept of(Φ,ρ,α)-V-invexity for differentiable vector-valued functions is introduced,which is a generalization of differentiable scalar-valued(Φ,ρ)-invexity.Based upon the(Φ,ρ,α)-V-invex functions,suffi...A new concept of(Φ,ρ,α)-V-invexity for differentiable vector-valued functions is introduced,which is a generalization of differentiable scalar-valued(Φ,ρ)-invexity.Based upon the(Φ,ρ,α)-V-invex functions,sufficient optimality conditions and MondWeir type dual theorems are derived for a class of nondifferentiable multiobjective fractional programming problems in which every component of the objective function and each constraint function contain a term involving the support function of a compact convex set.展开更多
基金supported by the National Social Science Foundation of China under Grant No.21 BT1048the National Scientific Foundation of China under Grant Nos.12371276 and 12131006。
文摘Model checking is crucial in statistical analyses and has garnered significant attention in the academic literature.However,certain challenges persist in scenarios that involve large-scale datasets and limited resource allocations.This research introduces a novel subsampling methodology for testing regression models with continuous and categorical predictors,referred to as the Subsampling Adaptive Projection-Test(SAPT).This innovative approach demonstrates substantial improvements in test power for both local and global alternatives,outperforming conventional uniform subsampling mechanisms.The authors rigorously establish the asymptotic properties of SAPT and delineate its maximum achievable power under asymptotic conditions.Comprehensive simulations and real-world dataset applications provide robust validation of the proposed theoretical propositions.
基金supported by the National Natural Science Foundation of China under Grant No.12301365supported by the National Natural Science Foundation of China under Grant No.2241200071Guangdong Basic and Applied Basic Research Foundation under Grant No.2023A1515110001。
文摘In this note,the authors revisit the envelope dimension reduction,which was first introduced for estimating a sufficient dimension reduction subspace without inverting the sample covariance.Motivated by the recent developments in envelope methods and algorithms,the authors refresh the envelope inverse regression as a flexible alternative to the existing inverse regression methods in dimension reduction.The authors discuss the versatility of the envelope approach and demonstrate the advantages of the envelope dimension reduction through simulation studies.
文摘The authors extend the marginal coordinate test for predictor contribution(Cook,2004)to the case with multivariate responses.Instead of explicitly specifying the link functions between the responses and the predictors,an asymptotic test is proposed under the normality assumption of the predictors as well as an asymmetry assumption about the unknown regression mean function.When these assumptions are violated,the asymptotic test with elliptical trimming and clustering is still valid with desirable numerical performances.
文摘In this paper,the authors propose a nonlinear dimension reduction technique based on Fréchet inverse regression to achieve sufficient dimension reduction for responses in metric spaces and predictors in Riemannian manifolds.The authors rigorously establish statistical properties of the estimators,providing formal proofs of their consistency and asymptotic behaviors.The effectiveness of our method is demonstrated through extensive simulations and applications to real-world datasets which highlight its practical utility for complex data with non-Euclidean structures.
文摘Classical linear discriminant analysis(LDA)(Fisher,1936)implicitly assumes the classification boundary depends on only one linear combination of the predictors.This restriction can lead to poor classification in applications where the decision boundary depends on multiple linear combinations of the predictors.To overcome this challenge,the authors first project the predictors onto an envelope central space and then perform LDA based on the sufficient predictor.The performance of the proposed method in improving classification accuracy is demonstrated in both synthetic data and real applications.
文摘Multi-dimensional arrays are referred to as tensors.Tensor-valued predictors are commonly encountered in modern biomedical applications,such as electroencephalogram(EEG),magnetic resonance imaging(MRI),functional MRI(fMRI),diffusion-weighted MRI,and longitudinal health data.In survival analysis,it is both important and challenging to integrate clinically relevant information,such as gender,age,and disease state along with medical imaging tensor data or longitudinal health data to predict disease outcomes.Most existing higher-order sufficient dimension reduction regressions for matrix-or array-valued data focus solely on tensor data,often neglecting established clinical covariates that are readily available and known to have predictive value.Based on the idea of Folded-Minimum Average Variance Estimation(Folded-MAVE:Xue and Yin,2014),the authors propose a new method,Partial Dimension Folded-MAVE(PF-MAVE),to address regression mean functions with tensor-valued covariates while simultaneously incorporating clinical covariates,which are typically categorical variables.Theorems and simulation studies demonstrate the importance of incorporating these categorical clinical predictors.A survival analysis of a longitudinal study of primary biliary cirrhosis(PBC)data is included for illustration of the proposed method.
文摘It is well understood that for conventional survey designs the set of unordered distinct units in a sample is a minimally sufficient statistic. This means that for inferential statistic of the sample, the value of the sampled units rather than the sample design is important. Sampling rare populations presents distinct challenges. Examples of rare populations are in biology with rare and endangered animals where there are only a few remaining individuals, or in social science, with the low incidence of people from an unusually high (or low) income group. Sampling rare populations tends to result in the case that many of the sample units do not contain information on the characteristic of interest (e.g., the rare animal, or people from the unusual income group). For finite rare populations the set of unordered distinct rare-units in a sample is a minimally sufficient statistic. In an example case study of a rare buttercup, the properties of the minimal sufficient estimator are explored. We compare the efficiency of the estimator for the population total based on the minimally sufficient statistic, with the standard estimator for a range of sample sizes. The variance of the minimally sufficient estimator was always smaller than the variance of the sufficient estimator. For rare populations where non-rare units can be distinguished from rare units because they have the same fixed value, the minimal sufficient statistic is the rare units, if any, in the sample.
基金National Natural Science Foundation of China(No.11071110)
文摘A new concept of(Φ,ρ,α)-V-invexity for differentiable vector-valued functions is introduced,which is a generalization of differentiable scalar-valued(Φ,ρ)-invexity.Based upon the(Φ,ρ,α)-V-invex functions,sufficient optimality conditions and MondWeir type dual theorems are derived for a class of nondifferentiable multiobjective fractional programming problems in which every component of the objective function and each constraint function contain a term involving the support function of a compact convex set.