The solution properties of semiparametric model are analyzed, especially that penalized least squares for semiparametric model will be invalid when the matrix B^TPB is ill-posed or singular. According to the principle...The solution properties of semiparametric model are analyzed, especially that penalized least squares for semiparametric model will be invalid when the matrix B^TPB is ill-posed or singular. According to the principle of ridge estimate for linear parametric model, generalized penalized least squares for semiparametric model are put forward, and some formulae and statistical properties of estimates are derived. Finally according to simulation examples some helpful conclusions are drawn.展开更多
When the total least squares(TLS)solution is used to solve the parameters in the errors-in-variables(EIV)model,the obtained parameter estimations will be unreliable in the observations containing systematic errors.To ...When the total least squares(TLS)solution is used to solve the parameters in the errors-in-variables(EIV)model,the obtained parameter estimations will be unreliable in the observations containing systematic errors.To solve this problem,we propose to add the nonparametric part(systematic errors)to the partial EIV model,and build the partial EIV model to weaken the influence of systematic errors.Then,having rewritten the model as a nonlinear model,we derive the formula of parameter estimations based on the penalized total least squares criterion.Furthermore,based on the second-order approximation method of precision estimation,we derive the second-order bias and covariance of parameter estimations and calculate the mean square error(MSE).Aiming at the selection of the smoothing factor,we propose to use the U curve method.The experiments show that the proposed method can mitigate the influence of systematic errors to a certain extent compared with the traditional method and get more reliable parameter estimations and its precision information,which validates the feasibility and effectiveness of the proposed method.展开更多
In this article, we use penalized spline to estimate the hazard function from a set of censored failure time data. A new approach to estimate the amount of smoothing is provided. Under regularity conditions we establi...In this article, we use penalized spline to estimate the hazard function from a set of censored failure time data. A new approach to estimate the amount of smoothing is provided. Under regularity conditions we establish the consistency and the asymptotic normality of the penalized likelihood estimators. Numerical studies and an example are conducted to evaluate the performances of the new procedure.展开更多
The penalized least squares(PLS)method with appropriate weights has proved to be a successful baseline estimation method for various spectral analyses.It can extract the baseline from the spectrum while retaining the ...The penalized least squares(PLS)method with appropriate weights has proved to be a successful baseline estimation method for various spectral analyses.It can extract the baseline from the spectrum while retaining the signal peaks in the presence of random noise.The algorithm is implemented by iterating over the weights of the data points.In this study,we propose a new approach for assigning weights based on the Bayesian rule.The proposed method provides a self-consistent weighting formula and performs well,particularly for baselines with different curvature components.This method was applied to analyze Schottky spectra obtained in 86Kr projectile fragmentation measurements in the experimental Cooler Storage Ring(CSRe)at Lanzhou.It provides an accurate and reliable storage lifetime with a smaller error bar than existing PLS methods.It is also a universal baseline-subtraction algorithm that can be used for spectrum-related experiments,such as precision nuclear mass and lifetime measurements in storage rings.展开更多
This paper consider the penalized least squares estimators with convex penalties or regularization norms.We provide sparsity oracle inequalities for the prediction error for a general convex penalty and for the partic...This paper consider the penalized least squares estimators with convex penalties or regularization norms.We provide sparsity oracle inequalities for the prediction error for a general convex penalty and for the particular cases of Lasso and Group Lasso estimators in a regression setting.The main contribution is that our oracle inequalities are established for the more general case where the observations noise is issued from probability measures that satisfy a weak spectral gap(or Poincaré)inequality instead of Gaussian distributions.We illustrate our results on a heavy tailed example and a sub Gaussian one;we especially give the explicit bounds of the oracle inequalities for these two special examples.展开更多
Penalized spline has been a popular method for estimating an unknown function in the non-parametric regression due to their use of low-rank spline bases, which make computations tractable. However its performance is p...Penalized spline has been a popular method for estimating an unknown function in the non-parametric regression due to their use of low-rank spline bases, which make computations tractable. However its performance is poor when estimating functions that are rapidly varying in some regions and are smooth in other regions. This is contributed by the use of a global smoothing parameter that provides a constant amount of smoothing across the function. In order to make this spline spatially adaptive we have introduced hierarchical penalized splines which are obtained by modelling the global smoothing parameter as another spline.展开更多
The Thin Plate Regression Spline (TPRS) was introduced as a means of smoothing off the differences between the satellite and in-situ observations during the two dimensional (2D) blending process in an attempt to calib...The Thin Plate Regression Spline (TPRS) was introduced as a means of smoothing off the differences between the satellite and in-situ observations during the two dimensional (2D) blending process in an attempt to calibrate ocean chlorophyll. The result was a remarkable improvement on the predictive capabilities of the penalized model making use of the satellite observation. In addition, the blending process has been extended to three dimensions (3D) since it is believed that most physical systems exist in the three dimensions (3D). In this article, an attempt to obtain more reliable and accurate predictions of ocean chlorophyll by extending the penalization process to three dimensional (3D) blending is presented. Penalty matrices were computed using the integrated least squares (ILS) and integrated squared derivative (ISD). Results obtained using the integrated least squares were not encouraging, but those obtained using the integrated squared derivative showed a reasonable improvement in predicting ocean chlorophyll especially where the validation datum was surrounded by available data from the satellite data set, however, the process appeared computationally expensive and the results matched the other methods on a general scale. In both case, the procedure for implementing the penalization process in three dimensional blending when penalty matrices were calculated using the two techniques has been well established and can be used in any similar three dimensional problem when it becomes necessary.展开更多
Improving the ability to assess potential stroke deficit may aid the selection of patients most likely to benefit from acute stroke therapies. Methods based only on ‘at risk’ volumes or initial neurological conditio...Improving the ability to assess potential stroke deficit may aid the selection of patients most likely to benefit from acute stroke therapies. Methods based only on ‘at risk’ volumes or initial neurological condition do predict eventual outcome but not perfectly. Given the close relationship between anatomy and function in the brain, we propose the use of a modified version of partial least squares (PLS) regression to examine how well stroke outcome covary with infarct location. The modified version of PLS incorporates penalized regression and can handle either binary or ordinal data. This version is known as partial least squares with penalized logistic regression (PLS-PLR) and has been adapted from its original use for high-dimensional microarray data. We have adapted this algorithm for use in imaging data and demonstrate the use of this algorithm in a set of patients with aphasia (high level language disorder) following stroke.展开更多
<div style="text-align:justify;"> With the high speed development of information technology, contemporary data from a variety of fields becomes extremely large. The number of features in many datasets ...<div style="text-align:justify;"> With the high speed development of information technology, contemporary data from a variety of fields becomes extremely large. The number of features in many datasets is well above the sample size and is called high dimensional data. In statistics, variable selection approaches are required to extract the efficacious information from high dimensional data. The most popular approach is to add a penalty function coupled with a tuning parameter to the log likelihood function, which is called penalized likelihood method. However, almost all of penalized likelihood approaches only consider noise accumulation and supurious correlation whereas ignoring the endogeneity which also appeared frequently in high dimensional space. In this paper, we explore the cause of endogeneity and its influence on penalized likelihood approaches. Simulations based on five classical pe-nalized approaches are provided to vindicate their inconsistency under endogeneity. The results show that the positive selection rate of all five approaches increased gradually but the false selection rate does not consistently decrease when endogenous variables exist, that is, they do not satisfy the selection consistency. </div>展开更多
Penalized spline has largely been applied in many research studies not limited to disease modeling and epidemiology. However, due to spatial heterogeneity of the data because different smoothing parameter leads to dif...Penalized spline has largely been applied in many research studies not limited to disease modeling and epidemiology. However, due to spatial heterogeneity of the data because different smoothing parameter leads to different amount of smoothing in different regions the penalized spline has not been exclusively appropriate to fit the data. The study assessed the properties of penalized spline hierarchical model;the hierarchy penalty improves the fit as well as the accuracy of inference. The simulation demonstrates the potential benefits of using the hierarchical penalty, which is obtained by modelling the global smoothing parameter as another spline. The results showed that mixed model with penalized hierarchical penalty had a better fit than the mixed model without hierarchy this was demonstrated by the rapid convergence of the model posterior parameters and the smallest DIC value of the model. Therefore hierarchical model with fifteen sub-knots provides a better fit of the data.展开更多
We give an existence result of the obstacle parabolic equations3b(x,u) div(a(x,t,u, Vu))+div((x,t,u))=f in QT, 3twhere b(x,u) is bounded function ot u, the term atva,x,r,u, v u)) is a Letay type operat...We give an existence result of the obstacle parabolic equations3b(x,u) div(a(x,t,u, Vu))+div((x,t,u))=f in QT, 3twhere b(x,u) is bounded function ot u, the term atva,x,r,u, v u)) is a Letay type operator and the function is a nonlinear lower order and satisfy only the growth condition. The second term belongs to L1 (QT). The proof of an existence solution is based on the penalization methods.展开更多
Clustering serves as a pivotal instrument in the realm of gene expression data analysis.This paper proposes a Biclustering Coefficient Estimation(BCE)method to identify groups in the individuals and genes.An alternati...Clustering serves as a pivotal instrument in the realm of gene expression data analysis.This paper proposes a Biclustering Coefficient Estimation(BCE)method to identify groups in the individuals and genes.An alternating direction method of multipliers(ADMM)algorithm with a double fusion penalty is developed to solve the problem.The authors rigorously establish the oracle properties for the proposed penalized estimator.Numerical studies,including simulations and analysis of a lung adenocarcinoma dataset,suggest that the proposed method is expected to simultaneously recover reasonable potential groups of samples and covariates and provide satisfactory estimates of group coefficients.展开更多
Sonic Hedgehog Medulloblastoma(SHH-MB)is one of the four primary molecular subgroups of Medulloblastoma.It is estimated to be responsible for nearly one-third of allMB cases.Using transcriptomic and DNA methylation pr...Sonic Hedgehog Medulloblastoma(SHH-MB)is one of the four primary molecular subgroups of Medulloblastoma.It is estimated to be responsible for nearly one-third of allMB cases.Using transcriptomic and DNA methylation profiling techniques,new developments in this field determined four molecular subtypes for SHH-MB.SHH-MB subtypes show distinct DNAmethylation patterns that allow their discrimination fromoverlapping subtypes and predict clinical outcomes.Class overlapping occurs when two or more classes share common features,making it difficult to distinguish them as separate.Using the DNA methylation dataset,a novel classification technique is presented to address the issue of overlapping SHH-MBsubtypes.Penalizedmultinomial regression(PMR),Tomek links(TL),and singular value decomposition(SVD)were all smoothly integrated into a single framework.SVD and group lasso improve computational efficiency,address the problem of high-dimensional datasets,and clarify class distinctions by removing redundant or irrelevant features that might lead to class overlap.As a method to eliminate the issues of decision boundary overlap and class imbalance in the classification task,TL enhances dataset balance and increases the clarity of decision boundaries through the elimination of overlapping samples.Using fivefold cross-validation,our proposed method(TL-SVDPMR)achieved a remarkable overall accuracy of almost 95%in the classification of SHH-MB molecular subtypes.The results demonstrate the strong performance of the proposed classification model among the various SHH-MB subtypes given a high average of the area under the curve(AUC)values.Additionally,the statistical significance test indicates that TL-SVDPMR is more accurate than both SVM and random forest algorithms in classifying the overlapping SHH-MB subtypes,highlighting its importance for precision medicine applications.Our findings emphasized the success of combining SVD,TL,and PMRtechniques to improve the classification performance for biomedical applications with many features and overlapping subtypes.展开更多
The minimax concave penalty (MCP) has been demonstrated theoretically and practical- ly to be effective in nonconvex penalization for variable selection and parameter estimation. In this paper, we develop an efficie...The minimax concave penalty (MCP) has been demonstrated theoretically and practical- ly to be effective in nonconvex penalization for variable selection and parameter estimation. In this paper, we develop an efficient alternating direction method of multipliers (ADMM) with continuation algorithm for solving the MCP-penalized least squares problem in high dimensions. Under some mild conditions, we study the convergence properties and the Karush-Kuhn-Tucker (KKT) optimality con- ditions of the proposed method. A high-dimensional BIC is developed to select the optimal tuning parameters. Simulations and a real data example are presented to illustrate the efficiency and accuracy of the proposed method.展开更多
This paper considers variable selection for moment restriction models. We propose a penalized empirical likelihood (PEL) approach that has desirable asymptotic properties comparable to the penalized likelihood appro...This paper considers variable selection for moment restriction models. We propose a penalized empirical likelihood (PEL) approach that has desirable asymptotic properties comparable to the penalized likelihood approach, which relies on a correct parametric likelihood specification. In addition to being consistent and having the oracle property, PEL admits inference on parameter without having to estimate its estimator's covariance. An approximate algorithm, along with a consistent BIC-type criterion for selecting the tuning parameters, is provided for FEL. The proposed algorithm enjoys considerable computational efficiency and overcomes the drawback of the local quadratic approximation of nonconcave penalties. Simulation studies to evaluate and compare the performances of our method with those of the existing ones show that PEL is competitive and robust. The proposed method is illustrated with two real examples.展开更多
The purpose of this paper is two fold. First, we investigate estimation for varying coefficient partially linear models in which covariates in the nonparametric part are measured with errors. As there would be some sp...The purpose of this paper is two fold. First, we investigate estimation for varying coefficient partially linear models in which covariates in the nonparametric part are measured with errors. As there would be some spurious covariates in the linear part, a penalized profile least squares estimation is suggested with the assistance from smoothly clipped absolute deviation penalty. However, the estimator is often biased due to the existence of measurement errors, a bias correction is proposed such that the estimation consistency with the oracle property is proved. Second, based on the estimator, a test statistic is constructed to check a linear hypothesis of the parameters and its asymptotic properties are studied. We prove that the existence of measurement errors causes intractability of the limiting null distribution that requires a Monte Carlo approximation and the absence of the errors can lead to a chi-square limit. Furthermore, confidence regions of the parameter of interest can also be constructed. Simulation studies and a real data example are conducted to examine the performance of our estimators and test statistic.展开更多
The seamless-L_0(SELO) penalty is a smooth function that very closely resembles the L_0 penalty, which has been demonstrated theoretically and practically to be effective in nonconvex penalization for variable selecti...The seamless-L_0(SELO) penalty is a smooth function that very closely resembles the L_0 penalty, which has been demonstrated theoretically and practically to be effective in nonconvex penalization for variable selection. In this paper, the authors first generalize the SELO penalty to a class of penalties retaining good features of SELO, and then develop variable selection and parameter estimation in Cox models using the proposed generalized SELO(GSELO) penalized log partial likelihood(PPL) approach. The authors show that the GSELO-PPL procedure possesses the oracle property with a diverging number of predictors under certain mild, interpretable regularity conditions. The entire path of GSELO-PPL estimates can be efficiently computed through a smoothing quasi-Newton(SQN) with continuation algorithm. The authors propose a consistent modified BIC(MBIC) tuning parameter selector for GSELO-PPL, and show that under some regularity conditions, the GSELOPPL-MBIC procedure consistently identifies the true model. Simulation studies and real data analysis are conducted to evaluate the finite sample performance of the proposed method.展开更多
This paper proposes a double penalized quantile regression for linear mixed effects model,which can select fixed and random effects simultaneously.Instead of using two tuning parameters,the proposed iterative algorith...This paper proposes a double penalized quantile regression for linear mixed effects model,which can select fixed and random effects simultaneously.Instead of using two tuning parameters,the proposed iterative algorithm enables only one optimal tuning parameter in each step and is more efficient.The authors establish asymptotic normality for the proposed estimators of quantile regression coefficients.Simulation studies show that the new method is robust to a variety of error distributions at different quantiles.It outperforms the traditional regression models under a wide array of simulated data models and is flexible enough to accommodate changes in fixed and random effects.For the high dimensional data scenarios,the new method still can correctly select important variables and exclude noise variables with high probability.A case study based on a hierarchical education data illustrates a practical utility of the proposed approach.展开更多
Based on the double penalized estimation method,a new variable selection procedure is proposed for partially linear models with longitudinal data.The proposed procedure can avoid the effects of the nonparametric estim...Based on the double penalized estimation method,a new variable selection procedure is proposed for partially linear models with longitudinal data.The proposed procedure can avoid the effects of the nonparametric estimator on the variable selection for the parameters components.Under some regularity conditions,the rate of convergence and asymptotic normality of the resulting estimators are established.In addition,to improve efficiency for regression coefficients,the estimation of the working covariance matrix is involved in the proposed iterative algorithm.Some simulation studies are carried out to demonstrate that the proposed method performs well.展开更多
In statistics and machine learning communities, the last fifteen years have witnessed a surge of high-dimensional models backed by penalized methods and other state-of-the-art variable selection techniques.The high-di...In statistics and machine learning communities, the last fifteen years have witnessed a surge of high-dimensional models backed by penalized methods and other state-of-the-art variable selection techniques.The high-dimensional models we refer to differ from conventional models in that the number of all parameters p and number of significant parameters s are both allowed to grow with the sample size T. When the field-specific knowledge is preliminary and in view of recent and potential affluence of data from genetics, finance and on-line social networks, etc., such(s, T, p)-triply diverging models enjoy ultimate flexibility in terms of modeling, and they can be used as a data-guided first step of investigation. However, model selection consistency and other theoretical properties were addressed only for independent data, leaving time series largely uncovered. On a simple linear regression model endowed with a weakly dependent sequence, this paper applies a penalized least squares(PLS) approach. Under regularity conditions, we show sign consistency, derive finite sample bound with high probability for estimation error, and prove that PLS estimate is consistent in L_2 norm with rate (s log s/T)~1/2.展开更多
基金Funded by the National Nature Science Foundation of China(No.40274005) .
文摘The solution properties of semiparametric model are analyzed, especially that penalized least squares for semiparametric model will be invalid when the matrix B^TPB is ill-posed or singular. According to the principle of ridge estimate for linear parametric model, generalized penalized least squares for semiparametric model are put forward, and some formulae and statistical properties of estimates are derived. Finally according to simulation examples some helpful conclusions are drawn.
基金supported by the National Natural Science Foundation of China,Nos.41874001 and 41664001Support Program for Outstanding Youth Talents in Jiangxi Province,No.20162BCB23050National Key Research and Development Program,No.2016YFB0501405。
文摘When the total least squares(TLS)solution is used to solve the parameters in the errors-in-variables(EIV)model,the obtained parameter estimations will be unreliable in the observations containing systematic errors.To solve this problem,we propose to add the nonparametric part(systematic errors)to the partial EIV model,and build the partial EIV model to weaken the influence of systematic errors.Then,having rewritten the model as a nonlinear model,we derive the formula of parameter estimations based on the penalized total least squares criterion.Furthermore,based on the second-order approximation method of precision estimation,we derive the second-order bias and covariance of parameter estimations and calculate the mean square error(MSE).Aiming at the selection of the smoothing factor,we propose to use the U curve method.The experiments show that the proposed method can mitigate the influence of systematic errors to a certain extent compared with the traditional method and get more reliable parameter estimations and its precision information,which validates the feasibility and effectiveness of the proposed method.
基金supported by the Natural Science Foundation of China(10771017,10971015,10231030)Key Project to Ministry of Education of the People’s Republic of China(309007)
文摘In this article, we use penalized spline to estimate the hazard function from a set of censored failure time data. A new approach to estimate the amount of smoothing is provided. Under regularity conditions we establish the consistency and the asymptotic normality of the penalized likelihood estimators. Numerical studies and an example are conducted to evaluate the performances of the new procedure.
基金supported by the National Key R&D Program of China(No.2018YFA0404401)CAS Project for Young Scientists in Basic Research(No.YSBR-002)Strategic Priority Research Program of the Chinese Academy of Sciences(No.XDB34000000).
文摘The penalized least squares(PLS)method with appropriate weights has proved to be a successful baseline estimation method for various spectral analyses.It can extract the baseline from the spectrum while retaining the signal peaks in the presence of random noise.The algorithm is implemented by iterating over the weights of the data points.In this study,we propose a new approach for assigning weights based on the Bayesian rule.The proposed method provides a self-consistent weighting formula and performs well,particularly for baselines with different curvature components.This method was applied to analyze Schottky spectra obtained in 86Kr projectile fragmentation measurements in the experimental Cooler Storage Ring(CSRe)at Lanzhou.It provides an accurate and reliable storage lifetime with a smaller error bar than existing PLS methods.It is also a universal baseline-subtraction algorithm that can be used for spectrum-related experiments,such as precision nuclear mass and lifetime measurements in storage rings.
基金This work has been(partially)supported by the Project EFI ANR-17-CE40-0030 of the French National Research Agency.
文摘This paper consider the penalized least squares estimators with convex penalties or regularization norms.We provide sparsity oracle inequalities for the prediction error for a general convex penalty and for the particular cases of Lasso and Group Lasso estimators in a regression setting.The main contribution is that our oracle inequalities are established for the more general case where the observations noise is issued from probability measures that satisfy a weak spectral gap(or Poincaré)inequality instead of Gaussian distributions.We illustrate our results on a heavy tailed example and a sub Gaussian one;we especially give the explicit bounds of the oracle inequalities for these two special examples.
文摘Penalized spline has been a popular method for estimating an unknown function in the non-parametric regression due to their use of low-rank spline bases, which make computations tractable. However its performance is poor when estimating functions that are rapidly varying in some regions and are smooth in other regions. This is contributed by the use of a global smoothing parameter that provides a constant amount of smoothing across the function. In order to make this spline spatially adaptive we have introduced hierarchical penalized splines which are obtained by modelling the global smoothing parameter as another spline.
文摘The Thin Plate Regression Spline (TPRS) was introduced as a means of smoothing off the differences between the satellite and in-situ observations during the two dimensional (2D) blending process in an attempt to calibrate ocean chlorophyll. The result was a remarkable improvement on the predictive capabilities of the penalized model making use of the satellite observation. In addition, the blending process has been extended to three dimensions (3D) since it is believed that most physical systems exist in the three dimensions (3D). In this article, an attempt to obtain more reliable and accurate predictions of ocean chlorophyll by extending the penalization process to three dimensional (3D) blending is presented. Penalty matrices were computed using the integrated least squares (ILS) and integrated squared derivative (ISD). Results obtained using the integrated least squares were not encouraging, but those obtained using the integrated squared derivative showed a reasonable improvement in predicting ocean chlorophyll especially where the validation datum was surrounded by available data from the satellite data set, however, the process appeared computationally expensive and the results matched the other methods on a general scale. In both case, the procedure for implementing the penalization process in three dimensional blending when penalty matrices were calculated using the two techniques has been well established and can be used in any similar three dimensional problem when it becomes necessary.
文摘Improving the ability to assess potential stroke deficit may aid the selection of patients most likely to benefit from acute stroke therapies. Methods based only on ‘at risk’ volumes or initial neurological condition do predict eventual outcome but not perfectly. Given the close relationship between anatomy and function in the brain, we propose the use of a modified version of partial least squares (PLS) regression to examine how well stroke outcome covary with infarct location. The modified version of PLS incorporates penalized regression and can handle either binary or ordinal data. This version is known as partial least squares with penalized logistic regression (PLS-PLR) and has been adapted from its original use for high-dimensional microarray data. We have adapted this algorithm for use in imaging data and demonstrate the use of this algorithm in a set of patients with aphasia (high level language disorder) following stroke.
文摘<div style="text-align:justify;"> With the high speed development of information technology, contemporary data from a variety of fields becomes extremely large. The number of features in many datasets is well above the sample size and is called high dimensional data. In statistics, variable selection approaches are required to extract the efficacious information from high dimensional data. The most popular approach is to add a penalty function coupled with a tuning parameter to the log likelihood function, which is called penalized likelihood method. However, almost all of penalized likelihood approaches only consider noise accumulation and supurious correlation whereas ignoring the endogeneity which also appeared frequently in high dimensional space. In this paper, we explore the cause of endogeneity and its influence on penalized likelihood approaches. Simulations based on five classical pe-nalized approaches are provided to vindicate their inconsistency under endogeneity. The results show that the positive selection rate of all five approaches increased gradually but the false selection rate does not consistently decrease when endogenous variables exist, that is, they do not satisfy the selection consistency. </div>
文摘Penalized spline has largely been applied in many research studies not limited to disease modeling and epidemiology. However, due to spatial heterogeneity of the data because different smoothing parameter leads to different amount of smoothing in different regions the penalized spline has not been exclusively appropriate to fit the data. The study assessed the properties of penalized spline hierarchical model;the hierarchy penalty improves the fit as well as the accuracy of inference. The simulation demonstrates the potential benefits of using the hierarchical penalty, which is obtained by modelling the global smoothing parameter as another spline. The results showed that mixed model with penalized hierarchical penalty had a better fit than the mixed model without hierarchy this was demonstrated by the rapid convergence of the model posterior parameters and the smallest DIC value of the model. Therefore hierarchical model with fifteen sub-knots provides a better fit of the data.
文摘We give an existence result of the obstacle parabolic equations3b(x,u) div(a(x,t,u, Vu))+div((x,t,u))=f in QT, 3twhere b(x,u) is bounded function ot u, the term atva,x,r,u, v u)) is a Letay type operator and the function is a nonlinear lower order and satisfy only the growth condition. The second term belongs to L1 (QT). The proof of an existence solution is based on the penalization methods.
基金supported by the National Science Foundation of China under Grant Nos.12171450 and71921001。
文摘Clustering serves as a pivotal instrument in the realm of gene expression data analysis.This paper proposes a Biclustering Coefficient Estimation(BCE)method to identify groups in the individuals and genes.An alternating direction method of multipliers(ADMM)algorithm with a double fusion penalty is developed to solve the problem.The authors rigorously establish the oracle properties for the proposed penalized estimator.Numerical studies,including simulations and analysis of a lung adenocarcinoma dataset,suggest that the proposed method is expected to simultaneously recover reasonable potential groups of samples and covariates and provide satisfactory estimates of group coefficients.
基金funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under grant No.(DGSSR-2024-02-01137).
文摘Sonic Hedgehog Medulloblastoma(SHH-MB)is one of the four primary molecular subgroups of Medulloblastoma.It is estimated to be responsible for nearly one-third of allMB cases.Using transcriptomic and DNA methylation profiling techniques,new developments in this field determined four molecular subtypes for SHH-MB.SHH-MB subtypes show distinct DNAmethylation patterns that allow their discrimination fromoverlapping subtypes and predict clinical outcomes.Class overlapping occurs when two or more classes share common features,making it difficult to distinguish them as separate.Using the DNA methylation dataset,a novel classification technique is presented to address the issue of overlapping SHH-MBsubtypes.Penalizedmultinomial regression(PMR),Tomek links(TL),and singular value decomposition(SVD)were all smoothly integrated into a single framework.SVD and group lasso improve computational efficiency,address the problem of high-dimensional datasets,and clarify class distinctions by removing redundant or irrelevant features that might lead to class overlap.As a method to eliminate the issues of decision boundary overlap and class imbalance in the classification task,TL enhances dataset balance and increases the clarity of decision boundaries through the elimination of overlapping samples.Using fivefold cross-validation,our proposed method(TL-SVDPMR)achieved a remarkable overall accuracy of almost 95%in the classification of SHH-MB molecular subtypes.The results demonstrate the strong performance of the proposed classification model among the various SHH-MB subtypes given a high average of the area under the curve(AUC)values.Additionally,the statistical significance test indicates that TL-SVDPMR is more accurate than both SVM and random forest algorithms in classifying the overlapping SHH-MB subtypes,highlighting its importance for precision medicine applications.Our findings emphasized the success of combining SVD,TL,and PMRtechniques to improve the classification performance for biomedical applications with many features and overlapping subtypes.
基金Supported by the National Natural Science Foundation of China(Grant Nos.11571263,11501579,11701571 and41572315)the Fundamental Research Funds for the Central Universities,China University of Geosciences(Wuhan)(Grant No.CUGW150809)
文摘The minimax concave penalty (MCP) has been demonstrated theoretically and practical- ly to be effective in nonconvex penalization for variable selection and parameter estimation. In this paper, we develop an efficient alternating direction method of multipliers (ADMM) with continuation algorithm for solving the MCP-penalized least squares problem in high dimensions. Under some mild conditions, we study the convergence properties and the Karush-Kuhn-Tucker (KKT) optimality con- ditions of the proposed method. A high-dimensional BIC is developed to select the optimal tuning parameters. Simulations and a real data example are presented to illustrate the efficiency and accuracy of the proposed method.
基金supported partly by National Natural Science Foundation of China (Grant No. 11071045)Shanghai Leading Academic Discipline Project (Grant No. B210)
文摘This paper considers variable selection for moment restriction models. We propose a penalized empirical likelihood (PEL) approach that has desirable asymptotic properties comparable to the penalized likelihood approach, which relies on a correct parametric likelihood specification. In addition to being consistent and having the oracle property, PEL admits inference on parameter without having to estimate its estimator's covariance. An approximate algorithm, along with a consistent BIC-type criterion for selecting the tuning parameters, is provided for FEL. The proposed algorithm enjoys considerable computational efficiency and overcomes the drawback of the local quadratic approximation of nonconcave penalties. Simulation studies to evaluate and compare the performances of our method with those of the existing ones show that PEL is competitive and robust. The proposed method is illustrated with two real examples.
基金supported by National Natural Science Foundation of China (Grant Nos. 11401006, 11671299 and 11671042)a grant from the University Grants Council of Hong Kong+1 种基金the China Postdoctoral Science Foundation (Grant No. 2017M611083)the National Statistical Science Research Program of China (Grant No. 2015LY55)
文摘The purpose of this paper is two fold. First, we investigate estimation for varying coefficient partially linear models in which covariates in the nonparametric part are measured with errors. As there would be some spurious covariates in the linear part, a penalized profile least squares estimation is suggested with the assistance from smoothly clipped absolute deviation penalty. However, the estimator is often biased due to the existence of measurement errors, a bias correction is proposed such that the estimation consistency with the oracle property is proved. Second, based on the estimator, a test statistic is constructed to check a linear hypothesis of the parameters and its asymptotic properties are studied. We prove that the existence of measurement errors causes intractability of the limiting null distribution that requires a Monte Carlo approximation and the absence of the errors can lead to a chi-square limit. Furthermore, confidence regions of the parameter of interest can also be constructed. Simulation studies and a real data example are conducted to examine the performance of our estimators and test statistic.
基金supported by the National Natural Science Foundation of China under Grant Nos.11801531,11501578,11501579,11701571,11871474 and 41572315the Fundamental Research Funds for the Central Universities under Grant No.CUGW150809
文摘The seamless-L_0(SELO) penalty is a smooth function that very closely resembles the L_0 penalty, which has been demonstrated theoretically and practically to be effective in nonconvex penalization for variable selection. In this paper, the authors first generalize the SELO penalty to a class of penalties retaining good features of SELO, and then develop variable selection and parameter estimation in Cox models using the proposed generalized SELO(GSELO) penalized log partial likelihood(PPL) approach. The authors show that the GSELO-PPL procedure possesses the oracle property with a diverging number of predictors under certain mild, interpretable regularity conditions. The entire path of GSELO-PPL estimates can be efficiently computed through a smoothing quasi-Newton(SQN) with continuation algorithm. The authors propose a consistent modified BIC(MBIC) tuning parameter selector for GSELO-PPL, and show that under some regularity conditions, the GSELOPPL-MBIC procedure consistently identifies the true model. Simulation studies and real data analysis are conducted to evaluate the finite sample performance of the proposed method.
基金the National Social Science Fund under Grant No.17BJY210。
文摘This paper proposes a double penalized quantile regression for linear mixed effects model,which can select fixed and random effects simultaneously.Instead of using two tuning parameters,the proposed iterative algorithm enables only one optimal tuning parameter in each step and is more efficient.The authors establish asymptotic normality for the proposed estimators of quantile regression coefficients.Simulation studies show that the new method is robust to a variety of error distributions at different quantiles.It outperforms the traditional regression models under a wide array of simulated data models and is flexible enough to accommodate changes in fixed and random effects.For the high dimensional data scenarios,the new method still can correctly select important variables and exclude noise variables with high probability.A case study based on a hierarchical education data illustrates a practical utility of the proposed approach.
基金Supported by National Natural Science Foundation of China(Grant No.11101119)the Training Program for Excellent Young Teachers in Guangxi Universitiesthe Philosophy and Social Sciences Foundation of Guangxi(Grant No.11FTJ002)
文摘Based on the double penalized estimation method,a new variable selection procedure is proposed for partially linear models with longitudinal data.The proposed procedure can avoid the effects of the nonparametric estimator on the variable selection for the parameters components.Under some regularity conditions,the rate of convergence and asymptotic normality of the resulting estimators are established.In addition,to improve efficiency for regression coefficients,the estimation of the working covariance matrix is involved in the proposed iterative algorithm.Some simulation studies are carried out to demonstrate that the proposed method performs well.
基金supported by Natural Science Foundation of USA (Grant Nos. DMS1206464 and DMS1613338)National Institutes of Health of USA (Grant Nos. R01GM072611, R01GM100474 and R01GM120507)
文摘In statistics and machine learning communities, the last fifteen years have witnessed a surge of high-dimensional models backed by penalized methods and other state-of-the-art variable selection techniques.The high-dimensional models we refer to differ from conventional models in that the number of all parameters p and number of significant parameters s are both allowed to grow with the sample size T. When the field-specific knowledge is preliminary and in view of recent and potential affluence of data from genetics, finance and on-line social networks, etc., such(s, T, p)-triply diverging models enjoy ultimate flexibility in terms of modeling, and they can be used as a data-guided first step of investigation. However, model selection consistency and other theoretical properties were addressed only for independent data, leaving time series largely uncovered. On a simple linear regression model endowed with a weakly dependent sequence, this paper applies a penalized least squares(PLS) approach. Under regularity conditions, we show sign consistency, derive finite sample bound with high probability for estimation error, and prove that PLS estimate is consistent in L_2 norm with rate (s log s/T)~1/2.