The composite quantile regression should provide estimation efficiency gain over a single quantile regression. In this paper, we extend composite quantile regression to nonparametric model with random censored data. T...The composite quantile regression should provide estimation efficiency gain over a single quantile regression. In this paper, we extend composite quantile regression to nonparametric model with random censored data. The asymptotic normality of the proposed estimator is established. The proposed methods are applied to the lung cancer data. Extensive simulations are reported, showing that the proposed method works well in practical settings.展开更多
This paper presents a simple nonparametric regression approach to data-driven computing in elasticity. We apply the kernel regression to the material data set, and formulate a system of nonlinear equations solved to o...This paper presents a simple nonparametric regression approach to data-driven computing in elasticity. We apply the kernel regression to the material data set, and formulate a system of nonlinear equations solved to obtain a static equilibrium state of an elastic structure. Preliminary numerical experiments illustrate that, compared with existing methods, the proposed method finds a reasonable solution even if data points distribute coarsely in a given material data set.展开更多
Oxygen uptake plays a crucial role in the evaluation of endurance performance during exercise and is extensively utilized for metabolic assessment. This study records the oxygen uptake during the exercise phase (i.e.,...Oxygen uptake plays a crucial role in the evaluation of endurance performance during exercise and is extensively utilized for metabolic assessment. This study records the oxygen uptake during the exercise phase (i.e., ascending or descending) of the stair exercise, utilizing an experimental dataset that includes ten participants and covers various exercise periods. Based on the designed experiment protocol, a non-parametric modeling method with kernel-based regularization is generally applied to estimate the oxygen uptake changes during the switching stairs exercise, which closely resembles daily life activities. The modeling results indicate the effectiveness of the non-parametric modeling approach when compared to fixed-order models in terms of accuracy, stability, and compatibility. The influence of exercise duration on estimated fitness reveals that the model of the phase-oxygen uptake system is not time-invariant related to respiratory metabolism regulation and muscle fatigue. Consequently, it allows us to study the humans’ conversion mechanism at different metabolic rates and facilitates the standardization and development of exercise prescriptions.展开更多
Landslide susceptibility mapping is significant for landslide prevention.Many approaches have been used for landslide susceptibility prediction,however,their performances are unstable.This study constructed a hybrid m...Landslide susceptibility mapping is significant for landslide prevention.Many approaches have been used for landslide susceptibility prediction,however,their performances are unstable.This study constructed a hybrid model,namely box counting dimension-based kernel logistic regression model,which uses fractal dimension calculated by box counting method as input data based on grid cells mapping unit and terrain mapping unit.The performance of this model was evaluated in the application in Zhidan County,Shaanxi Province,China.Firstly,a total of 221 landslides were identified and mapped,and 11 landslide predisposing factors were considered.Secondly,the landslide susceptibility maps(LSMs) of the study area were obtained by constructing the model on two different mapping units.Finally,the results were evaluated with five statistical indexes,sensitivity,specificity,positive predictive value(PPV),negative predictive value(NPV) and Accuracy.The statistical indexes of the model obtained on the terrain mapping unit were larger than those based on grid cells mapping unit.For training and validation datasets,the area under the receiver operating characteristic curve(AUC) of the model based on terrain mapping unit were 0.9374 and 0.9527,respectively,indicating that establishing this model on the terrain mapping unit was advantageous in the study area.The results show that the fractal dimension improves the prediction ability of the kernel logistic model.In addition,the terrain mapping unit is a more promising mapping unit in Loess areas.展开更多
In this article, we consider the varying coefficient multiplicative regression model, which is very useful to model the positive response. The criterion of least product relative error(LPRE) is extended to the varying...In this article, we consider the varying coefficient multiplicative regression model, which is very useful to model the positive response. The criterion of least product relative error(LPRE) is extended to the varying coefficient multiplicative regression model by kernel smoothing techniques. Consistency and asymptotic normality of the proposed estimator are established. Some numerical simulations are carried out to assess the performance of the proposed estimator. As an illustration, the ethanol data is analyzed.展开更多
Sintered neodymium-iron-boron(NdFeB)magnets are indispensable in high-performance applications,but their optimization is challenged by complex structure-property relationships and limited data.In this work,we curate t...Sintered neodymium-iron-boron(NdFeB)magnets are indispensable in high-performance applications,but their optimization is challenged by complex structure-property relationships and limited data.In this work,we curate the first multi-domain database for this system(1994 industrial and academic samples)and systematically evaluate active learning(AL)strategies on classical and quantum-enhanced regressors.First,our“domain-aware”analysis reveals quantitative differences in design heuristics between industrial and academic data.Second,we present a methodological blueprint for integrating quantum kernel regression into an AL framework using a bootstrapped ensemble for uncertainty quantification.Finally,and most significantly,our results reveal AL effectiveness is strongly model-dependent.Its advantage ranges from significant acceleration(Random Forest,SVR)to being diminished(XGBoost),or even inverted—proving detrimental compared to random sampling—as shown in our quantum-enhanced SVR case study.This finding provides critical new insights for the strategic application of machine learning in materials discovery.展开更多
通过融合多个核函数,提出一种多核主成分分析(multi-kernel principal component analysis,MKPCA)和二元Logistic回归耦合的诊断方法(MKPCA-Logistic回归模型)诊断冠心病,较好的解决了单一核函数适应性问题。选取第一舒张波高度U_(1)、...通过融合多个核函数,提出一种多核主成分分析(multi-kernel principal component analysis,MKPCA)和二元Logistic回归耦合的诊断方法(MKPCA-Logistic回归模型)诊断冠心病,较好的解决了单一核函数适应性问题。选取第一舒张波高度U_(1)、第三舒张波高度U_(3)、第一收缩波高度D_(1)、第二收缩波高度D_(2)、第三收缩波高度D_(3)、收缩波的波动值±U_(1)等6个影响因子,建立Logistic回归模型以及MKPCA-Logistic回归模型对冠心病进行诊断。利用预测准确率、误判率和成功率曲线(receiver operating characteristic,ROC)对两种模型的预测精度进行检验。结果表明:MKPCA-Logistic回归模型预测患冠心病的正确率为97%,明显高于Logistic回归模型的正确率92.5%。从ROC曲线分析来看,Logistic回归模型的ROC曲线的曲线下面积(AUC)为0.783,MKPCA-Logistic回归模型的AUC为0.874,耦合模型的分类精度更高。展开更多
As biological studies become more expensive to conduct,it is a frequently encountered question that how to take advantage of the available auxiliary covariate information when the exposure variable is not measured.In ...As biological studies become more expensive to conduct,it is a frequently encountered question that how to take advantage of the available auxiliary covariate information when the exposure variable is not measured.In this paper,we propose an induced cure rate mean residual life time regression model to accommodate the survival data with cure fraction and auxiliary covariate,in which the exposure variable is only assessed in a validation set,but a corresponding continuous auxiliary covariate is ascertained for all subjects in the study cohort.Simulation studies elucidate the practical performance of the proposed method under finite samples.As an illustration,we apply the proposed method to a heart disease data from the Study of Left Ventricular Dysfunction.展开更多
This paper considers the convergence rates for nonparametric estimators of the error distribution in semi-parametric regression models. By establishing some general laws of the iterated logarithm, it shows that the ra...This paper considers the convergence rates for nonparametric estimators of the error distribution in semi-parametric regression models. By establishing some general laws of the iterated logarithm, it shows that the rates of convergence of either the empirical distribution or a smoothed version of the empirical distribution function matches exactly the rates obtained for an independent sample from the error distribution.展开更多
This article investigates the test for linearity of a multivariate stochastic regression model.The use of nonparametric regression procedures for developing regression diagnostics has beenthe subject of several recent...This article investigates the test for linearity of a multivariate stochastic regression model.The use of nonparametric regression procedures for developing regression diagnostics has beenthe subject of several recent research efforts. However, when the dimension of the regressor islarge, some traditional nonparametric methods, such as kernel estimation, may be inefficient.We in this article suggest two test statistics based on projection pursuit technique and kernelmethod. The tests proposed are consistent against all fixed smooth alternatives to linearityand are asymptotically distribution-free for the distribution of the error. Furthermore, the testsare applied to an example of real-life data and some simulated data sets to demonstrate theavailability of the tests proposed.展开更多
文摘The composite quantile regression should provide estimation efficiency gain over a single quantile regression. In this paper, we extend composite quantile regression to nonparametric model with random censored data. The asymptotic normality of the proposed estimator is established. The proposed methods are applied to the lung cancer data. Extensive simulations are reported, showing that the proposed method works well in practical settings.
基金supported by JSPS KAKENHI (Grants 17K06633 and 18K18898)
文摘This paper presents a simple nonparametric regression approach to data-driven computing in elasticity. We apply the kernel regression to the material data set, and formulate a system of nonlinear equations solved to obtain a static equilibrium state of an elastic structure. Preliminary numerical experiments illustrate that, compared with existing methods, the proposed method finds a reasonable solution even if data points distribute coarsely in a given material data set.
基金supported by the National Natural Science Foundation of China(No.62103449)the Start-up Research Fund of Southeast University(RF1028623007)the Zhishan Youth Scholar Support Program of Southeast University(2242023R40044).
文摘Oxygen uptake plays a crucial role in the evaluation of endurance performance during exercise and is extensively utilized for metabolic assessment. This study records the oxygen uptake during the exercise phase (i.e., ascending or descending) of the stair exercise, utilizing an experimental dataset that includes ten participants and covers various exercise periods. Based on the designed experiment protocol, a non-parametric modeling method with kernel-based regularization is generally applied to estimate the oxygen uptake changes during the switching stairs exercise, which closely resembles daily life activities. The modeling results indicate the effectiveness of the non-parametric modeling approach when compared to fixed-order models in terms of accuracy, stability, and compatibility. The influence of exercise duration on estimated fitness reveals that the model of the phase-oxygen uptake system is not time-invariant related to respiratory metabolism regulation and muscle fatigue. Consequently, it allows us to study the humans’ conversion mechanism at different metabolic rates and facilitates the standardization and development of exercise prescriptions.
基金funded by National Key Research and Development Program of China, Ecological Safety Guarantee Technology and Demonstration Channel and Slope Treatment Project in Loess Hilly and Gully Area (Grant No. 2017YFC0504700)。
文摘Landslide susceptibility mapping is significant for landslide prevention.Many approaches have been used for landslide susceptibility prediction,however,their performances are unstable.This study constructed a hybrid model,namely box counting dimension-based kernel logistic regression model,which uses fractal dimension calculated by box counting method as input data based on grid cells mapping unit and terrain mapping unit.The performance of this model was evaluated in the application in Zhidan County,Shaanxi Province,China.Firstly,a total of 221 landslides were identified and mapped,and 11 landslide predisposing factors were considered.Secondly,the landslide susceptibility maps(LSMs) of the study area were obtained by constructing the model on two different mapping units.Finally,the results were evaluated with five statistical indexes,sensitivity,specificity,positive predictive value(PPV),negative predictive value(NPV) and Accuracy.The statistical indexes of the model obtained on the terrain mapping unit were larger than those based on grid cells mapping unit.For training and validation datasets,the area under the receiver operating characteristic curve(AUC) of the model based on terrain mapping unit were 0.9374 and 0.9527,respectively,indicating that establishing this model on the terrain mapping unit was advantageous in the study area.The results show that the fractal dimension improves the prediction ability of the kernel logistic model.In addition,the terrain mapping unit is a more promising mapping unit in Loess areas.
文摘In this article, we consider the varying coefficient multiplicative regression model, which is very useful to model the positive response. The criterion of least product relative error(LPRE) is extended to the varying coefficient multiplicative regression model by kernel smoothing techniques. Consistency and asymptotic normality of the proposed estimator are established. Some numerical simulations are carried out to assess the performance of the proposed estimator. As an illustration, the ethanol data is analyzed.
基金supported by National Key Research & Development Program of China (No. 2024YFC3906900)National Natural Science Foundation of China (No. 22173114, 22333003 to Y. Ma+3 种基金52401251 to H. Xu)Strategic Priority Research Program (XDB0500101)Youth Innovation Promotion Association (No. 2022168)Project of Ganjiang Innovation Academy (E455F001) of Chinese Academy of Sciences. Some of the computational experiments were implemented in the ORISE and ERA supercomputers, we are also highly appreciated the helps from the supporting team.
文摘Sintered neodymium-iron-boron(NdFeB)magnets are indispensable in high-performance applications,but their optimization is challenged by complex structure-property relationships and limited data.In this work,we curate the first multi-domain database for this system(1994 industrial and academic samples)and systematically evaluate active learning(AL)strategies on classical and quantum-enhanced regressors.First,our“domain-aware”analysis reveals quantitative differences in design heuristics between industrial and academic data.Second,we present a methodological blueprint for integrating quantum kernel regression into an AL framework using a bootstrapped ensemble for uncertainty quantification.Finally,and most significantly,our results reveal AL effectiveness is strongly model-dependent.Its advantage ranges from significant acceleration(Random Forest,SVR)to being diminished(XGBoost),or even inverted—proving detrimental compared to random sampling—as shown in our quantum-enhanced SVR case study.This finding provides critical new insights for the strategic application of machine learning in materials discovery.
基金supported by the National Natural Science Foundation of China(No.11971362,12101256)。
文摘As biological studies become more expensive to conduct,it is a frequently encountered question that how to take advantage of the available auxiliary covariate information when the exposure variable is not measured.In this paper,we propose an induced cure rate mean residual life time regression model to accommodate the survival data with cure fraction and auxiliary covariate,in which the exposure variable is only assessed in a validation set,but a corresponding continuous auxiliary covariate is ascertained for all subjects in the study cohort.Simulation studies elucidate the practical performance of the proposed method under finite samples.As an illustration,we apply the proposed method to a heart disease data from the Study of Left Ventricular Dysfunction.
基金supported by the National Science Foundation of China under Grant Nos.11201422,11301481,and 11371321Zhejiang Provincial Natural Science Foundation of China under Grant Nos.Y6110639,Y6110110,LQ12A01018,and LQ12A01017+2 种基金the National Statistical Science Research Project of China under Grant No.2012LY174Foundation for Young Talents of ZJGSU under Grant No.1020XJ1314019Zhejiang Provincial Key Research Base for Humanities and Social Science Research(Statistics)
文摘This paper considers the convergence rates for nonparametric estimators of the error distribution in semi-parametric regression models. By establishing some general laws of the iterated logarithm, it shows that the rates of convergence of either the empirical distribution or a smoothed version of the empirical distribution function matches exactly the rates obtained for an independent sample from the error distribution.
文摘This article investigates the test for linearity of a multivariate stochastic regression model.The use of nonparametric regression procedures for developing regression diagnostics has beenthe subject of several recent research efforts. However, when the dimension of the regressor islarge, some traditional nonparametric methods, such as kernel estimation, may be inefficient.We in this article suggest two test statistics based on projection pursuit technique and kernelmethod. The tests proposed are consistent against all fixed smooth alternatives to linearityand are asymptotically distribution-free for the distribution of the error. Furthermore, the testsare applied to an example of real-life data and some simulated data sets to demonstrate theavailability of the tests proposed.