Seismic fault rupture can extend to the surface,and the resulting surface deformation can cause severe damage to civil engineering structures crossing the fault zones.Coseismic Surface Rupture Prediction Models(CSRPMs...Seismic fault rupture can extend to the surface,and the resulting surface deformation can cause severe damage to civil engineering structures crossing the fault zones.Coseismic Surface Rupture Prediction Models(CSRPMs)play a crucial role in the structural design of fault-crossing engineering and in the hazard analysis of fault-intensive areas.In this study,a new global coseismic surface rupture database was constructed by compiling 171 earthquake events(Mw:5.5-7.9)that caused surface rupture.In contrast to the fault classification in traditional empirical relationships,this study categorizes earthquake events as strike-slip,dip-slip,and oblique-slip.CSRPMs utilizing Bayesian ridge regression(BRR)were developed to estimate parameters such as surface rupture length,average displacement,and maximum displacement.Based on Bayesian theory,BRR combines the benefits of both ridge regression and Bayesian linear regression.This approach effectively addresses the issue of overfitting while ensuring the strong model robustness.The reliability of the CSRPMs was validated by residual analysis and comparison with post-earthquake observations from the 2023 Türkiye earthquake doublet.The BRR-CSRPMs with new fault classification criteria are more suitable for the probabilistic hazard analysis of complex fault systems and dislocation design of fault-crossing engineering.展开更多
In the spectral analysis of laser-induced breakdown spectroscopy,abundant characteristic spectral lines and severe interference information exist simultaneously in the original spectral data.Here,a feature selection m...In the spectral analysis of laser-induced breakdown spectroscopy,abundant characteristic spectral lines and severe interference information exist simultaneously in the original spectral data.Here,a feature selection method called recursive feature elimination based on ridge regression(Ridge-RFE)for the original spectral data is recommended to make full use of the valid information of spectra.In the Ridge-RFE method,the absolute value of the ridge regression coefficient was used as a criterion to screen spectral characteristic,the feature with the absolute value of minimum weight in the input subset features was removed by recursive feature elimination(RFE),and the selected features were used as inputs of the partial least squares regression(PLS)model.The Ridge-RFE method based PLS model was used to measure the Fe,Si,Mg,Cu,Zn and Mn for 51 aluminum alloy samples,and the results showed that the root mean square error of prediction decreased greatly compared to the PLS model with full spectrum as input.The overall results demonstrate that the Ridge-RFE method is more efficient to extract the redundant features,make PLS model for better quantitative analysis results and improve model generalization ability.展开更多
This article provides the first application of the machine-learning approach in the study of the cross-sections for neutron-capture reactions with the kernel ridge regression(KRR)approach.It is found that the KRR appr...This article provides the first application of the machine-learning approach in the study of the cross-sections for neutron-capture reactions with the kernel ridge regression(KRR)approach.It is found that the KRR approach can reduce the root-mean-square(rms)deviation of the relative errors between the experimental data of the Maxwellian-averaged(n,γ)cross-sections and the corresponding theoretical predictions from 69.8%to 35.4%.By including the data with different temperatures in the training set,the rms deviation can be further significantly reduced to 2.0%.Moreover,the extrapolation performance of the KRR approach along different temperatures is found to be effective and reliable.展开更多
A novel pilot-aided ridge regression (RR) channel estimation for SC-FDE system on time-varying frequency selective fading channel is derived. Previous least square (LS) channel estimation, which does not consider and ...A novel pilot-aided ridge regression (RR) channel estimation for SC-FDE system on time-varying frequency selective fading channel is derived. Previous least square (LS) channel estimation, which does not consider and utilize the influence of noise, has poor performance when the observed signal is corrupted abnormally by noise. In order to overcome the inherent disadvantage of LS estimation, the proposed RR estimation uses the influence of noise to get better performance. The performance of this new estimator is examined. The numerical results are presented to show that the new estimation improves the accuracy of estimation especially in low channel signal-to-noise ratio (CSNR) level and outperforms LS estimation. In addition, the proposed RR estimation can get the gains of about 1dB compared with LS estimation.展开更多
At present,the prevalence of diabetes is increasing because the human body cannot metabolize the glucose level.Accurate prediction of diabetes patients is an important research area.Many researchers have proposed tech...At present,the prevalence of diabetes is increasing because the human body cannot metabolize the glucose level.Accurate prediction of diabetes patients is an important research area.Many researchers have proposed techniques to predict this disease through data mining and machine learning methods.In prediction,feature selection is a key concept in preprocessing.Thus,the features that are relevant to the disease are used for prediction.This condition improves the prediction accuracy.Selecting the right features in the whole feature set is a complicated process,and many researchers are concentrating on it to produce a predictive model with high accuracy.In this work,a wrapper-based feature selection method called recursive feature elimination is combined with ridge regression(L2)to form a hybrid L2 regulated feature selection algorithm for overcoming the overfitting problem of data set.Overfitting is a major problem in feature selection,where the new data are unfit to the model because the training data are small.Ridge regression is mainly used to overcome the overfitting problem.The features are selected by using the proposed feature selection method,and random forest classifier is used to classify the data on the basis of the selected features.This work uses the Pima Indians Diabetes data set,and the evaluated results are compared with the existing algorithms to prove the accuracy of the proposed algorithm.The accuracy of the proposed algorithm in predicting diabetes is 100%,and its area under the curve is 97%.The proposed algorithm outperforms existing algorithms.展开更多
Ridge regression spectrophotometry(LHG)is used for thesimultaneous determination of five components(acetaminophen,p-aminophenol, caffeine, chlorphenamine maleate and guaifenesin)incough syr- up. The computer program o...Ridge regression spectrophotometry(LHG)is used for thesimultaneous determination of five components(acetaminophen,p-aminophenol, caffeine, chlorphenamine maleate and guaifenesin)incough syr- up. The computer program of LHG is based on VB language.The difficulties in overlapping of absorption spectrums of fivecompounds are overcome by this procedure. The experimental resultsshow that the recovery of each component is in the range from97.9/100 to 103.3/100 and each component obtains satisfactory resultswithout any pre-separation.展开更多
The extended kernel ridge regression(EKRR)method with odd-even effects was adopted to improve the description of the nuclear charge radius using five commonly used nuclear models.These are:(i)the isospin-dependent A^(...The extended kernel ridge regression(EKRR)method with odd-even effects was adopted to improve the description of the nuclear charge radius using five commonly used nuclear models.These are:(i)the isospin-dependent A^(1∕3) formula,(ii)relativistic continuum Hartree-Bogoliubov(RCHB)theory,(iii)Hartree-Fock-Bogoliubov(HFB)model HFB25,(iv)the Weizsacker-Skyrme(WS)model WS*,and(v)HFB25*model.In the last two models,the charge radii were calculated using a five-parameter formula with the nuclear shell corrections and deformations obtained from the WS and HFB25 models,respectively.For each model,the resultant root-mean-square deviation for the 1014 nuclei with proton number Z≥8 can be significantly reduced to 0.009-0.013 fm after considering the modification with the EKRR method.The best among them was the RCHB model,with a root-mean-square deviation of 0.0092 fm.The extrapolation abilities of the KRR and EKRR methods for the neutron-rich region were examined,and it was found that after considering the odd-even effects,the extrapolation power was improved compared with that of the original KRR method.The strong odd-even staggering of nuclear charge radii of Ca and Cu isotopes and the abrupt kinks across the neutron N=126 and 82 shell closures were also calculated and could be reproduced quite well by calculations using the EKRR method.展开更多
With the development of UAV technology,UAV aerial magnetic survey plays an important role in the airborne geophysical prospecting.In the aeromagnetic survey,the magnetic field interferences generated by the magnetic c...With the development of UAV technology,UAV aerial magnetic survey plays an important role in the airborne geophysical prospecting.In the aeromagnetic survey,the magnetic field interferences generated by the magnetic components on the aircraft greatly affect the accuracy of the survey results.Therefore,it is necessary to use aeromagnetic compensation technology to eliminate the interfering magnetic field.So far,the aeromagnetic compensation methods used are mainly linear regression compensation methods based on the T-L equation.The least square is one of the most commonly used methods to solve multiple linear regressions.However,considering that the correlation between data may lead to instability of the algorithm,we use the ridge regression algorithm to solve the multicollinearity problem in the T-L equation.Subsequently this method is applied to the aeromagnetic survey data,and the standard deviation is selected as the index to evaluate the compensation effect to verify the effectiveness of the method.展开更多
In view of the difficulty in calculating the atomic structure parameters of high-Z elements,the Hartree–Fock with relativistic corrections(HFR)theory in combination with the ridge regression(RR)algorithm rather than ...In view of the difficulty in calculating the atomic structure parameters of high-Z elements,the Hartree–Fock with relativistic corrections(HFR)theory in combination with the ridge regression(RR)algorithm rather than the Cowan code’s least squares fitting(LSF)method is proposed and applied.By analyzing the energy level structure parameters of the HFR theory and using the fitting experimental energy level extrapolation method,some excited state energy levels of the Yb I(Z=70)atom including the 4f open shell are calculated.The advantages of the ridge regression algorithm are demonstrated by comparing it with Cowan code’s LSF results.In addition,the results obtained by the new method are compared with the experimental results and other theoretical results to demonstrate the reliability and accuracy of our approach.展开更多
Based on the model structure of the influence coefficient method analyzed in depth by matrix theory ,it is explained the reason why the unreasonable and instable correction masses with bigger MSE are obtained by LS in...Based on the model structure of the influence coefficient method analyzed in depth by matrix theory ,it is explained the reason why the unreasonable and instable correction masses with bigger MSE are obtained by LS influence coefficient method when there are correlation planes in the dynamic balancing. It also presencd the new ridge regression method for solving correction masses according to the Tikhonov regularization theory, and described the reason why the ridge regression can eliminate the disadvantage of the LS method. Applying this new method to dynamic balancing of gas turbine, it is found that this method is superior to the LS method when influence coefficient matrix is ill-conditioned,the minimal correction masses and residual vibration are obtained in the dynamic balancing of rotors.展开更多
In regression, despite being both aimed at estimating the Mean Squared Prediction Error (MSPE), Akaike’s Final Prediction Error (FPE) and the Generalized Cross Validation (GCV) selection criteria are usually derived ...In regression, despite being both aimed at estimating the Mean Squared Prediction Error (MSPE), Akaike’s Final Prediction Error (FPE) and the Generalized Cross Validation (GCV) selection criteria are usually derived from two quite different perspectives. Here, settling on the most commonly accepted definition of the MSPE as the expectation of the squared prediction error loss, we provide theoretical expressions for it, valid for any linear model (LM) fitter, be it under random or non random designs. Specializing these MSPE expressions for each of them, we are able to derive closed formulas of the MSPE for some of the most popular LM fitters: Ordinary Least Squares (OLS), with or without a full column rank design matrix;Ordinary and Generalized Ridge regression, the latter embedding smoothing splines fitting. For each of these LM fitters, we then deduce a computable estimate of the MSPE which turns out to coincide with Akaike’s FPE. Using a slight variation, we similarly get a class of MSPE estimates coinciding with the classical GCV formula for those same LM fitters.展开更多
Nonconvex penalties including the smoothly clipped absolute deviation penalty and the minimax concave penalty enjoy the properties of unbiasedness, continuity and sparsity,and the ridge regression can deal with the co...Nonconvex penalties including the smoothly clipped absolute deviation penalty and the minimax concave penalty enjoy the properties of unbiasedness, continuity and sparsity,and the ridge regression can deal with the collinearity problem. Combining the strengths of nonconvex penalties and ridge regression(abbreviated as NPR), we study the oracle property of the NPR estimator in high dimensional settings with highly correlated predictors, where the dimensionality of covariates pn is allowed to increase exponentially with the sample size n. Simulation studies and a real data example are presented to verify the performance of the NPR method.展开更多
Ridge regression is an effective tool to handle multicollinearity in regressions.It is also an essential type of shrinkage and regularization methods and is widely used in big data and distributed data applications.Th...Ridge regression is an effective tool to handle multicollinearity in regressions.It is also an essential type of shrinkage and regularization methods and is widely used in big data and distributed data applications.The divide and conquer trick,which combines the estimator in each subset with equal weight,is commonly applied in distributed data.To overcome multicollinearity and improve estimation accuracy in the presence of distributed data,we propose a Mallows-type model averaging method for ridge regressions,which combines estimators from all subsets.Our method is proved to be asymptotically optimal allowing the number of subsets and the dimension of variables to be divergent.The consistency of the resultant weight estimators tending to the theoretically optimal weights is also derived.Furthermore,the asymptotic normality of the model averaging estimator is demonstrated.Our simulation study and real data analysis show that the proposed model averaging method often performs better than commonly used model selection and model averaging methods in distributed data cases.展开更多
We propose a novel indoor positioning algorithm based on the received signal strength(RSS) fingerprint. The proposed algorithm can be divided into three steps, an offline phase at which an advanced clustering(AC) stra...We propose a novel indoor positioning algorithm based on the received signal strength(RSS) fingerprint. The proposed algorithm can be divided into three steps, an offline phase at which an advanced clustering(AC) strategy is used, an online phase of approximate localization at which cluster matching is used, and an online phase of precise localization with kernel ridge regression. Specifically, after offline fingerprint collection and similarity measurement, we employ an AC strategy based on the K-medoids clustering algorithm using additional reference points that are geographically located at the outer cluster boundary to enrich the data of each cluster. During the approximate localization, RSS measurements are compared with the cluster radio maps to determine to which cluster the target most likely belongs. Both the Euclidean distance of the RSSs and the Hamming distance of the coverage vectors between the observations and training records are explored for cluster matching. Then, a kernel-based ridge regression method is used to obtain the ultimate positioning of the target. The performance of the proposed algorithm is evaluated in two typical indoor environments, and compared with those of state-of-the-art algorithms. The experimental results demonstrate the effectiveness and advantages of the proposed algorithm in terms of positioning accuracy and complexity.展开更多
The kernel ridge regression(KRR)method and its extension with odd-even effects(KRRoe)are used to learn the nuclear mass table obtained by the relativistic continuum Hartree-Bogoliubov theory.With respect to the bindin...The kernel ridge regression(KRR)method and its extension with odd-even effects(KRRoe)are used to learn the nuclear mass table obtained by the relativistic continuum Hartree-Bogoliubov theory.With respect to the binding energies of 9035 nuclei,the KRR method achieves a root-mean-square deviation of 0.96 MeV,and the KRRoe method remarkably reduces the deviation to 0.17 MeV.By investigating the shell effects,one-nucleon and twonucleon separation energies,odd-even mass differences,and empirical proton-neutron interactions extracted from the learned binding energies,the ability of the machine learning tool to grasp the known physics is discussed.It is found that the shell effects,evolutions of nucleon separation energies,and empirical proton-neutron interactions are well reproduced by both the KRR and KRRoe methods,although the odd-even mass differences can only be reproduced by the KRRoe method.展开更多
Using two nuclear models,i)the relativistic continuum Hartree-Bogoliubov(RCHB)theory and ii)the Weizsäcker-Skyrme(WS)model WS*,the performances of nine kinds of kernel functions in the kernel ridge regression(KRR...Using two nuclear models,i)the relativistic continuum Hartree-Bogoliubov(RCHB)theory and ii)the Weizsäcker-Skyrme(WS)model WS*,the performances of nine kinds of kernel functions in the kernel ridge regression(KRR)method are investigated by comparing the accuracies of describing the experimental nuclear charge radii and the extrapolation abilities.It is found that,except the inverse power kernel,all other kernels can reach the same level around 0.015~0.016 fm for these two models with KRR method.The extrapolation ability for the neutron rich region of each kernel depends on the trainning data.Our investigation shows that the performances of the power kernel and Multiquadric kernel are better in the RCHB+KRR calculation,and the Gaussian kernel is better in the WS*+KRR calculation.In addition,the performance of different basis functions in the radial basis function method is also investigated for comparison.The results are similar to the KRR method.The influence of different kernels on the KRR reconstruct function is discussed by investigating the whole nuclear chart.At last,the charge radii of some specific isotopic chains have been investigated by the RCHB+KRR with power kernel and the WS*+KRR with Gaussian kernel.The charge radii and most of the specific features in these isotopic chains can be reproduced after considering the KRR method.展开更多
Soil moisture(SM)is a critical variable in terrestrial ecosystems,especially in arid and semi-arid areas where water sources are limited.Despite its importance,understanding the spatiotemporal variations and influenci...Soil moisture(SM)is a critical variable in terrestrial ecosystems,especially in arid and semi-arid areas where water sources are limited.Despite its importance,understanding the spatiotemporal variations and influencing factors of SM in these areas remains insufficient.This study investigated the spatiotemporal variations and influencing factors of SM in arid and semi-arid areas of China by utilizing the extended triple collation(ETC),Mann-Kendall test,Theil-Sen estimator,ridge regression analysis,and other relevant methods.The following findings were obtained:(1)at the pixel scale,the long-term monthly SM data from the European Space Agency Climate Change Initiative(ESA CCI)exhibited the highest correlation coefficient of 0.794 and the lowest root mean square error(RMSE)of 0.014 m^(3)/m^(3);(2)from 2000 to 2022,the study area experienced significant increase in annual average SM,with a rate of 0.408×10^(-3)m^(3)/(m^(3)•a).Moreover,higher altitudes showed a notable upward trend,with SM increasing rates at 0.210×10^(-3)m^(3)/(m^(3)•a)between 1000 and 2000 m,0.530×10^(-3)m^(3)/(m^(3)•a)between 2000 and 4000 m,and 0.760×10^(-3)m^(3)/(m^(3)•a)at altitudes above 4000 m;(3)land surface temperature(LST),root zone soil moisture(RSM)(10-40 cm depth),and normalized difference vegetation index(NDVI)were identified as the primary factors influencing annual average SM,which accounted for 34.37%,24.16%,and 22.64%relative contributions,respectively;and(4)absolute contribution of LST was more significant in subareas at higher altitudes,with average absolute contributions of 0.800×10^(-3)m^(3)/(m^(3)•a)between 2000 and 4000 m and 0.500×10^(-2) m^(3)/(m^(3)•a)above 4000 m.This study reveals the spatiotemporal variations and main influencing factors of SM in Chinese arid and semi-arid areas,highlighting the more pronounced absolute contribution of LST to SM in high-altitude areas,providing valuable insights for ecological research and water resource management in these areas.展开更多
β-ray-induced X-ray spectroscopy(BIXS)is a promising method for tritium detection in solid materials because of its unique advantages,such as large detection depth,nondestructive testing capabilities,and low requirem...β-ray-induced X-ray spectroscopy(BIXS)is a promising method for tritium detection in solid materials because of its unique advantages,such as large detection depth,nondestructive testing capabilities,and low requirements for sample preparation.However,high-accuracy reconstruction of the tritium depth profile remains a significant challenge for this technique.In this study,a novel reconstruction method based on a backpropagation(BP)neural network algorithm that demonstrates high accuracy,broad applicability,and robust noise resistance is proposed.The average reconstruction error calculated using the BP network(8.0%)was much lower than that obtained using traditional numerical methods(26.5%).In addition,the BP method can accurately reconstruct BIX spectra of samples with an unknown range of tritium and exhibits wide applicability to spectra with various tritium distributions.Furthermore,the BP network demonstrates superior accuracy and stability compared to numerical methods when reconstructing the spectra,with a relative uncertainty ranging from 0 to 10%.This study highlights the advantages of BP networks in accurately reconstructing the tritium depth profile from BIXS and promotes their further application in tritium detection.展开更多
Ensuring the reliability of wind energy as a dependable source requires overcoming challenges posed by the inherent volatility and stochastic nature of wind patterns.Long-term forecasting provides strategic advantages...Ensuring the reliability of wind energy as a dependable source requires overcoming challenges posed by the inherent volatility and stochastic nature of wind patterns.Long-term forecasting provides strategic advantages in managing energy generation projects,enabling the development of effective portfolio management strategies.The primary objective of this study was the development of forecasting methods to support strategic decision-making within the scope of wind energy operations,specifically targeting the PindaíWind Complex and its commercial dispatch.The study integrated Big Data analytics,data engineering,and computational techniques through the application of machine learning algorithms:including eXtreme Gradient Boosting,Multilayer Perceptron,Support Vector Regression,Ridge Regression,and Random Forests,aiming to generate forward-looking projections of the complex’s energy production for the year 2023.To this end,five supervised machine learning techniques were modeled and implemented.These techniques were grounded in their respective mathematical and structural formulations,and the empirical foundation for modeling was provided by historical power generation data from the PindaíWind Complex,combined with high-resolution realized and forecasted meteorological data retrieved via the Open-Meteo API.The models are trained using historical monthly generation data from the PindaíWind Complex,which has an installed capacity of 79.9 MW and is located in the northeastern region of Brazil,along with meteorological data from reanalysis models,such as air temperature,relative humidity,precipitation,surface pressure,wind speed at 10 m,wind speed at 100 m,and wind gusts.These methodologies are applied to forecast monthly wind generation for the year 2023,and the outputs are systematically compared using evaluation metrics to determine the most suitable modeling approach.The results highlight the superiority of the Multilayer Perceptron,Support Vector Regression,and eXtreme Gradient Boosting models,which achieved Kling-Gupta Efficiency(KGE)of 0.89,0.89,and 0.90,mean absolute scaled error(MASE)of 0.29,0.31,and 0.18,root mean square errors(RMSE)of 0.56,0.59,and 0.35,and mean absolute errors(MAE)of 0.48,0.52,and 0.29,respectively.展开更多
Capital structure decision is an important issue of corporate finance.Theories show that,the corporate debt ratio is determined by many factors.This study conducts empirical work on capital structure theories,focusing...Capital structure decision is an important issue of corporate finance.Theories show that,the corporate debt ratio is determined by many factors.This study conducts empirical work on capital structure theories,focusing on the corporate data of Chinese listed companies,by considering the intrinsic characteristics,utilizing the principal factor analysis and the ridge regression method.Our results suggest that a firms debt ratio has a positive relationship with its size,profitability and operating risk and has a negative relationship with its growth and non debt tax shield,while the long term leverage has a positive relationship with its collateral value of assets.展开更多
基金Foundation of China under Grant Nos. U2139207 and 52378517the Natural Science Foundation of Hubei Province under Grant No. 2023AFB934
文摘Seismic fault rupture can extend to the surface,and the resulting surface deformation can cause severe damage to civil engineering structures crossing the fault zones.Coseismic Surface Rupture Prediction Models(CSRPMs)play a crucial role in the structural design of fault-crossing engineering and in the hazard analysis of fault-intensive areas.In this study,a new global coseismic surface rupture database was constructed by compiling 171 earthquake events(Mw:5.5-7.9)that caused surface rupture.In contrast to the fault classification in traditional empirical relationships,this study categorizes earthquake events as strike-slip,dip-slip,and oblique-slip.CSRPMs utilizing Bayesian ridge regression(BRR)were developed to estimate parameters such as surface rupture length,average displacement,and maximum displacement.Based on Bayesian theory,BRR combines the benefits of both ridge regression and Bayesian linear regression.This approach effectively addresses the issue of overfitting while ensuring the strong model robustness.The reliability of the CSRPMs was validated by residual analysis and comparison with post-earthquake observations from the 2023 Türkiye earthquake doublet.The BRR-CSRPMs with new fault classification criteria are more suitable for the probabilistic hazard analysis of complex fault systems and dislocation design of fault-crossing engineering.
基金supported by National Key Research and Development Program of China(No.2016YFF0102502)the Key Research Program of Frontier Sciences,CAS(No.QYZDJ-SSW-JSC037)the Youth Innovation Promotion Association,CAS,Liao Ning Revitalization Talents Program(No.XLYC1807110)。
文摘In the spectral analysis of laser-induced breakdown spectroscopy,abundant characteristic spectral lines and severe interference information exist simultaneously in the original spectral data.Here,a feature selection method called recursive feature elimination based on ridge regression(Ridge-RFE)for the original spectral data is recommended to make full use of the valid information of spectra.In the Ridge-RFE method,the absolute value of the ridge regression coefficient was used as a criterion to screen spectral characteristic,the feature with the absolute value of minimum weight in the input subset features was removed by recursive feature elimination(RFE),and the selected features were used as inputs of the partial least squares regression(PLS)model.The Ridge-RFE method based PLS model was used to measure the Fe,Si,Mg,Cu,Zn and Mn for 51 aluminum alloy samples,and the results showed that the root mean square error of prediction decreased greatly compared to the PLS model with full spectrum as input.The overall results demonstrate that the Ridge-RFE method is more efficient to extract the redundant features,make PLS model for better quantitative analysis results and improve model generalization ability.
基金partly supported by the National Key R&D Program of China(Contracts No.2018YFA0404400 and No.2017YFE0116700)the National Natural Science Foundation of China(Grants No.11875075,No.11935003,No.11975031,No.12141501 and No.12070131001)+1 种基金the China Postdoctoral Science Foundation under Grant No.2021M700256the High-performance Computing Platform of Peking University
文摘This article provides the first application of the machine-learning approach in the study of the cross-sections for neutron-capture reactions with the kernel ridge regression(KRR)approach.It is found that the KRR approach can reduce the root-mean-square(rms)deviation of the relative errors between the experimental data of the Maxwellian-averaged(n,γ)cross-sections and the corresponding theoretical predictions from 69.8%to 35.4%.By including the data with different temperatures in the training set,the rms deviation can be further significantly reduced to 2.0%.Moreover,the extrapolation performance of the KRR approach along different temperatures is found to be effective and reliable.
基金Sponsored by the National Natural Science Foundation of China & Civil Aviation Administration of China(Grant No.61071104)the Science and Technology on Information Transmission and Dissemination in Communication Networks Laboratory(Grant No.ITD-U10006)
文摘A novel pilot-aided ridge regression (RR) channel estimation for SC-FDE system on time-varying frequency selective fading channel is derived. Previous least square (LS) channel estimation, which does not consider and utilize the influence of noise, has poor performance when the observed signal is corrupted abnormally by noise. In order to overcome the inherent disadvantage of LS estimation, the proposed RR estimation uses the influence of noise to get better performance. The performance of this new estimator is examined. The numerical results are presented to show that the new estimation improves the accuracy of estimation especially in low channel signal-to-noise ratio (CSNR) level and outperforms LS estimation. In addition, the proposed RR estimation can get the gains of about 1dB compared with LS estimation.
文摘At present,the prevalence of diabetes is increasing because the human body cannot metabolize the glucose level.Accurate prediction of diabetes patients is an important research area.Many researchers have proposed techniques to predict this disease through data mining and machine learning methods.In prediction,feature selection is a key concept in preprocessing.Thus,the features that are relevant to the disease are used for prediction.This condition improves the prediction accuracy.Selecting the right features in the whole feature set is a complicated process,and many researchers are concentrating on it to produce a predictive model with high accuracy.In this work,a wrapper-based feature selection method called recursive feature elimination is combined with ridge regression(L2)to form a hybrid L2 regulated feature selection algorithm for overcoming the overfitting problem of data set.Overfitting is a major problem in feature selection,where the new data are unfit to the model because the training data are small.Ridge regression is mainly used to overcome the overfitting problem.The features are selected by using the proposed feature selection method,and random forest classifier is used to classify the data on the basis of the selected features.This work uses the Pima Indians Diabetes data set,and the evaluated results are compared with the existing algorithms to prove the accuracy of the proposed algorithm.The accuracy of the proposed algorithm in predicting diabetes is 100%,and its area under the curve is 97%.The proposed algorithm outperforms existing algorithms.
基金This work was supported by the Science Foundation of the Education Department of Zhejiang Province( 20000064).
文摘Ridge regression spectrophotometry(LHG)is used for thesimultaneous determination of five components(acetaminophen,p-aminophenol, caffeine, chlorphenamine maleate and guaifenesin)incough syr- up. The computer program of LHG is based on VB language.The difficulties in overlapping of absorption spectrums of fivecompounds are overcome by this procedure. The experimental resultsshow that the recovery of each component is in the range from97.9/100 to 103.3/100 and each component obtains satisfactory resultswithout any pre-separation.
基金This work was supported by the National Natural Science Foundation of China(Nos.11875027,11975096).
文摘The extended kernel ridge regression(EKRR)method with odd-even effects was adopted to improve the description of the nuclear charge radius using five commonly used nuclear models.These are:(i)the isospin-dependent A^(1∕3) formula,(ii)relativistic continuum Hartree-Bogoliubov(RCHB)theory,(iii)Hartree-Fock-Bogoliubov(HFB)model HFB25,(iv)the Weizsacker-Skyrme(WS)model WS*,and(v)HFB25*model.In the last two models,the charge radii were calculated using a five-parameter formula with the nuclear shell corrections and deformations obtained from the WS and HFB25 models,respectively.For each model,the resultant root-mean-square deviation for the 1014 nuclei with proton number Z≥8 can be significantly reduced to 0.009-0.013 fm after considering the modification with the EKRR method.The best among them was the RCHB model,with a root-mean-square deviation of 0.0092 fm.The extrapolation abilities of the KRR and EKRR methods for the neutron-rich region were examined,and it was found that after considering the odd-even effects,the extrapolation power was improved compared with that of the original KRR method.The strong odd-even staggering of nuclear charge radii of Ca and Cu isotopes and the abrupt kinks across the neutron N=126 and 82 shell closures were also calculated and could be reproduced quite well by calculations using the EKRR method.
文摘With the development of UAV technology,UAV aerial magnetic survey plays an important role in the airborne geophysical prospecting.In the aeromagnetic survey,the magnetic field interferences generated by the magnetic components on the aircraft greatly affect the accuracy of the survey results.Therefore,it is necessary to use aeromagnetic compensation technology to eliminate the interfering magnetic field.So far,the aeromagnetic compensation methods used are mainly linear regression compensation methods based on the T-L equation.The least square is one of the most commonly used methods to solve multiple linear regressions.However,considering that the correlation between data may lead to instability of the algorithm,we use the ridge regression algorithm to solve the multicollinearity problem in the T-L equation.Subsequently this method is applied to the aeromagnetic survey data,and the standard deviation is selected as the index to evaluate the compensation effect to verify the effectiveness of the method.
基金the Fundamental Research Funds for the Central Universities(Grant No.10822041A2038).
文摘In view of the difficulty in calculating the atomic structure parameters of high-Z elements,the Hartree–Fock with relativistic corrections(HFR)theory in combination with the ridge regression(RR)algorithm rather than the Cowan code’s least squares fitting(LSF)method is proposed and applied.By analyzing the energy level structure parameters of the HFR theory and using the fitting experimental energy level extrapolation method,some excited state energy levels of the Yb I(Z=70)atom including the 4f open shell are calculated.The advantages of the ridge regression algorithm are demonstrated by comparing it with Cowan code’s LSF results.In addition,the results obtained by the new method are compared with the experimental results and other theoretical results to demonstrate the reliability and accuracy of our approach.
文摘Based on the model structure of the influence coefficient method analyzed in depth by matrix theory ,it is explained the reason why the unreasonable and instable correction masses with bigger MSE are obtained by LS influence coefficient method when there are correlation planes in the dynamic balancing. It also presencd the new ridge regression method for solving correction masses according to the Tikhonov regularization theory, and described the reason why the ridge regression can eliminate the disadvantage of the LS method. Applying this new method to dynamic balancing of gas turbine, it is found that this method is superior to the LS method when influence coefficient matrix is ill-conditioned,the minimal correction masses and residual vibration are obtained in the dynamic balancing of rotors.
文摘In regression, despite being both aimed at estimating the Mean Squared Prediction Error (MSPE), Akaike’s Final Prediction Error (FPE) and the Generalized Cross Validation (GCV) selection criteria are usually derived from two quite different perspectives. Here, settling on the most commonly accepted definition of the MSPE as the expectation of the squared prediction error loss, we provide theoretical expressions for it, valid for any linear model (LM) fitter, be it under random or non random designs. Specializing these MSPE expressions for each of them, we are able to derive closed formulas of the MSPE for some of the most popular LM fitters: Ordinary Least Squares (OLS), with or without a full column rank design matrix;Ordinary and Generalized Ridge regression, the latter embedding smoothing splines fitting. For each of these LM fitters, we then deduce a computable estimate of the MSPE which turns out to coincide with Akaike’s FPE. Using a slight variation, we similarly get a class of MSPE estimates coinciding with the classical GCV formula for those same LM fitters.
基金Supported by the National Natural Science Foundation of China(Grant No.11401340)China Postdoctoral Science Foundation(Grant No.2014M561892)+1 种基金the Foundation of Qufu Normal University(Grant Nos.bsqd2012041xkj201304)
文摘Nonconvex penalties including the smoothly clipped absolute deviation penalty and the minimax concave penalty enjoy the properties of unbiasedness, continuity and sparsity,and the ridge regression can deal with the collinearity problem. Combining the strengths of nonconvex penalties and ridge regression(abbreviated as NPR), we study the oracle property of the NPR estimator in high dimensional settings with highly correlated predictors, where the dimensionality of covariates pn is allowed to increase exponentially with the sample size n. Simulation studies and a real data example are presented to verify the performance of the NPR method.
基金partially supported by the Research Foundation of Shenzhen Polytechnic University (Grant No. 6023312034K)the Post-doctoral Later-stageof Shenzhen Polytechnic University(Grant No. 6023271021K)+2 种基金partially supported by the National Natural Science Foundation of China (Grant No. 71973116)partially supported by the National Natural Science Foundation of China (Grant Nos. 11971323 and 12031016)the Beijing Natural Science Foundation (Grant No. Z210003)
文摘Ridge regression is an effective tool to handle multicollinearity in regressions.It is also an essential type of shrinkage and regularization methods and is widely used in big data and distributed data applications.The divide and conquer trick,which combines the estimator in each subset with equal weight,is commonly applied in distributed data.To overcome multicollinearity and improve estimation accuracy in the presence of distributed data,we propose a Mallows-type model averaging method for ridge regressions,which combines estimators from all subsets.Our method is proved to be asymptotically optimal allowing the number of subsets and the dimension of variables to be divergent.The consistency of the resultant weight estimators tending to the theoretically optimal weights is also derived.Furthermore,the asymptotic normality of the model averaging estimator is demonstrated.Our simulation study and real data analysis show that the proposed model averaging method often performs better than commonly used model selection and model averaging methods in distributed data cases.
基金Project supported by the National Natural Science Foundation of China (Nos. 51705324 and 61702332)。
文摘We propose a novel indoor positioning algorithm based on the received signal strength(RSS) fingerprint. The proposed algorithm can be divided into three steps, an offline phase at which an advanced clustering(AC) strategy is used, an online phase of approximate localization at which cluster matching is used, and an online phase of precise localization with kernel ridge regression. Specifically, after offline fingerprint collection and similarity measurement, we employ an AC strategy based on the K-medoids clustering algorithm using additional reference points that are geographically located at the outer cluster boundary to enrich the data of each cluster. During the approximate localization, RSS measurements are compared with the cluster radio maps to determine to which cluster the target most likely belongs. Both the Euclidean distance of the RSSs and the Hamming distance of the coverage vectors between the observations and training records are explored for cluster matching. Then, a kernel-based ridge regression method is used to obtain the ultimate positioning of the target. The performance of the proposed algorithm is evaluated in two typical indoor environments, and compared with those of state-of-the-art algorithms. The experimental results demonstrate the effectiveness and advantages of the proposed algorithm in terms of positioning accuracy and complexity.
基金Supported by the National Natural Science Foundation of China(11875075,11935003,11975031,12141501,12070131001)the China Postdoctoral Science Foundation under(2021M700256)+1 种基金the State Key Laboratory of Nuclear Physics and Technology,Peking University(NPT2023ZX01,NPT2023KFY02)the President’s Undergraduate Research Fellowship(PURF)of Peking University
文摘The kernel ridge regression(KRR)method and its extension with odd-even effects(KRRoe)are used to learn the nuclear mass table obtained by the relativistic continuum Hartree-Bogoliubov theory.With respect to the binding energies of 9035 nuclei,the KRR method achieves a root-mean-square deviation of 0.96 MeV,and the KRRoe method remarkably reduces the deviation to 0.17 MeV.By investigating the shell effects,one-nucleon and twonucleon separation energies,odd-even mass differences,and empirical proton-neutron interactions extracted from the learned binding energies,the ability of the machine learning tool to grasp the known physics is discussed.It is found that the shell effects,evolutions of nucleon separation energies,and empirical proton-neutron interactions are well reproduced by both the KRR and KRRoe methods,although the odd-even mass differences can only be reproduced by the KRRoe method.
文摘Using two nuclear models,i)the relativistic continuum Hartree-Bogoliubov(RCHB)theory and ii)the Weizsäcker-Skyrme(WS)model WS*,the performances of nine kinds of kernel functions in the kernel ridge regression(KRR)method are investigated by comparing the accuracies of describing the experimental nuclear charge radii and the extrapolation abilities.It is found that,except the inverse power kernel,all other kernels can reach the same level around 0.015~0.016 fm for these two models with KRR method.The extrapolation ability for the neutron rich region of each kernel depends on the trainning data.Our investigation shows that the performances of the power kernel and Multiquadric kernel are better in the RCHB+KRR calculation,and the Gaussian kernel is better in the WS*+KRR calculation.In addition,the performance of different basis functions in the radial basis function method is also investigated for comparison.The results are similar to the KRR method.The influence of different kernels on the KRR reconstruct function is discussed by investigating the whole nuclear chart.At last,the charge radii of some specific isotopic chains have been investigated by the RCHB+KRR with power kernel and the WS*+KRR with Gaussian kernel.The charge radii and most of the specific features in these isotopic chains can be reproduced after considering the KRR method.
基金supported by the Natural Science Foundation of Henan Province(252300421290)the National Natural Science Foundation of China(41771438)+1 种基金the Program for Innovative Research Team(in Science and Technology)of Henan University(22IRTSTHN010)the Postgraduate Education Reform and Quality Improvement Project of Henan Province(HNYJS2020JD14).
文摘Soil moisture(SM)is a critical variable in terrestrial ecosystems,especially in arid and semi-arid areas where water sources are limited.Despite its importance,understanding the spatiotemporal variations and influencing factors of SM in these areas remains insufficient.This study investigated the spatiotemporal variations and influencing factors of SM in arid and semi-arid areas of China by utilizing the extended triple collation(ETC),Mann-Kendall test,Theil-Sen estimator,ridge regression analysis,and other relevant methods.The following findings were obtained:(1)at the pixel scale,the long-term monthly SM data from the European Space Agency Climate Change Initiative(ESA CCI)exhibited the highest correlation coefficient of 0.794 and the lowest root mean square error(RMSE)of 0.014 m^(3)/m^(3);(2)from 2000 to 2022,the study area experienced significant increase in annual average SM,with a rate of 0.408×10^(-3)m^(3)/(m^(3)•a).Moreover,higher altitudes showed a notable upward trend,with SM increasing rates at 0.210×10^(-3)m^(3)/(m^(3)•a)between 1000 and 2000 m,0.530×10^(-3)m^(3)/(m^(3)•a)between 2000 and 4000 m,and 0.760×10^(-3)m^(3)/(m^(3)•a)at altitudes above 4000 m;(3)land surface temperature(LST),root zone soil moisture(RSM)(10-40 cm depth),and normalized difference vegetation index(NDVI)were identified as the primary factors influencing annual average SM,which accounted for 34.37%,24.16%,and 22.64%relative contributions,respectively;and(4)absolute contribution of LST was more significant in subareas at higher altitudes,with average absolute contributions of 0.800×10^(-3)m^(3)/(m^(3)•a)between 2000 and 4000 m and 0.500×10^(-2) m^(3)/(m^(3)•a)above 4000 m.This study reveals the spatiotemporal variations and main influencing factors of SM in Chinese arid and semi-arid areas,highlighting the more pronounced absolute contribution of LST to SM in high-altitude areas,providing valuable insights for ecological research and water resource management in these areas.
基金supported by the National Key Research and Development Program of China(No.2022YFE03170003)the National Natural Science Foundation of China(Nos.12305403 and 12275243).
文摘β-ray-induced X-ray spectroscopy(BIXS)is a promising method for tritium detection in solid materials because of its unique advantages,such as large detection depth,nondestructive testing capabilities,and low requirements for sample preparation.However,high-accuracy reconstruction of the tritium depth profile remains a significant challenge for this technique.In this study,a novel reconstruction method based on a backpropagation(BP)neural network algorithm that demonstrates high accuracy,broad applicability,and robust noise resistance is proposed.The average reconstruction error calculated using the BP network(8.0%)was much lower than that obtained using traditional numerical methods(26.5%).In addition,the BP method can accurately reconstruct BIX spectra of samples with an unknown range of tritium and exhibits wide applicability to spectra with various tritium distributions.Furthermore,the BP network demonstrates superior accuracy and stability compared to numerical methods when reconstructing the spectra,with a relative uncertainty ranging from 0 to 10%.This study highlights the advantages of BP networks in accurately reconstructing the tritium depth profile from BIXS and promotes their further application in tritium detection.
文摘Ensuring the reliability of wind energy as a dependable source requires overcoming challenges posed by the inherent volatility and stochastic nature of wind patterns.Long-term forecasting provides strategic advantages in managing energy generation projects,enabling the development of effective portfolio management strategies.The primary objective of this study was the development of forecasting methods to support strategic decision-making within the scope of wind energy operations,specifically targeting the PindaíWind Complex and its commercial dispatch.The study integrated Big Data analytics,data engineering,and computational techniques through the application of machine learning algorithms:including eXtreme Gradient Boosting,Multilayer Perceptron,Support Vector Regression,Ridge Regression,and Random Forests,aiming to generate forward-looking projections of the complex’s energy production for the year 2023.To this end,five supervised machine learning techniques were modeled and implemented.These techniques were grounded in their respective mathematical and structural formulations,and the empirical foundation for modeling was provided by historical power generation data from the PindaíWind Complex,combined with high-resolution realized and forecasted meteorological data retrieved via the Open-Meteo API.The models are trained using historical monthly generation data from the PindaíWind Complex,which has an installed capacity of 79.9 MW and is located in the northeastern region of Brazil,along with meteorological data from reanalysis models,such as air temperature,relative humidity,precipitation,surface pressure,wind speed at 10 m,wind speed at 100 m,and wind gusts.These methodologies are applied to forecast monthly wind generation for the year 2023,and the outputs are systematically compared using evaluation metrics to determine the most suitable modeling approach.The results highlight the superiority of the Multilayer Perceptron,Support Vector Regression,and eXtreme Gradient Boosting models,which achieved Kling-Gupta Efficiency(KGE)of 0.89,0.89,and 0.90,mean absolute scaled error(MASE)of 0.29,0.31,and 0.18,root mean square errors(RMSE)of 0.56,0.59,and 0.35,and mean absolute errors(MAE)of 0.48,0.52,and 0.29,respectively.
文摘Capital structure decision is an important issue of corporate finance.Theories show that,the corporate debt ratio is determined by many factors.This study conducts empirical work on capital structure theories,focusing on the corporate data of Chinese listed companies,by considering the intrinsic characteristics,utilizing the principal factor analysis and the ridge regression method.Our results suggest that a firms debt ratio has a positive relationship with its size,profitability and operating risk and has a negative relationship with its growth and non debt tax shield,while the long term leverage has a positive relationship with its collateral value of assets.