期刊文献+
共找到111篇文章
< 1 2 6 >
每页显示 20 50 100
Classification of aviation incident causes using LGBM with improved cross-validation 被引量:1
1
作者 NI Xiaomei WANG Huawei +1 位作者 CHEN Lingzi LIN Ruiguan 《Journal of Systems Engineering and Electronics》 SCIE CSCD 2024年第2期396-405,共10页
Aviation accidents are currently one of the leading causes of significant injuries and deaths worldwide. This entices researchers to investigate aircraft safety using data analysis approaches based on an advanced mach... Aviation accidents are currently one of the leading causes of significant injuries and deaths worldwide. This entices researchers to investigate aircraft safety using data analysis approaches based on an advanced machine learning algorithm.To assess aviation safety and identify the causes of incidents, a classification model with light gradient boosting machine (LGBM)based on the aviation safety reporting system (ASRS) has been developed. It is improved by k-fold cross-validation with hybrid sampling model (HSCV), which may boost classification performance and maintain data balance. The results show that employing the LGBM-HSCV model can significantly improve accuracy while alleviating data imbalance. Vertical comparison with other cross-validation (CV) methods and lateral comparison with different fold times comprise the comparative approach. Aside from the comparison, two further CV approaches based on the improved method in this study are discussed:one with a different sampling and folding order, and the other with more CV. According to the assessment indices with different methods, the LGBMHSCV model proposed here is effective at detecting incident causes. The improved model for imbalanced data categorization proposed may serve as a point of reference for similar data processing, and the model’s accurate identification of civil aviation incident causes can assist to improve civil aviation safety. 展开更多
关键词 aviation safety imbalance data light gradient boosting machine(LGBM) cross-validation(CV)
在线阅读 下载PDF
基于Cross-Validation的小波自适应去噪方法 被引量:5
2
作者 黄文清 戴瑜兴 李加升 《湖南大学学报(自然科学版)》 EI CAS CSCD 北大核心 2008年第11期40-43,共4页
小波去噪算法中,阈值的选择非常关键.提出一种自适应阈值选择算法.该算法先通过Cross-Validation方法将噪声干扰信号分成两个子信号,一个用于阈值处理,一个用作参考信号;再采用最深梯度法来寻求一个最优去噪阈值.仿真和实验结果表明:在... 小波去噪算法中,阈值的选择非常关键.提出一种自适应阈值选择算法.该算法先通过Cross-Validation方法将噪声干扰信号分成两个子信号,一个用于阈值处理,一个用作参考信号;再采用最深梯度法来寻求一个最优去噪阈值.仿真和实验结果表明:在均方误差意义上,所提算法去噪效果优于Donoho等提出的VisuShrink和SureShrink两种去噪算法,且不需要带噪信号的任何'先验信息',适应于实际信号去噪处理. 展开更多
关键词 小波变换 cross-validation 自适应滤波 阈值
在线阅读 下载PDF
Cross-Validation, Shrinkage and Variable Selection in Linear Regression Revisited 被引量:3
3
作者 Hans C. van Houwelingen Willi Sauerbrei 《Open Journal of Statistics》 2013年第2期79-102,共24页
In deriving a regression model analysts often have to use variable selection, despite of problems introduced by data- dependent model building. Resampling approaches are proposed to handle some of the critical issues.... In deriving a regression model analysts often have to use variable selection, despite of problems introduced by data- dependent model building. Resampling approaches are proposed to handle some of the critical issues. In order to assess and compare several strategies, we will conduct a simulation study with 15 predictors and a complex correlation structure in the linear regression model. Using sample sizes of 100 and 400 and estimates of the residual variance corresponding to R2 of 0.50 and 0.71, we consider 4 scenarios with varying amount of information. We also consider two examples with 24 and 13 predictors, respectively. We will discuss the value of cross-validation, shrinkage and backward elimination (BE) with varying significance level. We will assess whether 2-step approaches using global or parameterwise shrinkage (PWSF) can improve selected models and will compare results to models derived with the LASSO procedure. Beside of MSE we will use model sparsity and further criteria for model assessment. The amount of information in the data has an influence on the selected models and the comparison of the procedures. None of the approaches was best in all scenarios. The performance of backward elimination with a suitably chosen significance level was not worse compared to the LASSO and BE models selected were much sparser, an important advantage for interpretation and transportability. Compared to global shrinkage, PWSF had better performance. Provided that the amount of information is not too small, we conclude that BE followed by PWSF is a suitable approach when variable selection is a key part of data analysis. 展开更多
关键词 cross-validation LASSO SHRINKAGE SIMULATION STUDY VARIABLE SELECTION
暂未订购
Using Multiple Risk Factors and Generalized Linear Mixed Models with 5-Fold Cross-Validation Strategy for Optimal Carotid Plaque Progression Prediction
4
作者 Qingyu Wang Dalin Tang +5 位作者 Liang Wang Gador Canton Zheyang Wu Thomas SHatsukami Kristen L Billiar Chun Yuan 《医用生物力学》 EI CAS CSCD 北大核心 2019年第A01期74-75,共2页
Background Cardiovascular diseases are closely linked to atherosclerotic plaque development and rupture.Plaque progression prediction is of fundamental significance to cardiovascular research and disease diagnosis,pre... Background Cardiovascular diseases are closely linked to atherosclerotic plaque development and rupture.Plaque progression prediction is of fundamental significance to cardiovascular research and disease diagnosis,prevention,and treatment.Generalized linear mixed models(GLMM)is an extension of linear model for categorical responses while considering the correlation among observations.Methods Magnetic resonance image(MRI)data of carotid atheroscleroticplaques were acquired from 20 patients with consent obtained and 3D thin-layer models were constructed to calculate plaque stress and strain for plaque progression prediction.Data for ten morphological and biomechanical risk factors included wall thickness(WT),lipid percent(LP),minimum cap thickness(MinCT),plaque area(PA),plaque burden(PB),lumen area(LA),maximum plaque wall stress(MPWS),maximum plaque wall strain(MPWSn),average plaque wall stress(APWS),and average plaque wall strain(APWSn)were extracted from all slices for analysis.Wall thickness increase(WTI),plaque burden increase(PBI)and plaque area increase(PAI) were chosen as three measures for plaque progression.Generalized linear mixed models(GLMM)with 5-fold cross-validation strategy were used to calculate prediction accuracy for each predictor and identify optimal predictor with the highest prediction accuracy defined as sum of sensitivity and specificity.All 201 MRI slices were randomly divided into 4 training subgroups and 1 verification subgroup.The training subgroups were used for model fitting,and the verification subgroup was used to estimate the model.All combinations(total1023)of 10 risk factors were feed to GLMM and the prediction accuracy of each predictor were selected from the point on the ROC(receiver operating characteristic)curve with the highest sum of specificity and sensitivity.Results LA was the best single predictor for PBI with the highest prediction accuracy(1.360 1),and the area under of the ROC curve(AUC)is0.654 0,followed by APWSn(1.336 3)with AUC=0.6342.The optimal predictor among all possible combinations for PBI was the combination of LA,PA,LP,WT,MPWS and MPWSn with prediction accuracy=1.414 6(AUC=0.715 8).LA was once again the best single predictor for PAI with the highest prediction accuracy(1.184 6)with AUC=0.606 4,followed by MPWSn(1. 183 2)with AUC=0.6084.The combination of PA,PB,WT,MPWS,MPWSn and APWSn gave the best prediction accuracy(1.302 5)for PAI,and the AUC value is 0.6657.PA was the best single predictor for WTI with highest prediction accuracy(1.288 7)with AUC=0.641 5,followed by WT(1.254 0),with AUC=0.6097.The combination of PA,PB,WT,LP,MinCT,MPWS and MPWS was the best predictor for WTI with prediction accuracy as 1.314 0,with AUC=0.6552.This indicated that PBI was a more predictable measure than WTI and PAI. The combinational predictors improved prediction accuracy by 9.95%,4.01%and 1.96%over the best single predictors for PAI,PBI and WTI(AUC values improved by9.78%,9.45%,and 2.14%),respectively.Conclusions The use of GLMM with 5-fold cross-validation strategy combining both morphological and biomechanical risk factors could potentially improve the accuracy of carotid plaque progression prediction.This study suggests that a linear combination of multiple predictors can provide potential improvement to existing plaque assessment schemes. 展开更多
关键词 Multiple Risk FACTORS GENERALIZED Linear 5-Fold cross-validation STRATEGY AUC
原文传递
ON THE CONSISTENCY OF CROSS-VALIDATIONIN NONLINEAR WAVELET REGRESSION ESTIMATION
5
作者 张双林 郑忠国 《Acta Mathematica Scientia》 SCIE CSCD 2000年第1期1-11,共11页
For the nonparametric regression model Y-ni = g(x(ni)) + epsilon(ni)i = 1, ..., n, with regularly spaced nonrandom design, the authors study the behavior of the nonlinear wavelet estimator of g(x). When the threshold ... For the nonparametric regression model Y-ni = g(x(ni)) + epsilon(ni)i = 1, ..., n, with regularly spaced nonrandom design, the authors study the behavior of the nonlinear wavelet estimator of g(x). When the threshold and truncation parameters are chosen by cross-validation on the everage squared error, strong consistency for the case of dyadic sample size and moment consistency for arbitrary sample size are established under some regular conditions. 展开更多
关键词 CONSISTENCY cross-validation nonparametric regression THRESHOLD TRUNCATION wavelet estimator
在线阅读 下载PDF
Augmented robustness in home demand prediction:Integrating statistical loss function with enhanced cross-validation in machine learning hyperparameter optimisation
6
作者 Banafshe Parizad Ali Jamali Hamid Khayyam 《Energy and AI》 2025年第3期776-787,共12页
Sustainable forecasting of home energy demand(SFHED)is crucial for promoting energy efficiency,minimizing environmental impact,and optimizing resource allocation.Machine learning(ML)supports SFHED by identifying patte... Sustainable forecasting of home energy demand(SFHED)is crucial for promoting energy efficiency,minimizing environmental impact,and optimizing resource allocation.Machine learning(ML)supports SFHED by identifying patterns and forecasting demand.However,conventional hyperparameter tuning methods often rely solely on minimizing average prediction errors,typically through fixed k-fold cross-validation,which overlooks error variability and limits model robustness.To address this limitation,we propose the Optimized Robust Hyperparameter Tuning for Machine Learning with Enhanced Multi-fold Cross-Validation(ORHT-ML-EMCV)framework.This method integrates statistical analysis of k-fold validation errors by incorporating their mean and variance into the optimization objective,enhancing robustness and generalizability.A weighting factor is introduced to balance accuracy and robustness,and its impact is evaluated across a range of values.A novel Enhanced Multi-Fold Cross-Validation(EMCV)technique is employed to automatically evaluate model performance across varying fold configurations without requiring a predefined k value,thereby reducing sensitivity to data splits.Using three evolutionary algorithms Genetic Algorithm(GA),Particle Swarm Optimization(PSO),and Differential Evolution(DE)we optimize two ensemble models:XGBoost and LightGBM.The optimization process minimizes both mean error and variance,with robustness assessed through cumulative distribution function(CDF)analyses.Experiments on three real-world residential datasets show the proposed method reduces worst-case Root Mean Square Error(RMSE)by up to 19.8%and narrows confidence intervals by up to 25%.Cross-household validations confirm strong generalization,achieving coefficient of determination(R²)of 0.946 and 0.972 on unseen homes.The framework offers a statistically grounded and efficient solution for robust energy forecasting. 展开更多
关键词 Demand forecast Enhanced K-fold cross-validation XGBoost LightGBM Optimisation Robust
在线阅读 下载PDF
Kriging Model Averaging Based on Leave-One-Out Cross-Validation Method 被引量:1
7
作者 FENG Ziheng ZONG Xianpeng +1 位作者 XIE Tianfa ZHANG Xinyu 《Journal of Systems Science & Complexity》 SCIE EI CSCD 2024年第5期2132-2156,共25页
In recent years,Kriging model has gained wide popularity in various fields such as space geology,econometrics,and computer experiments.As a result,research on this model has proliferated.In this paper,the authors prop... In recent years,Kriging model has gained wide popularity in various fields such as space geology,econometrics,and computer experiments.As a result,research on this model has proliferated.In this paper,the authors propose a model averaging estimation based on the best linear unbiased prediction of Kriging model and the leave-one-out cross-validation method,with consideration for the model uncertainty.The authors present a weight selection criterion for the model averaging estimation and provide two theoretical justifications for the proposed method.First,the estimated weight based on the proposed criterion is asymptotically optimal in achieving the lowest possible prediction risk.Second,the proposed method asymptotically assigns all weights to the correctly specified models when the candidate model set includes these models.The effectiveness of the proposed method is verified through numerical analyses. 展开更多
关键词 Asymptotic optimality best linear unbiased prediction cross-validation Kriging model model averaging
原文传递
基于V-foldCross-validation和Elman神经网络的信用评价研究 被引量:20
8
作者 吴德胜 梁樑 《系统工程理论与实践》 EI CSCD 北大核心 2004年第4期92-98,共7页
 研究了关于公司信用评估问题的现状,指出一般神经网络应用于信用评估领域的不足.在此基础上,提出一套甄选原则以选择关键的信用评分指标;然后依据这些指标建立了基于Elman回归神经网络的我国企业的信用评估模型.采用V-foldCross-valid...  研究了关于公司信用评估问题的现状,指出一般神经网络应用于信用评估领域的不足.在此基础上,提出一套甄选原则以选择关键的信用评分指标;然后依据这些指标建立了基于Elman回归神经网络的我国企业的信用评估模型.采用V-foldCross-validation技巧对该模型的评分效果进行了实证研究. 展开更多
关键词 ELMAN神经网络 V-fold cross-validation技巧 信用评分
原文传递
On Splitting Training and Validation Set:A Comparative Study of Cross-Validation,Bootstrap and Systematic Sampling for Estimating the Generalization Performance of Supervised Learning 被引量:11
9
作者 Yun Xu Royston Goodacre 《Journal of Analysis and Testing》 EI 2018年第3期249-262,共14页
Model validation is the most important part of building a supervised model.For building a model with good generalization performance one must have a sensible data splitting strategy,and this is crucial for model valid... Model validation is the most important part of building a supervised model.For building a model with good generalization performance one must have a sensible data splitting strategy,and this is crucial for model validation.In this study,we con-ducted a comparative study on various reported data splitting methods.The MixSim model was employed to generate nine simulated datasets with different probabilities of mis-classification and variable sample sizes.Then partial least squares for discriminant analysis and support vector machines for classification were applied to these datasets.Data splitting methods tested included variants of cross-validation,bootstrapping,bootstrapped Latin partition,Kennard-Stone algorithm(K-S)and sample set partitioning based on joint X-Y distances algorithm(SPXY).These methods were employed to split the data into training and validation sets.The estimated generalization performances from the validation sets were then compared with the ones obtained from the blind test sets which were generated from the same distribution but were unseen by the train-ing/validation procedure used in model construction.The results showed that the size of the data is the deciding factor for the qualities of the generalization performance estimated from the validation set.We found that there was a significant gap between the performance estimated from the validation set and the one from the test set for the all the data splitting methods employed on small datasets.Such disparity decreased when more samples were available for training/validation,and this is because the models were then moving towards approximations of the central limit theory for the simulated datasets used.We also found that having too many or too few samples in the training set had a negative effect on the estimated model performance,suggesting that it is necessary to have a good balance between the sizes of training set and validation set to have a reliable estimation of model performance.We also found that systematic sampling method such as K-S and SPXY generally had very poor estimation of the model performance,most likely due to the fact that they are designed to take the most representative samples first and thus left a rather poorly representative sample set for model performance estimation. 展开更多
关键词 cross-validation BOOTSTRAPPING Bootstrapped Latin partition Kennard-Stone algorithm SPXY Model selection Model validation Partial least squares for discriminant analysis Support vector machines
原文传递
PPP-RTK considering the ionosphere uncertainty with cross-validation 被引量:5
10
作者 Pan Li Bobin Cui +4 位作者 Jiahuan Hu Xuexi Liu Xiaohong Zhang Maorong Ge Harald Schuh 《Satellite Navigation》 2022年第1期34-46,I0002,共14页
With the high-precision products of satellite orbit and clock,uncalibrated phase delay,and the atmosphere delay corrections,Precise Point Positioning(PPP)based on a Real-Time Kinematic(RTK)network is possible to rapid... With the high-precision products of satellite orbit and clock,uncalibrated phase delay,and the atmosphere delay corrections,Precise Point Positioning(PPP)based on a Real-Time Kinematic(RTK)network is possible to rapidly achieve centimeter-level positioning accuracy.In the ionosphere-weighted PPP–RTK model,not only the a priori value of ionosphere but also its precision afect the convergence and accuracy of positioning.This study proposes a method to determine the precision of the interpolated slant ionospheric delay by cross-validation.The new method takes the high temporal and spatial variation into consideration.A distance-dependent function is built to represent the stochastic model of the slant ionospheric delay derived from each reference station,and an error model is built for each reference station on a fve-minute piecewise basis.The user can interpolate ionospheric delay correction and the corresponding precision with an error function related to the distance and time of each reference station.With the European Reference Frame(EUREF)Permanent GNSS(Global Navigation Satellite Systems)network(EPN),and SONEL(Système d’Observation du Niveau des Eaux Littorales)GNSS stations covering most of Europe,the efectiveness of our wide-area ionosphere constraint method for PPP-RTK is validated,compared with the method with a fxed ionosphere precision threshold.It is shown that although the Root Mean Square(RMS)of the interpolated ionosphere error is within 5 cm in most of the areas,it exceeds 10 cm for some areas with sparse reference stations during some periods of time.The convergence time of the 90th percentile is 4.0 and 20.5 min for horizontal and vertical directions using Global Positioning System(GPS)kinematic solution,respectively,with the proposed method.This convergence is faster than those with the fxed ionosphere precision values of 1,8,and 30 cm.The improvement with respect to the latter three solutions ranges from 10 to 60%.After integrating the Galileo navigation satellite system(Galileo),the convergence time of the 90th percentile for combined kinematic solutions is 2.0 and 9.0 min,with an improvement of 50.0%and 56.1%for horizontal and vertical directions,respectively,compared with the GPS-only solution.The average convergence time of GPS PPP-RTK for horizontal and vertical directions are 2.0 and 5.0 min,and those of GPS+Galileo PPP-RTK are 1.4 and 3.0 min,respectively. 展开更多
关键词 PPP-RTK Ionosphere precision cross-validation Rapid ambiguity resolution
原文传递
Convergence rate of cross-validation in nonlinear wavelet regression estimation 被引量:1
11
作者 Zhang Shuanglin Zheng Zhongguo 《Chinese Science Bulletin》 SCIE EI CAS 1999年第10期898-901,共4页
Cross-validation method is used to choose the three smoothing parameters in nonlin ear wavelet regression estimators. The strong consistency and convergence rate of cross-vali dation nonlinear wavelet regression estim... Cross-validation method is used to choose the three smoothing parameters in nonlin ear wavelet regression estimators. The strong consistency and convergence rate of cross-vali dation nonlinear wavelet regression estimators are obtained. 展开更多
关键词 WAVELET estimation NONPARAMETRIC regression ESTIMATORS cross-validation strong consistency.
在线阅读 下载PDF
Artificial neural network with a cross-validation approach to blast-induced ground vibration propagation modeling 被引量:1
12
作者 Gustavo Paneiro Manuel Rafael 《Underground Space》 SCIE EI 2021年第3期281-289,共9页
Given their technical and economic advantages,the application of explosive substances to rock mass excavation is widely used.However,because of serious environmental restraints,there has been an increasing need to use... Given their technical and economic advantages,the application of explosive substances to rock mass excavation is widely used.However,because of serious environmental restraints,there has been an increasing need to use complex tools to control environmental effects due to blast-induced ground vibrations.In the present study,an artificial neural network(ANN)with k-fold cross-validation was applied to a dataset containing 1114 observations that was obtained from published results;furthermore,quantitative and qualitative parameters were considered for ground vibration amplitude prediction.The best ANN model obtained has a maximum coefficient of determination of 0.840 and a mean absolute error of 5.59 and it comprises 17 input parameters,12 neurons in a one-layer hidden layer,and a sigmoid transfer function.Compared with the traditional models,the model obtained using the proposed methodology demonstrated better generalization ability.Furthermore,the proposed methodology offers an ANN model with higher prediction ability. 展开更多
关键词 Rock blasting EXCAVATION Ground vibrations Artificial neural network K-fold cross-validation MODELING
在线阅读 下载PDF
Robust U-type test for high dimensional regression coefficients using refitted cross-validation variance estimation 被引量:1
13
作者 GUO WenWen CHEN YongShuai CUI HengJian 《Science China Mathematics》 SCIE CSCD 2016年第12期2319-2334,共16页
This paper aims to develop a new robust U-type test for high dimensional regression coefficients using the estimated U-statistic of order two and refitted cross-validation error variance estimation. It is proved that ... This paper aims to develop a new robust U-type test for high dimensional regression coefficients using the estimated U-statistic of order two and refitted cross-validation error variance estimation. It is proved that the limiting null distribution of the proposed new test is normal under two kinds of ordinary models.We further study the local power of the proposed test and compare with other competitive tests for high dimensional data. The idea of refitted cross-validation approach is utilized to reduce the bias of sample variance in the estimation of the test statistic. Our theoretical results indicate that the proposed test can have even more substantial power gain than the test by Zhong and Chen(2011) when testing a hypothesis with outlying observations and heavy tailed distributions. We assess the finite-sample performance of the proposed test by examining its size and power via Monte Carlo studies. We also illustrate the application of the proposed test by an empirical analysis of a real data example. 展开更多
关键词 high dimension regression large p small n refitted cross-validation variance estimation U-type test robust
原文传递
M-Cross-Validation in Local Median Estimation
14
作者 Ying YANG 《Acta Mathematica Sinica,English Series》 SCIE CSCD 2006年第5期1565-1582,共18页
M-cross-validation criterion is proposed for selecting a smoothing parameter in a nonparametric median regression model in which a uniform weak convergency rate for the M-cross-validated local median estimate, and the... M-cross-validation criterion is proposed for selecting a smoothing parameter in a nonparametric median regression model in which a uniform weak convergency rate for the M-cross-validated local median estimate, and the upper and lower bounds of the smoothing parameter selected by the proposed criterion are established. The main contribution of this study shows a drastic difference from those encountered in the classical L2-, L1- cross-validation technique, which leads only to the consistency in the sense of the average. Obviously, our results are novel and nontrivial from the point of view of mathematics and statistics, which provides insight and possibility for practitioners substituting maximum deviation for average deviation to evaluate the performance of the data-driven technique. 展开更多
关键词 local median estimate cross-validation nonparametric median regression smoothing parameter uniform weak convergency rate
原文传递
A Note on the Optimality of Generalized Cross-validation Bandwidth Selection in Partially Linear Models with Kernel Smoothing Estimator
15
作者 Wang-li Xu 《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 2006年第2期345-352,共8页
The issue of selection of bandwidth in kernel smoothing method is considered within the context of partially linear models, hi this paper, we study the asymptotic behavior of the bandwidth choice based on generalized ... The issue of selection of bandwidth in kernel smoothing method is considered within the context of partially linear models, hi this paper, we study the asymptotic behavior of the bandwidth choice based on generalized cross-validation (CCV) approach and prove that this bandwidth choice is asymptotically optimal. Numerical simulation are also conducted to investigate the empirical performance of generalized cross-valldation. 展开更多
关键词 Generalized cross-validation partially linear model kernel smoothing bandwidth selection
原文传递
Detection and analysis of Spartina alterniflora in Chongming East Beach using Sentinel-2 imagery and image texture features
16
作者 Xinyu Mei Zhongbiao Chen +1 位作者 Runxia Sun Yijun He 《Acta Oceanologica Sinica》 2025年第2期80-90,共11页
Spartina alterniflora is now listed among the world’s 100 most dangerous invasive species,severely affecting the ecological balance of coastal wetlands.Remote sensing technologies based on deep learning enable large-... Spartina alterniflora is now listed among the world’s 100 most dangerous invasive species,severely affecting the ecological balance of coastal wetlands.Remote sensing technologies based on deep learning enable large-scale monitoring of Spartina alterniflora,but they require large datasets and have poor interpretability.A new method is proposed to detect Spartina alterniflora from Sentinel-2 imagery.Firstly,to get the high canopy cover and dense community characteristics of Spartina alterniflora,multi-dimensional shallow features are extracted from the imagery.Secondly,to detect different objects from satellite imagery,index features are extracted,and the statistical features of the Gray-Level Co-occurrence Matrix(GLCM)are derived using principal component analysis.Then,ensemble learning methods,including random forest,extreme gradient boosting,and light gradient boosting machine models,are employed for image classification.Meanwhile,Recursive Feature Elimination with Cross-Validation(RFECV)is used to select the best feature subset.Finally,to enhance the interpretability of the models,the best features are utilized to classify multi-temporal images and SHapley Additive exPlanations(SHAP)is combined with these classifications to explain the model prediction process.The method is validated by using Sentinel-2 imageries and previous observations of Spartina alterniflora in Chongming Island,it is found that the model combining image texture features such as GLCM covariance can significantly improve the detection accuracy of Spartina alterniflora by about 8%compared with the model without image texture features.Through multiple model comparisons and feature selection via RFECV,the selected model and eight features demonstrated good classification accuracy when applied to data from different time periods,proving that feature reduction can effectively enhance model generalization.Additionally,visualizing model decisions using SHAP revealed that the image texture feature component_1_GLCMVariance is particularly important for identifying each land cover type. 展开更多
关键词 texture features Recursive Feature Elimination with cross-validation(RFECV) SHapley Additive exPlanations(SHAP) Sentinel-2 time-series imagery multi-model comparison
在线阅读 下载PDF
Risk assessment of rockburst using SMOTE oversampling and integration algorithms under GBDT framework 被引量:2
17
作者 WANG Jia-chuang DONG Long-jun 《Journal of Central South University》 SCIE EI CAS CSCD 2024年第8期2891-2915,共25页
Rockburst is a common geological disaster in underground engineering,which seriously threatens the safety of personnel,equipment and property.Utilizing machine learning models to evaluate risk of rockburst is graduall... Rockburst is a common geological disaster in underground engineering,which seriously threatens the safety of personnel,equipment and property.Utilizing machine learning models to evaluate risk of rockburst is gradually becoming a trend.In this study,the integrated algorithms under Gradient Boosting Decision Tree(GBDT)framework were used to evaluate and classify rockburst intensity.First,a total of 301 rock burst data samples were obtained from a case database,and the data were preprocessed using synthetic minority over-sampling technique(SMOTE).Then,the rockburst evaluation models including GBDT,eXtreme Gradient Boosting(XGBoost),Light Gradient Boosting Machine(LightGBM),and Categorical Features Gradient Boosting(CatBoost)were established,and the optimal hyperparameters of the models were obtained through random search grid and five-fold cross-validation.Afterwards,use the optimal hyperparameter configuration to fit the evaluation models,and analyze these models using test set.In order to evaluate the performance,metrics including accuracy,precision,recall,and F1-score were selected to analyze and compare with other machine learning models.Finally,the trained models were used to conduct rock burst risk assessment on rock samples from a mine in Shanxi Province,China,and providing theoretical guidance for the mine's safe production work.The models under the GBDT framework perform well in the evaluation of rockburst levels,and the proposed methods can provide a reliable reference for rockburst risk level analysis and safety management. 展开更多
关键词 rockburst evaluation SMOTE oversampling random search grid K-fold cross-validation confusion matrix
在线阅读 下载PDF
Multi-environment BSA-seq using large F3 populations is able to achieve reliable QTL mapping with high power and resolution: An experimental demonstration in rice
18
作者 Yan Zheng Ei Ei Khine +9 位作者 Khin Mar Thi Ei Ei Nyein Likun Huang Lihui Lin Xiaofang Xie Min Htay Wai Lin Khin Than Oo Myat Myat Moe San San Aye Weiren Wu 《The Crop Journal》 SCIE CSCD 2024年第2期549-557,共9页
Bulked-segregant analysis by deep sequencing(BSA-seq) is a widely used method for mapping QTL(quantitative trait loci) due to its simplicity, speed, cost-effectiveness, and efficiency. However, the ability of BSA-seq ... Bulked-segregant analysis by deep sequencing(BSA-seq) is a widely used method for mapping QTL(quantitative trait loci) due to its simplicity, speed, cost-effectiveness, and efficiency. However, the ability of BSA-seq to detect QTL is often limited by inappropriate experimental designs, as evidenced by numerous practical studies. Most BSA-seq studies have utilized small to medium-sized populations, with F2populations being the most common choice. Nevertheless, theoretical studies have shown that using a large population with an appropriate pool size can significantly enhance the power and resolution of QTL detection in BSA-seq, with F_(3)populations offering notable advantages over F2populations. To provide an experimental demonstration, we tested the power of BSA-seq to identify QTL controlling days from sowing to heading(DTH) in a 7200-plant rice F_(3)population in two environments, with a pool size of approximately 500. Each experiment identified 34 QTL, an order of magnitude greater than reported in most BSA-seq experiments, of which 23 were detected in both experiments, with 17 of these located near41 previously reported QTL and eight cloned genes known to control DTH in rice. These results indicate that QTL mapping by BSA-seq in large F_(3)populations and multi-environment experiments can achieve high power, resolution, and reliability. 展开更多
关键词 BSA-seq QTL mapping Large F3 population Multi-environment experiment cross-validation
在线阅读 下载PDF
A Novel Optimized Deep Convolutional Neural Network for Efficient Seizure Stage Classification
19
作者 Umapathi Krishnamoorthy Shanmugam Jagan +2 位作者 Mohammed Zakariah Abdulaziz S.Almazyad K.Gurunathan 《Computers, Materials & Continua》 SCIE EI 2024年第12期3903-3926,共24页
Brain signal analysis from electroencephalogram(EEG)recordings is the gold standard for diagnosing various neural disorders especially epileptic seizure.Seizure signals are highly chaotic compared to normal brain sign... Brain signal analysis from electroencephalogram(EEG)recordings is the gold standard for diagnosing various neural disorders especially epileptic seizure.Seizure signals are highly chaotic compared to normal brain signals and thus can be identified from EEG recordings.In the current seizure detection and classification landscape,most models primarily focus on binary classification—distinguishing between seizure and non-seizure states.While effective for basic detection,these models fail to address the nuanced stages of seizures and the intervals between them.Accurate identification of per-seizure or interictal stages and the timing between seizures is crucial for an effective seizure alert system.This granularity is essential for improving patient-specific interventions and developing proactive seizure management strategies.This study addresses this gap by proposing a novel AI-based approach for seizure stage classification using a Deep Convolutional Neural Network(DCNN).The developed model goes beyond traditional binary classification by categorizing EEG recordings into three distinct classes,thus providing a more detailed analysis of seizure stages.To enhance the model’s performance,we have optimized the DCNN using two advanced techniques:the Stochastic Gradient Algorithm(SGA)and the evolutionary Genetic Algorithm(GA).These optimization strategies are designed to fine-tune the model’s accuracy and robustness.Moreover,k-fold cross-validation ensures the model’s reliability and generalizability across different data sets.Trained and validated on the Bonn EEG data sets,the proposed optimized DCNN model achieved a test accuracy of 93.2%,demonstrating its ability to accurately classify EEG signals.In summary,the key advancement of the present research lies in addressing the limitations of existing models by providing a more detailed seizure classification system,thus potentially enhancing the effectiveness of real-time seizure prediction and management systems in clinical settings.With its inherent classification performance,the proposed approach represents a significant step forward in improving patient outcomes through advanced AI techniques. 展开更多
关键词 Bonn EEG dataset cross-validation genetic algorithm batch normalization seizure classification stochastic gradient
在线阅读 下载PDF
Height-diameter models for King Boris fir(Abies borisii regis Mattf.) and Scots pine(Pinus sylvestris L.) in Olympus and Pieria Mountains, Greece
20
作者 Dimitrios I.RAPTIS Dimitra PAPADOPOULOU +3 位作者 Angeliki PSARRA Athanasios A.FALLIAS Aristides G.TSITSANIS Vassiliki KAZANA 《Journal of Mountain Science》 SCIE CSCD 2024年第5期1475-1490,共16页
In forest science and practice, the total tree height is one of the basic morphometric attributes at the tree level and it has been closely linked with important stand attributes. In the current research, sixteen nonl... In forest science and practice, the total tree height is one of the basic morphometric attributes at the tree level and it has been closely linked with important stand attributes. In the current research, sixteen nonlinear functions for height prediction were tested in terms of their fitting ability against samples of Abies borisii regis and Pinus sylvestris trees from mountainous forests in central Greece. The fitting procedure was based on generalized nonlinear weighted regression. At the final stage, a five-quantile nonlinear height-diameter model was developed for both species through a quantile regression approach, to estimate the entire conditional distribution of tree height, enabling the evaluation of the diameter impact at various quantiles and providing a comprehensive understanding of the proposed relationship across the distribution. The results clearly showed that employing the diameter as the sole independent variable, the 3-parameter Hossfeld function and the 2-parameter N?slund function managed to explain approximately 84.0% and 81.7% of the total height variance in the case of King Boris fir and Scots pine species, respectively. Furthermore, the models exhibited low levels of error in both cases(2.310m for the fir and 3.004m for the pine), yielding unbiased predictions for both fir(-0.002m) and pine(-0.004m). Notably, all the required assumptions for homogeneity and normality of the associated residuals were achieved through the weighting procedure, while the quantile regression approach provided additional insights into the height-diameter allometry of the specific species. The proposed models can turn into valuable tools for operational forest management planning, particularly for wood production and conservation of mountainous forest ecosystems. 展开更多
关键词 Generalized nonlinear weighted regression Monte Carlo cross-validation Mountainous ecosystems Quantile regression Central Greece
原文传递
上一页 1 2 6 下一页 到第
使用帮助 返回顶部