期刊文献+
共找到7,132篇文章
< 1 2 250 >
每页显示 20 50 100
Variable Selection and Parameter Estimation in Distributed High-Dimensional Quantile Regression with Responses Missing at Random
1
作者 CHEN Dan CHEN Ruijing +1 位作者 TANG Jiarui LI Huimin 《Journal of Systems Science & Complexity》 2026年第1期385-409,共25页
Quantile regression(QR)has become an important tool to measure dependence of response variable's quantiles on a number of predictors for heterogeneous data,especially heavy-tailed data and outliers.However,it is q... Quantile regression(QR)has become an important tool to measure dependence of response variable's quantiles on a number of predictors for heterogeneous data,especially heavy-tailed data and outliers.However,it is quite challenging to make statistical inference on distributed high-dimensional QR with missing data due to the distributed nature,sparsity and missingness of data and nondifferentiable quantile loss function.To overcome the challenge,this paper develops a communicationefficient method to select variables and estimate parameters by utilizing a smooth function to approximate the non-differentiable quantile loss function and incorporating the idea of the inverse probability weighting and the penalty function.The proposed approach has three merits.First,it is both computationally and communicationally efficient because only the first-and second-order information of the approximate objective function are communicated at each iteration.Second,the proposed estimators possess the oracle property after a limited number of iterations without constraint on the number of machines.Third,the proposed method simultaneously selects variables and estimates parameters within a distributed framework,ensuring robustness to the specified response probability or propensity score function of the missing data mechanism.Simulation studies and a real example are used to illustrate the effectiveness of the proposed methodologies. 展开更多
关键词 Distributed estimator high-dimensional model missing at random quantile regression variable selection
原文传递
Test for Varying-Coefficient Models with High-Dimensional Data
2
作者 YANG Lin GAO Yuzhao QU Lianqiang 《Journal of Systems Science & Complexity》 2026年第1期203-229,共27页
The authors consider the issue of hypothesis testing in varying-coefficient regression models with high-dimensional data.Utilizing kernel smoothing techniques,the authors propose a locally concerned U-statistic method... The authors consider the issue of hypothesis testing in varying-coefficient regression models with high-dimensional data.Utilizing kernel smoothing techniques,the authors propose a locally concerned U-statistic method to assess the overall significance of the coefficients.The authors establish that the proposed test is asymptotically normal under both the null hypothesis and local alternatives.Based on the locally concerned U-statistic,the authors further develop a globally concerned U-statistic to test whether the coefficient function is zero.A stochastic perturbation method is employed to approximate the distribution of the globally concerned test statistic.Monte Carlo simulations demonstrate the validity of the proposed test in finite samples. 展开更多
关键词 Hypothesis testing high-dimensional data kernel smoothing U-STATISTIC varying-coefficient models
原文传递
Adaptive feature selection method for high-dimensional imbalanced data classification
3
作者 WU Jianzhen XUE Zhen +1 位作者 ZHANG Liangliang YANG Xu 《Journal of Measurement Science and Instrumentation》 2025年第4期612-624,共13页
Data collected in fields such as cybersecurity and biomedicine often encounter high dimensionality and class imbalance.To address the problem of low classification accuracy for minority class samples arising from nume... Data collected in fields such as cybersecurity and biomedicine often encounter high dimensionality and class imbalance.To address the problem of low classification accuracy for minority class samples arising from numerous irrelevant and redundant features in high-dimensional imbalanced data,we proposed a novel feature selection method named AMF-SGSK based on adaptive multi-filter and subspace-based gaining sharing knowledge.Firstly,the balanced dataset was obtained by random under-sampling.Secondly,combining the feature importance score with the AUC score for each filter method,we proposed a concept called feature hardness to judge the importance of feature,which could adaptively select the essential features.Finally,the optimal feature subset was obtained by gaining sharing knowledge in multiple subspaces.This approach effectively achieved dimensionality reduction for high-dimensional imbalanced data.The experiment results on 30 benchmark imbalanced datasets showed that AMF-SGSK performed better than other eight commonly used algorithms including BGWO and IG-SSO in terms of F1-score,AUC,and G-mean.The mean values of F1-score,AUC,and Gmean for AMF-SGSK are 0.950,0.967,and 0.965,respectively,achieving the highest among all algorithms.And the mean value of Gmean is higher than those of IG-PSO,ReliefF-GWO,and BGOA by 3.72%,11.12%,and 20.06%,respectively.Furthermore,the selected feature ratio is below 0.01 across the selected ten datasets,further demonstrating the proposed method’s overall superiority over competing approaches.AMF-SGSK could adaptively remove irrelevant and redundant features and effectively improve the classification accuracy of high-dimensional imbalanced data,providing scientific and technological references for practical applications. 展开更多
关键词 high-dimensional imbalanced data adaptive feature selection adaptive multi-filter feature hardness gaining sharing knowledge based algorithm metaheuristic algorithm
在线阅读 下载PDF
Ecological Dynamics of a Logistic Population Model with Impulsive Age-selective Harvesting
4
作者 DAI Xiangjun JIAO Jianjun 《应用数学》 北大核心 2026年第1期72-79,共8页
In this paper,we establish and study a single-species logistic model with impulsive age-selective harvesting.First,we prove the ultimate boundedness of the solutions of the system.Then,we obtain conditions for the asy... In this paper,we establish and study a single-species logistic model with impulsive age-selective harvesting.First,we prove the ultimate boundedness of the solutions of the system.Then,we obtain conditions for the asymptotic stability of the trivial solution and the positive periodic solution.Finally,numerical simulations are presented to validate our results.Our results show that age-selective harvesting is more conducive to sustainable population survival than non-age-selective harvesting. 展开更多
关键词 The logistic population model selective harvesting Asymptotic stability EXTINCTION
在线阅读 下载PDF
Influence of different data selection criteria on internal geomagnetic field modeling 被引量:4
5
作者 HongBo Yao JuYuan Xu +3 位作者 Yi Jiang Qing Yan Liang Yin PengFei Liu 《Earth and Planetary Physics》 2025年第3期541-549,共9页
Earth’s internal core and crustal magnetic fields,as measured by geomagnetic satellites like MSS-1(Macao Science Satellite-1)and Swarm,are vital for understanding core dynamics and tectonic evolution.To model these i... Earth’s internal core and crustal magnetic fields,as measured by geomagnetic satellites like MSS-1(Macao Science Satellite-1)and Swarm,are vital for understanding core dynamics and tectonic evolution.To model these internal magnetic fields accurately,data selection based on specific criteria is often employed to minimize the influence of rapidly changing current systems in the ionosphere and magnetosphere.However,the quantitative impact of various data selection criteria on internal geomagnetic field modeling is not well understood.This study aims to address this issue and provide a reference for constructing and applying geomagnetic field models.First,we collect the latest MSS-1 and Swarm satellite magnetic data and summarize widely used data selection criteria in geomagnetic field modeling.Second,we briefly describe the method to co-estimate the core,crustal,and large-scale magnetospheric fields using satellite magnetic data.Finally,we conduct a series of field modeling experiments with different data selection criteria to quantitatively estimate their influence.Our numerical experiments confirm that without selecting data from dark regions and geomagnetically quiet times,the resulting internal field differences at the Earth’s surface can range from tens to hundreds of nanotesla(nT).Additionally,we find that the uncertainties introduced into field models by different data selection criteria are significantly larger than the measurement accuracy of modern geomagnetic satellites.These uncertainties should be considered when utilizing constructed magnetic field models for scientific research and applications. 展开更多
关键词 Macao Science Satellite-1 SWARM geomagnetic field modeling data selection core field crustal field
在线阅读 下载PDF
A Unified Feature Selection Framework Combining Mutual Information and Regression Optimization for Multi-Label Learning
6
作者 Hyunki Lim 《Computers, Materials & Continua》 2026年第4期1262-1281,共20页
High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of ... High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of labels.Moreover,an optimization problem that fully considers all dependencies between features and labels is difficult to solve.In this study,we propose a novel regression-basedmulti-label feature selectionmethod that integrates mutual information to better exploit the underlying data structure.By incorporating mutual information into the regression formulation,the model captures not only linear relationships but also complex non-linear dependencies.The proposed objective function simultaneously considers three types of relationships:(1)feature redundancy,(2)featurelabel relevance,and(3)inter-label dependency.These three quantities are computed usingmutual information,allowing the proposed formulation to capture nonlinear dependencies among variables.These three types of relationships are key factors in multi-label feature selection,and our method expresses them within a unified formulation,enabling efficient optimization while simultaneously accounting for all of them.To efficiently solve the proposed optimization problem under non-negativity constraints,we develop a gradient-based optimization algorithm with fast convergence.Theexperimental results on sevenmulti-label datasets show that the proposed method outperforms existingmulti-label feature selection techniques. 展开更多
关键词 feature selection multi-label learning regression model optimization mutual information
在线阅读 下载PDF
Engine Failure Prediction on Large-Scale CMAPSS Data Using Hybrid Feature Selection and Imbalance-Aware Learning
7
作者 Ahmad Junaid Abid Iqbal +3 位作者 Abuzar Khan Ghassan Husnain Abdul-Rahim Ahmad Mohammed Al-Naeem 《Computers, Materials & Continua》 2026年第4期1485-1508,共24页
Most predictive maintenance studies have emphasized accuracy but provide very little focus on Interpretability or deployment readiness.This study improves on prior methods by developing a small yet robust system that ... Most predictive maintenance studies have emphasized accuracy but provide very little focus on Interpretability or deployment readiness.This study improves on prior methods by developing a small yet robust system that can predict when turbofan engines will fail.It uses the NASA CMAPSS dataset,which has over 200,000 engine cycles from260 engines.The process begins with systematic preprocessing,which includes imputation,outlier removal,scaling,and labelling of the remaining useful life.Dimensionality is reduced using a hybrid selection method that combines variance filtering,recursive elimination,and gradient-boosted importance scores,yielding a stable set of 10 informative sensors.To mitigate class imbalance,minority cases are oversampled,and class-weighted losses are applied during training.Benchmarking is carried out with logistic regression,gradient boosting,and a recurrent design that integrates gated recurrent units with long short-term memory networks.The Long Short-Term Memory–Gated Recurrent Unit(LSTM–GRU)hybrid achieved the strongest performance with an F1 score of 0.92,precision of 0.93,recall of 0.91,ReceiverOperating Characteristic–AreaUnder the Curve(ROC-AUC)of 0.97,andminority recall of 0.75.Interpretability testing using permutation importance and Shapley values indicates that sensors 13,15,and 11 are the most important indicators of engine wear.The proposed system combines imbalance handling,feature reduction,and Interpretability into a practical design suitable for real industrial settings. 展开更多
关键词 Predictive maintenance CMAPSS dataset feature selection class imbalance LSTM-GRUhybrid model INTERPRETABILITY industrial deployment
在线阅读 下载PDF
A Novel Hybrid Sine Cosine-Flower Pollination Algorithm for Optimized Feature Selection
8
作者 Sumbul Azeem Shazia Javed +3 位作者 Farheen Ibraheem Uzma Bashir Nazar Waheed Khursheed Aurangzeb 《Computers, Materials & Continua》 2026年第5期1916-1930,共15页
Data serves as the foundation for training and testing machine learning and artificial intelligencemodels.The most fundamental part of data is its attributes or features.The feature set size changes from one dataset t... Data serves as the foundation for training and testing machine learning and artificial intelligencemodels.The most fundamental part of data is its attributes or features.The feature set size changes from one dataset to another.Only the relevant features contributemeaningfully to classificationaccuracy.The presence of irrelevant features reduces the system’s effectiveness.Classification performance often deteriorates on high-dimensional datasets due to the large search space.Thus,one of the significant obstacles affecting the performance of the learning process in the majority of machine learning and data mining techniques is the dimensionality of the datasets.Feature selection(FS)is an effective preprocessing step in classification tasks.The aim of applying FS is to exclude redundant and unrelated features while retaining the most informative ones to optimize classification capability and compress computational complexity.In this paper,a novel hybrid binary metaheuristic algorithm,termed hSC-FPA,is proposed by hybridizing the Flower Pollination Algorithm(FPA)and the Sine Cosine Algorithm(SCA).Hybridization controls the exploration capacity of SCA and the exploitation behavior of FPA to maintain a balanced search process.SCA guides the global search in the early iterations,while FPA’s local pollination refines promising solutions in later stages.A binary conversion mechanism using a threshold function is implemented to handle the discrete nature of the feature selection problem.The functionality of the proposed hSC-FPA is authenticated on fourteen standard datasets from the UCI repository using the K-Nearest Neighbors(K-NN)classifier.Experimental results are benchmarked against the standalone SCA and FPA algorithms.The hSC-FPA consistently achieves higher classification accuracy,selects a more compact feature subset,and demonstrates superior convergence behavior.These findings support the stability and outperformance of the hybrid feature selection method presented. 展开更多
关键词 Classification algorithms feature selection process flower pollination algorithm hybrid model metaheuristics multi-objective optimization search algorithm sine cosine algorithm
在线阅读 下载PDF
Robust Gini covariance matrix estimation for portfolio selection based on a factor model
9
作者 Yongda Zhu Lei Shu 《中国科学技术大学学报》 北大核心 2025年第8期59-67,I0002,共10页
Portfolio theory has been extensively studied and applied in finance.To determine the optimal portfolio weight under the global minimum variance strategy,it is necessary to estimate both the covariance matrix and its ... Portfolio theory has been extensively studied and applied in finance.To determine the optimal portfolio weight under the global minimum variance strategy,it is necessary to estimate both the covariance matrix and its inverse.However,the high dimensionality and heavy-tailed nature of financial data pose significant challenges to this estimation.In this study,we propose a method to estimate the Gini covariance matrix by introducing a low-rank and sparse correlation structure,as an alternative to the traditional sample covariance matrix.Our approach employs a factor model to capture the low-rank structure,combined with thresholding rules to achieve the final estimation.We demonstrate the consistency of our estimators and validate our approach through simulation experiments and empirical portfolio analyses.Simulation results show that our method is highly applicable across a variety of distributional scenarios.Furthermore,empirical portfolio analysis indicates that our method can construct portfolios with superior performance. 展开更多
关键词 elliptical distribution factor model Gini covariance matrix portfolio selection
在线阅读 下载PDF
Generalized Functional Linear Models:Efficient Modeling for High-dimensional Correlated Mixture Exposures
10
作者 Bingsong Zhang Haibin Yu +11 位作者 Xin Peng Haiyi Yan Siran Li Shutong Luo Renhuizi Wei Zhujiang Zhou Yalin Kuang Yihuan Zheng Chulan Ou Linhua Liu Yuehua Hu Jindong Ni 《Biomedical and Environmental Sciences》 2025年第8期961-976,共16页
Objective Humans are exposed to complex mixtures of environmental chemicals and other factors that can affect their health.Analysis of these mixture exposures presents several key challenges for environmental epidemio... Objective Humans are exposed to complex mixtures of environmental chemicals and other factors that can affect their health.Analysis of these mixture exposures presents several key challenges for environmental epidemiology and risk assessment,including high dimensionality,correlated exposure,and subtle individual effects.Methods We proposed a novel statistical approach,the generalized functional linear model(GFLM),to analyze the health effects of exposure mixtures.GFLM treats the effect of mixture exposures as a smooth function by reordering exposures based on specific mechanisms and capturing internal correlations to provide a meaningful estimation and interpretation.The robustness and efficiency was evaluated under various scenarios through extensive simulation studies.Results We applied the GFLM to two datasets from the National Health and Nutrition Examination Survey(NHANES).In the first application,we examined the effects of 37 nutrients on BMI(2011–2016 cycles).The GFLM identified a significant mixture effect,with fiber and fat emerging as the nutrients with the greatest negative and positive effects on BMI,respectively.For the second application,we investigated the association between four pre-and perfluoroalkyl substances(PFAS)and gout risk(2007–2018 cycles).Unlike traditional methods,the GFLM indicated no significant association,demonstrating its robustness to multicollinearity.Conclusion GFLM framework is a powerful tool for mixture exposure analysis,offering improved handling of correlated exposures and interpretable results.It demonstrates robust performance across various scenarios and real-world applications,advancing our understanding of complex environmental exposures and their health impacts on environmental epidemiology and toxicology. 展开更多
关键词 Mixture exposure modeling Functional data analysis high-dimensional data Correlated exposures Environmental epidemiology
暂未订购
Machine learning for patient selection in corticosteroid decision making in knee osteoarthritis:A feasibility model
11
作者 Omar Musbahi Kyriacos Pouris +4 位作者 Savvas Hadjixenophontos Ahmed Al-Saadawi Iris Soteriou Justin PeterCobb Gareth G Jones 《World Journal of Methodology》 2025年第4期232-240,共9页
BACKGROUND Relieving pain is central to the early management of knee osteoarthritis,with a plethora of pharmacological agents licensed for this purpose.Intra-articular corticosteroid injections are a widely used optio... BACKGROUND Relieving pain is central to the early management of knee osteoarthritis,with a plethora of pharmacological agents licensed for this purpose.Intra-articular corticosteroid injections are a widely used option,albeit with variable efficacy.AIM To develop a machine learning(ML)model that predicts which patients will benefit from corticosteroid injections.METHODS Data from two prospective cohort studies[Osteoarthritis(OA)Initiative and Multicentre OA Study]was combined.The primary outcome was patientreported pain score following corticosteroid injection,assessed using the Western Ontario and McMaster Universities OA pain scale,with significant change defined using minimally clinically important difference and meaningful within person change.A ML algorithm was developed,utilizing linear discriminant analysis,to predict symptomatic improvement,and examine the association between pain scores and patient factors by calculating the sensitivity,specificity,positive predictive value,negative predictive value,accuracy,and F2 score.RESULTS A total of 330 patients were included,with a mean age of 63.4(SD:8.3).The mean Western Ontario and McMaster Universities OA pain score was 5.2(SD:4.1),with only 25.5%of patients achieving significant improvement in pain following corticosteroid injection.The ML model generated an accuracy of 67.8%(95%confidence interval:64.6%-70.9%),F1 score of 30.8%,and an area under the curve score of 0.60.CONCLUSION The model demonstrated feasibility to assist clinicians with decision-making in patient selection for corticosteroid injections.Further studies are required to improve the model prior to testing in clinical settings. 展开更多
关键词 Knee osteoarthritis Machine learning Predictive modelling Corticosteroid injection Patient selection
暂未订购
Subgroup Analysis of a Single-Index Threshold Penalty Quantile Regression Model Based on Variable Selection
12
作者 QI Hui XUE Yaxin 《Wuhan University Journal of Natural Sciences》 2025年第2期169-183,共15页
In clinical research,subgroup analysis can help identify patient groups that respond better or worse to specific treatments,improve therapeutic effect and safety,and is of great significance in precision medicine.This... In clinical research,subgroup analysis can help identify patient groups that respond better or worse to specific treatments,improve therapeutic effect and safety,and is of great significance in precision medicine.This article considers subgroup analysis methods for longitudinal data containing multiple covariates and biomarkers.We divide subgroups based on whether a linear combination of these biomarkers exceeds a predetermined threshold,and assess the heterogeneity of treatment effects across subgroups using the interaction between subgroups and exposure variables.Quantile regression is used to better characterize the global distribution of the response variable and sparsity penalties are imposed to achieve variable selection of covariates and biomarkers.The effectiveness of our proposed methodology for both variable selection and parameter estimation is verified through random simulations.Finally,we demonstrate the application of this method by analyzing data from the PA.3 trial,further illustrating the practicality of the method proposed in this paper. 展开更多
关键词 longitudinal data subgroup analysis threshold model quantile regression variable selection
原文传递
Prediction model of mechanical properties of hot-rolled strip based on improved feature selection method
13
作者 Zhi-wei Gao Guang-ming Cao +3 位作者 Si-wei Wu Deng Luo Hou-xin Wang Zhen-yu Liu 《Journal of Iron and Steel Research International》 2025年第6期1627-1640,共14页
Selecting proper descriptors(also known feature selection,FS)is key in the process of establishing mechanical properties prediction model of hot-rolled microalloyed steels by using machine learning(ML)algorithm.FS met... Selecting proper descriptors(also known feature selection,FS)is key in the process of establishing mechanical properties prediction model of hot-rolled microalloyed steels by using machine learning(ML)algorithm.FS methods based on data-driving can reduce the redundancy of data features and improve the prediction accuracy of mechanical properties.Based on the collected data of hot-rolled microalloyed steels,the association rules are used to mine the correlation information between the data.High-quality feature subsets are selected by the proposed FS method(FS method based on genetic algorithm embedding,GAMIC).Compared with the common FS method,it is shown on dataset that GAMIC selects feature subsets more appropriately.Six different ML algorithms are trained and tested for mechanical properties prediction.The result shows that the root-mean-square error of yield strength,tensile strength and elongation based on limit gradient enhancement(XGBoost)algorithm is 21.95 MPa,20.85 MPa and 1.96%,the correlation coefficient(R^(2))is 0.969,0.968 and 0.830,and the mean absolute error is 16.84 MPa,15.83 MPa and 1.48%,respectively,showing the best prediction performance.Finally,SHapley Additive exPlanation is used to further explore the influence of feature variables on mechanical properties.GAMIC feature selection method proposed is universal,which provides a basis for the development of high-precision mechanical property prediction model. 展开更多
关键词 Feature selection Data-driven model Hot-rolled microalloyed steel Mechanical property Machine learning
原文传递
Feature selection for determining input parameters in antenna modeling
14
作者 LIU Zhixian SHAO Wei +2 位作者 CHENG Xi OU Haiyan DING Xiao 《Journal of Systems Engineering and Electronics》 2025年第1期15-23,共9页
In this paper,a feature selection method for determining input parameters in antenna modeling is proposed.In antenna modeling,the input feature of artificial neural network(ANN)is geometric parameters.The selection cr... In this paper,a feature selection method for determining input parameters in antenna modeling is proposed.In antenna modeling,the input feature of artificial neural network(ANN)is geometric parameters.The selection criteria contain correlation and sensitivity between the geometric parameter and the electromagnetic(EM)response.Maximal information coefficient(MIC),an exploratory data mining tool,is introduced to evaluate both linear and nonlinear correlations.The EM response range is utilized to evaluate the sensitivity.The wide response range corresponding to varying values of a parameter implies the parameter is highly sensitive and the narrow response range suggests the parameter is insensitive.Only the parameter which is highly correlative and sensitive is selected as the input of ANN,and the sampling space of the model is highly reduced.The modeling of a wideband and circularly polarized antenna is studied as an example to verify the effectiveness of the proposed method.The number of input parameters decreases from8 to 4.The testing errors of|S_(11)|and axis ratio are reduced by8.74%and 8.95%,respectively,compared with the ANN with no feature selection. 展开更多
关键词 antenna modeling artificial neural network(ANN) feature selection maximal information coefficient(MIC)
在线阅读 下载PDF
FSFS: A Novel Statistical Approach for Fair and Trustworthy Impactful Feature Selection in Artificial Intelligence Models
15
作者 Ali Hamid Farea Iman Askerzade +1 位作者 Omar H.Alhazmi Savas Takan 《Computers, Materials & Continua》 2025年第7期1457-1484,共28页
Feature selection(FS)is a pivotal pre-processing step in developing data-driven models,influencing reliability,performance and optimization.Although existing FS techniques can yield high-performance metrics for certai... Feature selection(FS)is a pivotal pre-processing step in developing data-driven models,influencing reliability,performance and optimization.Although existing FS techniques can yield high-performance metrics for certain models,they do not invariably guarantee the extraction of the most critical or impactful features.Prior literature underscores the significance of equitable FS practices and has proposed diverse methodologies for the identification of appropriate features.However,the challenge of discerning the most relevant and influential features persists,particularly in the context of the exponential growth and heterogeneity of big data—a challenge that is increasingly salient in modern artificial intelligence(AI)applications.In response,this study introduces an innovative,automated statistical method termed Farea Similarity for Feature Selection(FSFS).The FSFS approach computes a similarity metric for each feature by benchmarking it against the record-wise mean,thereby finding feature dependencies and mitigating the influence of outliers that could potentially distort evaluation outcomes.Features are subsequently ranked according to their similarity scores,with the threshold established at the average similarity score.Notably,lower FSFS values indicate higher similarity and stronger data correlations,whereas higher values suggest lower similarity.The FSFS method is designed not only to yield reliable evaluation metrics but also to reduce data complexity without compromising model performance.Comparative analyses were performed against several established techniques,including Chi-squared(CS),Correlation Coefficient(CC),Genetic Algorithm(GA),Exhaustive Approach,Greedy Stepwise Approach,Gain Ratio,and Filtered Subset Eval,using a variety of datasets such as the Experimental Dataset,Breast Cancer Wisconsin(Original),KDD CUP 1999,NSL-KDD,UNSW-NB15,and Edge-IIoT.In the absence of the FSFS method,the highest classifier accuracies observed were 60.00%,95.13%,97.02%,98.17%,95.86%,and 94.62%for the respective datasets.When the FSFS technique was integrated with data normalization,encoding,balancing,and feature importance selection processes,accuracies improved to 100.00%,97.81%,98.63%,98.94%,94.27%,and 98.46%,respectively.The FSFS method,with a computational complexity of O(fn log n),demonstrates robust scalability and is well-suited for datasets of large size,ensuring efficient processing even when the number of features is substantial.By automatically eliminating outliers and redundant data,FSFS reduces computational overhead,resulting in faster training and improved model performance.Overall,the FSFS framework not only optimizes performance but also enhances the interpretability and explainability of data-driven models,thereby facilitating more trustworthy decision-making in AI applications. 展开更多
关键词 Artificial intelligence big data feature selection FSFS models trustworthy similarity-based feature ranking explainable artificial intelligence(XAI)
在线阅读 下载PDF
A three-stage series model predictive torque and flux control system based on fast optimal voltage vector selection for more electric aircraft
16
作者 Zhaoyang FU Lixian PENG +2 位作者 Shuangrui PING Lefei GE Weilin LI 《Chinese Journal of Aeronautics》 2025年第11期315-328,共14页
With the development of More Electric Aircraft(MEA),the Permanent Magnet Synchronous Motor(PMSM)is widely used in the MEA field.The PMSM control system of MEA needs to consider the system reliability,and the inverter ... With the development of More Electric Aircraft(MEA),the Permanent Magnet Synchronous Motor(PMSM)is widely used in the MEA field.The PMSM control system of MEA needs to consider the system reliability,and the inverter switching frequency of the inverter is one of the impacting factors.At the same time,the control accuracy of the system also needs to be considered,and the torque ripple and flux ripple are usually considered to be its important indexes.This paper proposes a three-stage series Model Predictive Torque and Flux Control system(three-stage series MPTFC)based on fast optimal voltage vector selection to reduce switching frequency and suppress torque ripple and flux ripple.Firstly,the analytical model of the PMSM is established and the multi-stage series control method is used to reduce the switching frequency.Secondly,selectable voltage vectors are extended from 8 to 26 and a fast selection method for optimal voltage vector sectors is designed based on the hysteresis comparator,which can suppress the torque ripple and flux ripple to improve the control accuracy.Thirdly,a three-stage series control is obtained by expanding the two-stage series control using the P-Q torque decomposition theory.Finally,a model predictive torque and flux control experimental platform is built,and the feasibility and effectiveness of this method are verified through comparison experiments. 展开更多
关键词 Fast optimal voltage vector selection model predictive control Permanent magnet synchronous motor Ripple suppression Switching frequency
原文传递
A Length-Adaptive Non-Dominated Sorting Genetic Algorithm for Bi-Objective High-Dimensional Feature Selection 被引量:2
17
作者 Yanlu Gong Junhai Zhou +2 位作者 Quanwang Wu MengChu Zhou Junhao Wen 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第9期1834-1844,共11页
As a crucial data preprocessing method in data mining,feature selection(FS)can be regarded as a bi-objective optimization problem that aims to maximize classification accuracy and minimize the number of selected featu... As a crucial data preprocessing method in data mining,feature selection(FS)can be regarded as a bi-objective optimization problem that aims to maximize classification accuracy and minimize the number of selected features.Evolutionary computing(EC)is promising for FS owing to its powerful search capability.However,in traditional EC-based methods,feature subsets are represented via a length-fixed individual encoding.It is ineffective for high-dimensional data,because it results in a huge search space and prohibitive training time.This work proposes a length-adaptive non-dominated sorting genetic algorithm(LA-NSGA)with a length-variable individual encoding and a length-adaptive evolution mechanism for bi-objective highdimensional FS.In LA-NSGA,an initialization method based on correlation and redundancy is devised to initialize individuals of diverse lengths,and a Pareto dominance-based length change operator is introduced to guide individuals to explore in promising search space adaptively.Moreover,a dominance-based local search method is employed for further improvement.The experimental results based on 12 high-dimensional gene datasets show that the Pareto front of feature subsets produced by LA-NSGA is superior to those of existing algorithms. 展开更多
关键词 Bi-objective optimization feature selection(FS) genetic algorithm high-dimensional data length-adaptive
在线阅读 下载PDF
Multi-Objective Equilibrium Optimizer for Feature Selection in High-Dimensional English Speech Emotion Recognition
18
作者 Liya Yue Pei Hu +1 位作者 Shu-Chuan Chu Jeng-Shyang Pan 《Computers, Materials & Continua》 SCIE EI 2024年第2期1957-1975,共19页
Speech emotion recognition(SER)uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions.The number of features acquired with acoustic analysis is ext... Speech emotion recognition(SER)uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions.The number of features acquired with acoustic analysis is extremely high,so we introduce a hybrid filter-wrapper feature selection algorithm based on an improved equilibrium optimizer for constructing an emotion recognition system.The proposed algorithm implements multi-objective emotion recognition with the minimum number of selected features and maximum accuracy.First,we use the information gain and Fisher Score to sort the features extracted from signals.Then,we employ a multi-objective ranking method to evaluate these features and assign different importance to them.Features with high rankings have a large probability of being selected.Finally,we propose a repair strategy to address the problem of duplicate solutions in multi-objective feature selection,which can improve the diversity of solutions and avoid falling into local traps.Using random forest and K-nearest neighbor classifiers,four English speech emotion datasets are employed to test the proposed algorithm(MBEO)as well as other multi-objective emotion identification techniques.The results illustrate that it performs well in inverted generational distance,hypervolume,Pareto solutions,and execution time,and MBEO is appropriate for high-dimensional English SER. 展开更多
关键词 Speech emotion recognition filter-wrapper high-dimensional feature selection equilibrium optimizer MULTI-OBJECTIVE
在线阅读 下载PDF
Boosted Spider Wasp Optimizer for High-dimensional Feature Selection
19
作者 Elfadil A.Mohamed Malik Sh.Braik +1 位作者 Mohammed Azmi Al-Betar Mohammed A.Awadallah 《Journal of Bionic Engineering》 CSCD 2024年第5期2424-2459,共36页
With the increasing dimensionality of the data,High-dimensional Feature Selection(HFS)becomes an increasingly dif-ficult task.It is not simple to find the best subset of features due to the breadth of the search space... With the increasing dimensionality of the data,High-dimensional Feature Selection(HFS)becomes an increasingly dif-ficult task.It is not simple to find the best subset of features due to the breadth of the search space and the intricacy of the interactions between features.Many of the Feature Selection(FS)approaches now in use for these problems perform sig-nificantly less well when faced with such intricate situations involving high-dimensional search spaces.It is demonstrated that meta-heuristic algorithms can provide sub-optimal results in an acceptable amount of time.This paper presents a new binary Boosted version of the Spider Wasp Optimizer(BSWO)called Binary Boosted SWO(BBSWO),which combines a number of successful and promising strategies,in order to deal with HFS.The shortcomings of the original BSWO,including early convergence,settling into local optimums,limited exploration and exploitation,and lack of population diversity,were addressed by the proposal of this new variant of SWO.The concept of chaos optimization is introduced in BSWO,where initialization is consistently produced by utilizing the properties of sine chaos mapping.A new convergence parameter was then incorporated into BSWO to achieve a promising balance between exploration and exploitation.Multiple exploration mechanisms were then applied in conjunction with several exploitation strategies to effectively enrich the search process of BSWO within the search space.Finally,quantum-based optimization was added to enhance the diversity of the search agents in BSWO.The proposed BBSWO not only offers the most suitable subset of features located,but it also lessens the data's redundancy structure.BBSWO was evaluated using the k-Nearest Neighbor(k-NN)classifier on 23 HFS problems from the biomedical domain taken from the UCI repository.The results were compared with those of traditional BSWO and other well-known meta-heuristics-based FS.The findings indicate that,in comparison to other competing techniques,the proposed BBSWO can,on average,identify the least significant subsets of features with efficient classification accuracy of the k-NN classifier. 展开更多
关键词 high-dimensional features SWO algorithm Feature selection Optimization Machine learning
在线阅读 下载PDF
Variable selection-based SPC procedures for high-dimensional multistage processes 被引量:2
20
作者 KIM Sangahn 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2019年第1期144-153,共10页
Monitoring high-dimensional multistage processes becomes crucial to ensure the quality of the final product in modern industry environments. Few statistical process monitoring(SPC) approaches for monitoring and contro... Monitoring high-dimensional multistage processes becomes crucial to ensure the quality of the final product in modern industry environments. Few statistical process monitoring(SPC) approaches for monitoring and controlling quality in highdimensional multistage processes are studied. We propose a deviance residual-based multivariate exponentially weighted moving average(MEWMA) control chart with a variable selection procedure. We demonstrate that it outperforms the existing multivariate SPC charts in terms of out-of-control average run length(ARL) for the detection of process mean shift. 展开更多
关键词 diagnosis procedure deviance RESIDUAL fault identification model-BASED control CHART MULTISTAGE process monitoring variable selection.
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部