Casing damage resulting from sand production in unconsolidated sandstone reservoirs can significantly impact the average production of oil wells.However,the prediction task remains challenging due to the complex damag...Casing damage resulting from sand production in unconsolidated sandstone reservoirs can significantly impact the average production of oil wells.However,the prediction task remains challenging due to the complex damage mechanism caused by sand production.This paper presents an innovative approach that combines feature selection(FS)with boosting algorithms to accurately predict casing damage in unconsolidated sandstone reservoirs.A novel TriScore FS technique is developed,combining mRMR,Random Forest,and F-test.The approach integrates three distinct feature selection approaches—TriScore,wrapper,and hybrid TriScore-wrapper and four interpretable Boosting models(AdaBoost,XGBoost,LightGBM,CatBoost).Moreover,shapley additive explanations(SHAP)was used to identify the most significant features across engineering,geological,and production features.The CatBoost model,using the Hybrid TriScore-rapper G_(1)G_(2)FS method,showed exceptional performance in analyzing data from the Gangxi Oilfield.It achieved the highestaccuracy(95.5%)and recall rate(89.7%)compared to other tested models.Casing service time,casing wall thickness,and perforation density were selected as the top three most important features.This framework enhances predictive robustness and is an effective tool for policymakers and energy analysts,confirming its capability to deliver reliable casing damage forecasts.展开更多
In recent years, particle swarm optimization (PSO) has received widespread attention in feature selection due to its simplicity and potential for global search. However, in traditional PSO, particles primarily update ...In recent years, particle swarm optimization (PSO) has received widespread attention in feature selection due to its simplicity and potential for global search. However, in traditional PSO, particles primarily update based on two extreme values: personal best and global best, which limits the diversity of information. Ideally, particles should learn from multiple advantageous particles to enhance interactivity and optimization efficiency. Accordingly, this paper proposes a PSO that simulates the evolutionary dynamics of species survival in mountain peak ecology (PEPSO) for feature selection. Based on the pyramid topology, the algorithm simulates the features of mountain peak ecology in nature and the competitive-cooperative strategies among species. According to the principles of the algorithm, the population is first adaptively divided into many subgroups based on the fitness level of particles. Then, particles within each subgroup are divided into three different types based on their evolutionary levels, employing different adaptive inertia weight rules and dynamic learning mechanisms to define distinct learning modes. Consequently, all particles play their respective roles in promoting the global optimization performance of the algorithm, similar to different species in the ecological pattern of mountain peaks. Experimental validation of the PEPSO performance was conducted on 18 public datasets. The experimental results demonstrate that the PEPSO outperforms other PSO variant-based feature selection methods and mainstream feature selection methods based on intelligent optimization algorithms in terms of overall performance in global search capability, classification accuracy, and reduction of feature space dimensions. Wilcoxon signed-rank test also confirms the excellent performance of the PEPSO.展开更多
Software defect prediction(SDP)aims to find a reliable method to predict defects in specific software projects and help software engineers allocate limited resources to release high-quality software products.Software ...Software defect prediction(SDP)aims to find a reliable method to predict defects in specific software projects and help software engineers allocate limited resources to release high-quality software products.Software defect prediction can be effectively performed using traditional features,but there are some redundant or irrelevant features in them(the presence or absence of this feature has little effect on the prediction results).These problems can be solved using feature selection.However,existing feature selection methods have shortcomings such as insignificant dimensionality reduction effect and low classification accuracy of the selected optimal feature subset.In order to reduce the impact of these shortcomings,this paper proposes a new feature selection method Cubic TraverseMa Beluga whale optimization algorithm(CTMBWO)based on the improved Beluga whale optimization algorithm(BWO).The goal of this study is to determine how well the CTMBWO can extract the features that are most important for correctly predicting software defects,improve the accuracy of fault prediction,reduce the number of the selected feature and mitigate the risk of overfitting,thereby achieving more efficient resource utilization and better distribution of test workload.The CTMBWO comprises three main stages:preprocessing the dataset,selecting relevant features,and evaluating the classification performance of the model.The novel feature selection method can effectively improve the performance of SDP.This study performs experiments on two software defect datasets(PROMISE,NASA)and shows the method’s classification performance using four detailed evaluation metrics,Accuracy,F1-score,MCC,AUC and Recall.The results indicate that the approach presented in this paper achieves outstanding classification performance on both datasets and has significant improvement over the baseline models.展开更多
In recent years,feature selection(FS)optimization of high-dimensional gene expression data has become one of the most promising approaches for cancer prediction and classification.This work reviews FS and classificati...In recent years,feature selection(FS)optimization of high-dimensional gene expression data has become one of the most promising approaches for cancer prediction and classification.This work reviews FS and classification methods that utilize evolutionary algorithms(EAs)for gene expression profiles in cancer or medical applications based on research motivations,challenges,and recommendations.Relevant studies were retrieved from four major academic databases-IEEE,Scopus,Springer,and ScienceDirect-using the keywords‘cancer classification’,‘optimization’,‘FS’,and‘gene expression profile’.A total of 67 papers were finally selected with key advancements identified as follows:(1)The majority of papers(44.8%)focused on developing algorithms and models for FS and classification.(2)The second category encompassed studies on biomarker identification by EAs,including 20 papers(30%).(3)The third category comprised works that applied FS to cancer data for decision support system purposes,addressing high-dimensional data and the formulation of chromosome length.These studies accounted for 12%of the total number of studies.(4)The remaining three papers(4.5%)were reviews and surveys focusing on models and developments in prediction and classification optimization for cancer classification under current technical conditions.This review highlights the importance of optimizing FS in EAs to manage high-dimensional data effectively.Despite recent advancements,significant limitations remain:the dynamic formulation of chromosome length remains an underexplored area.Thus,further research is needed on dynamic-length chromosome techniques for more sophisticated biomarker gene selection techniques.The findings suggest that further advancements in dynamic chromosome length formulations and adaptive algorithms could enhance cancer classification accuracy and efficiency.展开更多
With the birth of Software-Defined Networking(SDN),integration of both SDN and traditional architectures becomes the development trend of computer networks.Network intrusion detection faces challenges in dealing with ...With the birth of Software-Defined Networking(SDN),integration of both SDN and traditional architectures becomes the development trend of computer networks.Network intrusion detection faces challenges in dealing with complex attacks in SDN environments,thus to address the network security issues from the viewpoint of Artificial Intelligence(AI),this paper introduces the Crayfish Optimization Algorithm(COA)to the field of intrusion detection for both SDN and traditional network architectures,and based on the characteristics of the original COA,an Improved Crayfish Optimization Algorithm(ICOA)is proposed by integrating strategies of elite reverse learning,Levy flight,crowding factor and parameter modification.The ICOA is then utilized for AI-integrated feature selection of intrusion detection for both SDN and traditional network architectures,to reduce the dimensionality of the data and improve the performance of network intrusion detection.Finally,the performance evaluation is performed by testing not only the NSL-KDD dataset and the UNSW-NB 15 dataset for traditional networks but also the InSDN dataset for SDN-based networks.Experimental results show that ICOA improves the accuracy by 0.532%and 2.928%respectively compared with GWO and COA in traditional networks.In SDN networks,the accuracy of ICOA is 0.25%and 0.3%higher than COA and PSO.These findings collectively indicate that AI-integrated feature selection based on the proposed ICOA can promote network intrusion detection for both SDN and traditional architectures.展开更多
The radial basis function (RBF), a kind of neural networks algorithm, is adopted to select clusterheads. It has many advantages such as simple parallel distributed computation, distributed storage, and fast learning...The radial basis function (RBF), a kind of neural networks algorithm, is adopted to select clusterheads. It has many advantages such as simple parallel distributed computation, distributed storage, and fast learning. Four factors related to a node becoming a cluster-head are drawn by analysis, which are energy ( energy available in each node), number (the number of neighboring nodes), centrality ( a value to classify the nodes based on the proximity how central the node is to the cluster), and location (the distance between the base station and the node). The factors are as input variables of neural networks and the output variable is suitability that is the degree of a node becoming a cluster head. A group of cluster-heads are selected according to the size of network. Then the base station broadcasts a message containing the list of cluster-heads' IDs to all nodes. After that, each cluster-head announces its new status to all its neighbors and sets up a new cluster. If a node around it receives the message, it registers itself to be a member of the cluster. After identifying all the members, the cluster-head manages them and carries out data aggregation in each cluster. Thus data flowing in the network decreases and energy consumption of nodes decreases accordingly. Experimental results show that, compared with other algorithms, the proposed algorithm can significantly increase the lifetime of the sensor network.展开更多
In order to ease congestion and ground delays in major hub airports, an aircraft taxiing scheduling optimization model is proposed with schedule time as the object function. In the new model, the idea of a classical j...In order to ease congestion and ground delays in major hub airports, an aircraft taxiing scheduling optimization model is proposed with schedule time as the object function. In the new model, the idea of a classical job shop-schedule problem is adopted and three types of special aircraft-taxi conflicts are considered in the constraints. To solve such nondeterministic polynomial time-complex problems, the immune clonal selection algorithm(ICSA) is introduced. The simulation results in a congested hour of Beijing Capital International Airport show that, compared with the first-come-first-served(FCFS) strategy, the optimization-planning strategy reduces the total scheduling time by 13.6 min and the taxiing time per aircraft by 45.3 s, which improves the capacity of the runway and the efficiency of airport operations.展开更多
In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying result...In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying results by using conventional linear sta- tistical methods. Recursive feature elimination based on support vector machine (SVM RFE) is an effective algorithm for gene selection and cancer classification, which are integrated into a consistent framework. In this paper, we propose a new method to select parameters of the aforementioned algorithm implemented with Gaussian kernel SVMs as better alternatives to the common practice of selecting the apparently best parameters by using a genetic algorithm to search for a couple of optimal parameter. Fast implementation issues for this method are also discussed for pragmatic reasons. The proposed method was tested on two repre- sentative hereditary breast cancer and acute leukaemia datasets. The experimental results indicate that the proposed method per- forms well in selecting genes and achieves high classification accuracies with these genes.展开更多
Refineries often need to find similar crude oil to replace the scarce crude oil for stabilizing the feedstock property. We introduced the method for calculation of crude blended properties firstly, and then created a ...Refineries often need to find similar crude oil to replace the scarce crude oil for stabilizing the feedstock property. We introduced the method for calculation of crude blended properties firstly, and then created a crude oil selection and blending optimization model based on the data of crude oil property. The model is a mixed-integer nonlinear programming(MINLP) with constraints, and the target is to maximize the similarity between the blended crude oil and the objective crude oil. Furthermore, the model takes into account the selection of crude oils and their blending ratios simultaneously, and transforms the problem of looking for similar crude oil into the crude oil selection and blending optimization problem. We applied the Improved Cuckoo Search(ICS) algorithm to solving the model. Through the simulations, ICS was compared with the genetic algorithm, the particle swarm optimization algorithm and the CPLEX solver. The results show that ICS has very good optimization efficiency. The blending solution can provide a reference for refineries to find the similar crude oil. And the method proposed can also give some references to selection and blending optimization of other materials.展开更多
In this paper, we explore a novel ensemble method for spectral clustering. In contrast to the traditional clustering ensemble methods that combine all the obtained clustering results, we propose the adaptive spectral ...In this paper, we explore a novel ensemble method for spectral clustering. In contrast to the traditional clustering ensemble methods that combine all the obtained clustering results, we propose the adaptive spectral clustering ensemble method to achieve a better clustering solution. This method can adaptively assess the number of the component members, which is not owned by many other algorithms. The component clusterings of the ensemble system are generated by spectral clustering (SC) which bears some good characteristics to engender the diverse committees. The selection process works by evaluating the generated component spectral clustering through resampling technique and population-based incremental learning algorithm (PBIL). Experimental results on UCI datasets demonstrate that the proposed algorithm can achieve better results compared with traditional clustering ensemble methods, especially when the number of component clusterings is large.展开更多
A clonal selection based memetic algorithm is proposed for solving job shop scheduling problems in this paper. In the proposed algorithm, the clonal selection and the local search mechanism are designed to enhance exp...A clonal selection based memetic algorithm is proposed for solving job shop scheduling problems in this paper. In the proposed algorithm, the clonal selection and the local search mechanism are designed to enhance exploration and exploitation. In the clonal selection mechanism, clonal selection, hypermutation and receptor edit theories are presented to construct an evolutionary searching mechanism which is used for exploration. In the local search mechanism, a simulated annealing local search algorithm based on Nowicki and Smutnicki's neighborhood is presented to exploit local optima. The proposed algorithm is examined using some well-known benchmark problems. Numerical results validate the effectiveness of the proposed algorithm.展开更多
Objective:To explore core acupoints and acupoint selection principles in acupuncture and moxibustion for obesity,from syndrome differentiation prescriptions of the acupuncture-moxibustion therapy in 808 obesity prescr...Objective:To explore core acupoints and acupoint selection principles in acupuncture and moxibustion for obesity,from syndrome differentiation prescriptions of the acupuncture-moxibustion therapy in 808 obesity prescriptions,by using node centrality and cluster analysis methods in complex network.Methods:Firstly,an acupoint network model is established,and acupoint nodes are assessed and calculated in multiple aspects by introducing the node centrality analysis idea of complex network,to excavate core acupoint nodes.Secondly,a cluster analysis is carried out on acupoint network by the cluster algorithm Q-PSO for complex network,to investigate the acupoint combination principles.Results:Zusanli(足三里ST36),Tianshu(天枢ST25),Fenglong(丰隆ST40),Zhongwan(中脘CV12)and Qihai(气海CV6),etc.,were included into the core acupoint Sanyinjiao(三阴交SP6)community.Zhigou(支沟TE6),Neiting(内庭ST44),Shangjuxu(上巨虚ST37),and Pishu(脾俞BL20)etc.,were included into the core acupoint Yinlingquan(阴陵泉SP9)community.Baihuanshu(白环俞BL30)and Zhiyang(至阳GV9)were included into the core acupoint Dachangshu(大肠俞BL25)community.Biguan(髀关ST31)was a single core community.Among all the acupoint nodes,SP6,ST25,SP9,ST36,CV6,Quchi(曲池L111),and Guanyuan(关元CV4)were of high degree centrality and eigenvector centrality,directly reflecting their importance in acupoint selection prescriptions.Conclusion:The Q-PSO algorithm is characterized with high precision and high efficiency,etc.The core acupoints and their combination principles explored by this algorithm are in accordance with clinical experiences.展开更多
Since traditional whale optimization algorithms have slow convergence speed,low accuracy and are easy to fall into local optimal solutions,an improved whale optimization algorithm based on mirror selection(WOA-MS)is p...Since traditional whale optimization algorithms have slow convergence speed,low accuracy and are easy to fall into local optimal solutions,an improved whale optimization algorithm based on mirror selection(WOA-MS)is proposed. Specific improvements includes:(1)An adaptive nonlinear inertia weight based on Branin function was introduced to balance global search and local mining.(2) A mirror selection method is proposed to improve the individual quality and speed up the convergence. By optimizing several test functions and comparing the experimental results with other three algorithms,this study verifies that WOA-MS has an excellent optimization performance.展开更多
To research the effect of the selection method of multi — objects genetic algorithm problem on optimizing result, this method is analyzed theoretically and discussed by using an autonomous underwater vehicle (AUV) as...To research the effect of the selection method of multi — objects genetic algorithm problem on optimizing result, this method is analyzed theoretically and discussed by using an autonomous underwater vehicle (AUV) as an object. A changing weight value method is put forward and a selection formula is modified. Some experiments were implemented on an AUV, TwinBurger. The results shows that this method is effective and feasible.展开更多
The advent of Big Data has rendered Machine Learning tasks more intricate as they frequently involve higher-dimensional data.Feature Selection(FS)methods can abate the complexity of the data and enhance the accuracy,g...The advent of Big Data has rendered Machine Learning tasks more intricate as they frequently involve higher-dimensional data.Feature Selection(FS)methods can abate the complexity of the data and enhance the accuracy,generalizability,and interpretability of models.Meta-heuristic algorithms are often utilized for FS tasks due to their low requirements and efficient performance.This paper introduces an augmented Forensic-Based Investigation algorithm(DCFBI)that incorporates a Dynamic Individual Selection(DIS)and crisscross(CC)mechanism to improve the pursuit phase of the FBI.Moreover,a binary version of DCFBI(BDCFBI)is applied to FS.Experiments conducted on IEEE CEC 2017 with other metaheuristics demonstrate that DCFBI surpasses them in search capability.The influence of different mechanisms on the original FBI is analyzed on benchmark functions,while its scalability is verified by comparing it with the original FBI on benchmarks with varied dimensions.BDCFBI is then applied to 18 real datasets from the UCI machine learning database and the Wieslaw dataset to select near-optimal features,which are then compared with six renowned binary metaheuristics.The results show that BDCFBI can be more competitive than similar methods and acquire a subset of features with superior classification accuracy.展开更多
Prediction plays a vital role in decision making. Correct prediction leads to right decision making to save the life, energy,efforts, money and time. The right decision prevents physical and material losses and it is ...Prediction plays a vital role in decision making. Correct prediction leads to right decision making to save the life, energy,efforts, money and time. The right decision prevents physical and material losses and it is practiced in all the fields including medical,finance, environmental studies, engineering and emerging technologies. Prediction is carried out by a model called classifier. The predictive accuracy of the classifier highly depends on the training datasets utilized for training the classifier. The irrelevant and redundant features of the training dataset reduce the accuracy of the classifier. Hence, the irrelevant and redundant features must be removed from the training dataset through the process known as feature selection. This paper proposes a feature selection algorithm namely unsupervised learning with ranking based feature selection(FSULR). It removes redundant features by clustering and eliminates irrelevant features by statistical measures to select the most significant features from the training dataset. The performance of this proposed algorithm is compared with the other seven feature selection algorithms by well known classifiers namely naive Bayes(NB),instance based(IB1) and tree based J48. Experimental results show that the proposed algorithm yields better prediction accuracy for classifiers.展开更多
Screening biomolecular markers from high-dimensional biological data is one of the long-standing tasks for biomedical translational research.With its advantages in both feature shrinkage and biological interpretabilit...Screening biomolecular markers from high-dimensional biological data is one of the long-standing tasks for biomedical translational research.With its advantages in both feature shrinkage and biological interpretability,Least Absolute Shrinkage and Selection Operator(LASSO)algorithm is one of the most popular methods for the scenarios of clinical biomarker development.However,in practice,applying LASSO on omics-based data with high dimensions and low-sample size may usually result in an excess number of predictive variables,leading to the overfitting of the model.Here,we present VSOLassoBag,a wrapped LASSO approach by integrating an ensemble learning strategy to help select efficient and stable variables with high confidence from omics-based data.Using a bagging strategy in combination with a parametric method or inflection point search method,VSOLassoBag can integrate and vote variables generated from multiple LASSO models to determine the optimal candidates.The application of VSOLassoBag on both simulation datasets and real-world datasets shows that the algorithm can effectively identify markers for either case-control binary classification or prognosis prediction.In addition,by comparing with multiple existing algorithms,VSOLassoBag shows a comparable performance under different scenarios while resulting in fewer features than others.In summary,VSOLassoBag,which is available at https://seqworld.com/VSOLassoBag/under the GPL v3 license,provides an alternative strategy for selecting reliable biomarkers from high-dimensional omics data.For user’s convenience,we implement VSOLassoBag as an R package that provides multithreading computing configurations.展开更多
Planetary gear train is a prominent component of helicopter transmission system and its health is of great significance for the flight safety of the helicopter.During health condition monitoring,the selection of a fau...Planetary gear train is a prominent component of helicopter transmission system and its health is of great significance for the flight safety of the helicopter.During health condition monitoring,the selection of a fault sensitive feature subset is meaningful for fault diagnosis of helicopter planetary gear train.According to actual situation,this paper proposed a multi-criteria fusion feature selection algorithm (MCFFSA) to identify an optimal feature subset from the highdimensional original feature space.In MCFFSA,a fault feature set of multiple domains,including time domain,frequency domain and wavelet domain,is first extracted from the raw vibration dataset.Four targeted criteria are then fused by multi-objective evolutionary algorithm based on decomposition (MOEA/D) to find Proto-efficient subsets,wherein two criteria for measuring diagnostic performance are assessed by sparse Bayesian extreme learning machine (SBELM).Further,Fmeasure is adopted to identify the optimal feature subset,which was employed for subsequent fault diagnosis.The effectiveness of MCFFSA is validated through six fault recognition datasets from a real helicopter transmission platform.The experimental results illustrate the superiority of combination of MOEA/D and SBELM in MCFFSA,and comparative analysis demonstrates that the optimal feature subset provided by MCFFSA can achieve a better diagnosis performance than other algorithms.展开更多
Based on the deficiency of time convergence and variability of Web services selection for services composition supporting cross-enterprises collaboration,an algorithm QCDSS(QoS constraints of dynamic Web services sele...Based on the deficiency of time convergence and variability of Web services selection for services composition supporting cross-enterprises collaboration,an algorithm QCDSS(QoS constraints of dynamic Web services selection)to resolve dynamic Web services selection with QoS global optimal path,was proposed.The essence of the algorithm was that the problem of dynamic Web services selection with QoS global optimal path was transformed into a multi-objective services composition optimization problem with QoS constraints.The operations of the cross and mutation in genetic algorithm were brought into PSOA(particle swarm optimization algorithm),forming an improved algorithm(IPSOA)to solve the QoS global optimal problem.Theoretical analysis and experimental results indicate that the algorithm can better satisfy the time convergence requirement for Web services composition supporting cross-enterprises collaboration than the traditional algorithms.展开更多
To overcome the limitations of traditional monitoring methods, based on vibration parameter image of rotating machinery, this paper presents an abnormality online monitoring method suitable for rotating machinery usin...To overcome the limitations of traditional monitoring methods, based on vibration parameter image of rotating machinery, this paper presents an abnormality online monitoring method suitable for rotating machinery using the negative selection mechanism of biology immune system. This method uses techniques of biology clone and learning mechanism to improve the negative selection algorithm to generate detectors possessing different monitoring radius, covers the abnormality space effectively, and avoids such problems as the low efficiency of generating detectors, etc. The result of an example applying the presented monitoring method shows that this method can solve the difficulty of obtaining fault samples preferably and extract the turbine state character effectively, it also can detect abnormality by causing various fault of the turbine and obtain the degree of abnormality accurately. The exact monitoring precision of abnormality indicates that this method is feasible and has better on-line quality, accuracy and robustness.展开更多
基金funded by the National Natural Science Foundation Project(Grant No.52274015)the National Science and Technology Major Project(Grant No.2025ZD1402205)。
文摘Casing damage resulting from sand production in unconsolidated sandstone reservoirs can significantly impact the average production of oil wells.However,the prediction task remains challenging due to the complex damage mechanism caused by sand production.This paper presents an innovative approach that combines feature selection(FS)with boosting algorithms to accurately predict casing damage in unconsolidated sandstone reservoirs.A novel TriScore FS technique is developed,combining mRMR,Random Forest,and F-test.The approach integrates three distinct feature selection approaches—TriScore,wrapper,and hybrid TriScore-wrapper and four interpretable Boosting models(AdaBoost,XGBoost,LightGBM,CatBoost).Moreover,shapley additive explanations(SHAP)was used to identify the most significant features across engineering,geological,and production features.The CatBoost model,using the Hybrid TriScore-rapper G_(1)G_(2)FS method,showed exceptional performance in analyzing data from the Gangxi Oilfield.It achieved the highestaccuracy(95.5%)and recall rate(89.7%)compared to other tested models.Casing service time,casing wall thickness,and perforation density were selected as the top three most important features.This framework enhances predictive robustness and is an effective tool for policymakers and energy analysts,confirming its capability to deliver reliable casing damage forecasts.
文摘In recent years, particle swarm optimization (PSO) has received widespread attention in feature selection due to its simplicity and potential for global search. However, in traditional PSO, particles primarily update based on two extreme values: personal best and global best, which limits the diversity of information. Ideally, particles should learn from multiple advantageous particles to enhance interactivity and optimization efficiency. Accordingly, this paper proposes a PSO that simulates the evolutionary dynamics of species survival in mountain peak ecology (PEPSO) for feature selection. Based on the pyramid topology, the algorithm simulates the features of mountain peak ecology in nature and the competitive-cooperative strategies among species. According to the principles of the algorithm, the population is first adaptively divided into many subgroups based on the fitness level of particles. Then, particles within each subgroup are divided into three different types based on their evolutionary levels, employing different adaptive inertia weight rules and dynamic learning mechanisms to define distinct learning modes. Consequently, all particles play their respective roles in promoting the global optimization performance of the algorithm, similar to different species in the ecological pattern of mountain peaks. Experimental validation of the PEPSO performance was conducted on 18 public datasets. The experimental results demonstrate that the PEPSO outperforms other PSO variant-based feature selection methods and mainstream feature selection methods based on intelligent optimization algorithms in terms of overall performance in global search capability, classification accuracy, and reduction of feature space dimensions. Wilcoxon signed-rank test also confirms the excellent performance of the PEPSO.
文摘Software defect prediction(SDP)aims to find a reliable method to predict defects in specific software projects and help software engineers allocate limited resources to release high-quality software products.Software defect prediction can be effectively performed using traditional features,but there are some redundant or irrelevant features in them(the presence or absence of this feature has little effect on the prediction results).These problems can be solved using feature selection.However,existing feature selection methods have shortcomings such as insignificant dimensionality reduction effect and low classification accuracy of the selected optimal feature subset.In order to reduce the impact of these shortcomings,this paper proposes a new feature selection method Cubic TraverseMa Beluga whale optimization algorithm(CTMBWO)based on the improved Beluga whale optimization algorithm(BWO).The goal of this study is to determine how well the CTMBWO can extract the features that are most important for correctly predicting software defects,improve the accuracy of fault prediction,reduce the number of the selected feature and mitigate the risk of overfitting,thereby achieving more efficient resource utilization and better distribution of test workload.The CTMBWO comprises three main stages:preprocessing the dataset,selecting relevant features,and evaluating the classification performance of the model.The novel feature selection method can effectively improve the performance of SDP.This study performs experiments on two software defect datasets(PROMISE,NASA)and shows the method’s classification performance using four detailed evaluation metrics,Accuracy,F1-score,MCC,AUC and Recall.The results indicate that the approach presented in this paper achieves outstanding classification performance on both datasets and has significant improvement over the baseline models.
基金funded by the Ministry of Higher Education of Malaysia,grant number FRGS/1/2022/ICT02/UPSI/02/1.
文摘In recent years,feature selection(FS)optimization of high-dimensional gene expression data has become one of the most promising approaches for cancer prediction and classification.This work reviews FS and classification methods that utilize evolutionary algorithms(EAs)for gene expression profiles in cancer or medical applications based on research motivations,challenges,and recommendations.Relevant studies were retrieved from four major academic databases-IEEE,Scopus,Springer,and ScienceDirect-using the keywords‘cancer classification’,‘optimization’,‘FS’,and‘gene expression profile’.A total of 67 papers were finally selected with key advancements identified as follows:(1)The majority of papers(44.8%)focused on developing algorithms and models for FS and classification.(2)The second category encompassed studies on biomarker identification by EAs,including 20 papers(30%).(3)The third category comprised works that applied FS to cancer data for decision support system purposes,addressing high-dimensional data and the formulation of chromosome length.These studies accounted for 12%of the total number of studies.(4)The remaining three papers(4.5%)were reviews and surveys focusing on models and developments in prediction and classification optimization for cancer classification under current technical conditions.This review highlights the importance of optimizing FS in EAs to manage high-dimensional data effectively.Despite recent advancements,significant limitations remain:the dynamic formulation of chromosome length remains an underexplored area.Thus,further research is needed on dynamic-length chromosome techniques for more sophisticated biomarker gene selection techniques.The findings suggest that further advancements in dynamic chromosome length formulations and adaptive algorithms could enhance cancer classification accuracy and efficiency.
基金supported by the National Natural Science Foundation of China under Grant 61602162the Hubei Provincial Science and Technology Plan Project under Grant 2023BCB041.
文摘With the birth of Software-Defined Networking(SDN),integration of both SDN and traditional architectures becomes the development trend of computer networks.Network intrusion detection faces challenges in dealing with complex attacks in SDN environments,thus to address the network security issues from the viewpoint of Artificial Intelligence(AI),this paper introduces the Crayfish Optimization Algorithm(COA)to the field of intrusion detection for both SDN and traditional network architectures,and based on the characteristics of the original COA,an Improved Crayfish Optimization Algorithm(ICOA)is proposed by integrating strategies of elite reverse learning,Levy flight,crowding factor and parameter modification.The ICOA is then utilized for AI-integrated feature selection of intrusion detection for both SDN and traditional network architectures,to reduce the dimensionality of the data and improve the performance of network intrusion detection.Finally,the performance evaluation is performed by testing not only the NSL-KDD dataset and the UNSW-NB 15 dataset for traditional networks but also the InSDN dataset for SDN-based networks.Experimental results show that ICOA improves the accuracy by 0.532%and 2.928%respectively compared with GWO and COA in traditional networks.In SDN networks,the accuracy of ICOA is 0.25%and 0.3%higher than COA and PSO.These findings collectively indicate that AI-integrated feature selection based on the proposed ICOA can promote network intrusion detection for both SDN and traditional architectures.
基金The National Natural Science Foundation of China(No.60472053),the Natural Science Foundation of Jiangsu Province(No.BK2003055),the Specialized Research Fund for the Doctoral Pro-gram of Higher Education (No.20030286017).
文摘The radial basis function (RBF), a kind of neural networks algorithm, is adopted to select clusterheads. It has many advantages such as simple parallel distributed computation, distributed storage, and fast learning. Four factors related to a node becoming a cluster-head are drawn by analysis, which are energy ( energy available in each node), number (the number of neighboring nodes), centrality ( a value to classify the nodes based on the proximity how central the node is to the cluster), and location (the distance between the base station and the node). The factors are as input variables of neural networks and the output variable is suitability that is the degree of a node becoming a cluster head. A group of cluster-heads are selected according to the size of network. Then the base station broadcasts a message containing the list of cluster-heads' IDs to all nodes. After that, each cluster-head announces its new status to all its neighbors and sets up a new cluster. If a node around it receives the message, it registers itself to be a member of the cluster. After identifying all the members, the cluster-head manages them and carries out data aggregation in each cluster. Thus data flowing in the network decreases and energy consumption of nodes decreases accordingly. Experimental results show that, compared with other algorithms, the proposed algorithm can significantly increase the lifetime of the sensor network.
基金Supported by the Basic Scientific Research Projects of the Central University of China(ZXH2010D010)the National Natural Science Foundation of China(60979021/F01)~~
文摘In order to ease congestion and ground delays in major hub airports, an aircraft taxiing scheduling optimization model is proposed with schedule time as the object function. In the new model, the idea of a classical job shop-schedule problem is adopted and three types of special aircraft-taxi conflicts are considered in the constraints. To solve such nondeterministic polynomial time-complex problems, the immune clonal selection algorithm(ICSA) is introduced. The simulation results in a congested hour of Beijing Capital International Airport show that, compared with the first-come-first-served(FCFS) strategy, the optimization-planning strategy reduces the total scheduling time by 13.6 min and the taxiing time per aircraft by 45.3 s, which improves the capacity of the runway and the efficiency of airport operations.
基金Project supported by the National Basic Research Program (973) of China (No. 2002CB312200) and the Center for Bioinformatics Pro-gram Grant of Harvard Center of Neurodegeneration and Repair,Harvard Medical School, Harvard University, Boston, USA
文摘In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying results by using conventional linear sta- tistical methods. Recursive feature elimination based on support vector machine (SVM RFE) is an effective algorithm for gene selection and cancer classification, which are integrated into a consistent framework. In this paper, we propose a new method to select parameters of the aforementioned algorithm implemented with Gaussian kernel SVMs as better alternatives to the common practice of selecting the apparently best parameters by using a genetic algorithm to search for a couple of optimal parameter. Fast implementation issues for this method are also discussed for pragmatic reasons. The proposed method was tested on two repre- sentative hereditary breast cancer and acute leukaemia datasets. The experimental results indicate that the proposed method per- forms well in selecting genes and achieves high classification accuracies with these genes.
基金supported by the National Natural Science Foundation of China(No.21365008)the Science Foundation of Guangxi province of China(No.2012GXNSFAA053230)
文摘Refineries often need to find similar crude oil to replace the scarce crude oil for stabilizing the feedstock property. We introduced the method for calculation of crude blended properties firstly, and then created a crude oil selection and blending optimization model based on the data of crude oil property. The model is a mixed-integer nonlinear programming(MINLP) with constraints, and the target is to maximize the similarity between the blended crude oil and the objective crude oil. Furthermore, the model takes into account the selection of crude oils and their blending ratios simultaneously, and transforms the problem of looking for similar crude oil into the crude oil selection and blending optimization problem. We applied the Improved Cuckoo Search(ICS) algorithm to solving the model. Through the simulations, ICS was compared with the genetic algorithm, the particle swarm optimization algorithm and the CPLEX solver. The results show that ICS has very good optimization efficiency. The blending solution can provide a reference for refineries to find the similar crude oil. And the method proposed can also give some references to selection and blending optimization of other materials.
基金Supported by the National Natural Science Foundation of China (60661003)the Research Project Department of Education of Jiangxi Province (GJJ10566)
文摘In this paper, we explore a novel ensemble method for spectral clustering. In contrast to the traditional clustering ensemble methods that combine all the obtained clustering results, we propose the adaptive spectral clustering ensemble method to achieve a better clustering solution. This method can adaptively assess the number of the component members, which is not owned by many other algorithms. The component clusterings of the ensemble system are generated by spectral clustering (SC) which bears some good characteristics to engender the diverse committees. The selection process works by evaluating the generated component spectral clustering through resampling technique and population-based incremental learning algorithm (PBIL). Experimental results on UCI datasets demonstrate that the proposed algorithm can achieve better results compared with traditional clustering ensemble methods, especially when the number of component clusterings is large.
文摘A clonal selection based memetic algorithm is proposed for solving job shop scheduling problems in this paper. In the proposed algorithm, the clonal selection and the local search mechanism are designed to enhance exploration and exploitation. In the clonal selection mechanism, clonal selection, hypermutation and receptor edit theories are presented to construct an evolutionary searching mechanism which is used for exploration. In the local search mechanism, a simulated annealing local search algorithm based on Nowicki and Smutnicki's neighborhood is presented to exploit local optima. The proposed algorithm is examined using some well-known benchmark problems. Numerical results validate the effectiveness of the proposed algorithm.
基金Supported by Hubei Health & Family Planning Commission Notice (No. [2017]20)Wuhan training project of the sixth batch of young and middle-aged medical talents, wuhan health & family planning commission (Wuhan Health & Family Planning Commission Notice No. [2018]116)Training project of the first batch of tanhualin famous doctors and students (Hubei TCM Hospital No. [2018]72)
文摘Objective:To explore core acupoints and acupoint selection principles in acupuncture and moxibustion for obesity,from syndrome differentiation prescriptions of the acupuncture-moxibustion therapy in 808 obesity prescriptions,by using node centrality and cluster analysis methods in complex network.Methods:Firstly,an acupoint network model is established,and acupoint nodes are assessed and calculated in multiple aspects by introducing the node centrality analysis idea of complex network,to excavate core acupoint nodes.Secondly,a cluster analysis is carried out on acupoint network by the cluster algorithm Q-PSO for complex network,to investigate the acupoint combination principles.Results:Zusanli(足三里ST36),Tianshu(天枢ST25),Fenglong(丰隆ST40),Zhongwan(中脘CV12)and Qihai(气海CV6),etc.,were included into the core acupoint Sanyinjiao(三阴交SP6)community.Zhigou(支沟TE6),Neiting(内庭ST44),Shangjuxu(上巨虚ST37),and Pishu(脾俞BL20)etc.,were included into the core acupoint Yinlingquan(阴陵泉SP9)community.Baihuanshu(白环俞BL30)and Zhiyang(至阳GV9)were included into the core acupoint Dachangshu(大肠俞BL25)community.Biguan(髀关ST31)was a single core community.Among all the acupoint nodes,SP6,ST25,SP9,ST36,CV6,Quchi(曲池L111),and Guanyuan(关元CV4)were of high degree centrality and eigenvector centrality,directly reflecting their importance in acupoint selection prescriptions.Conclusion:The Q-PSO algorithm is characterized with high precision and high efficiency,etc.The core acupoints and their combination principles explored by this algorithm are in accordance with clinical experiences.
基金supported by the Natural Science Foundation of Jiangsu Province (No. BK20151479)the Open Foundation of Graduate Innovation Base in Nanjing University of Aeronautics and Astronautics(No. kfjj20190736)
文摘Since traditional whale optimization algorithms have slow convergence speed,low accuracy and are easy to fall into local optimal solutions,an improved whale optimization algorithm based on mirror selection(WOA-MS)is proposed. Specific improvements includes:(1)An adaptive nonlinear inertia weight based on Branin function was introduced to balance global search and local mining.(2) A mirror selection method is proposed to improve the individual quality and speed up the convergence. By optimizing several test functions and comparing the experimental results with other three algorithms,this study verifies that WOA-MS has an excellent optimization performance.
文摘To research the effect of the selection method of multi — objects genetic algorithm problem on optimizing result, this method is analyzed theoretically and discussed by using an autonomous underwater vehicle (AUV) as an object. A changing weight value method is put forward and a selection formula is modified. Some experiments were implemented on an AUV, TwinBurger. The results shows that this method is effective and feasible.
基金supported by Special Fund of Fundamental Scientific Research Business Expense for Higher School of Central Government(ZY20180119)the Natural Science Foundation of Zhejiang Province(LZ22F020005)+1 种基金the Natural Science Foundation of Hebei Province(D2022512001)National Natural Science Foundation of China(42164002,62076185).
文摘The advent of Big Data has rendered Machine Learning tasks more intricate as they frequently involve higher-dimensional data.Feature Selection(FS)methods can abate the complexity of the data and enhance the accuracy,generalizability,and interpretability of models.Meta-heuristic algorithms are often utilized for FS tasks due to their low requirements and efficient performance.This paper introduces an augmented Forensic-Based Investigation algorithm(DCFBI)that incorporates a Dynamic Individual Selection(DIS)and crisscross(CC)mechanism to improve the pursuit phase of the FBI.Moreover,a binary version of DCFBI(BDCFBI)is applied to FS.Experiments conducted on IEEE CEC 2017 with other metaheuristics demonstrate that DCFBI surpasses them in search capability.The influence of different mechanisms on the original FBI is analyzed on benchmark functions,while its scalability is verified by comparing it with the original FBI on benchmarks with varied dimensions.BDCFBI is then applied to 18 real datasets from the UCI machine learning database and the Wieslaw dataset to select near-optimal features,which are then compared with six renowned binary metaheuristics.The results show that BDCFBI can be more competitive than similar methods and acquire a subset of features with superior classification accuracy.
文摘Prediction plays a vital role in decision making. Correct prediction leads to right decision making to save the life, energy,efforts, money and time. The right decision prevents physical and material losses and it is practiced in all the fields including medical,finance, environmental studies, engineering and emerging technologies. Prediction is carried out by a model called classifier. The predictive accuracy of the classifier highly depends on the training datasets utilized for training the classifier. The irrelevant and redundant features of the training dataset reduce the accuracy of the classifier. Hence, the irrelevant and redundant features must be removed from the training dataset through the process known as feature selection. This paper proposes a feature selection algorithm namely unsupervised learning with ranking based feature selection(FSULR). It removes redundant features by clustering and eliminates irrelevant features by statistical measures to select the most significant features from the training dataset. The performance of this proposed algorithm is compared with the other seven feature selection algorithms by well known classifiers namely naive Bayes(NB),instance based(IB1) and tree based J48. Experimental results show that the proposed algorithm yields better prediction accuracy for classifiers.
基金supported by National Key R&D Program of China(2021YFA1302100 to Q.Z)the National Natural Science Foundation of China(82172861 to Q.Z)+1 种基金Guangdong Basic and Applied Basic Research Foundation(2021A1515011743 to Q.Z)National Key Clinical Discipline(to D.Z)。
文摘Screening biomolecular markers from high-dimensional biological data is one of the long-standing tasks for biomedical translational research.With its advantages in both feature shrinkage and biological interpretability,Least Absolute Shrinkage and Selection Operator(LASSO)algorithm is one of the most popular methods for the scenarios of clinical biomarker development.However,in practice,applying LASSO on omics-based data with high dimensions and low-sample size may usually result in an excess number of predictive variables,leading to the overfitting of the model.Here,we present VSOLassoBag,a wrapped LASSO approach by integrating an ensemble learning strategy to help select efficient and stable variables with high confidence from omics-based data.Using a bagging strategy in combination with a parametric method or inflection point search method,VSOLassoBag can integrate and vote variables generated from multiple LASSO models to determine the optimal candidates.The application of VSOLassoBag on both simulation datasets and real-world datasets shows that the algorithm can effectively identify markers for either case-control binary classification or prognosis prediction.In addition,by comparing with multiple existing algorithms,VSOLassoBag shows a comparable performance under different scenarios while resulting in fewer features than others.In summary,VSOLassoBag,which is available at https://seqworld.com/VSOLassoBag/under the GPL v3 license,provides an alternative strategy for selecting reliable biomarkers from high-dimensional omics data.For user’s convenience,we implement VSOLassoBag as an R package that provides multithreading computing configurations.
基金co-supported by the Equipment Pre-research Foundation Project of China (No. JZX7Y20190243016301)Helicopter Transmission Technology Key Laboratory Foundation of China (No. KY-52-2018-0024)the Fundamental Research Funds for the Central Universities & Funding of Jiangsu Innovation Program for Graduate Education under Grant (No. KYLX16_0336)
文摘Planetary gear train is a prominent component of helicopter transmission system and its health is of great significance for the flight safety of the helicopter.During health condition monitoring,the selection of a fault sensitive feature subset is meaningful for fault diagnosis of helicopter planetary gear train.According to actual situation,this paper proposed a multi-criteria fusion feature selection algorithm (MCFFSA) to identify an optimal feature subset from the highdimensional original feature space.In MCFFSA,a fault feature set of multiple domains,including time domain,frequency domain and wavelet domain,is first extracted from the raw vibration dataset.Four targeted criteria are then fused by multi-objective evolutionary algorithm based on decomposition (MOEA/D) to find Proto-efficient subsets,wherein two criteria for measuring diagnostic performance are assessed by sparse Bayesian extreme learning machine (SBELM).Further,Fmeasure is adopted to identify the optimal feature subset,which was employed for subsequent fault diagnosis.The effectiveness of MCFFSA is validated through six fault recognition datasets from a real helicopter transmission platform.The experimental results illustrate the superiority of combination of MOEA/D and SBELM in MCFFSA,and comparative analysis demonstrates that the optimal feature subset provided by MCFFSA can achieve a better diagnosis performance than other algorithms.
基金Project(70631004)supported by the Key Project of the National Natural Science Foundation of ChinaProject(20080440988)supported by the Postdoctoral Science Foundation of China+1 种基金Project(09JJ4030)supported by the Natural Science Foundation of Hunan Province,ChinaProject supported by the Postdoctoral Science Foundation of Central South University,China
文摘Based on the deficiency of time convergence and variability of Web services selection for services composition supporting cross-enterprises collaboration,an algorithm QCDSS(QoS constraints of dynamic Web services selection)to resolve dynamic Web services selection with QoS global optimal path,was proposed.The essence of the algorithm was that the problem of dynamic Web services selection with QoS global optimal path was transformed into a multi-objective services composition optimization problem with QoS constraints.The operations of the cross and mutation in genetic algorithm were brought into PSOA(particle swarm optimization algorithm),forming an improved algorithm(IPSOA)to solve the QoS global optimal problem.Theoretical analysis and experimental results indicate that the algorithm can better satisfy the time convergence requirement for Web services composition supporting cross-enterprises collaboration than the traditional algorithms.
基金Sponsored by the National Natural Science Foundation of China(Grant No.50875056)
文摘To overcome the limitations of traditional monitoring methods, based on vibration parameter image of rotating machinery, this paper presents an abnormality online monitoring method suitable for rotating machinery using the negative selection mechanism of biology immune system. This method uses techniques of biology clone and learning mechanism to improve the negative selection algorithm to generate detectors possessing different monitoring radius, covers the abnormality space effectively, and avoids such problems as the low efficiency of generating detectors, etc. The result of an example applying the presented monitoring method shows that this method can solve the difficulty of obtaining fault samples preferably and extract the turbine state character effectively, it also can detect abnormality by causing various fault of the turbine and obtain the degree of abnormality accurately. The exact monitoring precision of abnormality indicates that this method is feasible and has better on-line quality, accuracy and robustness.