期刊文献+
共找到4,210篇文章
< 1 2 211 >
每页显示 20 50 100
Enhancing Multi-Class Cyberbullying Classification with Hybrid Feature Extraction and Transformer-Based Models
1
作者 Suliman Mohamed Fati Mohammed A.Mahdi +4 位作者 Mohamed A.G.Hazber Shahanawaj Ahamad Sawsan A.Saad Mohammed Gamal Ragab Mohammed Al-Shalabi 《Computer Modeling in Engineering & Sciences》 2025年第5期2109-2131,共23页
Cyberbullying on social media poses significant psychological risks,yet most detection systems over-simplify the task by focusing on binary classification,ignoring nuanced categories like passive-aggressive remarks or... Cyberbullying on social media poses significant psychological risks,yet most detection systems over-simplify the task by focusing on binary classification,ignoring nuanced categories like passive-aggressive remarks or indirect slurs.To address this gap,we propose a hybrid framework combining Term Frequency-Inverse Document Frequency(TF-IDF),word-to-vector(Word2Vec),and Bidirectional Encoder Representations from Transformers(BERT)based models for multi-class cyberbullying detection.Our approach integrates TF-IDF for lexical specificity and Word2Vec for semantic relationships,fused with BERT’s contextual embeddings to capture syntactic and semantic complexities.We evaluate the framework on a publicly available dataset of 47,000 annotated social media posts across five cyberbullying categories:age,ethnicity,gender,religion,and indirect aggression.Among BERT variants tested,BERT Base Un-Cased achieved the highest performance with 93%accuracy(standard deviation across±1%5-fold cross-validation)and an average AUC of 0.96,outperforming standalone TF-IDF(78%)and Word2Vec(82%)models.Notably,it achieved near-perfect AUC scores(0.99)for age and ethnicity-based bullying.A comparative analysis with state-of-the-art benchmarks,including Generative Pre-trained Transformer 2(GPT-2)and Text-to-Text Transfer Transformer(T5)models highlights BERT’s superiority in handling ambiguous language.This work advances cyberbullying detection by demonstrating how hybrid feature extraction and transformer models improve multi-class classification,offering a scalable solution for moderating nuanced harmful content. 展开更多
关键词 Cyberbullying classification multi-class classification BERT models machine learning TF-IDF Word2Vec social media analysis transformer models
在线阅读 下载PDF
Optimizing Cancer Classification and Gene Discovery with an Adaptive Learning Search Algorithm for Microarray Analysis
2
作者 Chiwen Qu Heng Yao +1 位作者 Tingjiang Pan Zenghui Lu 《Journal of Bionic Engineering》 2025年第2期901-930,共30页
DNA microarrays, a cornerstone in biomedicine, measure gene expression across thousands to tens of thousands of genes. Identifying the genes vital for accurate cancer classification is a key challenge. Here, we presen... DNA microarrays, a cornerstone in biomedicine, measure gene expression across thousands to tens of thousands of genes. Identifying the genes vital for accurate cancer classification is a key challenge. Here, we present Fs-LSA (F-score based Learning Search Algorithm), a novel gene selection algorithm designed to enhance the precision and efficiency of target gene identification from microarray data for cancer classification. This algorithm is divided into two phases: the first leverages F-score values to prioritize and select feature genes with the most significant differential expression;the second phase introduces our Learning Search Algorithm (LSA), which harnesses swarm intelligence to identify the optimal subset among the remaining genes. Inspired by human social learning, LSA integrates historical data and collective intelligence for a thorough search, with a dynamic control mechanism that balances exploration and refinement, thereby enhancing the gene selection process. We conducted a rigorous validation of Fs-LSA’s performance using eight publicly available cancer microarray expression datasets. Fs-LSA achieved accuracy, precision, sensitivity, and F1-score values of 0.9932, 0.9923, 0.9962, and 0.994, respectively. Comparative analyses with state-of-the-art algorithms revealed Fs-LSA’s superior performance in terms of simplicity and efficiency. Additionally, we validated the algorithm’s efficacy independently using glioblastoma data from GEO and TCGA databases. It was significantly superior to those of the comparison algorithms. Importantly, the driver genes identified by Fs-LSA were instrumental in developing a predictive model as an independent prognostic indicator for glioblastoma, underscoring Fs-LSA’s transformative potential in genomics and personalized medicine. 展开更多
关键词 Gene selection Learning search algorithm Gene expression data classification
暂未订购
Feature Selection Optimisation for Cancer Classification Based on Evolutionary Algorithms:An Extensive Review
3
作者 Siti Ramadhani Lestari Handayani +4 位作者 Theam Foo Ng Sumayyah Dzulkifly Roziana Ariffin Haldi Budiman Shir Li Wang 《Computer Modeling in Engineering & Sciences》 2025年第6期2711-2765,共55页
In recent years,feature selection(FS)optimization of high-dimensional gene expression data has become one of the most promising approaches for cancer prediction and classification.This work reviews FS and classificati... In recent years,feature selection(FS)optimization of high-dimensional gene expression data has become one of the most promising approaches for cancer prediction and classification.This work reviews FS and classification methods that utilize evolutionary algorithms(EAs)for gene expression profiles in cancer or medical applications based on research motivations,challenges,and recommendations.Relevant studies were retrieved from four major academic databases-IEEE,Scopus,Springer,and ScienceDirect-using the keywords‘cancer classification’,‘optimization’,‘FS’,and‘gene expression profile’.A total of 67 papers were finally selected with key advancements identified as follows:(1)The majority of papers(44.8%)focused on developing algorithms and models for FS and classification.(2)The second category encompassed studies on biomarker identification by EAs,including 20 papers(30%).(3)The third category comprised works that applied FS to cancer data for decision support system purposes,addressing high-dimensional data and the formulation of chromosome length.These studies accounted for 12%of the total number of studies.(4)The remaining three papers(4.5%)were reviews and surveys focusing on models and developments in prediction and classification optimization for cancer classification under current technical conditions.This review highlights the importance of optimizing FS in EAs to manage high-dimensional data effectively.Despite recent advancements,significant limitations remain:the dynamic formulation of chromosome length remains an underexplored area.Thus,further research is needed on dynamic-length chromosome techniques for more sophisticated biomarker gene selection techniques.The findings suggest that further advancements in dynamic chromosome length formulations and adaptive algorithms could enhance cancer classification accuracy and efficiency. 展开更多
关键词 Feature selection(FS) gene expression profile(GEP) cancer classification evolutionary algorithms(EAs) dynamic-length chromosome
暂未订购
Variety classification and identification of maize seeds based on hyperspectral imaging method 被引量:1
4
作者 XUE Hang XU Xiping MENG Xiang 《Optoelectronics Letters》 2025年第4期234-241,共8页
In this study,eight different varieties of maize seeds were used as the research objects.Conduct 81 types of combined preprocessing on the original spectra.Through comparison,Savitzky-Golay(SG)-multivariate scattering... In this study,eight different varieties of maize seeds were used as the research objects.Conduct 81 types of combined preprocessing on the original spectra.Through comparison,Savitzky-Golay(SG)-multivariate scattering correction(MSC)-maximum-minimum normalization(MN)was identified as the optimal preprocessing technique.The competitive adaptive reweighted sampling(CARS),successive projections algorithm(SPA),and their combined methods were employed to extract feature wavelengths.Classification models based on back propagation(BP),support vector machine(SVM),random forest(RF),and partial least squares(PLS)were established using full-band data and feature wavelengths.Among all models,the(CARS-SPA)-BP model achieved the highest accuracy rate of 98.44%.This study offers novel insights and methodologies for the rapid and accurate identification of corn seeds as well as other crop seeds. 展开更多
关键词 feature extraction extract feature wavelengthsclassification models variety classification hyperspectral imaging combined preprocessing competitive adaptive reweighted sampling cars successive projections algorithm spa PREPROCESSING maize seeds
原文传递
Enhancing Cancer Classification through a Hybrid Bio-Inspired Evolutionary Algorithm for Biomarker Gene Selection 被引量:1
5
作者 Hala AlShamlan Halah AlMazrua 《Computers, Materials & Continua》 SCIE EI 2024年第4期675-694,共20页
In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selec... In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selection.Themotivation for utilizingGWOandHHOstems fromtheir bio-inspired nature and their demonstrated success in optimization problems.We aimto leverage the strengths of these algorithms to enhance the effectiveness of feature selection in microarray-based cancer classification.We selected leave-one-out cross-validation(LOOCV)to evaluate the performance of both two widely used classifiers,k-nearest neighbors(KNN)and support vector machine(SVM),on high-dimensional cancer microarray data.The proposed method is extensively tested on six publicly available cancer microarray datasets,and a comprehensive comparison with recently published methods is conducted.Our hybrid algorithm demonstrates its effectiveness in improving classification performance,Surpassing alternative approaches in terms of precision.The outcomes confirm the capability of our method to substantially improve both the precision and efficiency of cancer classification,thereby advancing the development ofmore efficient treatment strategies.The proposed hybridmethod offers a promising solution to the gene selection problem in microarray-based cancer classification.It improves the accuracy and efficiency of cancer diagnosis and treatment,and its superior performance compared to other methods highlights its potential applicability in realworld cancer classification tasks.By harnessing the complementary search mechanisms of GWO and HHO,we leverage their bio-inspired behavior to identify informative genes relevant to cancer diagnosis and treatment. 展开更多
关键词 Bio-inspired algorithms BIOINFORMATICS cancer classification evolutionary algorithm feature selection gene expression grey wolf optimizer harris hawks optimization k-nearest neighbor support vector machine
在线阅读 下载PDF
A Precipitation Classification Scheme for China with Special Consideration of Extreme Events
6
作者 MA Ya-yu WANG Jing-song +5 位作者 ZHAO Liang LI Wen-juan YANG Hao WEN Wu ZHOU Hang CHEN Min 《Journal of Tropical Meteorology》 2025年第5期497-510,共14页
Due to global warming, extreme weather and climate events are becoming more frequent, highlighting the need to explore the changing characteristics of precipitation in China, including extreme precipitation. A cluster... Due to global warming, extreme weather and climate events are becoming more frequent, highlighting the need to explore the changing characteristics of precipitation in China, including extreme precipitation. A clustering algorithm was developed to classify summer(June, July, and August) daily precipitation in China from 1961 to 2020, considering spatial distribution, standard deviations, and frequency of extreme precipitation events. The results reveal six distinct precipitation climate zones, a classification that differs from previous divisions. While overall precipitation has decreased in most regions, the frequency of extreme precipitation events has increased across all clusters, indicating a shift in precipitation distribution patterns. Analysis shows that the weakened Lake Baikal blocking high and strengthened Mongolian cyclone influence the arid region in northwest China(Cluster 1), which is characterized by the lowest precipitation.The transition zone between the monsoon and arid region(Cluster 2) is affected by the Mongolian cyclone, water vapor transport from the Indian Ocean, and shifts in the monsoon boundary. Clusters 3 and 4 represent areas associated with advancement and retreat of the summer monsoon. In the Meiyu region, two distinct subregions have been identified exist.Cluster 4 is primarily influenced by the East Asia-Pacific wave train. Despite sharing similar climate drivers and proximity,Clusters 4 and 5 differ significantly due to topographic variations and disparate levels of urbanization. Cluster 5 exhibits a higher average precipitation, greater variability, and more frequent extreme events. Cluster 6 exhibits the highest overall precipitation in the coastal areas of Guangdong and Guangxi, where abundant water vapor contributes to a higher frequency of extreme precipitation. In addition, anthropogenic activities and urbanization significantly influence precipitation in Beijing-Tianjin-Hebei and Yangtze River Delta regions. This research proposes a precipitation classification scheme integrating multiple precipitation parameters, providing support for risk management and mitigation strategies in the face of increasing extreme precipitation events. 展开更多
关键词 precipitation characteristics extreme precipitation K-means clustering algorithm precipitation classification scheme risk management
在线阅读 下载PDF
Marine Predators Algorithm with Deep Learning-Based Leukemia Cancer Classification on Medical Images
7
作者 Sonali Das Saroja Kumar Rout +5 位作者 Sujit Kumar Panda Pradyumna Kumar Mohapatra Abdulaziz S.Almazyad Muhammed Basheer Jasser Guojiang Xiong Ali Wagdy Mohamed 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第10期893-916,共24页
In blood or bone marrow,leukemia is a form of cancer.A person with leukemia has an expansion of white blood cells(WBCs).It primarily affects children and rarely affects adults.Treatment depends on the type of leukemia... In blood or bone marrow,leukemia is a form of cancer.A person with leukemia has an expansion of white blood cells(WBCs).It primarily affects children and rarely affects adults.Treatment depends on the type of leukemia and the extent to which cancer has established throughout the body.Identifying leukemia in the initial stage is vital to providing timely patient care.Medical image-analysis-related approaches grant safer,quicker,and less costly solutions while ignoring the difficulties of these invasive processes.It can be simple to generalize Computer vision(CV)-based and image-processing techniques and eradicate human error.Many researchers have implemented computer-aided diagnosticmethods andmachine learning(ML)for laboratory image analysis,hopefully overcoming the limitations of late leukemia detection and determining its subgroups.This study establishes a Marine Predators Algorithm with Deep Learning Leukemia Cancer Classification(MPADL-LCC)algorithm onMedical Images.The projectedMPADL-LCC system uses a bilateral filtering(BF)technique to pre-process medical images.The MPADL-LCC system uses Faster SqueezeNet withMarine Predators Algorithm(MPA)as a hyperparameter optimizer for feature extraction.Lastly,the denoising autoencoder(DAE)methodology can be executed to accurately detect and classify leukemia cancer.The hyperparameter tuning process using MPA helps enhance leukemia cancer classification performance.Simulation results are compared with other recent approaches concerning various measurements and the MPADL-LCC algorithm exhibits the best results over other recent approaches. 展开更多
关键词 Leukemia cancer medical imaging image classification deep learning marine predators algorithm
在线阅读 下载PDF
Classification and reconstructive algorithm for nasal alar defect in Asians
8
作者 Renpeng Zhou Dongze Lyu +1 位作者 Chen Wang Danru Wang 《Chinese Journal of Plastic and Reconstructive Surgery》 2024年第1期22-27,共6页
Background:The nasal alar defect in Asians remains a challenging issue,as do clear classification and algorithm guidance,despite numerous previously described surgical techniques.The aim of this study is to propose a ... Background:The nasal alar defect in Asians remains a challenging issue,as do clear classification and algorithm guidance,despite numerous previously described surgical techniques.The aim of this study is to propose a surgical algorithm that addresses the appropriate surgical procedures for different types of nasal alar defects in Asian patients.Methods:A retrospective case note review was conducted on 32 patients with nasal alar defect who underwent reconstruction between 2008 and 2022.Based on careful analysis and our clinical experience,we proposed a classification system for nasal alar defects and presented a reconstructive algorithm.Patient data,including age,sex,diagnosis,surgical options,and complications,were assessed.The extent of surgical scar formation was evaluated using standard photography based on a 4-grade scar scale.Results:Among the 32 patients,there were 20 males and 12 females with nasal alar defects.The predominant cause of trauma in China was industrial factors.The majority of alar defects were classified as type Ⅰ C(n=8,25%),comprising 18 cases(56.2%);there were 5 cases(15.6%)of type Ⅱ defect,7(21.9%)of type Ⅲ defect,and 2(6.3%)of type Ⅳ defect.The most common surgical option was auricular composite graft(n=8,25%),followed by bilobed flap(n=6,18.8%),free auricular composite flap(n=4,12.5%),and primary closure(n=3,9.4%).Satisfactory improvements were observed postoperatively.Conclusion:Factors contributing to classifications were analyzed and defined,providing a framework for the proposed classification system.The reconstructive algorithm offers surgeons appropriate procedures for treating nasal alar defect in Asians. 展开更多
关键词 Nasal alar DEFECT classification algorithm Surgical methods
暂未订购
A combined algorithm of K-means and MTRL for multi-class classification 被引量:2
9
作者 XUE Mengfan HAN Lei PENG Dongliang 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2019年第5期875-885,共11页
The basic idea of multi-class classification is a disassembly method,which is to decompose a multi-class classification task into several binary classification tasks.In order to improve the accuracy of multi-class cla... The basic idea of multi-class classification is a disassembly method,which is to decompose a multi-class classification task into several binary classification tasks.In order to improve the accuracy of multi-class classification in the case of insufficient samples,this paper proposes a multi-class classification method combining K-means and multi-task relationship learning(MTRL).The method first uses the split method of One vs.Rest to disassemble the multi-class classification task into binary classification tasks.K-means is used to down sample the dataset of each task,which can prevent over-fitting of the model while reducing training costs.Finally,the sampled dataset is applied to the MTRL,and multiple binary classifiers are trained together.With the help of MTRL,this method can utilize the inter-task association to train the model,and achieve the purpose of improving the classification accuracy of each binary classifier.The effectiveness of the proposed approach is demonstrated by experimental results on the Iris dataset,Wine dataset,Multiple Features dataset,Wireless Indoor Localization dataset and Avila dataset. 展开更多
关键词 machine LEARNING multi-class classification K-MEANS MULTI-TASK RELATIONSHIP LEARNING (MTRL) OVER-FITTING
在线阅读 下载PDF
Power Quality Disturbance Classification Method Based on Wavelet Transform and SVM Multi-class Algorithms 被引量:1
10
作者 Xiao Fei 《Energy and Power Engineering》 2013年第4期561-565,共5页
The accurate identification and classification of various power quality disturbances are keys to ensuring high-quality electrical energy. In this study, the statistical characteristics of the disturbance signal of wav... The accurate identification and classification of various power quality disturbances are keys to ensuring high-quality electrical energy. In this study, the statistical characteristics of the disturbance signal of wavelet transform coefficients and wavelet transform energy distribution constitute feature vectors. These vectors are then trained and tested using SVM multi-class algorithms. Experimental results demonstrate that the SVM multi-class algorithms, which use the Gaussian radial basis function, exponential radial basis function, and hyperbolic tangent function as basis functions, are suitable methods for power quality disturbance classification. 展开更多
关键词 Power Quality DISTURBANCE classification WAVELET TRANSFORM SVM multi-class algorithmS
在线阅读 下载PDF
FLIGHT CLASSIFICATION MODEL BASED ON TRANSITIVE CLOSURE ALGORITHM AND APPLICATION TO FLIGHT SEQUENCING PROBLEM 被引量:3
11
作者 李雄 徐肖豪 李冬宾 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI 2007年第1期31-35,共5页
A new arrival and departure flight classification method based on the transitive closure algorithm (TCA) is proposed. Firstly, the fuzzy set theory and the transitive closure algorithm are introduced. Then four diff... A new arrival and departure flight classification method based on the transitive closure algorithm (TCA) is proposed. Firstly, the fuzzy set theory and the transitive closure algorithm are introduced. Then four different factors are selected to establish the flight classification model and a method is given to calculate the delay cost for each class. Finally, the proposed method is implemented in the sequencing problems of flights in a terminal area, and results are compared with that of the traditional classification method(TCM). Results show that the new classification model is effective in reducing the expenses of flight delays, thus optimizing the sequences of arrival and departure flights, and improving the efficiency of air traffic control. 展开更多
关键词 air traffic control transitive closure algorithm cost of flight delay classification model
在线阅读 下载PDF
LEARNING ALGORITHM OF FEEDFORWARD NEURAL NETWORK WITH HARD LIMITER USED FOR CLASSIFICATION
12
作者 张兆宁 孙雅明 毛鹏 《Transactions of Tianjin University》 EI CAS 1999年第2期14-18,共5页
A learning algorithm based on a hard limiter for feedforward neural networks (NN) is presented,and is applied in solving classification problems on separable convex sets and disjoint sets.It has been proved that the a... A learning algorithm based on a hard limiter for feedforward neural networks (NN) is presented,and is applied in solving classification problems on separable convex sets and disjoint sets.It has been proved that the algorithm has stronger classification ability than that of the back propagation (BP) algorithm for the feedforward NN using sigmoid function by simulation.What is more,the models can be implemented with lower cost hardware than that of the BP NN.LEARNIN 展开更多
关键词 hard limiter separable convex sets HYPERPLANE feedforward NN classification learning algorithm
在线阅读 下载PDF
Application of QPSO-KM Algorithm in Wine Quality Classification
13
作者 邱靖 彭莞云 +1 位作者 吴瑞武 张海涛 《Agricultural Science & Technology》 CAS 2015年第9期2045-2047,共3页
Since there are many factors affecting the quality of wine, total 17 factors were screened out using principle component analysis. The difference test was conducted on the evaluation data of the two groups of testers.... Since there are many factors affecting the quality of wine, total 17 factors were screened out using principle component analysis. The difference test was conducted on the evaluation data of the two groups of testers. The results showed that the evaluation data of the second group were more reliable compared with those of the first group. At the same time, the KM algorithm was optimized using the QPSO algorithm. The wine classification model was established. Compared with the other two algorithms, the QPSO-KM algorithm was more capable of searching the globally optimum solution, and it could be used to classify the wine samples. In addition,the QPSO-KM algorithm could also be used to solve the issues about clustering. 展开更多
关键词 QPSO KM algorithm Wine sample classification model
在线阅读 下载PDF
BACNN: Multi-scale feature fusion-based bilinear attention convolutional neural network for wood NIR classification 被引量:2
14
作者 Zihao Wan Hong Yang +2 位作者 Jipan Xu Hongbo Mu Dawei Qi 《Journal of Forestry Research》 SCIE EI CAS CSCD 2024年第4期202-214,共13页
Effective development and utilization of wood resources is critical.Wood modification research has become an integral dimension of wood science research,however,the similarities between modified wood and original wood... Effective development and utilization of wood resources is critical.Wood modification research has become an integral dimension of wood science research,however,the similarities between modified wood and original wood render it challenging for accurate identification and classification using conventional image classification techniques.So,the development of efficient and accurate wood classification techniques is inevitable.This paper presents a one-dimensional,convolutional neural network(i.e.,BACNN)that combines near-infrared spectroscopy and deep learning techniques to classify poplar,tung,and balsa woods,and PVA,nano-silica-sol and PVA-nano silica sol modified woods of poplar.The results show that BACNN achieves an accuracy of 99.3%on the test set,higher than the 52.9%of the BP neural network and 98.7%of Support Vector Machine compared with traditional machine learning methods and deep learning based methods;it is also higher than the 97.6%of LeNet,98.7%of AlexNet and 99.1%of VGGNet-11.Therefore,the classification method proposed offers potential applications in wood classification,especially with homogeneous modified wood,and it also provides a basis for subsequent wood properties studies. 展开更多
关键词 Wood classification Near infrared spectroscopy Bilinear network SE module Anti-noise algorithm
在线阅读 下载PDF
An Imbalanced Data Classification Method Based on Hybrid Resampling and Fine Cost Sensitive Support Vector Machine 被引量:2
15
作者 Bo Zhu Xiaona Jing +1 位作者 Lan Qiu Runbo Li 《Computers, Materials & Continua》 SCIE EI 2024年第6期3977-3999,共23页
When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to ... When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to be in favor of the majority class(usually defined as the negative class),which may do harm to the accuracy of the minority class(usually defined as the positive class),and then lead to poor overall performance of the model.A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article,which is based on a new hybrid resampling approach(MSHR)and a new fine cost-sensitive support vector machine(CS-SVM)classifier(FCSSVM).The MSHR measures the separability of each negative sample through its Silhouette value calculated by Mahalanobis distance between samples,based on which,the so-called pseudo-negative samples are screened out to generate new positive samples(over-sampling step)through linear interpolation and are deleted finally(under-sampling step).This approach replaces pseudo-negative samples with generated new positive samples one by one to clear up the inter-class overlap on the borderline,without changing the overall scale of the dataset.The FCSSVM is an improved version of the traditional CS-SVM.It considers influences of both the imbalance of sample number and the class distribution on classification simultaneously,and through finely tuning the class cost weights by using the efficient optimization algorithm based on the physical phenomenon of rime-ice(RIME)algorithm with cross-validation accuracy as the fitness function to accurately adjust the classification borderline.To verify the effectiveness of the proposed method,a series of experiments are carried out based on 20 imbalanced datasets including both mildly and extremely imbalanced datasets.The experimental results show that the MSHR-FCSSVM method performs better than the methods for comparison in most cases,and both the MSHR and the FCSSVM played significant roles. 展开更多
关键词 Imbalanced data classification Silhouette value Mahalanobis distance RIME algorithm CS-SVM
在线阅读 下载PDF
Ensemble Filter-Wrapper Text Feature Selection Methods for Text Classification 被引量:1
16
作者 Oluwaseun Peter Ige Keng Hoon Gan 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第11期1847-1865,共19页
Feature selection is a crucial technique in text classification for improving the efficiency and effectiveness of classifiers or machine learning techniques by reducing the dataset’s dimensionality.This involves elim... Feature selection is a crucial technique in text classification for improving the efficiency and effectiveness of classifiers or machine learning techniques by reducing the dataset’s dimensionality.This involves eliminating irrelevant,redundant,and noisy features to streamline the classification process.Various methods,from single feature selection techniques to ensemble filter-wrapper methods,have been used in the literature.Metaheuristic algorithms have become popular due to their ability to handle optimization complexity and the continuous influx of text documents.Feature selection is inherently multi-objective,balancing the enhancement of feature relevance,accuracy,and the reduction of redundant features.This research presents a two-fold objective for feature selection.The first objective is to identify the top-ranked features using an ensemble of three multi-univariate filter methods:Information Gain(Infogain),Chi-Square(Chi^(2)),and Analysis of Variance(ANOVA).This aims to maximize feature relevance while minimizing redundancy.The second objective involves reducing the number of selected features and increasing accuracy through a hybrid approach combining Artificial Bee Colony(ABC)and Genetic Algorithms(GA).This hybrid method operates in a wrapper framework to identify the most informative subset of text features.Support Vector Machine(SVM)was employed as the performance evaluator for the proposed model,tested on two high-dimensional multiclass datasets.The experimental results demonstrated that the ensemble filter combined with the ABC+GA hybrid approach is a promising solution for text feature selection,offering superior performance compared to other existing feature selection algorithms. 展开更多
关键词 Metaheuristic algorithms text classification multi-univariate filter feature selection ensemble filter-wrapper techniques
在线阅读 下载PDF
Multi-Class Classification Methods of Cost-Conscious LS-SVM for Fault Diagnosis of Blast Furnace 被引量:15
17
作者 LIU Li-mei WANG An-na SHA Mo ZHAO Feng-yun 《Journal of Iron and Steel Research International》 SCIE EI CAS CSCD 2011年第10期17-23,33,共8页
Aiming at the limitations of rapid fault diagnosis of blast furnace, a novel strategy based on cost-conscious least squares support vector machine (LS-SVM) is proposed to solve this problem. Firstly, modified discre... Aiming at the limitations of rapid fault diagnosis of blast furnace, a novel strategy based on cost-conscious least squares support vector machine (LS-SVM) is proposed to solve this problem. Firstly, modified discrete particle swarm optimization is applied to optimize the feature selection and the LS-SVM parameters. Secondly, cost-con- scious formula is presented for fitness function and it contains in detail training time, recognition accuracy and the feature selection. The CLS-SVM algorithm is presented to increase the performance of the LS-SVM classifier. The new method can select the best fault features in much shorter time and have fewer support vectbrs and better general- ization performance in the application of fault diagnosis of the blast furnace. Thirdly, a gradual change binary tree is established for blast furnace faults diagnosis. It is a multi-class classification method based on center-of-gravity formula distance of cluster. A gradual change classification percentage ia used to select sample randomly. The proposed new metbod raises the sped of diagnosis, optimizes the classifieation scraraey and has good generalization ability for fault diagnosis of the application of blast furnace. 展开更多
关键词 blast furnace fault diagnosis eosc-conscious LS-SVM multi-class classification
原文传递
Basic Tenets of Classification Algorithms K-Nearest-Neighbor, Support Vector Machine, Random Forest and Neural Network: A Review 被引量:13
18
作者 Ernest Yeboah Boateng Joseph Otoo Daniel A. Abaye 《Journal of Data Analysis and Information Processing》 2020年第4期341-357,共17页
In this paper, sixty-eight research articles published between 2000 and 2017 as well as textbooks which employed four classification algorithms: K-Nearest-Neighbor (KNN), Support Vector Machines (SVM), Random Forest (... In this paper, sixty-eight research articles published between 2000 and 2017 as well as textbooks which employed four classification algorithms: K-Nearest-Neighbor (KNN), Support Vector Machines (SVM), Random Forest (RF) and Neural Network (NN) as the main statistical tools were reviewed. The aim was to examine and compare these nonparametric classification methods on the following attributes: robustness to training data, sensitivity to changes, data fitting, stability, ability to handle large data sizes, sensitivity to noise, time invested in parameter tuning, and accuracy. The performances, strengths and shortcomings of each of the algorithms were examined, and finally, a conclusion was arrived at on which one has higher performance. It was evident from the literature reviewed that RF is too sensitive to small changes in the training dataset and is occasionally unstable and tends to overfit in the model. KNN is easy to implement and understand but has a major drawback of becoming significantly slow as the size of the data in use grows, while the ideal value of K for the KNN classifier is difficult to set. SVM and RF are insensitive to noise or overtraining, which shows their ability in dealing with unbalanced data. Larger input datasets will lengthen classification times for NN and KNN more than for SVM and RF. Among these nonparametric classification methods, NN has the potential to become a more widely used classification algorithm, but because of their time-consuming parameter tuning procedure, high level of complexity in computational processing, the numerous types of NN architectures to choose from and the high number of algorithms used for training, most researchers recommend SVM and RF as easier and wieldy used methods which repeatedly achieve results with high accuracies and are often faster to implement. 展开更多
关键词 classification algorithms NON-PARAMETRIC K-Nearest-Neighbor Neural Networks Random Forest Support Vector Machines
在线阅读 下载PDF
Prediction of geological characteristics from shield operational parameters by integrating grid search and K-fold cross validation into stacking classification algorithm 被引量:12
19
作者 Tao Yan Shui-Long Shen +1 位作者 Annan Zhou Xiangsheng Chen 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2022年第4期1292-1303,共12页
This study presents a framework for predicting geological characteristics based on integrating a stacking classification algorithm(SCA) with a grid search(GS) and K-fold cross validation(K-CV). The SCA includes two le... This study presents a framework for predicting geological characteristics based on integrating a stacking classification algorithm(SCA) with a grid search(GS) and K-fold cross validation(K-CV). The SCA includes two learner layers: a primary learner’s layer and meta-classifier layer. The accuracy of the SCA can be improved by using the GS and K-CV. The GS was developed to match the hyper-parameters and optimise complicated problems. The K-CV is commonly applied to changing the validation set in a training set. In general, a GS is usually combined with K-CV to produce a corresponding evaluation index and select the best hyper-parameters. The torque penetration index(TPI) and field penetration index(FPI) are proposed based on shield parameters to express the geological characteristics. The elbow method(EM) and silhouette coefficient(Si) are employed to determine the types of geological characteristics(K) in a Kmeans++ algorithm. A case study on mixed ground in Guangzhou is adopted to validate the applicability of the developed model. The results show that with the developed framework, the four selected parameters, i.e. thrust, advance rate, cutterhead rotation speed and cutterhead torque, can be used to effectively predict the corresponding geological characteristics. 展开更多
关键词 Geological characteristics Stacking classification algorithm(SCA) K-fold cross-validation(K-CV) K-means++
在线阅读 下载PDF
Multi-class Classification Methods of Enhanced LS-TWSVM for Strip Steel Surface Defects 被引量:4
20
作者 Mao-xiang CHU An-na WANG +1 位作者 Rong-fen GONG Mo SHA 《Journal of Iron and Steel Research International》 SCIE EI CAS CSCD 2014年第2期174-180,共7页
Considering strip steel surface defect samples, a multi-class classification method was proposed based on enhanced least squares twin support vector machines (ELS-TWSVMs) and binary tree. Firstly, pruning region sam... Considering strip steel surface defect samples, a multi-class classification method was proposed based on enhanced least squares twin support vector machines (ELS-TWSVMs) and binary tree. Firstly, pruning region samples center method with adjustable pruning scale was used to prune data samples. This method could reduce classifierr s training time and testing time. Secondly, ELS-TWSVM was proposed to classify the data samples. By introducing error variable contribution parameter and weight parameter, ELS-TWSVM could restrain the impact of noise sam- ples and have better classification accuracy. Finally, multi-class classification algorithms of ELS-TWSVM were pro- posed by combining ELS-TWSVM and complete binary tree. Some experiments were made on two-dimensional data- sets and strip steel surface defect datasets. The experiments showed that the multi-class classification methods of ELS-TWSVM had higher classification speed and accuracy for the datasets with large-scale, unbalanced and noise samples. 展开更多
关键词 multi-class classification least squares twin support vector machine error variable contribution WEIGHT binary tree strip steel surface
原文传递
上一页 1 2 211 下一页 到第
使用帮助 返回顶部