Cyberbullying on social media poses significant psychological risks,yet most detection systems over-simplify the task by focusing on binary classification,ignoring nuanced categories like passive-aggressive remarks or...Cyberbullying on social media poses significant psychological risks,yet most detection systems over-simplify the task by focusing on binary classification,ignoring nuanced categories like passive-aggressive remarks or indirect slurs.To address this gap,we propose a hybrid framework combining Term Frequency-Inverse Document Frequency(TF-IDF),word-to-vector(Word2Vec),and Bidirectional Encoder Representations from Transformers(BERT)based models for multi-class cyberbullying detection.Our approach integrates TF-IDF for lexical specificity and Word2Vec for semantic relationships,fused with BERT’s contextual embeddings to capture syntactic and semantic complexities.We evaluate the framework on a publicly available dataset of 47,000 annotated social media posts across five cyberbullying categories:age,ethnicity,gender,religion,and indirect aggression.Among BERT variants tested,BERT Base Un-Cased achieved the highest performance with 93%accuracy(standard deviation across±1%5-fold cross-validation)and an average AUC of 0.96,outperforming standalone TF-IDF(78%)and Word2Vec(82%)models.Notably,it achieved near-perfect AUC scores(0.99)for age and ethnicity-based bullying.A comparative analysis with state-of-the-art benchmarks,including Generative Pre-trained Transformer 2(GPT-2)and Text-to-Text Transfer Transformer(T5)models highlights BERT’s superiority in handling ambiguous language.This work advances cyberbullying detection by demonstrating how hybrid feature extraction and transformer models improve multi-class classification,offering a scalable solution for moderating nuanced harmful content.展开更多
Aiming at the limitations of rapid fault diagnosis of blast furnace, a novel strategy based on cost-conscious least squares support vector machine (LS-SVM) is proposed to solve this problem. Firstly, modified discre...Aiming at the limitations of rapid fault diagnosis of blast furnace, a novel strategy based on cost-conscious least squares support vector machine (LS-SVM) is proposed to solve this problem. Firstly, modified discrete particle swarm optimization is applied to optimize the feature selection and the LS-SVM parameters. Secondly, cost-con- scious formula is presented for fitness function and it contains in detail training time, recognition accuracy and the feature selection. The CLS-SVM algorithm is presented to increase the performance of the LS-SVM classifier. The new method can select the best fault features in much shorter time and have fewer support vectbrs and better general- ization performance in the application of fault diagnosis of the blast furnace. Thirdly, a gradual change binary tree is established for blast furnace faults diagnosis. It is a multi-class classification method based on center-of-gravity formula distance of cluster. A gradual change classification percentage ia used to select sample randomly. The proposed new metbod raises the sped of diagnosis, optimizes the classifieation scraraey and has good generalization ability for fault diagnosis of the application of blast furnace.展开更多
Considering strip steel surface defect samples, a multi-class classification method was proposed based on enhanced least squares twin support vector machines (ELS-TWSVMs) and binary tree. Firstly, pruning region sam...Considering strip steel surface defect samples, a multi-class classification method was proposed based on enhanced least squares twin support vector machines (ELS-TWSVMs) and binary tree. Firstly, pruning region samples center method with adjustable pruning scale was used to prune data samples. This method could reduce classifierr s training time and testing time. Secondly, ELS-TWSVM was proposed to classify the data samples. By introducing error variable contribution parameter and weight parameter, ELS-TWSVM could restrain the impact of noise sam- ples and have better classification accuracy. Finally, multi-class classification algorithms of ELS-TWSVM were pro- posed by combining ELS-TWSVM and complete binary tree. Some experiments were made on two-dimensional data- sets and strip steel surface defect datasets. The experiments showed that the multi-class classification methods of ELS-TWSVM had higher classification speed and accuracy for the datasets with large-scale, unbalanced and noise samples.展开更多
The basic idea of multi-class classification is a disassembly method,which is to decompose a multi-class classification task into several binary classification tasks.In order to improve the accuracy of multi-class cla...The basic idea of multi-class classification is a disassembly method,which is to decompose a multi-class classification task into several binary classification tasks.In order to improve the accuracy of multi-class classification in the case of insufficient samples,this paper proposes a multi-class classification method combining K-means and multi-task relationship learning(MTRL).The method first uses the split method of One vs.Rest to disassemble the multi-class classification task into binary classification tasks.K-means is used to down sample the dataset of each task,which can prevent over-fitting of the model while reducing training costs.Finally,the sampled dataset is applied to the MTRL,and multiple binary classifiers are trained together.With the help of MTRL,this method can utilize the inter-task association to train the model,and achieve the purpose of improving the classification accuracy of each binary classifier.The effectiveness of the proposed approach is demonstrated by experimental results on the Iris dataset,Wine dataset,Multiple Features dataset,Wireless Indoor Localization dataset and Avila dataset.展开更多
Focusing on strip steel surface defects classification, a novel support vector machine with adjustable hyper-sphere (AHSVM) is formulated. Meanwhile, a new multi-class classification method is proposed. Originated f...Focusing on strip steel surface defects classification, a novel support vector machine with adjustable hyper-sphere (AHSVM) is formulated. Meanwhile, a new multi-class classification method is proposed. Originated from support vector data description, AHSVM adopts hyper-sphere to solve classification problem. AHSVM can obey two principles: the margin maximization and inner-class dispersion minimization. Moreover, the hyper-sphere of AHSVM is adjustable, which makes the final classification hyper-sphere optimal for training dataset. On the other hand, AHSVM is combined with binary tree to solve multi-class classification for steel surface defects. A scheme of samples pruning in mapped feature space is provided, which can reduce the number of training samples under the premise of classification accuracy, resulting in the improvements of classification speed. Finally, some testing experiments are done for eight types of strip steel surface defects. Experimental results show that multi-class AHSVM classifier exhibits satisfactory results in classification accuracy and efficiency.展开更多
Defect classification is the key task of a steel surface defect detection system.The current defect classification algorithms have not taken the feature noise into consideration.In order to reduce the adverse impact o...Defect classification is the key task of a steel surface defect detection system.The current defect classification algorithms have not taken the feature noise into consideration.In order to reduce the adverse impact of feature noise,an anti-noise multi-class classification method was proposed for steel surface defects.On the one hand,a novel anti-noise support vector hyper-spheres(ASVHs)classifier was formulated.For N types of defects,the ASVHs classifier built N hyper-spheres.These hyper-spheres were insensitive to feature and label noise.On the other hand,in order to reduce the costs of online time and storage space,the defect samples were pruned by support vector data description with parameter iteration adjustment strategy.In the end,the ASVHs classifier was built with sparse defect samples set and auxiliary information.Experimental results show that the novel multi-class classification method has high efficiency and accuracy for corrupted defect samples in steel surface.展开更多
The accurate identification and classification of various power quality disturbances are keys to ensuring high-quality electrical energy. In this study, the statistical characteristics of the disturbance signal of wav...The accurate identification and classification of various power quality disturbances are keys to ensuring high-quality electrical energy. In this study, the statistical characteristics of the disturbance signal of wavelet transform coefficients and wavelet transform energy distribution constitute feature vectors. These vectors are then trained and tested using SVM multi-class algorithms. Experimental results demonstrate that the SVM multi-class algorithms, which use the Gaussian radial basis function, exponential radial basis function, and hyperbolic tangent function as basis functions, are suitable methods for power quality disturbance classification.展开更多
A holistic analysis of problem and incident tickets in a real production cloud service environment is presented in this paper.By extracting different bags of words,we use principal component analysis(PCA)to examine th...A holistic analysis of problem and incident tickets in a real production cloud service environment is presented in this paper.By extracting different bags of words,we use principal component analysis(PCA)to examine the clustering characteristics of these tickets.Then Kmeans and latent Dirichlet allocation(LDA)are applied to show the potential clusters within this Cloud environment.The second part of our study uses a pre-trained bidirectional encoder representation from transformers(BERT)model to classify the tickets,with the goal of predicting the optimal dispatching department for a given ticket.Experimental results show that due to the unique characteristics of ticket description,pre-processing with domain knowledge turns out to be critical in both clustering and classification.Our classification model yields 86%accuracy when predicting the target dispatching department.展开更多
The inverse problems for motions of dynamic systems of which are described by system of the ordinary differential equations are examined. The classification of such type of inverse problems is given. It was shown that...The inverse problems for motions of dynamic systems of which are described by system of the ordinary differential equations are examined. The classification of such type of inverse problems is given. It was shown that inverse problems can be divided into two types: synthesis inverse problems and inverse problems of measurement (recognition). Each type of inverse problems requires separate approach to statements and solution methods. The regularization method for obtaining of stable solution of inverse problems was suggested. In some cases, instead of recognition of inverse problems solution, the estimation of solution can be used. Within the framework of this approach, two practical inverse problems of measurement are considered.展开更多
Big data is a term that refers to a set of data that,due to its largeness or complexity,cannot be stored or processed with one of the usual tools or applications for data management,and it has become a prominent word ...Big data is a term that refers to a set of data that,due to its largeness or complexity,cannot be stored or processed with one of the usual tools or applications for data management,and it has become a prominent word in recent years for the massive development of technology.Almost immediately thereafter,the term“big data mining”emerged,i.e.,mining from big data even as an emerging and interconnected field of research.Classification is an important stage in data mining since it helps people make better decisions in a variety of situations,including scientific endeavors,biomedical research,and industrial applications.The probabilistic neural network(PNN)is a commonly used and successful method for handling classification and pattern recognition issues.In this study,the authors proposed to combine the probabilistic neural network(PPN),which is one of the data mining techniques,with the vibrating particles system(VPS),which is one of the metaheuristic algorithms named“VPS-PNN”,to solve classi-fication problems more effectively.The data set is eleven common benchmark medical datasets from the machine-learning library,the suggested method was tested.The suggested VPS-PNN mechanism outperforms the PNN,biogeography-based optimization,enhanced-water cycle algorithm(E-WCA)and the firefly algorithm(FA)in terms of convergence speed and classification accuracy.展开更多
The establishment of a unified land use classification system is the basis for realizing the unified management of land and sea,urban and rural areas,and aboveground and underground space.In November 2020,the Ministry...The establishment of a unified land use classification system is the basis for realizing the unified management of land and sea,urban and rural areas,and aboveground and underground space.In November 2020,the Ministry of Natural Resources of the People's Republic of China issued the Classification Guide for Land and Space Survey,Planning and Use Control of Land and Sea(for Trial Implementation),which aims to establish a national unified land and sea use classification system,lay an important foundation for scientific planning and unified management of natural resources,rational use and protection of natural resources,and speed up the construction of a new pattern of land and space development and protection.However,there are still some obvious shortcomings in the Classification Guide.This paper analyzes some problems existing in this classification standard from three aspects of logicality,rigorousness and comprehensiveness,and puts forward some suggestions for further improvement.This has important practical significance to better guiding the practice of land use and land resources management,and then to achieving the goal of unified management of natural resources.展开更多
Human activity recognition is a significant area of research in artificial intelligence for surveillance,healthcare,sports,and human-computer interaction applications.The article benchmarks the performance of You Only...Human activity recognition is a significant area of research in artificial intelligence for surveillance,healthcare,sports,and human-computer interaction applications.The article benchmarks the performance of You Only Look Once version 11-based(YOLOv11-based)architecture for multi-class human activity recognition.The article benchmarks the performance of You Only Look Once version 11-based(YOLOv11-based)architecture for multi-class human activity recognition.The dataset consists of 14,186 images across 19 activity classes,from dynamic activities such as running and swimming to static activities such as sitting and sleeping.Preprocessing included resizing all images to 512512 pixels,annotating them in YOLO’s bounding box format,and applying data augmentation methods such as flipping,rotation,and cropping to enhance model generalization.The proposed model was trained for 100 epochs with adaptive learning rate methods and hyperparameter optimization for performance improvement,with a mAP@0.5 of 74.93%and a mAP@0.5-0.95 of 64.11%,outperforming previous versions of YOLO(v10,v9,and v8)and general-purpose architectures like ResNet50 and EfficientNet.It exhibited improved precision and recall for all activity classes with high precision values of 0.76 for running,0.79 for swimming,0.80 for sitting,and 0.81 for sleeping,and was tested for real-time deployment with an inference time of 8.9 ms per image,being computationally light.Proposed YOLOv11’s improvements are attributed to architectural advancements like a more complex feature extraction process,better attention modules,and an anchor-free detection mechanism.While YOLOv10 was extremely stable in static activity recognition,YOLOv9 performed well in dynamic environments but suffered from overfitting,and YOLOv8,while being a decent baseline,failed to differentiate between overlapping static activities.The experimental results determine proposed YOLOv11 to be the most appropriate model,providing an ideal balance between accuracy,computational efficiency,and robustness for real-world deployment.Nevertheless,there exist certain issues to be addressed,particularly in discriminating against visually similar activities and the use of publicly available datasets.Future research will entail the inclusion of 3D data and multimodal sensor inputs,such as depth and motion information,for enhancing recognition accuracy and generalizability to challenging real-world environments.展开更多
With the development of deep learning and Convolutional Neural Networks(CNNs),the accuracy of automatic food recognition based on visual data have significantly improved.Some research studies have shown that the deepe...With the development of deep learning and Convolutional Neural Networks(CNNs),the accuracy of automatic food recognition based on visual data have significantly improved.Some research studies have shown that the deeper the model is,the higher the accuracy is.However,very deep neural networks would be affected by the overfitting problem and also consume huge computing resources.In this paper,a new classification scheme is proposed for automatic food-ingredient recognition based on deep learning.We construct an up-to-date combinational convolutional neural network(CBNet)with a subnet merging technique.Firstly,two different neural networks are utilized for learning interested features.Then,a well-designed feature fusion component aggregates the features from subnetworks,further extracting richer and more precise features for image classification.In order to learn more complementary features,the corresponding fusion strategies are also proposed,including auxiliary classifiers and hyperparameters setting.Finally,CBNet based on the well-known VGGNet,ResNet and DenseNet is evaluated on a dataset including 41 major categories of food ingredients and 100 images for each category.Theoretical analysis and experimental results demonstrate that CBNet achieves promising accuracy for multi-class classification and improves the performance of convolutional neural networks.展开更多
Based on the framework of support vector machines (SVM) using one-against-one (OAO) strategy, a new multi-class kernel method based on directed aeyclie graph (DAG) and probabilistic distance is proposed to raise...Based on the framework of support vector machines (SVM) using one-against-one (OAO) strategy, a new multi-class kernel method based on directed aeyclie graph (DAG) and probabilistic distance is proposed to raise the multi-class classification accuracies. The topology structure of DAG is constructed by rearranging the nodes' sequence in the graph. DAG is equivalent to guided operating SVM on a list, and the classification performance depends on the nodes' sequence in the graph. Jeffries-Matusita distance (JMD) is introduced to estimate the separability of each class, and the implementation list is initialized with all classes organized according to certain sequence in the list. To testify the effectiveness of the proposed method, numerical analysis is conducted on UCI data and hyperspectral data. Meanwhile, comparative studies using standard OAO and DAG classification methods are also conducted and the results illustrate better performance and higher accuracy of the orooosed JMD-DAG method.展开更多
A modified multisurface "proximal support vector machine classifier via generalized eigenvalues (GEPSVM for short)" was proposed. By defining a new principle, we designed a new classification approach via GEPSVM, ...A modified multisurface "proximal support vector machine classifier via generalized eigenvalues (GEPSVM for short)" was proposed. By defining a new principle, we designed a new classification approach via GEPSVM, namely, maximum or minimum plane distance GEPSVM (MPDGEPSVM). Unlike GEPSVM, our approach obtains two planes by solving two simple eigenvalue problems, such that it can avoid occurrence of singular problems. Our approach, compared with GEPSVM, has better classification performalce. Moreover, MPDGEPSVM is over one order of magnitude faster than GEPSVM, and almost two orders of magnitude faster than SVM. Computational results on public datasets from UCI database illustrated the efficiency of MPDGEPSVM.展开更多
In this paper,a new multiclass classification algorithm is proposed based on the idea of Locally Linear Embedding(LLE),to avoid the defect of traditional manifold learning algorithms,which can not deal with new sample...In this paper,a new multiclass classification algorithm is proposed based on the idea of Locally Linear Embedding(LLE),to avoid the defect of traditional manifold learning algorithms,which can not deal with new sample points.The algorithm defines an error as a criterion by computing a sample's reconstruction weight using LLE.Furthermore,the existence and characteristics of low dimensional manifold in range-profile time-frequency information are explored using manifold learning algorithm,aiming at the problem of target recognition about high range resolution MilliMeter-Wave(MMW) radar.The new algorithm is applied to radar target recognition.The experiment results show the algorithm is efficient.Compared with other classification algorithms,our method improves the recognition precision and the result is not sensitive to input parameters.展开更多
Double parallel forward neural network (DPFNN) model is a mixture structure of single-layer perception and single-hidden-layer forward neural network (SLFN). In this paper, by making use of the idea of online sequ...Double parallel forward neural network (DPFNN) model is a mixture structure of single-layer perception and single-hidden-layer forward neural network (SLFN). In this paper, by making use of the idea of online sequential extreme learning machine (OS-ELM) on DPFNN, we derive the online sequential double parallel extreme learning machine algorithm (OS-DPELM). Compared to other similar algorithms, our algorithms can achieve approximate learning performance with fewer numbers of hidden units, as well as the parameters to be determined. The experimental results show that the proposed algorithm has good generalization performance for real world classification problems, and thus can be a necessary and beneficial complement to OS-ELM.展开更多
Datasets with the imbalanced class distribution are difficult to handle with the standard classification algorithms.In supervised learning,dealing with the problem of class imbalance is still considered to be a challe...Datasets with the imbalanced class distribution are difficult to handle with the standard classification algorithms.In supervised learning,dealing with the problem of class imbalance is still considered to be a challenging research problem.Various machine learning techniques are designed to operate on balanced datasets;therefore,the state of the art,different undersampling,over-sampling and hybrid strategies have been proposed to deal with the problem of imbalanced datasets,but highly skewed datasets still pose the problem of generalization and noise generation during resampling.To overcome these problems,this paper proposes amajority clusteringmodel for classification of imbalanced datasets known as MCBC-SMOTE(Majority Clustering for balanced Classification-SMOTE).The model provides a method to convert the problem of binary classification into a multi-class problem.In the proposed algorithm,the number of clusters for themajority class is calculated using the elbow method and the minority class is over-sampled as an average of clustered majority classes to generate a symmetrical class distribution.The proposed technique is cost-effective,reduces the problem of noise generation and successfully disables the imbalances present in between and within classes.The results of the evaluations on diverse real datasets proved to provide better classification results as compared to state of the art existing methodologies based on several performance metrics.展开更多
This paper offers a symbiosis based hybrid modified DNA-ABC optimization algorithm which combines modified DNA concepts and artificial bee colony (ABC) algorithm to aid hierarchical fuzzy classification. According to ...This paper offers a symbiosis based hybrid modified DNA-ABC optimization algorithm which combines modified DNA concepts and artificial bee colony (ABC) algorithm to aid hierarchical fuzzy classification. According to literature, the ABC algorithm is traditionally applied to constrained and unconstrained problems, but is combined with modified DNA concepts and implemented for fuzzy classification in this present research. Moreover, from the best of our knowledge, previous research on the ABC algorithm has not combined it with DNA computing for hierarchical fuzzy classification to explore the merits of cooperative coevolution. Therefore, this paper is the first to apply the mechanism of symbiosis to create a hybrid modified DNA-ABC algorithm for hierarchical fuzzy classification applications. In this study, the partition number and the shape of the membership function are extracted by the symbiosis based hybrid modified DNA-ABC optimization algorithm, which provides both sufficient global exploration and also adequate local exploitation for hierarchical fuzzy classification. The proposed optimization algorithm is applied on five benchmark University of Irvine (UCI) data sets, and the results prove the efficiency of the algorithm.展开更多
Purpose:A text generation based multidisciplinary problem identification method is proposed,which does not rely on a large amount of data annotation.Design/methodology/approach:The proposed method first identifies the...Purpose:A text generation based multidisciplinary problem identification method is proposed,which does not rely on a large amount of data annotation.Design/methodology/approach:The proposed method first identifies the research objective types and disciplinary labels of papers using a text classification technique;second,it generates abstractive titles for each paper based on abstract and research objective types using a generative pre-trained language model;third,it extracts problem phrases from generated titles according to regular expression rules;fourth,it creates problem relation networks and identifies the same problems by exploiting a weighted community detection algorithm;finally,it identifies multidisciplinary problems based on the disciplinary labels of papers.Findings:Experiments in the“Carbon Peaking and Carbon Neutrality”field show that the proposed method can effectively identify multidisciplinary research problems.The disciplinary distribution of the identified problems is consistent with our understanding of multidisciplinary collaboration in the field.Research limitations:It is necessary to use the proposed method in other multidisciplinary fields to validate its effectiveness.Practical implications:Multidisciplinary problem identification helps to gather multidisciplinary forces to solve complex real-world problems for the governments,fund valuable multidisciplinary problems for research management authorities,and borrow ideas from other disciplines for researchers.Originality/value:This approach proposes a novel multidisciplinary problem identification method based on text generation,which identifies multidisciplinary problems based on generative abstractive titles of papers without data annotation required by standard sequence labeling techniques.展开更多
基金funded by Scientific Research Deanship at University of Hail-Saudi Arabia through Project Number RG-23092.
文摘Cyberbullying on social media poses significant psychological risks,yet most detection systems over-simplify the task by focusing on binary classification,ignoring nuanced categories like passive-aggressive remarks or indirect slurs.To address this gap,we propose a hybrid framework combining Term Frequency-Inverse Document Frequency(TF-IDF),word-to-vector(Word2Vec),and Bidirectional Encoder Representations from Transformers(BERT)based models for multi-class cyberbullying detection.Our approach integrates TF-IDF for lexical specificity and Word2Vec for semantic relationships,fused with BERT’s contextual embeddings to capture syntactic and semantic complexities.We evaluate the framework on a publicly available dataset of 47,000 annotated social media posts across five cyberbullying categories:age,ethnicity,gender,religion,and indirect aggression.Among BERT variants tested,BERT Base Un-Cased achieved the highest performance with 93%accuracy(standard deviation across±1%5-fold cross-validation)and an average AUC of 0.96,outperforming standalone TF-IDF(78%)and Word2Vec(82%)models.Notably,it achieved near-perfect AUC scores(0.99)for age and ethnicity-based bullying.A comparative analysis with state-of-the-art benchmarks,including Generative Pre-trained Transformer 2(GPT-2)and Text-to-Text Transfer Transformer(T5)models highlights BERT’s superiority in handling ambiguous language.This work advances cyberbullying detection by demonstrating how hybrid feature extraction and transformer models improve multi-class classification,offering a scalable solution for moderating nuanced harmful content.
基金Item Sponsored by National Natural Science Foundation of China(60843007,61050006)
文摘Aiming at the limitations of rapid fault diagnosis of blast furnace, a novel strategy based on cost-conscious least squares support vector machine (LS-SVM) is proposed to solve this problem. Firstly, modified discrete particle swarm optimization is applied to optimize the feature selection and the LS-SVM parameters. Secondly, cost-con- scious formula is presented for fitness function and it contains in detail training time, recognition accuracy and the feature selection. The CLS-SVM algorithm is presented to increase the performance of the LS-SVM classifier. The new method can select the best fault features in much shorter time and have fewer support vectbrs and better general- ization performance in the application of fault diagnosis of the blast furnace. Thirdly, a gradual change binary tree is established for blast furnace faults diagnosis. It is a multi-class classification method based on center-of-gravity formula distance of cluster. A gradual change classification percentage ia used to select sample randomly. The proposed new metbod raises the sped of diagnosis, optimizes the classifieation scraraey and has good generalization ability for fault diagnosis of the application of blast furnace.
基金Item Sponsored by National Natural Science Foundation of China(61050006)
文摘Considering strip steel surface defect samples, a multi-class classification method was proposed based on enhanced least squares twin support vector machines (ELS-TWSVMs) and binary tree. Firstly, pruning region samples center method with adjustable pruning scale was used to prune data samples. This method could reduce classifierr s training time and testing time. Secondly, ELS-TWSVM was proposed to classify the data samples. By introducing error variable contribution parameter and weight parameter, ELS-TWSVM could restrain the impact of noise sam- ples and have better classification accuracy. Finally, multi-class classification algorithms of ELS-TWSVM were pro- posed by combining ELS-TWSVM and complete binary tree. Some experiments were made on two-dimensional data- sets and strip steel surface defect datasets. The experiments showed that the multi-class classification methods of ELS-TWSVM had higher classification speed and accuracy for the datasets with large-scale, unbalanced and noise samples.
基金supported by the National Natural Science Foundation of China(61703131 61703129+1 种基金 61701148 61703128)
文摘The basic idea of multi-class classification is a disassembly method,which is to decompose a multi-class classification task into several binary classification tasks.In order to improve the accuracy of multi-class classification in the case of insufficient samples,this paper proposes a multi-class classification method combining K-means and multi-task relationship learning(MTRL).The method first uses the split method of One vs.Rest to disassemble the multi-class classification task into binary classification tasks.K-means is used to down sample the dataset of each task,which can prevent over-fitting of the model while reducing training costs.Finally,the sampled dataset is applied to the MTRL,and multiple binary classifiers are trained together.With the help of MTRL,this method can utilize the inter-task association to train the model,and achieve the purpose of improving the classification accuracy of each binary classifier.The effectiveness of the proposed approach is demonstrated by experimental results on the Iris dataset,Wine dataset,Multiple Features dataset,Wireless Indoor Localization dataset and Avila dataset.
文摘Focusing on strip steel surface defects classification, a novel support vector machine with adjustable hyper-sphere (AHSVM) is formulated. Meanwhile, a new multi-class classification method is proposed. Originated from support vector data description, AHSVM adopts hyper-sphere to solve classification problem. AHSVM can obey two principles: the margin maximization and inner-class dispersion minimization. Moreover, the hyper-sphere of AHSVM is adjustable, which makes the final classification hyper-sphere optimal for training dataset. On the other hand, AHSVM is combined with binary tree to solve multi-class classification for steel surface defects. A scheme of samples pruning in mapped feature space is provided, which can reduce the number of training samples under the premise of classification accuracy, resulting in the improvements of classification speed. Finally, some testing experiments are done for eight types of strip steel surface defects. Experimental results show that multi-class AHSVM classifier exhibits satisfactory results in classification accuracy and efficiency.
基金This work was supported by the National Natural Science Foundation of China(No.51674140)Natural Science Foundation of Liaoning Province,China(No.20180550067)+2 种基金Department of Education of Liaoning Province,China(Nos.2017LNQN11 and 2020LNZD06)University of Science and Technology Liaoning Talent Project Grants(No.601011507-20)University of Science and Technology Liaoning Team Building Grants(No.601013360-17).
文摘Defect classification is the key task of a steel surface defect detection system.The current defect classification algorithms have not taken the feature noise into consideration.In order to reduce the adverse impact of feature noise,an anti-noise multi-class classification method was proposed for steel surface defects.On the one hand,a novel anti-noise support vector hyper-spheres(ASVHs)classifier was formulated.For N types of defects,the ASVHs classifier built N hyper-spheres.These hyper-spheres were insensitive to feature and label noise.On the other hand,in order to reduce the costs of online time and storage space,the defect samples were pruned by support vector data description with parameter iteration adjustment strategy.In the end,the ASVHs classifier was built with sparse defect samples set and auxiliary information.Experimental results show that the novel multi-class classification method has high efficiency and accuracy for corrupted defect samples in steel surface.
文摘The accurate identification and classification of various power quality disturbances are keys to ensuring high-quality electrical energy. In this study, the statistical characteristics of the disturbance signal of wavelet transform coefficients and wavelet transform energy distribution constitute feature vectors. These vectors are then trained and tested using SVM multi-class algorithms. Experimental results demonstrate that the SVM multi-class algorithms, which use the Gaussian radial basis function, exponential radial basis function, and hyperbolic tangent function as basis functions, are suitable methods for power quality disturbance classification.
文摘A holistic analysis of problem and incident tickets in a real production cloud service environment is presented in this paper.By extracting different bags of words,we use principal component analysis(PCA)to examine the clustering characteristics of these tickets.Then Kmeans and latent Dirichlet allocation(LDA)are applied to show the potential clusters within this Cloud environment.The second part of our study uses a pre-trained bidirectional encoder representation from transformers(BERT)model to classify the tickets,with the goal of predicting the optimal dispatching department for a given ticket.Experimental results show that due to the unique characteristics of ticket description,pre-processing with domain knowledge turns out to be critical in both clustering and classification.Our classification model yields 86%accuracy when predicting the target dispatching department.
文摘The inverse problems for motions of dynamic systems of which are described by system of the ordinary differential equations are examined. The classification of such type of inverse problems is given. It was shown that inverse problems can be divided into two types: synthesis inverse problems and inverse problems of measurement (recognition). Each type of inverse problems requires separate approach to statements and solution methods. The regularization method for obtaining of stable solution of inverse problems was suggested. In some cases, instead of recognition of inverse problems solution, the estimation of solution can be used. Within the framework of this approach, two practical inverse problems of measurement are considered.
文摘Big data is a term that refers to a set of data that,due to its largeness or complexity,cannot be stored or processed with one of the usual tools or applications for data management,and it has become a prominent word in recent years for the massive development of technology.Almost immediately thereafter,the term“big data mining”emerged,i.e.,mining from big data even as an emerging and interconnected field of research.Classification is an important stage in data mining since it helps people make better decisions in a variety of situations,including scientific endeavors,biomedical research,and industrial applications.The probabilistic neural network(PNN)is a commonly used and successful method for handling classification and pattern recognition issues.In this study,the authors proposed to combine the probabilistic neural network(PPN),which is one of the data mining techniques,with the vibrating particles system(VPS),which is one of the metaheuristic algorithms named“VPS-PNN”,to solve classi-fication problems more effectively.The data set is eleven common benchmark medical datasets from the machine-learning library,the suggested method was tested.The suggested VPS-PNN mechanism outperforms the PNN,biogeography-based optimization,enhanced-water cycle algorithm(E-WCA)and the firefly algorithm(FA)in terms of convergence speed and classification accuracy.
文摘The establishment of a unified land use classification system is the basis for realizing the unified management of land and sea,urban and rural areas,and aboveground and underground space.In November 2020,the Ministry of Natural Resources of the People's Republic of China issued the Classification Guide for Land and Space Survey,Planning and Use Control of Land and Sea(for Trial Implementation),which aims to establish a national unified land and sea use classification system,lay an important foundation for scientific planning and unified management of natural resources,rational use and protection of natural resources,and speed up the construction of a new pattern of land and space development and protection.However,there are still some obvious shortcomings in the Classification Guide.This paper analyzes some problems existing in this classification standard from three aspects of logicality,rigorousness and comprehensiveness,and puts forward some suggestions for further improvement.This has important practical significance to better guiding the practice of land use and land resources management,and then to achieving the goal of unified management of natural resources.
基金supported by King Saud University,Riyadh,Saudi Arabia,under Ongoing Research Funding Program(ORF-2025-951).
文摘Human activity recognition is a significant area of research in artificial intelligence for surveillance,healthcare,sports,and human-computer interaction applications.The article benchmarks the performance of You Only Look Once version 11-based(YOLOv11-based)architecture for multi-class human activity recognition.The article benchmarks the performance of You Only Look Once version 11-based(YOLOv11-based)architecture for multi-class human activity recognition.The dataset consists of 14,186 images across 19 activity classes,from dynamic activities such as running and swimming to static activities such as sitting and sleeping.Preprocessing included resizing all images to 512512 pixels,annotating them in YOLO’s bounding box format,and applying data augmentation methods such as flipping,rotation,and cropping to enhance model generalization.The proposed model was trained for 100 epochs with adaptive learning rate methods and hyperparameter optimization for performance improvement,with a mAP@0.5 of 74.93%and a mAP@0.5-0.95 of 64.11%,outperforming previous versions of YOLO(v10,v9,and v8)and general-purpose architectures like ResNet50 and EfficientNet.It exhibited improved precision and recall for all activity classes with high precision values of 0.76 for running,0.79 for swimming,0.80 for sitting,and 0.81 for sleeping,and was tested for real-time deployment with an inference time of 8.9 ms per image,being computationally light.Proposed YOLOv11’s improvements are attributed to architectural advancements like a more complex feature extraction process,better attention modules,and an anchor-free detection mechanism.While YOLOv10 was extremely stable in static activity recognition,YOLOv9 performed well in dynamic environments but suffered from overfitting,and YOLOv8,while being a decent baseline,failed to differentiate between overlapping static activities.The experimental results determine proposed YOLOv11 to be the most appropriate model,providing an ideal balance between accuracy,computational efficiency,and robustness for real-world deployment.Nevertheless,there exist certain issues to be addressed,particularly in discriminating against visually similar activities and the use of publicly available datasets.Future research will entail the inclusion of 3D data and multimodal sensor inputs,such as depth and motion information,for enhancing recognition accuracy and generalizability to challenging real-world environments.
基金This paper is partially supported by National Natural Foundation of China(Grant No.61772561)the Key Research&Development Plan of Hunan Province(Grant No.2018NK2012)+2 种基金Postgraduate Research and Innovative Project of Central South University of Forestry and Technology(Grant No.20183012)Graduate Education and Teaching Reform Project of Central South University of Forestry and Technology(Grant No.2018JG005)Teaching Reform Project of Central South University of Forestry and Technology(Grant No.20180682).
文摘With the development of deep learning and Convolutional Neural Networks(CNNs),the accuracy of automatic food recognition based on visual data have significantly improved.Some research studies have shown that the deeper the model is,the higher the accuracy is.However,very deep neural networks would be affected by the overfitting problem and also consume huge computing resources.In this paper,a new classification scheme is proposed for automatic food-ingredient recognition based on deep learning.We construct an up-to-date combinational convolutional neural network(CBNet)with a subnet merging technique.Firstly,two different neural networks are utilized for learning interested features.Then,a well-designed feature fusion component aggregates the features from subnetworks,further extracting richer and more precise features for image classification.In order to learn more complementary features,the corresponding fusion strategies are also proposed,including auxiliary classifiers and hyperparameters setting.Finally,CBNet based on the well-known VGGNet,ResNet and DenseNet is evaluated on a dataset including 41 major categories of food ingredients and 100 images for each category.Theoretical analysis and experimental results demonstrate that CBNet achieves promising accuracy for multi-class classification and improves the performance of convolutional neural networks.
基金Sponsored by the National Natural Science Foundation of China(Grant No.61201310)the Fundamental Research Funds for the Central Universities(Grant No.HIT.NSRIF.201160)the China Postdoctoral Science Foundation(Grant No.20110491067)
文摘Based on the framework of support vector machines (SVM) using one-against-one (OAO) strategy, a new multi-class kernel method based on directed aeyclie graph (DAG) and probabilistic distance is proposed to raise the multi-class classification accuracies. The topology structure of DAG is constructed by rearranging the nodes' sequence in the graph. DAG is equivalent to guided operating SVM on a list, and the classification performance depends on the nodes' sequence in the graph. Jeffries-Matusita distance (JMD) is introduced to estimate the separability of each class, and the implementation list is initialized with all classes organized according to certain sequence in the list. To testify the effectiveness of the proposed method, numerical analysis is conducted on UCI data and hyperspectral data. Meanwhile, comparative studies using standard OAO and DAG classification methods are also conducted and the results illustrate better performance and higher accuracy of the orooosed JMD-DAG method.
基金The National Defence Basic Research Pro-gram in China(No.S0500A001)the National High Technol-ogy Research and Development Program of China(863 Pro-gram) (No.2002AA411030)the Scientific and Techno-logical Innovation Foundation of Jiangsu Province in China
文摘A modified multisurface "proximal support vector machine classifier via generalized eigenvalues (GEPSVM for short)" was proposed. By defining a new principle, we designed a new classification approach via GEPSVM, namely, maximum or minimum plane distance GEPSVM (MPDGEPSVM). Unlike GEPSVM, our approach obtains two planes by solving two simple eigenvalue problems, such that it can avoid occurrence of singular problems. Our approach, compared with GEPSVM, has better classification performalce. Moreover, MPDGEPSVM is over one order of magnitude faster than GEPSVM, and almost two orders of magnitude faster than SVM. Computational results on public datasets from UCI database illustrated the efficiency of MPDGEPSVM.
基金Supported by the National Defense Pre-Research Foundation of China (Grant No.9140A05070107BQ0204)
文摘In this paper,a new multiclass classification algorithm is proposed based on the idea of Locally Linear Embedding(LLE),to avoid the defect of traditional manifold learning algorithms,which can not deal with new sample points.The algorithm defines an error as a criterion by computing a sample's reconstruction weight using LLE.Furthermore,the existence and characteristics of low dimensional manifold in range-profile time-frequency information are explored using manifold learning algorithm,aiming at the problem of target recognition about high range resolution MilliMeter-Wave(MMW) radar.The new algorithm is applied to radar target recognition.The experiment results show the algorithm is efficient.Compared with other classification algorithms,our method improves the recognition precision and the result is not sensitive to input parameters.
基金Supported by the National Natural Science Foundation of China(Grant Nos.1140107661473328+1 种基金1117136761473059)
文摘Double parallel forward neural network (DPFNN) model is a mixture structure of single-layer perception and single-hidden-layer forward neural network (SLFN). In this paper, by making use of the idea of online sequential extreme learning machine (OS-ELM) on DPFNN, we derive the online sequential double parallel extreme learning machine algorithm (OS-DPELM). Compared to other similar algorithms, our algorithms can achieve approximate learning performance with fewer numbers of hidden units, as well as the parameters to be determined. The experimental results show that the proposed algorithm has good generalization performance for real world classification problems, and thus can be a necessary and beneficial complement to OS-ELM.
基金This research was supported by Taif University Researchers Supporting Project number(TURSP-2020/254),Taif University,Taif,Saudi Arabia.
文摘Datasets with the imbalanced class distribution are difficult to handle with the standard classification algorithms.In supervised learning,dealing with the problem of class imbalance is still considered to be a challenging research problem.Various machine learning techniques are designed to operate on balanced datasets;therefore,the state of the art,different undersampling,over-sampling and hybrid strategies have been proposed to deal with the problem of imbalanced datasets,but highly skewed datasets still pose the problem of generalization and noise generation during resampling.To overcome these problems,this paper proposes amajority clusteringmodel for classification of imbalanced datasets known as MCBC-SMOTE(Majority Clustering for balanced Classification-SMOTE).The model provides a method to convert the problem of binary classification into a multi-class problem.In the proposed algorithm,the number of clusters for themajority class is calculated using the elbow method and the minority class is over-sampled as an average of clustered majority classes to generate a symmetrical class distribution.The proposed technique is cost-effective,reduces the problem of noise generation and successfully disables the imbalances present in between and within classes.The results of the evaluations on diverse real datasets proved to provide better classification results as compared to state of the art existing methodologies based on several performance metrics.
文摘This paper offers a symbiosis based hybrid modified DNA-ABC optimization algorithm which combines modified DNA concepts and artificial bee colony (ABC) algorithm to aid hierarchical fuzzy classification. According to literature, the ABC algorithm is traditionally applied to constrained and unconstrained problems, but is combined with modified DNA concepts and implemented for fuzzy classification in this present research. Moreover, from the best of our knowledge, previous research on the ABC algorithm has not combined it with DNA computing for hierarchical fuzzy classification to explore the merits of cooperative coevolution. Therefore, this paper is the first to apply the mechanism of symbiosis to create a hybrid modified DNA-ABC algorithm for hierarchical fuzzy classification applications. In this study, the partition number and the shape of the membership function are extracted by the symbiosis based hybrid modified DNA-ABC optimization algorithm, which provides both sufficient global exploration and also adequate local exploitation for hierarchical fuzzy classification. The proposed optimization algorithm is applied on five benchmark University of Irvine (UCI) data sets, and the results prove the efficiency of the algorithm.
基金supported by the General Projects of ISTIC Innovation Foundation“Problem innovation solution mining based on text generation model”(MS2024-03).
文摘Purpose:A text generation based multidisciplinary problem identification method is proposed,which does not rely on a large amount of data annotation.Design/methodology/approach:The proposed method first identifies the research objective types and disciplinary labels of papers using a text classification technique;second,it generates abstractive titles for each paper based on abstract and research objective types using a generative pre-trained language model;third,it extracts problem phrases from generated titles according to regular expression rules;fourth,it creates problem relation networks and identifies the same problems by exploiting a weighted community detection algorithm;finally,it identifies multidisciplinary problems based on the disciplinary labels of papers.Findings:Experiments in the“Carbon Peaking and Carbon Neutrality”field show that the proposed method can effectively identify multidisciplinary research problems.The disciplinary distribution of the identified problems is consistent with our understanding of multidisciplinary collaboration in the field.Research limitations:It is necessary to use the proposed method in other multidisciplinary fields to validate its effectiveness.Practical implications:Multidisciplinary problem identification helps to gather multidisciplinary forces to solve complex real-world problems for the governments,fund valuable multidisciplinary problems for research management authorities,and borrow ideas from other disciplines for researchers.Originality/value:This approach proposes a novel multidisciplinary problem identification method based on text generation,which identifies multidisciplinary problems based on generative abstractive titles of papers without data annotation required by standard sequence labeling techniques.