Logistic regression is often used to solve linear binary classification problems such as machine vision,speech recognition,and handwriting recognition.However,it usually fails to solve certain nonlinear multi-classifi...Logistic regression is often used to solve linear binary classification problems such as machine vision,speech recognition,and handwriting recognition.However,it usually fails to solve certain nonlinear multi-classification problem,such as problem with non-equilibrium samples.Many scholars have proposed some methods,such as neural network,least square support vector machine,AdaBoost meta-algorithm,etc.These methods essentially belong to machine learning categories.In this work,based on the probability theory and statistical principle,we propose an improved logistic regression algorithm based on kernel density estimation for solving nonlinear multi-classification.We have compared our approach with other methods using non-equilibrium samples,the results show that our approach guarantees sample integrity and achieves superior classification.展开更多
With the rapid development of Internet of Things technology,the sharp increase in network devices and their inherent security vulnerabilities present a stark contrast,bringing unprecedented challenges to the field of ...With the rapid development of Internet of Things technology,the sharp increase in network devices and their inherent security vulnerabilities present a stark contrast,bringing unprecedented challenges to the field of network security,especially in identifying malicious attacks.However,due to the uneven distribution of network traffic data,particularly the imbalance between attack traffic and normal traffic,as well as the imbalance between minority class attacks and majority class attacks,traditional machine learning detection algorithms have significant limitations when dealing with sparse network traffic data.To effectively tackle this challenge,we have designed a lightweight intrusion detection model based on diffusion mechanisms,named Diff-IDS,with the core objective of enhancing the model’s efficiency in parsing complex network traffic features,thereby significantly improving its detection speed and training efficiency.The model begins by finely filtering network traffic features and converting them into grayscale images,while also employing image-flipping techniques for data augmentation.Subsequently,these preprocessed images are fed into a diffusion model based on the Unet architecture for training.Once the model is trained,we fix the weights of the Unet network and propose a feature enhancement algorithm based on feature masking to further boost the model’s expressiveness.Finally,we devise an end-to-end lightweight detection strategy to streamline the model,enabling efficient lightweight detection of imbalanced samples.Our method has been subjected to multiple experimental tests on renowned network intrusion detection benchmarks,including CICIDS 2017,KDD 99,and NSL-KDD.The experimental results indicate that Diff-IDS leads in terms of detection accuracy,training efficiency,and lightweight metrics compared to the current state-of-the-art models,demonstrating exceptional detection capabilities and robustness.展开更多
An improved particle swarm optimization(PSO) algorithm is proposed to train the fuzzy support vector machine(FSVM) for pattern multi-classification.In the improved algorithm,the particles studies not only from its...An improved particle swarm optimization(PSO) algorithm is proposed to train the fuzzy support vector machine(FSVM) for pattern multi-classification.In the improved algorithm,the particles studies not only from itself and the best one but also from the mean value of some other particles.In addition,adaptive mutation was introduced to reduce the rate of premature convergence.The experimental results on the synthetic aperture radar(SAR) target recognition of moving and stationary target acquisition and recognition(MSTAR) dataset and character recognition of MNIST database show that the improved algorithm is feasible and effective for fuzzy multi-class SVM training.展开更多
The zero-degree calorimeter(ZDC)plays a crucial role toward determining the centrality in the Cooling-Storage-Ring External-target Experiment(CEE)at the Heavy Ion Research Facility in Lanzhou.A boosted decision tree(B...The zero-degree calorimeter(ZDC)plays a crucial role toward determining the centrality in the Cooling-Storage-Ring External-target Experiment(CEE)at the Heavy Ion Research Facility in Lanzhou.A boosted decision tree(BDT)multi-classification algorithm was employed to classify the centrality of the collision events based on the raw features from ZDC such as the number of fired channels and deposited energy.The data from simulated^(238)U+^(238)U collisions at 500 MeV∕u,generated by the IQMD event generator and subsequently modeled using the GEANT4 package,were employed to train and test the BDT model.The results showed the high accuracy of the multi-classification model adopted in ZDC for centrality determination,which is robust against variations in different factors of detector geometry and response.This study demon-strates the good performance of CEE-ZDC in determining the centrality in nucleus-nucleus collisions.展开更多
The multi-purpose forensics is an important tool for forge image detection.In this paper,we propose a universal feature set for the multi-purpose forensics which is capable of simultaneously identifying several typica...The multi-purpose forensics is an important tool for forge image detection.In this paper,we propose a universal feature set for the multi-purpose forensics which is capable of simultaneously identifying several typical image manipulations,including spatial low-pass Gaussian blurring,median filtering,re-sampling,and JPEG compression.To eliminate the influences caused by diverse image contents on the effectiveness and robustness of the feature,a residual group which contains several high-pass filtered residuals is introduced.The partial correlation coefficient is exploited from the residual group to purely measure neighborhood correlations in a linear way.Besides that,we also combine autoregressive coefficient and transition probability to form the proposed composite feature which is used to measure how manipulations change the neighborhood relationships in both linear and non-linear way.After a series of dimension reductions,the proposed feature set can accelerate the training and testing for the multi-purpose forensics.The proposed feature set is then fed into a multi-classifier to train a multi-purpose detector.Experimental results show that the proposed detector can identify several typical image manipulations,and is superior to the complicated deep CNN-based methods in terms of detection accuracy and time efficiency for JPEG compressed image with low resolution.展开更多
This paper proposes a model to analyze the massive data of electricity.Feature subset is determined by the correla-tion-based feature selection and the data-driven methods.The attribute season can be classified succes...This paper proposes a model to analyze the massive data of electricity.Feature subset is determined by the correla-tion-based feature selection and the data-driven methods.The attribute season can be classified successfully through five classi-fiers using the selected feature subset,and the best model can be determined further.The effects on analyzing electricity consump-tion of the other three attributes,including months,businesses,and meters,can be estimated using the chosen model.The data used for the project is provided by Beijing Power Supply Bureau.We use WEKA as the machine learning tool.The models we built are promising for electricity scheduling and power theft detection.展开更多
Currently there are two approaches for a multi-class support vector classifier(SVC). One is to construct and combine several binary classifiers while the other is to directly consider all classes of data in one optimi...Currently there are two approaches for a multi-class support vector classifier(SVC). One is to construct and combine several binary classifiers while the other is to directly consider all classes of data in one optimization formulation. For a K-class problem(K>2),the first approach has to construct at least K classifiers,and the second approach has to solve a much larger op-timization problem proportional to K by the algorithms developed so far. In this paper,following the second approach,we present a novel multi-class large margin classifier(MLMC). This new machine can solve K-class problems in one optimization formula-tion without increasing the size of the quadratic programming(QP) problem proportional to K. This property allows us to construct just one classifier with as few variables in the QP problem as possible to classify multi-class data,and we can gain the advantage of speed from it especially when K is large. Our experiments indicate that MLMC almost works as well as(sometimes better than) many other multi-class SVCs for some benchmark data classification problems,and obtains a reasonable performance in face recognition application on the AR face database.展开更多
As ocular computer-aided diagnostic(CAD)tools become more widely accessible,many researchers are developing deep learning(DL)methods to aid in ocular disease(OHD)diagnosis.Common eye diseases like cataracts(CATR),glau...As ocular computer-aided diagnostic(CAD)tools become more widely accessible,many researchers are developing deep learning(DL)methods to aid in ocular disease(OHD)diagnosis.Common eye diseases like cataracts(CATR),glaucoma(GLU),and age-related macular degeneration(AMD)are the focus of this study,which uses DL to examine their identification.Data imbalance and outliers are widespread in fundus images,which can make it difficult to apply manyDL algorithms to accomplish this analytical assignment.The creation of efficient and reliable DL algorithms is seen to be the key to further enhancing detection performance.Using the analysis of images of the color of the retinal fundus,this study offers a DL model that is combined with a one-of-a-kind concoction loss function(CLF)for the automated identification of OHD.This study presents a combination of focal loss(FL)and correntropy-induced loss functions(CILF)in the proposed DL model to improve the recognition performance of classifiers for biomedical data.This is done because of the good generalization and robustness of these two types of losses in addressing complex datasets with class imbalance and outliers.The classification performance of the DL model with our proposed loss function is compared to that of the baseline models using accuracy(ACU),recall(REC),specificity(SPF),Kappa,and area under the receiver operating characteristic curve(AUC)as the evaluation metrics.The testing shows that the method is reliable and efficient.展开更多
This article takes the companies that publicly issued corporate bonds on the Shanghai and Shenzhen Stock Exchanges from 2006 to 2018 as the research objects selecting six aspects that comprehensively reflect the 17 fi...This article takes the companies that publicly issued corporate bonds on the Shanghai and Shenzhen Stock Exchanges from 2006 to 2018 as the research objects selecting six aspects that comprehensively reflect the 17 financial variables in 6 aspects:profitability,operating ability,bond repayment ability,development ability,cash flow and market value of the company.Principal component analysis method and factor analysis method are used to extract the principal factors of these financial indicator variables.That is how an ordered multi-classification Logistic regression model is constructed to test the impact of the Shanghai and Shenzhen Stock Exchanges’financial status on the corporate bond credit rating.It turns out that the financial status of the Shanghai and Shenzhen Stock Exchanges have an important impact on the credit rating of corporate bonds.The financial status has a greater impact on corporate bonds with credit ratings of A-and AA-,while it has a smaller impact on corporate bonds with credit ratings above AA.The results of this article can help individual and institutional investors prevent risks from investing.展开更多
In recent years,defending against adversarial examples has gained significant importance,leading to a growing body of research in this area.Among these studies,pre-processing defense approaches have emerged as a promi...In recent years,defending against adversarial examples has gained significant importance,leading to a growing body of research in this area.Among these studies,pre-processing defense approaches have emerged as a prominent research direction.However,existing adversarial example pre-processing techniques often employ a single pre-processing model to counter different types of adversarial attacks.Such a strategy may miss the nuances between different types of attacks,limiting the comprehensiveness and effectiveness of the defense strategy.To address this issue,we propose a divide-and-conquer reconstruction pre-processing algorithm via multi-classification and multi-network training to more effectively defend against different types of mainstream adversarial attacks.The premise and challenge of the divide-and-conquer reconstruction defense is to distinguish between multiple types of adversarial attacks.Our method designs an adversarial attack classification module that exploits the high-frequency information differences between different types of adversarial examples for their multi-classification,which can hardly be achieved by existing adversarial example detection methods.In addition,we construct a divide-and-conquer reconstruction module that utilizes different trained image reconstruction models for each type of adversarial attack,ensuring optimal defense effectiveness.Extensive experiments show that our proposed divide-and-conquer defense algorithm exhibits superior performance compared to state-of-the-art pre-processing methods.展开更多
We present an exploratory study to improve the performance of a knowledge push system in product design. We focus on the domain of knowledge matching, where traditional matching algorithms need repeated calculations t...We present an exploratory study to improve the performance of a knowledge push system in product design. We focus on the domain of knowledge matching, where traditional matching algorithms need repeated calculations that result in a long response time and where accuracy needs to be improved. The goal of our approach is to meet designers’ knowledge demands with a quick response and quality service in the knowledge push system. To improve the previous work, two methods are investigated to augment the limited training set in practical operations,namely, oscillating the feature weight and revising the case feature in the case feature vectors. In addition, we propose a multi-classification radial basis function neural network that can match the knowledge from the knowledge base once and ensure the accuracy of pushing results. We apply our approach using the training set in the design of guides by computer numerical control machine tools for training and testing, and the results demonstrate the benefit of the augmented training set. Moreover, experimental results reveal that our approach outperforms other matching approaches.展开更多
基金The authors would like to thank all anonymous reviewers for their suggestions and feedback.This work was supported by National Natural Science Foundation of China(Grant No.61379103).
文摘Logistic regression is often used to solve linear binary classification problems such as machine vision,speech recognition,and handwriting recognition.However,it usually fails to solve certain nonlinear multi-classification problem,such as problem with non-equilibrium samples.Many scholars have proposed some methods,such as neural network,least square support vector machine,AdaBoost meta-algorithm,etc.These methods essentially belong to machine learning categories.In this work,based on the probability theory and statistical principle,we propose an improved logistic regression algorithm based on kernel density estimation for solving nonlinear multi-classification.We have compared our approach with other methods using non-equilibrium samples,the results show that our approach guarantees sample integrity and achieves superior classification.
基金supported by the Key Research and Development Program of Hainan Province(Grant Nos.ZDYF2024GXJS014,ZDYF2023GXJS163)the National Natural Science Foundation of China(NSFC)(Grant Nos.62162022,62162024)Collaborative Innovation Project of Hainan University(XTCX2022XXB02).
文摘With the rapid development of Internet of Things technology,the sharp increase in network devices and their inherent security vulnerabilities present a stark contrast,bringing unprecedented challenges to the field of network security,especially in identifying malicious attacks.However,due to the uneven distribution of network traffic data,particularly the imbalance between attack traffic and normal traffic,as well as the imbalance between minority class attacks and majority class attacks,traditional machine learning detection algorithms have significant limitations when dealing with sparse network traffic data.To effectively tackle this challenge,we have designed a lightweight intrusion detection model based on diffusion mechanisms,named Diff-IDS,with the core objective of enhancing the model’s efficiency in parsing complex network traffic features,thereby significantly improving its detection speed and training efficiency.The model begins by finely filtering network traffic features and converting them into grayscale images,while also employing image-flipping techniques for data augmentation.Subsequently,these preprocessed images are fed into a diffusion model based on the Unet architecture for training.Once the model is trained,we fix the weights of the Unet network and propose a feature enhancement algorithm based on feature masking to further boost the model’s expressiveness.Finally,we devise an end-to-end lightweight detection strategy to streamline the model,enabling efficient lightweight detection of imbalanced samples.Our method has been subjected to multiple experimental tests on renowned network intrusion detection benchmarks,including CICIDS 2017,KDD 99,and NSL-KDD.The experimental results indicate that Diff-IDS leads in terms of detection accuracy,training efficiency,and lightweight metrics compared to the current state-of-the-art models,demonstrating exceptional detection capabilities and robustness.
基金supported by the National Natural Science Foundation of China (60873086)the Aeronautical Science Foundation of China(20085153013)the Fundamental Research Found of Northwestern Polytechnical Unirersity (JC200942)
文摘An improved particle swarm optimization(PSO) algorithm is proposed to train the fuzzy support vector machine(FSVM) for pattern multi-classification.In the improved algorithm,the particles studies not only from itself and the best one but also from the mean value of some other particles.In addition,adaptive mutation was introduced to reduce the rate of premature convergence.The experimental results on the synthetic aperture radar(SAR) target recognition of moving and stationary target acquisition and recognition(MSTAR) dataset and character recognition of MNIST database show that the improved algorithm is feasible and effective for fuzzy multi-class SVM training.
基金This work was supported in part by the National Nature Science Foundation of China(NSFC)(Nos.11927901 and 12175084)the National Key Research and Development Program of China(Nos.2020YFE0202002 and 2022YFA1604900)the Fundamental Research Funds for the Central Universities(No.CCNU22QN005).
文摘The zero-degree calorimeter(ZDC)plays a crucial role toward determining the centrality in the Cooling-Storage-Ring External-target Experiment(CEE)at the Heavy Ion Research Facility in Lanzhou.A boosted decision tree(BDT)multi-classification algorithm was employed to classify the centrality of the collision events based on the raw features from ZDC such as the number of fired channels and deposited energy.The data from simulated^(238)U+^(238)U collisions at 500 MeV∕u,generated by the IQMD event generator and subsequently modeled using the GEANT4 package,were employed to train and test the BDT model.The results showed the high accuracy of the multi-classification model adopted in ZDC for centrality determination,which is robust against variations in different factors of detector geometry and response.This study demon-strates the good performance of CEE-ZDC in determining the centrality in nucleus-nucleus collisions.
基金supported by NSFC(No.61702429)Sichuan Science and Technology Program(No.19yyjc1656).
文摘The multi-purpose forensics is an important tool for forge image detection.In this paper,we propose a universal feature set for the multi-purpose forensics which is capable of simultaneously identifying several typical image manipulations,including spatial low-pass Gaussian blurring,median filtering,re-sampling,and JPEG compression.To eliminate the influences caused by diverse image contents on the effectiveness and robustness of the feature,a residual group which contains several high-pass filtered residuals is introduced.The partial correlation coefficient is exploited from the residual group to purely measure neighborhood correlations in a linear way.Besides that,we also combine autoregressive coefficient and transition probability to form the proposed composite feature which is used to measure how manipulations change the neighborhood relationships in both linear and non-linear way.After a series of dimension reductions,the proposed feature set can accelerate the training and testing for the multi-purpose forensics.The proposed feature set is then fed into a multi-classifier to train a multi-purpose detector.Experimental results show that the proposed detector can identify several typical image manipulations,and is superior to the complicated deep CNN-based methods in terms of detection accuracy and time efficiency for JPEG compressed image with low resolution.
基金Supported by the National Earthquake Major Project of China (201008007)the Fundamental Research Funds for Central University of China (216275645)
文摘This paper proposes a model to analyze the massive data of electricity.Feature subset is determined by the correla-tion-based feature selection and the data-driven methods.The attribute season can be classified successfully through five classi-fiers using the selected feature subset,and the best model can be determined further.The effects on analyzing electricity consump-tion of the other three attributes,including months,businesses,and meters,can be estimated using the chosen model.The data used for the project is provided by Beijing Power Supply Bureau.We use WEKA as the machine learning tool.The models we built are promising for electricity scheduling and power theft detection.
基金supported by the National Natural Science Foundation of China (No. 60675049)the National Creative Research Groups Science Foundation of China (No. 60721062)the Natural Science Foundation of Zhejiang Province, China (No. Y106414)
文摘Currently there are two approaches for a multi-class support vector classifier(SVC). One is to construct and combine several binary classifiers while the other is to directly consider all classes of data in one optimization formulation. For a K-class problem(K>2),the first approach has to construct at least K classifiers,and the second approach has to solve a much larger op-timization problem proportional to K by the algorithms developed so far. In this paper,following the second approach,we present a novel multi-class large margin classifier(MLMC). This new machine can solve K-class problems in one optimization formula-tion without increasing the size of the quadratic programming(QP) problem proportional to K. This property allows us to construct just one classifier with as few variables in the QP problem as possible to classify multi-class data,and we can gain the advantage of speed from it especially when K is large. Our experiments indicate that MLMC almost works as well as(sometimes better than) many other multi-class SVCs for some benchmark data classification problems,and obtains a reasonable performance in face recognition application on the AR face database.
基金supported by the Deanship of Scientific Research,Vice Presidency forGraduate Studies and Scientific Research,King Faisal University,Saudi Arabia[Grant No.3,363].
文摘As ocular computer-aided diagnostic(CAD)tools become more widely accessible,many researchers are developing deep learning(DL)methods to aid in ocular disease(OHD)diagnosis.Common eye diseases like cataracts(CATR),glaucoma(GLU),and age-related macular degeneration(AMD)are the focus of this study,which uses DL to examine their identification.Data imbalance and outliers are widespread in fundus images,which can make it difficult to apply manyDL algorithms to accomplish this analytical assignment.The creation of efficient and reliable DL algorithms is seen to be the key to further enhancing detection performance.Using the analysis of images of the color of the retinal fundus,this study offers a DL model that is combined with a one-of-a-kind concoction loss function(CLF)for the automated identification of OHD.This study presents a combination of focal loss(FL)and correntropy-induced loss functions(CILF)in the proposed DL model to improve the recognition performance of classifiers for biomedical data.This is done because of the good generalization and robustness of these two types of losses in addressing complex datasets with class imbalance and outliers.The classification performance of the DL model with our proposed loss function is compared to that of the baseline models using accuracy(ACU),recall(REC),specificity(SPF),Kappa,and area under the receiver operating characteristic curve(AUC)as the evaluation metrics.The testing shows that the method is reliable and efficient.
文摘This article takes the companies that publicly issued corporate bonds on the Shanghai and Shenzhen Stock Exchanges from 2006 to 2018 as the research objects selecting six aspects that comprehensively reflect the 17 financial variables in 6 aspects:profitability,operating ability,bond repayment ability,development ability,cash flow and market value of the company.Principal component analysis method and factor analysis method are used to extract the principal factors of these financial indicator variables.That is how an ordered multi-classification Logistic regression model is constructed to test the impact of the Shanghai and Shenzhen Stock Exchanges’financial status on the corporate bond credit rating.It turns out that the financial status of the Shanghai and Shenzhen Stock Exchanges have an important impact on the credit rating of corporate bonds.The financial status has a greater impact on corporate bonds with credit ratings of A-and AA-,while it has a smaller impact on corporate bonds with credit ratings above AA.The results of this article can help individual and institutional investors prevent risks from investing.
基金supported by the Science and Technology Innovation Program of Hunan Province(No.2022GK5002,2024JK2015,2024JJ5440)the Special Foundation for Distinguished Young Scientists of Changsha(No.kq2209003)+2 种基金the Foreign Expert Project of China(No.G2023041039L)the 111 Project(No.D23006)in part by the High Performance Computing Center of Central South University.
文摘In recent years,defending against adversarial examples has gained significant importance,leading to a growing body of research in this area.Among these studies,pre-processing defense approaches have emerged as a prominent research direction.However,existing adversarial example pre-processing techniques often employ a single pre-processing model to counter different types of adversarial attacks.Such a strategy may miss the nuances between different types of attacks,limiting the comprehensiveness and effectiveness of the defense strategy.To address this issue,we propose a divide-and-conquer reconstruction pre-processing algorithm via multi-classification and multi-network training to more effectively defend against different types of mainstream adversarial attacks.The premise and challenge of the divide-and-conquer reconstruction defense is to distinguish between multiple types of adversarial attacks.Our method designs an adversarial attack classification module that exploits the high-frequency information differences between different types of adversarial examples for their multi-classification,which can hardly be achieved by existing adversarial example detection methods.In addition,we construct a divide-and-conquer reconstruction module that utilizes different trained image reconstruction models for each type of adversarial attack,ensuring optimal defense effectiveness.Extensive experiments show that our proposed divide-and-conquer defense algorithm exhibits superior performance compared to state-of-the-art pre-processing methods.
基金Project supported by the National Key R&D Project of China(No.2018YFB1700700)the National Natural Science Foundation of China(No.51675478)。
文摘We present an exploratory study to improve the performance of a knowledge push system in product design. We focus on the domain of knowledge matching, where traditional matching algorithms need repeated calculations that result in a long response time and where accuracy needs to be improved. The goal of our approach is to meet designers’ knowledge demands with a quick response and quality service in the knowledge push system. To improve the previous work, two methods are investigated to augment the limited training set in practical operations,namely, oscillating the feature weight and revising the case feature in the case feature vectors. In addition, we propose a multi-classification radial basis function neural network that can match the knowledge from the knowledge base once and ensure the accuracy of pushing results. We apply our approach using the training set in the design of guides by computer numerical control machine tools for training and testing, and the results demonstrate the benefit of the augmented training set. Moreover, experimental results reveal that our approach outperforms other matching approaches.