Support vector classifier (SVC) has the superior advantages for small sample learning problems with high dimensions, with especially better generalization ability. However there is some redundancy among the high dim...Support vector classifier (SVC) has the superior advantages for small sample learning problems with high dimensions, with especially better generalization ability. However there is some redundancy among the high dimensions of the original samples and the main features of the samples may be picked up first to improve the performance of SVC. A principal component analysis (PCA) is employed to reduce the feature dimensions of the original samples and the pre-selected main features efficiently, and an SVC is constructed in the selected feature space to improve the learning speed and identification rate of SVC. Furthermore, a heuristic genetic algorithm-based automatic model selection is proposed to determine the hyperparameters of SVC to evaluate the performance of the learning machines. Experiments performed on the Heart and Adult benchmark data sets demonstrate that the proposed PCA-based SVC not only reduces the test time drastically, but also improves the identify rates effectively.展开更多
The multi-voxel pattern analysis technique is applied to fMRI data for classification of high-level brain functions using pattern information distributed over multiple voxels. In this paper, we propose a classifier en...The multi-voxel pattern analysis technique is applied to fMRI data for classification of high-level brain functions using pattern information distributed over multiple voxels. In this paper, we propose a classifier ensemble for multiclass classification in fMRI analysis, exploiting the fact that specific neighboring voxels can contain spatial pattern information. The proposed method converts the multiclass classification to a pairwise classifier ensemble, and each pairwise classifier consists of multiple sub-clas- sifiers using an adaptive feature set for each class-pair. Simulated and real fMRI data were used to verify the proposed method. Intra- and inter-subject analyses were performed to compare the proposed method with several well-known classitiers, including single and ensemble classifiers. The comparison results showed that the proposed method can be generally applied to multiclass classification in both simulations and real fMRI analyses.展开更多
Malware attacks on Windows machines pose significant cybersecurity threats,necessitating effective detection and prevention mechanisms.Supervised machine learning classifiers have emerged as promising tools for malwar...Malware attacks on Windows machines pose significant cybersecurity threats,necessitating effective detection and prevention mechanisms.Supervised machine learning classifiers have emerged as promising tools for malware detection.However,there remains a need for comprehensive studies that compare the performance of different classifiers specifically for Windows malware detection.Addressing this gap can provide valuable insights for enhancing cybersecurity strategies.While numerous studies have explored malware detection using machine learning techniques,there is a lack of systematic comparison of supervised classifiers for Windows malware detection.Understanding the relative effectiveness of these classifiers can inform the selection of optimal detection methods and improve overall security measures.This study aims to bridge the research gap by conducting a comparative analysis of supervised machine learning classifiers for detecting malware on Windows systems.The objectives include Investigating the performance of various classifiers,such as Gaussian Naïve Bayes,K Nearest Neighbors(KNN),Stochastic Gradient Descent Classifier(SGDC),and Decision Tree,in detecting Windows malware.Evaluating the accuracy,efficiency,and suitability of each classifier for real-world malware detection scenarios.Identifying the strengths and limitations of different classifiers to provide insights for cybersecurity practitioners and researchers.Offering recommendations for selecting the most effective classifier for Windows malware detection based on empirical evidence.The study employs a structured methodology consisting of several phases:exploratory data analysis,data preprocessing,model training,and evaluation.Exploratory data analysis involves understanding the dataset’s characteristics and identifying preprocessing requirements.Data preprocessing includes cleaning,feature encoding,dimensionality reduction,and optimization to prepare the data for training.Model training utilizes various supervised classifiers,and their performance is evaluated using metrics such as accuracy,precision,recall,and F1 score.The study’s outcomes comprise a comparative analysis of supervised machine learning classifiers for Windows malware detection.Results reveal the effectiveness and efficiency of each classifier in detecting different types of malware.Additionally,insights into their strengths and limitations provide practical guidance for enhancing cybersecurity defenses.Overall,this research contributes to advancing malware detection techniques and bolstering the security posture of Windows systems against evolving cyber threats.展开更多
开集分类识别要求分类器不仅能够“辨识”已知类别的测试样本,而且还要有效地“拒识”未知类别的测试样本;在光谱分析中有关的研究与应用相对较少。改进了Ishibuchi提出的经典的闭集框架下的模糊规则多类别分类器,将其应用于开集分类识...开集分类识别要求分类器不仅能够“辨识”已知类别的测试样本,而且还要有效地“拒识”未知类别的测试样本;在光谱分析中有关的研究与应用相对较少。改进了Ishibuchi提出的经典的闭集框架下的模糊规则多类别分类器,将其应用于开集分类识别领域。首先,使用主成分分析法进行原始光谱曲线向量的光谱维度约简,降维至4维~6维的光谱特征向量。其次,将Ishibuchi提出的模糊规则多类别分类器简化为二元分类器版本,采用1-vs-1二元分类器进行分类处理,并且确定该测试样本在相应类别的得票。最后,将所有二元分类器的投票数进行统计,如果某个已知类别的得票数最高,并且该最高得票数大于预先确定的阈值τ,那么测试样本判决为该已知类别;否则就“拒识”为未知类别,从而实现了多类别的开集分类识别。在实验验证中,对于木材和芒果光谱数据集进行了分组的对比实验,结果表明,本方法优于其他的主流的开集分类识别,包括基于广义基本概率分配(generalized Basic probability assignment,GBPA)的改进的开集框架下的模糊规则多类别分类器;具有最好的评价指标F-Score,Kappa系数及总体识别率。此外,还针对芒果光谱数据集的对比实验进行了双尾McNemar s Test统计检验,进一步表明该方法相对于其他的开集分类识别方法来说,具有统计检验意义的优势。展开更多
为了有效地从杂乱无章的时间序列中提取有用的信息,文章提出一种基于层次差分符号熵(Hierarchical Differential Symbol Entropy,HDSE)和层次原型分类器(Hierarchical Prototype,HP)的故障识别新方法。首先,针对差分符号熵(Differential...为了有效地从杂乱无章的时间序列中提取有用的信息,文章提出一种基于层次差分符号熵(Hierarchical Differential Symbol Entropy,HDSE)和层次原型分类器(Hierarchical Prototype,HP)的故障识别新方法。首先,针对差分符号熵(Differential Symbol Entropy,DSE)的缺陷,文章提出层次差分符号熵(HDSE),并采用层次差分符号熵对原始振动信号进行全面的特征提取;其次,为了避免样本中冗余信息较多而影响故障识别正确率,同时为了更深层次地提取出特征信息,采用线性判别分析法(Linear Discriminant Analysis,LDA)对特征向量进行降维操作;最后,采用特征向量训练层次原型分类器(HP)并测试其性能。为了测试本文所提方法的有效性,采用轴承试验台数据进行试验分析,通过试验分析可知:其故障识别准确率达95.423%。展开更多
Support vector machine (SVM), as a novel approach in pattern recognition, has demonstrated a success in face detection and face recognition. In this paper, a face recognition approach based on the SVM classifier with ...Support vector machine (SVM), as a novel approach in pattern recognition, has demonstrated a success in face detection and face recognition. In this paper, a face recognition approach based on the SVM classifier with the nearest neighbor classifier (NNC) is proposed. The principal component analysis (PCA) is used to reduce the dimension and extract features. Then one-against-all stratedy is used to train the SVM classifiers. At the testing stage, we propose an al-展开更多
基金the National Natural Science of China (50675167)a Foundation for the Author of National Excellent Doctoral Dissertation of China(200535)
文摘Support vector classifier (SVC) has the superior advantages for small sample learning problems with high dimensions, with especially better generalization ability. However there is some redundancy among the high dimensions of the original samples and the main features of the samples may be picked up first to improve the performance of SVC. A principal component analysis (PCA) is employed to reduce the feature dimensions of the original samples and the pre-selected main features efficiently, and an SVC is constructed in the selected feature space to improve the learning speed and identification rate of SVC. Furthermore, a heuristic genetic algorithm-based automatic model selection is proposed to determine the hyperparameters of SVC to evaluate the performance of the learning machines. Experiments performed on the Heart and Adult benchmark data sets demonstrate that the proposed PCA-based SVC not only reduces the test time drastically, but also improves the identify rates effectively.
文摘The multi-voxel pattern analysis technique is applied to fMRI data for classification of high-level brain functions using pattern information distributed over multiple voxels. In this paper, we propose a classifier ensemble for multiclass classification in fMRI analysis, exploiting the fact that specific neighboring voxels can contain spatial pattern information. The proposed method converts the multiclass classification to a pairwise classifier ensemble, and each pairwise classifier consists of multiple sub-clas- sifiers using an adaptive feature set for each class-pair. Simulated and real fMRI data were used to verify the proposed method. Intra- and inter-subject analyses were performed to compare the proposed method with several well-known classitiers, including single and ensemble classifiers. The comparison results showed that the proposed method can be generally applied to multiclass classification in both simulations and real fMRI analyses.
基金This researchwork is supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2024R411),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Malware attacks on Windows machines pose significant cybersecurity threats,necessitating effective detection and prevention mechanisms.Supervised machine learning classifiers have emerged as promising tools for malware detection.However,there remains a need for comprehensive studies that compare the performance of different classifiers specifically for Windows malware detection.Addressing this gap can provide valuable insights for enhancing cybersecurity strategies.While numerous studies have explored malware detection using machine learning techniques,there is a lack of systematic comparison of supervised classifiers for Windows malware detection.Understanding the relative effectiveness of these classifiers can inform the selection of optimal detection methods and improve overall security measures.This study aims to bridge the research gap by conducting a comparative analysis of supervised machine learning classifiers for detecting malware on Windows systems.The objectives include Investigating the performance of various classifiers,such as Gaussian Naïve Bayes,K Nearest Neighbors(KNN),Stochastic Gradient Descent Classifier(SGDC),and Decision Tree,in detecting Windows malware.Evaluating the accuracy,efficiency,and suitability of each classifier for real-world malware detection scenarios.Identifying the strengths and limitations of different classifiers to provide insights for cybersecurity practitioners and researchers.Offering recommendations for selecting the most effective classifier for Windows malware detection based on empirical evidence.The study employs a structured methodology consisting of several phases:exploratory data analysis,data preprocessing,model training,and evaluation.Exploratory data analysis involves understanding the dataset’s characteristics and identifying preprocessing requirements.Data preprocessing includes cleaning,feature encoding,dimensionality reduction,and optimization to prepare the data for training.Model training utilizes various supervised classifiers,and their performance is evaluated using metrics such as accuracy,precision,recall,and F1 score.The study’s outcomes comprise a comparative analysis of supervised machine learning classifiers for Windows malware detection.Results reveal the effectiveness and efficiency of each classifier in detecting different types of malware.Additionally,insights into their strengths and limitations provide practical guidance for enhancing cybersecurity defenses.Overall,this research contributes to advancing malware detection techniques and bolstering the security posture of Windows systems against evolving cyber threats.
文摘开集分类识别要求分类器不仅能够“辨识”已知类别的测试样本,而且还要有效地“拒识”未知类别的测试样本;在光谱分析中有关的研究与应用相对较少。改进了Ishibuchi提出的经典的闭集框架下的模糊规则多类别分类器,将其应用于开集分类识别领域。首先,使用主成分分析法进行原始光谱曲线向量的光谱维度约简,降维至4维~6维的光谱特征向量。其次,将Ishibuchi提出的模糊规则多类别分类器简化为二元分类器版本,采用1-vs-1二元分类器进行分类处理,并且确定该测试样本在相应类别的得票。最后,将所有二元分类器的投票数进行统计,如果某个已知类别的得票数最高,并且该最高得票数大于预先确定的阈值τ,那么测试样本判决为该已知类别;否则就“拒识”为未知类别,从而实现了多类别的开集分类识别。在实验验证中,对于木材和芒果光谱数据集进行了分组的对比实验,结果表明,本方法优于其他的主流的开集分类识别,包括基于广义基本概率分配(generalized Basic probability assignment,GBPA)的改进的开集框架下的模糊规则多类别分类器;具有最好的评价指标F-Score,Kappa系数及总体识别率。此外,还针对芒果光谱数据集的对比实验进行了双尾McNemar s Test统计检验,进一步表明该方法相对于其他的开集分类识别方法来说,具有统计检验意义的优势。
文摘为了有效地从杂乱无章的时间序列中提取有用的信息,文章提出一种基于层次差分符号熵(Hierarchical Differential Symbol Entropy,HDSE)和层次原型分类器(Hierarchical Prototype,HP)的故障识别新方法。首先,针对差分符号熵(Differential Symbol Entropy,DSE)的缺陷,文章提出层次差分符号熵(HDSE),并采用层次差分符号熵对原始振动信号进行全面的特征提取;其次,为了避免样本中冗余信息较多而影响故障识别正确率,同时为了更深层次地提取出特征信息,采用线性判别分析法(Linear Discriminant Analysis,LDA)对特征向量进行降维操作;最后,采用特征向量训练层次原型分类器(HP)并测试其性能。为了测试本文所提方法的有效性,采用轴承试验台数据进行试验分析,通过试验分析可知:其故障识别准确率达95.423%。
基金This project was supported by Shanghai Shu Guang Project.
文摘Support vector machine (SVM), as a novel approach in pattern recognition, has demonstrated a success in face detection and face recognition. In this paper, a face recognition approach based on the SVM classifier with the nearest neighbor classifier (NNC) is proposed. The principal component analysis (PCA) is used to reduce the dimension and extract features. Then one-against-all stratedy is used to train the SVM classifiers. At the testing stage, we propose an al-