The multi-voxel pattern analysis technique is applied to fMRI data for classification of high-level brain functions using pattern information distributed over multiple voxels. In this paper, we propose a classifier en...The multi-voxel pattern analysis technique is applied to fMRI data for classification of high-level brain functions using pattern information distributed over multiple voxels. In this paper, we propose a classifier ensemble for multiclass classification in fMRI analysis, exploiting the fact that specific neighboring voxels can contain spatial pattern information. The proposed method converts the multiclass classification to a pairwise classifier ensemble, and each pairwise classifier consists of multiple sub-clas- sifiers using an adaptive feature set for each class-pair. Simulated and real fMRI data were used to verify the proposed method. Intra- and inter-subject analyses were performed to compare the proposed method with several well-known classitiers, including single and ensemble classifiers. The comparison results showed that the proposed method can be generally applied to multiclass classification in both simulations and real fMRI analyses.展开更多
Support vector machine (SVM), as a novel approach in pattern recognition, has demonstrated a success in face detection and face recognition. In this paper, a face recognition approach based on the SVM classifier with ...Support vector machine (SVM), as a novel approach in pattern recognition, has demonstrated a success in face detection and face recognition. In this paper, a face recognition approach based on the SVM classifier with the nearest neighbor classifier (NNC) is proposed. The principal component analysis (PCA) is used to reduce the dimension and extract features. Then one-against-all stratedy is used to train the SVM classifiers. At the testing stage, we propose an al-展开更多
Biometric recognition refers to the identification of individuals through their unique behavioral features(e.g.,fingerprint,face,and iris).We need distinguishing characteristics to identify people,such as fingerprints...Biometric recognition refers to the identification of individuals through their unique behavioral features(e.g.,fingerprint,face,and iris).We need distinguishing characteristics to identify people,such as fingerprints,which are world-renowned as the most reliablemethod to identify people.The recognition of fingerprints has become a standard procedure in forensics,and different techniques are available for this purpose.Most current techniques lack interest in image enhancement and rely on high-dimensional features to generate classification models.Therefore,we proposed an effective fingerprint classification method for classifying the fingerprint image as authentic or altered since criminals and hackers routinely change their fingerprints to generate fake ones.In order to improve fingerprint classification accuracy,our proposed method used the most effective texture features and classifiers.Discriminant Analysis(DCA)and Gaussian Discriminant Analysis(GDA)are employed as classifiers,along with Histogram of Oriented Gradient(HOG)and Segmentation-based Feature Texture Analysis(SFTA)feature vectors as inputs.The performance of the classifiers is determined by assessing a range of feature sets,and the most accurate results are obtained.The proposed method is tested using a Sokoto Coventry Fingerprint Dataset(SOCOFing).The SOCOFing project includes 6,000 fingerprint images collected from 600 African people whose fingerprints were taken ten times.Three distinct degrees of obliteration,central rotation,and z-cut have been performed to obtain synthetically altered replicas of the genuine fingerprints.The proposal achieved massive success with a classification accuracy reaching 99%.The experimental results indicate that the proposed method for fingerprint classification is feasible and effective.The experiments also showed that the proposed SFTA-based GDA method outperformed state-of-art approaches in feature dimension and classification accuracy.展开更多
Objective:The annual influenza epidemic is a heavy burden on the health care system,and has increasingly become a major public health problem in some areas,such as Hong Kong(China).Therefore,based on a variety of mach...Objective:The annual influenza epidemic is a heavy burden on the health care system,and has increasingly become a major public health problem in some areas,such as Hong Kong(China).Therefore,based on a variety of machine learning methods,and considering the seasonal influenza in Hong Kong,the study aims to establish a Combinatorial Judgment Classifier(CJC)model to classify the epidemic trend and improve the accuracy of influenza epidemic early warning.展开更多
Sentiment analysis is the computational study of how opinions, attitudes, emotions, and perspectives are expressed in language, and has been the important task of natural language processing. Sentiment analysis is hig...Sentiment analysis is the computational study of how opinions, attitudes, emotions, and perspectives are expressed in language, and has been the important task of natural language processing. Sentiment analysis is highly valuable for both research and practical applications. The focuses were put on the difficulties in the construction of sentiment classifiers which normally need tremendous labeled domain training data, and a novel unsupervised framework was proposed to make use of the Chinese idiom resources to develop a general sentiment classifier. Furthermore, the domain adaption of general sentiment classifier was improved by taking the general classifier as the base of a self-training procedure to get a domain self-training sentiment classifier. To validate the effect of the unsupervised framework, several experiments were carried out on publicly available Chinese online reviews dataset. The experiments show that the proposed framework is effective and achieves encouraging results. Specifically, the general classifier outperforms two baselines(a Na?ve 50% baseline and a cross-domain classifier), and the bootstrapping self-training classifier approximates the upper bound domain-specific classifier with the lowest accuracy of 81.5%, but the performance is more stable and the framework needs no labeled training dataset.展开更多
Support vector classifier(SVC)has the superior advantages for small sample learning problems with high dimensions,with especially better generalization ability.However there is some redundancy among the high dimension...Support vector classifier(SVC)has the superior advantages for small sample learning problems with high dimensions,with especially better generalization ability.However there is some redundancy among the high dimensions of the original samples and the main features of the samples may be picked up first to improve the performance of SVC.A principal component analysis(PCA)is employed to reduce the feature dimensions of the original samples and the pre-selected main features efficiently,and an SVC is constructed in the selected feature space to improve the learning speed and identification rate of SVC.Furthermore,a heuristic genetic algorithm-based automatic model selection is proposed to determine the hyperparameters of SVC to evaluate the performance of the learning machines.Experiments performed on the Heart and Adult benchmark data sets demonstrate that the proposed PCA-based SVC not only reduces the test time drastically,but also improves the identify rates effectively.展开更多
A face recognition scheme is proposed, wherein a face image is preprocessed by pixel averaging and energy normalizing to reduce data dimension and brightness variation effect, followed by the Fourier transform to esti...A face recognition scheme is proposed, wherein a face image is preprocessed by pixel averaging and energy normalizing to reduce data dimension and brightness variation effect, followed by the Fourier transform to estimate the spectrum of the preprocessed image. The principal component analysis is conducted on the spectra of a face image to obtain eigen features. Combining eigen features with a Parzen classifier, experiments are taken on the ORL face database.展开更多
Objective To detect unknown network worm at its early propagation stage. Methods On the basis of characteristics of network worm attack, the concept of failed connection flow (FCT) was defined. Based on wavelet packet...Objective To detect unknown network worm at its early propagation stage. Methods On the basis of characteristics of network worm attack, the concept of failed connection flow (FCT) was defined. Based on wavelet packet analysis of FCT time series, this method computed the energy associated with each wavelet packet of FCT time series, transformed the FCT time series into a series of energy distribution vector on frequency domain, then a trained K-nearest neighbor (KNN) classifier was applied to identify the worm. Results The experiment showed that the method could identify network worm when the worm started to scan. Compared to theoretic value, the identification error ratio was 5.69%. Conclusion The method can detect unknown network worm at its early propagation stage effectively.展开更多
As the popularity of digital images is rapidly increasing on the Internet, research on technologies for semantic image classification has become an important research topic. However, the well-known content-based image...As the popularity of digital images is rapidly increasing on the Internet, research on technologies for semantic image classification has become an important research topic. However, the well-known content-based image classification methods do not overcome the so-called semantic gap problem in which low-level visual features cannot represent the high-level semantic content of images. Image classification using visual and textual information often performs poorly since the extracted textual features are often too limited to accurately represent the images. In this paper, we propose a semantic image classification ap- proach using multi-context analysis. For a given image, we model the relevant textual information as its multi-modal context, and regard the related images connected by hyperlinks as its link context. Two kinds of context analysis models, i.e., cross-modal correlation analysis and link-based correlation model, are used to capture the correlation among different modals of features and the topical dependency among images induced by the link structure. We propose a new collective classification model called relational support vector classifier (RSVC) based on the well-known Support Vector Machines (SVMs) and the link-based cor- relation model. Experiments showed that the proposed approach significantly improved classification accuracy over that of SVM classifiers using visual and/or textual features.展开更多
Today, mammography is the best method for early detection of breast cancer. Radiologists failed to detect evident cancerous signs in approximately 20% of false negative mammograms. False negatives have been identified...Today, mammography is the best method for early detection of breast cancer. Radiologists failed to detect evident cancerous signs in approximately 20% of false negative mammograms. False negatives have been identified as the inability of the radiologist to detect the abnormalities due to several reasons such as poor image quality, image noise, or eye fatigue. This paper presents a framework for a computer aided detection system that integrates Principal Component Analysis (PCA), Fisher Linear Discriminant (FLD), and Nearest Neighbor Classifier (KNN) algorithms for the detection of abnormalities in mammograms. Using normal and abnormal mammograms from the MIAS database, the integrated algorithm achieved 93.06% classification accuracy. Also in this paper, we present an analysis of the integrated algorithm’s parameters and suggest selection criteria.展开更多
Smartphone devices particularly Android devices are in use by billions of people everywhere in the world.Similarly,this increasing rate attracts mobile botnet attacks which is a network of interconnected nodes operate...Smartphone devices particularly Android devices are in use by billions of people everywhere in the world.Similarly,this increasing rate attracts mobile botnet attacks which is a network of interconnected nodes operated through the command and control(C&C)method to expand malicious activities.At present,mobile botnet attacks launched the Distributed denial of services(DDoS)that causes to steal of sensitive data,remote access,and spam generation,etc.Consequently,various approaches are defined in the literature to detect mobile botnet attacks using static or dynamic analysis.In this paper,a novel hybrid model,the combination of static and dynamic methods that relies on machine learning to detect android botnet applications is proposed.Furthermore,results are evaluated using machine learning classifiers.The Random Forest(RF)classifier outperform as compared to other ML techniques i.e.,Naïve Bayes(NB),Support Vector Machine(SVM),and Simple Logistic(SL).Our proposed framework achieved 97.48%accuracy in the detection of botnet applications.Finally,some future research directions are highlighted regarding botnet attacks detection for the entire community.展开更多
Web-blogging sites such as Twitter and Facebook are heavily influenced by emotions,sentiments,and data in the modern era.Twitter,a widely used microblogging site where individuals share their thoughts in the form of t...Web-blogging sites such as Twitter and Facebook are heavily influenced by emotions,sentiments,and data in the modern era.Twitter,a widely used microblogging site where individuals share their thoughts in the form of tweets,has become a major source for sentiment analysis.In recent years,there has been a significant increase in demand for sentiment analysis to identify and classify opinions or expressions in text or tweets.Opinions or expressions of people about a particular topic,situation,person,or product can be identified from sentences and divided into three categories:positive for good,negative for bad,and neutral for mixed or confusing opinions.The process of analyzing changes in sentiment and the combination of these categories is known as“sentiment analysis.”In this study,sentiment analysis was performed on a dataset of 90,000 tweets using both deep learning and machine learning methods.The deep learning-based model long-short-term memory(LSTM)performed better than machine learning approaches.Long short-term memory achieved 87%accuracy,and the support vector machine(SVM)classifier achieved slightly worse results than LSTM at 86%.The study also tested binary classes of positive and negative,where LSTM and SVM both achieved 90%accuracy.展开更多
Deep Learning is a powerful technique that is widely applied to Image Recognition and Natural Language Processing tasks amongst many other tasks. In this work, we propose an efficient technique to utilize pre-trained ...Deep Learning is a powerful technique that is widely applied to Image Recognition and Natural Language Processing tasks amongst many other tasks. In this work, we propose an efficient technique to utilize pre-trained Convolutional Neural Network (CNN) architectures to extract powerful features from images for object recognition purposes. We have built on the existing concept of extending the learning from pre-trained CNNs to new databases through activations by proposing to consider multiple deep layers. We have exploited the progressive learning that happens at the various intermediate layers of the CNNs to construct Deep Multi-Layer (DM-L) based Feature Extraction vectors to achieve excellent object recognition performance. Two popular pre-trained CNN architecture models i.e. the VGG_16 and VGG_19 have been used in this work to extract the feature sets from 3 deep fully connected multiple layers namely “fc6”, “fc7” and “fc8” from inside the models for object recognition purposes. Using the Principal Component Analysis (PCA) technique, the Dimensionality of the DM-L feature vectors has been reduced to form powerful feature vectors that have been fed to an external Classifier Ensemble for classification instead of the Softmax based classification layers of the two original pre-trained CNN models. The proposed DM-L technique has been applied to the Benchmark Caltech-101 object recognition database. Conventional wisdom may suggest that feature extractions based on the deepest layer i.e. “fc8” compared to “fc6” will result in the best recognition performance but our results have proved it otherwise for the two considered models. Our experiments have revealed that for the two models under consideration, the “fc6” based feature vectors have achieved the best recognition performance. State-of-the-Art recognition performances of 91.17% and 91.35% have been achieved by utilizing the “fc6” based feature vectors for the VGG_16 and VGG_19 models respectively. The recognition performance has been achieved by considering 30 sample images per class whereas the proposed system is capable of achieving improved performance by considering all sample images per class. Our research shows that for feature extraction based on CNNs, multiple layers should be considered and then the best layer can be selected that maximizes the recognition performance.展开更多
Malware attacks on Windows machines pose significant cybersecurity threats,necessitating effective detection and prevention mechanisms.Supervised machine learning classifiers have emerged as promising tools for malwar...Malware attacks on Windows machines pose significant cybersecurity threats,necessitating effective detection and prevention mechanisms.Supervised machine learning classifiers have emerged as promising tools for malware detection.However,there remains a need for comprehensive studies that compare the performance of different classifiers specifically for Windows malware detection.Addressing this gap can provide valuable insights for enhancing cybersecurity strategies.While numerous studies have explored malware detection using machine learning techniques,there is a lack of systematic comparison of supervised classifiers for Windows malware detection.Understanding the relative effectiveness of these classifiers can inform the selection of optimal detection methods and improve overall security measures.This study aims to bridge the research gap by conducting a comparative analysis of supervised machine learning classifiers for detecting malware on Windows systems.The objectives include Investigating the performance of various classifiers,such as Gaussian Naïve Bayes,K Nearest Neighbors(KNN),Stochastic Gradient Descent Classifier(SGDC),and Decision Tree,in detecting Windows malware.Evaluating the accuracy,efficiency,and suitability of each classifier for real-world malware detection scenarios.Identifying the strengths and limitations of different classifiers to provide insights for cybersecurity practitioners and researchers.Offering recommendations for selecting the most effective classifier for Windows malware detection based on empirical evidence.The study employs a structured methodology consisting of several phases:exploratory data analysis,data preprocessing,model training,and evaluation.Exploratory data analysis involves understanding the dataset’s characteristics and identifying preprocessing requirements.Data preprocessing includes cleaning,feature encoding,dimensionality reduction,and optimization to prepare the data for training.Model training utilizes various supervised classifiers,and their performance is evaluated using metrics such as accuracy,precision,recall,and F1 score.The study’s outcomes comprise a comparative analysis of supervised machine learning classifiers for Windows malware detection.Results reveal the effectiveness and efficiency of each classifier in detecting different types of malware.Additionally,insights into their strengths and limitations provide practical guidance for enhancing cybersecurity defenses.Overall,this research contributes to advancing malware detection techniques and bolstering the security posture of Windows systems against evolving cyber threats.展开更多
开集分类识别要求分类器不仅能够“辨识”已知类别的测试样本,而且还要有效地“拒识”未知类别的测试样本;在光谱分析中有关的研究与应用相对较少。改进了Ishibuchi提出的经典的闭集框架下的模糊规则多类别分类器,将其应用于开集分类识...开集分类识别要求分类器不仅能够“辨识”已知类别的测试样本,而且还要有效地“拒识”未知类别的测试样本;在光谱分析中有关的研究与应用相对较少。改进了Ishibuchi提出的经典的闭集框架下的模糊规则多类别分类器,将其应用于开集分类识别领域。首先,使用主成分分析法进行原始光谱曲线向量的光谱维度约简,降维至4维~6维的光谱特征向量。其次,将Ishibuchi提出的模糊规则多类别分类器简化为二元分类器版本,采用1-vs-1二元分类器进行分类处理,并且确定该测试样本在相应类别的得票。最后,将所有二元分类器的投票数进行统计,如果某个已知类别的得票数最高,并且该最高得票数大于预先确定的阈值τ,那么测试样本判决为该已知类别;否则就“拒识”为未知类别,从而实现了多类别的开集分类识别。在实验验证中,对于木材和芒果光谱数据集进行了分组的对比实验,结果表明,本方法优于其他的主流的开集分类识别,包括基于广义基本概率分配(generalized Basic probability assignment,GBPA)的改进的开集框架下的模糊规则多类别分类器;具有最好的评价指标F-Score,Kappa系数及总体识别率。此外,还针对芒果光谱数据集的对比实验进行了双尾McNemar s Test统计检验,进一步表明该方法相对于其他的开集分类识别方法来说,具有统计检验意义的优势。展开更多
Machinery condition monitoring is beneficial to equipment maintenance and has been receiving much attention from academia and industry.Machine learning,especially deep learning,has become popular for machinery conditi...Machinery condition monitoring is beneficial to equipment maintenance and has been receiving much attention from academia and industry.Machine learning,especially deep learning,has become popular for machinery condition monitoring because that can fully use available data and computational power.Since significant accidents might be caused if wrong fault alarms are given for machine condition monitoring,interpretable machine learning models,integrate signal processing knowledge to enhance trustworthiness of models,are gradually becoming a research hotspot.A previous spectrum-based and interpretable optimized weights method has been proposed to indicate faulty and fundamental frequencies when the analyzed data only contains a healthy type and a fault type.Considering that multiclass fault types are naturally met in practice,this work aims to explore the interpretable optimized weights method for multiclass fault type scenarios.Therefore,a new multiclass optimized weights spectrum(OWS)is proposed and further studied theoretically and numerically.It is found that the multiclass OWS is capable of capturing the characteristic components associated with different conditions and clearly indicating specific fault characteristic frequencies(FCFs)corresponding to each fault condition.This work can provide new insights into spectrum-based fault classification models,and the new multiclass OWS also shows great potential for practical applications.展开更多
Predictive maintenance is essential for the implementation of an innovative and efficient structural health monitoring strategy.Models capable of accurately interpreting new data automatically collected by suitably pl...Predictive maintenance is essential for the implementation of an innovative and efficient structural health monitoring strategy.Models capable of accurately interpreting new data automatically collected by suitably placed sensors to assess the state of the infrastructure represent a fundamental step,particularly for the railway sector,whose safe and continuous operation plays a strategic role in the well-being and development of nations.In this scenario,the benefits of a digital twin of a bonded insu-lated rail joint(IRJ)with the predictive capabilities of advanced classification algorithms based on artificial intelligence have been explored.The digital model provides an accurate mechanical response of the infrastructure as a pair of wheels passes over the joint.As bolt preload conditions vary,four structural health classes were identified for the joint.Two parameters,i.e.gap value and vertical displacement,which are strongly correlated with bolt preload,are used in different combinations to train and test five predictive classifiers.Their classification effectiveness was assessed using several performance indica-tors.Finally,we compared the IRJ condition predictions of two trained classifiers with the available data,confirming their high accuracy.The approach presented provides an interesting solution for future predictive tools in SHM especially in the case of complex systems such as railways where the vehicle-infrastructure interaction is complex and always time varying.展开更多
文摘The multi-voxel pattern analysis technique is applied to fMRI data for classification of high-level brain functions using pattern information distributed over multiple voxels. In this paper, we propose a classifier ensemble for multiclass classification in fMRI analysis, exploiting the fact that specific neighboring voxels can contain spatial pattern information. The proposed method converts the multiclass classification to a pairwise classifier ensemble, and each pairwise classifier consists of multiple sub-clas- sifiers using an adaptive feature set for each class-pair. Simulated and real fMRI data were used to verify the proposed method. Intra- and inter-subject analyses were performed to compare the proposed method with several well-known classitiers, including single and ensemble classifiers. The comparison results showed that the proposed method can be generally applied to multiclass classification in both simulations and real fMRI analyses.
基金This project was supported by Shanghai Shu Guang Project.
文摘Support vector machine (SVM), as a novel approach in pattern recognition, has demonstrated a success in face detection and face recognition. In this paper, a face recognition approach based on the SVM classifier with the nearest neighbor classifier (NNC) is proposed. The principal component analysis (PCA) is used to reduce the dimension and extract features. Then one-against-all stratedy is used to train the SVM classifiers. At the testing stage, we propose an al-
文摘Biometric recognition refers to the identification of individuals through their unique behavioral features(e.g.,fingerprint,face,and iris).We need distinguishing characteristics to identify people,such as fingerprints,which are world-renowned as the most reliablemethod to identify people.The recognition of fingerprints has become a standard procedure in forensics,and different techniques are available for this purpose.Most current techniques lack interest in image enhancement and rely on high-dimensional features to generate classification models.Therefore,we proposed an effective fingerprint classification method for classifying the fingerprint image as authentic or altered since criminals and hackers routinely change their fingerprints to generate fake ones.In order to improve fingerprint classification accuracy,our proposed method used the most effective texture features and classifiers.Discriminant Analysis(DCA)and Gaussian Discriminant Analysis(GDA)are employed as classifiers,along with Histogram of Oriented Gradient(HOG)and Segmentation-based Feature Texture Analysis(SFTA)feature vectors as inputs.The performance of the classifiers is determined by assessing a range of feature sets,and the most accurate results are obtained.The proposed method is tested using a Sokoto Coventry Fingerprint Dataset(SOCOFing).The SOCOFing project includes 6,000 fingerprint images collected from 600 African people whose fingerprints were taken ten times.Three distinct degrees of obliteration,central rotation,and z-cut have been performed to obtain synthetically altered replicas of the genuine fingerprints.The proposal achieved massive success with a classification accuracy reaching 99%.The experimental results indicate that the proposed method for fingerprint classification is feasible and effective.The experiments also showed that the proposed SFTA-based GDA method outperformed state-of-art approaches in feature dimension and classification accuracy.
基金This project was supported by grants from the Ministry of Education Humanities and Social Sciences Research Fund Project。
文摘Objective:The annual influenza epidemic is a heavy burden on the health care system,and has increasingly become a major public health problem in some areas,such as Hong Kong(China).Therefore,based on a variety of machine learning methods,and considering the seasonal influenza in Hong Kong,the study aims to establish a Combinatorial Judgment Classifier(CJC)model to classify the epidemic trend and improve the accuracy of influenza epidemic early warning.
基金Projects(61170156,60933005)supported by the National Natural Science Foundation of China
文摘Sentiment analysis is the computational study of how opinions, attitudes, emotions, and perspectives are expressed in language, and has been the important task of natural language processing. Sentiment analysis is highly valuable for both research and practical applications. The focuses were put on the difficulties in the construction of sentiment classifiers which normally need tremendous labeled domain training data, and a novel unsupervised framework was proposed to make use of the Chinese idiom resources to develop a general sentiment classifier. Furthermore, the domain adaption of general sentiment classifier was improved by taking the general classifier as the base of a self-training procedure to get a domain self-training sentiment classifier. To validate the effect of the unsupervised framework, several experiments were carried out on publicly available Chinese online reviews dataset. The experiments show that the proposed framework is effective and achieves encouraging results. Specifically, the general classifier outperforms two baselines(a Na?ve 50% baseline and a cross-domain classifier), and the bootstrapping self-training classifier approximates the upper bound domain-specific classifier with the lowest accuracy of 81.5%, but the performance is more stable and the framework needs no labeled training dataset.
基金the National Natural Science of China(50675167)a Foundation for the Author of National Excellent Doctoral Dissertation of China(200535)
文摘Support vector classifier(SVC)has the superior advantages for small sample learning problems with high dimensions,with especially better generalization ability.However there is some redundancy among the high dimensions of the original samples and the main features of the samples may be picked up first to improve the performance of SVC.A principal component analysis(PCA)is employed to reduce the feature dimensions of the original samples and the pre-selected main features efficiently,and an SVC is constructed in the selected feature space to improve the learning speed and identification rate of SVC.Furthermore,a heuristic genetic algorithm-based automatic model selection is proposed to determine the hyperparameters of SVC to evaluate the performance of the learning machines.Experiments performed on the Heart and Adult benchmark data sets demonstrate that the proposed PCA-based SVC not only reduces the test time drastically,but also improves the identify rates effectively.
文摘A face recognition scheme is proposed, wherein a face image is preprocessed by pixel averaging and energy normalizing to reduce data dimension and brightness variation effect, followed by the Fourier transform to estimate the spectrum of the preprocessed image. The principal component analysis is conducted on the spectra of a face image to obtain eigen features. Combining eigen features with a Parzen classifier, experiments are taken on the ORL face database.
基金This work was supported by National "863" programof China (No.2003AA148010) and National Torch Project of China (No.2005EB011484) .
文摘Objective To detect unknown network worm at its early propagation stage. Methods On the basis of characteristics of network worm attack, the concept of failed connection flow (FCT) was defined. Based on wavelet packet analysis of FCT time series, this method computed the energy associated with each wavelet packet of FCT time series, transformed the FCT time series into a series of energy distribution vector on frequency domain, then a trained K-nearest neighbor (KNN) classifier was applied to identify the worm. Results The experiment showed that the method could identify network worm when the worm started to scan. Compared to theoretic value, the identification error ratio was 5.69%. Conclusion The method can detect unknown network worm at its early propagation stage effectively.
基金Project supported by the Hi-Tech Research and Development Pro-gram (863) of China (No. 2003AA119010), and China-American Digital Academic Library (CADAL) Project (No. CADAL2004002)
文摘As the popularity of digital images is rapidly increasing on the Internet, research on technologies for semantic image classification has become an important research topic. However, the well-known content-based image classification methods do not overcome the so-called semantic gap problem in which low-level visual features cannot represent the high-level semantic content of images. Image classification using visual and textual information often performs poorly since the extracted textual features are often too limited to accurately represent the images. In this paper, we propose a semantic image classification ap- proach using multi-context analysis. For a given image, we model the relevant textual information as its multi-modal context, and regard the related images connected by hyperlinks as its link context. Two kinds of context analysis models, i.e., cross-modal correlation analysis and link-based correlation model, are used to capture the correlation among different modals of features and the topical dependency among images induced by the link structure. We propose a new collective classification model called relational support vector classifier (RSVC) based on the well-known Support Vector Machines (SVMs) and the link-based cor- relation model. Experiments showed that the proposed approach significantly improved classification accuracy over that of SVM classifiers using visual and/or textual features.
文摘Today, mammography is the best method for early detection of breast cancer. Radiologists failed to detect evident cancerous signs in approximately 20% of false negative mammograms. False negatives have been identified as the inability of the radiologist to detect the abnormalities due to several reasons such as poor image quality, image noise, or eye fatigue. This paper presents a framework for a computer aided detection system that integrates Principal Component Analysis (PCA), Fisher Linear Discriminant (FLD), and Nearest Neighbor Classifier (KNN) algorithms for the detection of abnormalities in mammograms. Using normal and abnormal mammograms from the MIAS database, the integrated algorithm achieved 93.06% classification accuracy. Also in this paper, we present an analysis of the integrated algorithm’s parameters and suggest selection criteria.
文摘Smartphone devices particularly Android devices are in use by billions of people everywhere in the world.Similarly,this increasing rate attracts mobile botnet attacks which is a network of interconnected nodes operated through the command and control(C&C)method to expand malicious activities.At present,mobile botnet attacks launched the Distributed denial of services(DDoS)that causes to steal of sensitive data,remote access,and spam generation,etc.Consequently,various approaches are defined in the literature to detect mobile botnet attacks using static or dynamic analysis.In this paper,a novel hybrid model,the combination of static and dynamic methods that relies on machine learning to detect android botnet applications is proposed.Furthermore,results are evaluated using machine learning classifiers.The Random Forest(RF)classifier outperform as compared to other ML techniques i.e.,Naïve Bayes(NB),Support Vector Machine(SVM),and Simple Logistic(SL).Our proposed framework achieved 97.48%accuracy in the detection of botnet applications.Finally,some future research directions are highlighted regarding botnet attacks detection for the entire community.
基金The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:(22UQU4400257DSR01).
文摘Web-blogging sites such as Twitter and Facebook are heavily influenced by emotions,sentiments,and data in the modern era.Twitter,a widely used microblogging site where individuals share their thoughts in the form of tweets,has become a major source for sentiment analysis.In recent years,there has been a significant increase in demand for sentiment analysis to identify and classify opinions or expressions in text or tweets.Opinions or expressions of people about a particular topic,situation,person,or product can be identified from sentences and divided into three categories:positive for good,negative for bad,and neutral for mixed or confusing opinions.The process of analyzing changes in sentiment and the combination of these categories is known as“sentiment analysis.”In this study,sentiment analysis was performed on a dataset of 90,000 tweets using both deep learning and machine learning methods.The deep learning-based model long-short-term memory(LSTM)performed better than machine learning approaches.Long short-term memory achieved 87%accuracy,and the support vector machine(SVM)classifier achieved slightly worse results than LSTM at 86%.The study also tested binary classes of positive and negative,where LSTM and SVM both achieved 90%accuracy.
文摘Deep Learning is a powerful technique that is widely applied to Image Recognition and Natural Language Processing tasks amongst many other tasks. In this work, we propose an efficient technique to utilize pre-trained Convolutional Neural Network (CNN) architectures to extract powerful features from images for object recognition purposes. We have built on the existing concept of extending the learning from pre-trained CNNs to new databases through activations by proposing to consider multiple deep layers. We have exploited the progressive learning that happens at the various intermediate layers of the CNNs to construct Deep Multi-Layer (DM-L) based Feature Extraction vectors to achieve excellent object recognition performance. Two popular pre-trained CNN architecture models i.e. the VGG_16 and VGG_19 have been used in this work to extract the feature sets from 3 deep fully connected multiple layers namely “fc6”, “fc7” and “fc8” from inside the models for object recognition purposes. Using the Principal Component Analysis (PCA) technique, the Dimensionality of the DM-L feature vectors has been reduced to form powerful feature vectors that have been fed to an external Classifier Ensemble for classification instead of the Softmax based classification layers of the two original pre-trained CNN models. The proposed DM-L technique has been applied to the Benchmark Caltech-101 object recognition database. Conventional wisdom may suggest that feature extractions based on the deepest layer i.e. “fc8” compared to “fc6” will result in the best recognition performance but our results have proved it otherwise for the two considered models. Our experiments have revealed that for the two models under consideration, the “fc6” based feature vectors have achieved the best recognition performance. State-of-the-Art recognition performances of 91.17% and 91.35% have been achieved by utilizing the “fc6” based feature vectors for the VGG_16 and VGG_19 models respectively. The recognition performance has been achieved by considering 30 sample images per class whereas the proposed system is capable of achieving improved performance by considering all sample images per class. Our research shows that for feature extraction based on CNNs, multiple layers should be considered and then the best layer can be selected that maximizes the recognition performance.
基金This researchwork is supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2024R411),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Malware attacks on Windows machines pose significant cybersecurity threats,necessitating effective detection and prevention mechanisms.Supervised machine learning classifiers have emerged as promising tools for malware detection.However,there remains a need for comprehensive studies that compare the performance of different classifiers specifically for Windows malware detection.Addressing this gap can provide valuable insights for enhancing cybersecurity strategies.While numerous studies have explored malware detection using machine learning techniques,there is a lack of systematic comparison of supervised classifiers for Windows malware detection.Understanding the relative effectiveness of these classifiers can inform the selection of optimal detection methods and improve overall security measures.This study aims to bridge the research gap by conducting a comparative analysis of supervised machine learning classifiers for detecting malware on Windows systems.The objectives include Investigating the performance of various classifiers,such as Gaussian Naïve Bayes,K Nearest Neighbors(KNN),Stochastic Gradient Descent Classifier(SGDC),and Decision Tree,in detecting Windows malware.Evaluating the accuracy,efficiency,and suitability of each classifier for real-world malware detection scenarios.Identifying the strengths and limitations of different classifiers to provide insights for cybersecurity practitioners and researchers.Offering recommendations for selecting the most effective classifier for Windows malware detection based on empirical evidence.The study employs a structured methodology consisting of several phases:exploratory data analysis,data preprocessing,model training,and evaluation.Exploratory data analysis involves understanding the dataset’s characteristics and identifying preprocessing requirements.Data preprocessing includes cleaning,feature encoding,dimensionality reduction,and optimization to prepare the data for training.Model training utilizes various supervised classifiers,and their performance is evaluated using metrics such as accuracy,precision,recall,and F1 score.The study’s outcomes comprise a comparative analysis of supervised machine learning classifiers for Windows malware detection.Results reveal the effectiveness and efficiency of each classifier in detecting different types of malware.Additionally,insights into their strengths and limitations provide practical guidance for enhancing cybersecurity defenses.Overall,this research contributes to advancing malware detection techniques and bolstering the security posture of Windows systems against evolving cyber threats.
文摘开集分类识别要求分类器不仅能够“辨识”已知类别的测试样本,而且还要有效地“拒识”未知类别的测试样本;在光谱分析中有关的研究与应用相对较少。改进了Ishibuchi提出的经典的闭集框架下的模糊规则多类别分类器,将其应用于开集分类识别领域。首先,使用主成分分析法进行原始光谱曲线向量的光谱维度约简,降维至4维~6维的光谱特征向量。其次,将Ishibuchi提出的模糊规则多类别分类器简化为二元分类器版本,采用1-vs-1二元分类器进行分类处理,并且确定该测试样本在相应类别的得票。最后,将所有二元分类器的投票数进行统计,如果某个已知类别的得票数最高,并且该最高得票数大于预先确定的阈值τ,那么测试样本判决为该已知类别;否则就“拒识”为未知类别,从而实现了多类别的开集分类识别。在实验验证中,对于木材和芒果光谱数据集进行了分组的对比实验,结果表明,本方法优于其他的主流的开集分类识别,包括基于广义基本概率分配(generalized Basic probability assignment,GBPA)的改进的开集框架下的模糊规则多类别分类器;具有最好的评价指标F-Score,Kappa系数及总体识别率。此外,还针对芒果光谱数据集的对比实验进行了双尾McNemar s Test统计检验,进一步表明该方法相对于其他的开集分类识别方法来说,具有统计检验意义的优势。
基金supported by the National Natural Science Foundation of China under Grant Nos.523B2043 and 52475112.
文摘Machinery condition monitoring is beneficial to equipment maintenance and has been receiving much attention from academia and industry.Machine learning,especially deep learning,has become popular for machinery condition monitoring because that can fully use available data and computational power.Since significant accidents might be caused if wrong fault alarms are given for machine condition monitoring,interpretable machine learning models,integrate signal processing knowledge to enhance trustworthiness of models,are gradually becoming a research hotspot.A previous spectrum-based and interpretable optimized weights method has been proposed to indicate faulty and fundamental frequencies when the analyzed data only contains a healthy type and a fault type.Considering that multiclass fault types are naturally met in practice,this work aims to explore the interpretable optimized weights method for multiclass fault type scenarios.Therefore,a new multiclass optimized weights spectrum(OWS)is proposed and further studied theoretically and numerically.It is found that the multiclass OWS is capable of capturing the characteristic components associated with different conditions and clearly indicating specific fault characteristic frequencies(FCFs)corresponding to each fault condition.This work can provide new insights into spectrum-based fault classification models,and the new multiclass OWS also shows great potential for practical applications.
基金the National Recovery and Resilience Plan (NRRP), Mission 4 Component 2 Investment 1.4-Call for tender No. 3138 of 16/12/2021 of Italian Ministry of University and Research funded by the European Union-Next Generation EU. Award Number: Project code CN00000023Concession Decree No. 1033 of 17/06/2022 adopted by the Italian Ministry of University and Research, CUP D93C22000400001, “Sustainable Mobility Center” (CNMS). Spoke 4-Rail Transportation
文摘Predictive maintenance is essential for the implementation of an innovative and efficient structural health monitoring strategy.Models capable of accurately interpreting new data automatically collected by suitably placed sensors to assess the state of the infrastructure represent a fundamental step,particularly for the railway sector,whose safe and continuous operation plays a strategic role in the well-being and development of nations.In this scenario,the benefits of a digital twin of a bonded insu-lated rail joint(IRJ)with the predictive capabilities of advanced classification algorithms based on artificial intelligence have been explored.The digital model provides an accurate mechanical response of the infrastructure as a pair of wheels passes over the joint.As bolt preload conditions vary,four structural health classes were identified for the joint.Two parameters,i.e.gap value and vertical displacement,which are strongly correlated with bolt preload,are used in different combinations to train and test five predictive classifiers.Their classification effectiveness was assessed using several performance indica-tors.Finally,we compared the IRJ condition predictions of two trained classifiers with the available data,confirming their high accuracy.The approach presented provides an interesting solution for future predictive tools in SHM especially in the case of complex systems such as railways where the vehicle-infrastructure interaction is complex and always time varying.