Reverse k nearest neighbor (RNNk) is a generalization of the reverse nearest neighbor problem and receives increasing attention recently in the spatial data index and query. RNNk query is to retrieve all the data po...Reverse k nearest neighbor (RNNk) is a generalization of the reverse nearest neighbor problem and receives increasing attention recently in the spatial data index and query. RNNk query is to retrieve all the data points which use a query point as one of their k nearest neighbors. To answer the RNNk of queries efficiently, the properties of the Voronoi cell and the space-dividing regions are applied. The RNNk of the given point can be found without computing its nearest neighbors every time by using the rank Voronoi cell. With the elementary RNNk query result, the candidate data points of reverse nearest neighbors can he further limited by the approximation with sweepline and the partial extension of query region Q. The approximate minimum average distance (AMAD) can be calculated by the approximate RNNk without the restriction of k. Experimental results indicate the efficiency and the effectiveness of the algorithm and the approximate method in three varied data distribution spaces. The approximate query and the calculation method with the high precision and the accurate recall are obtained by filtrating data and pruning the search space.展开更多
The interaction between humans and machines has become an issue of concern in recent years.Besides facial expressions or gestures,speech has been evidenced as one of the foremost promising modalities for automatic emo...The interaction between humans and machines has become an issue of concern in recent years.Besides facial expressions or gestures,speech has been evidenced as one of the foremost promising modalities for automatic emotion recognition.Effective computing means to support HCI(Human-Computer Interaction)at a psychological level,allowing PCs to adjust their reactions as per human requirements.Therefore,the recognition of emotion is pivotal in High-level interactions.Each Emotion has distinctive properties that form us to recognize them.The acoustic signal produced for identical expression or sentence changes is essentially a direct result of biophysical changes,(for example,the stress instigated narrowing of the larynx)set off by emotions.This connection between acoustic cues and emotions made Speech Emotion Recognition one of the moving subjects of the emotive computing area.The most motivation behind a Speech Emotion Recognition algorithm is to observe the emotional condition of a speaker from recorded Speech signals.The results from the application of k-NN and OVA-SVM for MFCC features without and with a feature selection approach are presented in this research.The MFCC features from the audio signal were initially extracted to characterize the properties of emotional speech.Secondly,nine basic statistical measures were calculated from MFCC and 117-dimensional features were consequently obtained to train the classifiers for seven different classes(Anger,Happiness,Disgust,Fear,Sadness,Disgust,Boredom and Neutral)of emotions.Next,Classification was done in four steps.First,all the 117-features are classified using both classifiers.Second,the best classifier was found and then features were scaled to[-1,1]and classified.In the third step,the with or without feature scaling which gives better performance was derived from the results of the second step and the classification was done for each of the basic statistical measures separately.Finally,in the fourth step,the combination of statistical measures which gives better performance was derived using the forward feature selection method Experiments were carried out using k-NN with different k values and a linear OVA-based SVM classifier with different optimal values.Berlin emotional speech database for the German language was utilized for testing the planned methodology and recognition rates as high as 60%accomplished for the recognition of emotion from voice signal for the set of statistical measures(median,maximum,mean,Inter-quartile range,skewness).OVA-SVM performs better than k-NN and the use of the feature selection technique gives a high rate.展开更多
The accurate estimation of road traffic states can provide decision making for travelers and traffic managers. In this work,an algorithm based on kernel-k nearest neighbor(KNN) matching of road traffic spatial charact...The accurate estimation of road traffic states can provide decision making for travelers and traffic managers. In this work,an algorithm based on kernel-k nearest neighbor(KNN) matching of road traffic spatial characteristics is presented to estimate road traffic states. Firstly, the representative road traffic state data were extracted to establish the reference sequences of road traffic running characteristics(RSRTRC). Secondly, the spatial road traffic state data sequence was selected and the kernel function was constructed, with which the spatial road traffic data sequence could be mapped into a high dimensional feature space. Thirdly, the referenced and current spatial road traffic data sequences were extracted and the Euclidean distances in the feature space between them were obtained. Finally, the road traffic states were estimated from weighted averages of the selected k road traffic states, which corresponded to the nearest Euclidean distances. Several typical links in Beijing were adopted for case studies. The final results of the experiments show that the accuracy of this algorithm for estimating speed and volume is 95.27% and 91.32% respectively, which prove that this road traffic states estimation approach based on kernel-KNN matching of road traffic spatial characteristics is feasible and can achieve a high accuracy.展开更多
Malware attacks on Windows machines pose significant cybersecurity threats,necessitating effective detection and prevention mechanisms.Supervised machine learning classifiers have emerged as promising tools for malwar...Malware attacks on Windows machines pose significant cybersecurity threats,necessitating effective detection and prevention mechanisms.Supervised machine learning classifiers have emerged as promising tools for malware detection.However,there remains a need for comprehensive studies that compare the performance of different classifiers specifically for Windows malware detection.Addressing this gap can provide valuable insights for enhancing cybersecurity strategies.While numerous studies have explored malware detection using machine learning techniques,there is a lack of systematic comparison of supervised classifiers for Windows malware detection.Understanding the relative effectiveness of these classifiers can inform the selection of optimal detection methods and improve overall security measures.This study aims to bridge the research gap by conducting a comparative analysis of supervised machine learning classifiers for detecting malware on Windows systems.The objectives include Investigating the performance of various classifiers,such as Gaussian Naïve Bayes,K Nearest Neighbors(KNN),Stochastic Gradient Descent Classifier(SGDC),and Decision Tree,in detecting Windows malware.Evaluating the accuracy,efficiency,and suitability of each classifier for real-world malware detection scenarios.Identifying the strengths and limitations of different classifiers to provide insights for cybersecurity practitioners and researchers.Offering recommendations for selecting the most effective classifier for Windows malware detection based on empirical evidence.The study employs a structured methodology consisting of several phases:exploratory data analysis,data preprocessing,model training,and evaluation.Exploratory data analysis involves understanding the dataset’s characteristics and identifying preprocessing requirements.Data preprocessing includes cleaning,feature encoding,dimensionality reduction,and optimization to prepare the data for training.Model training utilizes various supervised classifiers,and their performance is evaluated using metrics such as accuracy,precision,recall,and F1 score.The study’s outcomes comprise a comparative analysis of supervised machine learning classifiers for Windows malware detection.Results reveal the effectiveness and efficiency of each classifier in detecting different types of malware.Additionally,insights into their strengths and limitations provide practical guidance for enhancing cybersecurity defenses.Overall,this research contributes to advancing malware detection techniques and bolstering the security posture of Windows systems against evolving cyber threats.展开更多
Traffic flow prediction in urban areas is essential in the IntelligentTransportation System (ITS). Short Term Traffic Flow (STTF) predictionimpacts traffic flow series, where an estimation of the number of vehicleswil...Traffic flow prediction in urban areas is essential in the IntelligentTransportation System (ITS). Short Term Traffic Flow (STTF) predictionimpacts traffic flow series, where an estimation of the number of vehicleswill appear during the next instance of time per hour. Precise STTF iscritical in Intelligent Transportation System. Various extinct systems aim forshort-term traffic forecasts, ensuring a good precision outcome which was asignificant task over the past few years. The main objective of this paper is topropose a new model to predict STTF for every hour of a day. In this paper,we have proposed a novel hybrid algorithm utilizing Principal ComponentAnalysis (PCA), Stacked Auto-Encoder (SAE), Long Short Term Memory(LSTM), and K-Nearest Neighbors (KNN) named PALKNN. Firstly, PCAremoves unwanted information from the dataset and selects essential features.Secondly, SAE is used to reduce the dimension of input data using onehotencoding so the model can be trained with better speed. Thirdly, LSTMtakes the input from SAE, where the data is sorted in ascending orderbased on the important features and generates the derived value. Finally,KNN Regressor takes information from LSTM to predict traffic flow. Theforecasting performance of the PALKNN model is investigated with OpenRoad Traffic Statistics dataset, Great Britain, UK. This paper enhanced thetraffic flow prediction for every hour of a day with a minimal error value.An extensive experimental analysis was performed on the benchmark dataset.The evaluated results indicate the significant improvement of the proposedPALKNN model over the recent approaches such as KNN, SARIMA, LogisticRegression, RNN, and LSTM in terms of root mean square error (RMSE)of 2.07%, mean square error (MSE) of 4.1%, and mean absolute error (MAE)of 2.04%.展开更多
It is a key challenge to exploit the label coupling relationship in multi-label classification(MLC)problems.Most previous work focused on label pairwise relations,in which generally only global statistical informati...It is a key challenge to exploit the label coupling relationship in multi-label classification(MLC)problems.Most previous work focused on label pairwise relations,in which generally only global statistical information is used to analyze the coupled label relationship.In this work,firstly Bayesian and hypothesis testing methods are applied to predict the label set size of testing samples within their k nearest neighbor samples,which combines global and local statistical information,and then apriori algorithm is used to mine the label coupling relationship among multiple labels rather than pairwise labels,which can exploit the label coupling relations more accurately and comprehensively.The experimental results on text,biology and audio datasets shown that,compared with the state-of-the-art algorithm,the proposed algorithm can obtain better performance on 5 common criteria.展开更多
The continuous top-t most influential place (CTtMIP) query is defined formally and solved efficiently in this paper. A CTtMIP query continuously monitors the t places with the maximum influence from the set of place...The continuous top-t most influential place (CTtMIP) query is defined formally and solved efficiently in this paper. A CTtMIP query continuously monitors the t places with the maximum influence from the set of places, where the influence of a place is defined as the number of its bichromatic reverse k nearest neighbors (BRkNNs). Two new metrics and their corresponding rules are introduced to shrink the search region and reduce the candidates of BRkNNs checked. Extensive experiments confirm that our proposed approach outperforms the state-of-the-art competitor significantly.展开更多
This paper is devoted to the investigation of the evaluation and query algorithm problem for the influence of spatial location based on RkNN(reverse k nearest neighbor).On the one hand,an object can make contribution ...This paper is devoted to the investigation of the evaluation and query algorithm problem for the influence of spatial location based on RkNN(reverse k nearest neighbor).On the one hand,an object can make contribution to multiple locations.However,for the existing measures for evaluating the influence of spatial location,an object only makes contribution to one location,and its influence is usually measured by the number of spatial objects in the region.In this case,a new measure for evaluating the influence of spatial location based on the RkNN is proposed.Since the weight of the contribution is determined by the distance between the object and the location,the influence weight definition is given,which meets the actual applications.On the other hand,a query algorithm for the influence of spatial location is introduced based on the proposed measure.Firstly,an algorithm named INCH(INtersection’s Convex Hull)is applied to get candidate regions,where all objects are candidates.Then,kNN and Range-k are used to refine results.Then,according to the proposed measure,the weights of objects in RkNN results are computed,and the influence of the location is accumulated.The experimental results on the real data show that the optimized algorithms outperform the basic algorithm on efficiency.In addition,in order to provide the best customer service in the location problem and make the best use of all infrastructures,a location algorithm with the query is presented based on RkNN.The influence of each facility is calculated in the location program and the equilibrium coefficient is used to evaluate the reasonability of the location in the paper.The smaller the equilibrium coefficient is,the more reasonability the program is.The actual application shows that the location based on influence makes the location algorithm more reasonable and available.展开更多
This paper deals with two new methods,based on k-NN algorithm,for fault detection and classification in distance protection.In these methods,by finding the distance between each sample and its fifth nearest neighbor i...This paper deals with two new methods,based on k-NN algorithm,for fault detection and classification in distance protection.In these methods,by finding the distance between each sample and its fifth nearest neighbor in a predefault window,the fault occurrence time and the faulty phases are determined.The maximum value of the distances in case of detection and classification procedures is compared with pre-defined threshold values.The main advantages of these methods are:simplicity,low calculation burden,acceptable accuracy,and speed.The performance of the proposed scheme is tested on a typical system in MATLAB Simulink.Various possible fault types in different fault resistances,fault inception angles,fault locations,short circuit levels,X/R ratios,source load angles are simulated.In addition,the performance of similar six well-known classification techniques is compared with the proposed classification method using plenty of simulation data.展开更多
This paper proposes an efficient retrieval ap- proach for iris using local features. The features are extracted from segmented iris image using scale invariant feature trans- form (SIFT). The keypoint descriptors ex...This paper proposes an efficient retrieval ap- proach for iris using local features. The features are extracted from segmented iris image using scale invariant feature trans- form (SIFT). The keypoint descriptors extracted from SIFT are clustered into m groups using k-means. The idea is to perform indexing of keypoints based on descriptor property. During database indexing phase, k-d tree k-dimensional tree is constructed for each cluster center taken from N iris im- ages. Thus for m clusters, rn such k-d trees are generated de- noted as ti, where 1 ~〈 i ~〈 m. During the retrieval phase, the keypoint descriptors from probe iris image are clustered into m groups and ith cluster center is used to traverse correspond- ing ti for searching, k nearest neighbor approach is used, which finds p neighbors from each tree (ti) that falls within certain radius r centered on the probe point in k-dimensional space. Finally, p neighbors from m trees are combined using union operation and top S matches (S c_ (m x p)) correspond- ing to query iris image are retrieved. The proposed approach has been tested on publicly available databases and outper- forms the existing approaches in terms of speed and accuracy.展开更多
Recently, negative databases (NDBs) are proposed for privacy protection. Similar to the traditional databases, some basic operations could be conducted over the NDBs, such as select, intersection, update, delete and...Recently, negative databases (NDBs) are proposed for privacy protection. Similar to the traditional databases, some basic operations could be conducted over the NDBs, such as select, intersection, update, delete and so on. However, both classifying and clustering in negative databases have not yet been studied. Therefore, two algorithms, i.e., a k nearest neighbor (kNN) classification algorithm and a k-means clustering algorithm in NDBs, are proposed in this paper, respectively. The core of these two algorithms is a novel method for estimating the Hamming distance between a binary string and an NDB. Experimental results demonstrate that classifying and clustering in NDBs are promising.展开更多
基金Supported by the National Natural Science Foundation of China (60673136)the Natural Science Foundation of Heilongjiang Province of China (F200601)~~
文摘Reverse k nearest neighbor (RNNk) is a generalization of the reverse nearest neighbor problem and receives increasing attention recently in the spatial data index and query. RNNk query is to retrieve all the data points which use a query point as one of their k nearest neighbors. To answer the RNNk of queries efficiently, the properties of the Voronoi cell and the space-dividing regions are applied. The RNNk of the given point can be found without computing its nearest neighbors every time by using the rank Voronoi cell. With the elementary RNNk query result, the candidate data points of reverse nearest neighbors can he further limited by the approximation with sweepline and the partial extension of query region Q. The approximate minimum average distance (AMAD) can be calculated by the approximate RNNk without the restriction of k. Experimental results indicate the efficiency and the effectiveness of the algorithm and the approximate method in three varied data distribution spaces. The approximate query and the calculation method with the high precision and the accurate recall are obtained by filtrating data and pruning the search space.
文摘The interaction between humans and machines has become an issue of concern in recent years.Besides facial expressions or gestures,speech has been evidenced as one of the foremost promising modalities for automatic emotion recognition.Effective computing means to support HCI(Human-Computer Interaction)at a psychological level,allowing PCs to adjust their reactions as per human requirements.Therefore,the recognition of emotion is pivotal in High-level interactions.Each Emotion has distinctive properties that form us to recognize them.The acoustic signal produced for identical expression or sentence changes is essentially a direct result of biophysical changes,(for example,the stress instigated narrowing of the larynx)set off by emotions.This connection between acoustic cues and emotions made Speech Emotion Recognition one of the moving subjects of the emotive computing area.The most motivation behind a Speech Emotion Recognition algorithm is to observe the emotional condition of a speaker from recorded Speech signals.The results from the application of k-NN and OVA-SVM for MFCC features without and with a feature selection approach are presented in this research.The MFCC features from the audio signal were initially extracted to characterize the properties of emotional speech.Secondly,nine basic statistical measures were calculated from MFCC and 117-dimensional features were consequently obtained to train the classifiers for seven different classes(Anger,Happiness,Disgust,Fear,Sadness,Disgust,Boredom and Neutral)of emotions.Next,Classification was done in four steps.First,all the 117-features are classified using both classifiers.Second,the best classifier was found and then features were scaled to[-1,1]and classified.In the third step,the with or without feature scaling which gives better performance was derived from the results of the second step and the classification was done for each of the basic statistical measures separately.Finally,in the fourth step,the combination of statistical measures which gives better performance was derived using the forward feature selection method Experiments were carried out using k-NN with different k values and a linear OVA-based SVM classifier with different optimal values.Berlin emotional speech database for the German language was utilized for testing the planned methodology and recognition rates as high as 60%accomplished for the recognition of emotion from voice signal for the set of statistical measures(median,maximum,mean,Inter-quartile range,skewness).OVA-SVM performs better than k-NN and the use of the feature selection technique gives a high rate.
基金Projects(LQ16E080012,LY14F030012)supported by the Zhejiang Provincial Natural Science Foundation,ChinaProject(61573317)supported by the National Natural Science Foundation of ChinaProject(2015001)supported by the Open Fund for a Key-Key Discipline of Zhejiang University of Technology,China
文摘The accurate estimation of road traffic states can provide decision making for travelers and traffic managers. In this work,an algorithm based on kernel-k nearest neighbor(KNN) matching of road traffic spatial characteristics is presented to estimate road traffic states. Firstly, the representative road traffic state data were extracted to establish the reference sequences of road traffic running characteristics(RSRTRC). Secondly, the spatial road traffic state data sequence was selected and the kernel function was constructed, with which the spatial road traffic data sequence could be mapped into a high dimensional feature space. Thirdly, the referenced and current spatial road traffic data sequences were extracted and the Euclidean distances in the feature space between them were obtained. Finally, the road traffic states were estimated from weighted averages of the selected k road traffic states, which corresponded to the nearest Euclidean distances. Several typical links in Beijing were adopted for case studies. The final results of the experiments show that the accuracy of this algorithm for estimating speed and volume is 95.27% and 91.32% respectively, which prove that this road traffic states estimation approach based on kernel-KNN matching of road traffic spatial characteristics is feasible and can achieve a high accuracy.
基金This researchwork is supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2024R411),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Malware attacks on Windows machines pose significant cybersecurity threats,necessitating effective detection and prevention mechanisms.Supervised machine learning classifiers have emerged as promising tools for malware detection.However,there remains a need for comprehensive studies that compare the performance of different classifiers specifically for Windows malware detection.Addressing this gap can provide valuable insights for enhancing cybersecurity strategies.While numerous studies have explored malware detection using machine learning techniques,there is a lack of systematic comparison of supervised classifiers for Windows malware detection.Understanding the relative effectiveness of these classifiers can inform the selection of optimal detection methods and improve overall security measures.This study aims to bridge the research gap by conducting a comparative analysis of supervised machine learning classifiers for detecting malware on Windows systems.The objectives include Investigating the performance of various classifiers,such as Gaussian Naïve Bayes,K Nearest Neighbors(KNN),Stochastic Gradient Descent Classifier(SGDC),and Decision Tree,in detecting Windows malware.Evaluating the accuracy,efficiency,and suitability of each classifier for real-world malware detection scenarios.Identifying the strengths and limitations of different classifiers to provide insights for cybersecurity practitioners and researchers.Offering recommendations for selecting the most effective classifier for Windows malware detection based on empirical evidence.The study employs a structured methodology consisting of several phases:exploratory data analysis,data preprocessing,model training,and evaluation.Exploratory data analysis involves understanding the dataset’s characteristics and identifying preprocessing requirements.Data preprocessing includes cleaning,feature encoding,dimensionality reduction,and optimization to prepare the data for training.Model training utilizes various supervised classifiers,and their performance is evaluated using metrics such as accuracy,precision,recall,and F1 score.The study’s outcomes comprise a comparative analysis of supervised machine learning classifiers for Windows malware detection.Results reveal the effectiveness and efficiency of each classifier in detecting different types of malware.Additionally,insights into their strengths and limitations provide practical guidance for enhancing cybersecurity defenses.Overall,this research contributes to advancing malware detection techniques and bolstering the security posture of Windows systems against evolving cyber threats.
文摘Traffic flow prediction in urban areas is essential in the IntelligentTransportation System (ITS). Short Term Traffic Flow (STTF) predictionimpacts traffic flow series, where an estimation of the number of vehicleswill appear during the next instance of time per hour. Precise STTF iscritical in Intelligent Transportation System. Various extinct systems aim forshort-term traffic forecasts, ensuring a good precision outcome which was asignificant task over the past few years. The main objective of this paper is topropose a new model to predict STTF for every hour of a day. In this paper,we have proposed a novel hybrid algorithm utilizing Principal ComponentAnalysis (PCA), Stacked Auto-Encoder (SAE), Long Short Term Memory(LSTM), and K-Nearest Neighbors (KNN) named PALKNN. Firstly, PCAremoves unwanted information from the dataset and selects essential features.Secondly, SAE is used to reduce the dimension of input data using onehotencoding so the model can be trained with better speed. Thirdly, LSTMtakes the input from SAE, where the data is sorted in ascending orderbased on the important features and generates the derived value. Finally,KNN Regressor takes information from LSTM to predict traffic flow. Theforecasting performance of the PALKNN model is investigated with OpenRoad Traffic Statistics dataset, Great Britain, UK. This paper enhanced thetraffic flow prediction for every hour of a day with a minimal error value.An extensive experimental analysis was performed on the benchmark dataset.The evaluated results indicate the significant improvement of the proposedPALKNN model over the recent approaches such as KNN, SARIMA, LogisticRegression, RNN, and LSTM in terms of root mean square error (RMSE)of 2.07%, mean square error (MSE) of 4.1%, and mean absolute error (MAE)of 2.04%.
基金Supported by Australian Research Council Discovery(DP130102691)the National Science Foundation of China(61302157)+1 种基金China National 863 Project(2012AA12A308)China Pre-research Project of Nuclear Industry(FZ1402-08)
文摘It is a key challenge to exploit the label coupling relationship in multi-label classification(MLC)problems.Most previous work focused on label pairwise relations,in which generally only global statistical information is used to analyze the coupled label relationship.In this work,firstly Bayesian and hypothesis testing methods are applied to predict the label set size of testing samples within their k nearest neighbor samples,which combines global and local statistical information,and then apriori algorithm is used to mine the label coupling relationship among multiple labels rather than pairwise labels,which can exploit the label coupling relations more accurately and comprehensively.The experimental results on text,biology and audio datasets shown that,compared with the state-of-the-art algorithm,the proposed algorithm can obtain better performance on 5 common criteria.
基金Supported by the National Natural Science Foundation of China (61003049)the Natural Science Foundation of Zhejiang Province (Y110278, 2010QNA5051)Zheda Zijin Plan
文摘The continuous top-t most influential place (CTtMIP) query is defined formally and solved efficiently in this paper. A CTtMIP query continuously monitors the t places with the maximum influence from the set of places, where the influence of a place is defined as the number of its bichromatic reverse k nearest neighbors (BRkNNs). Two new metrics and their corresponding rules are introduced to shrink the search region and reduce the candidates of BRkNNs checked. Extensive experiments confirm that our proposed approach outperforms the state-of-the-art competitor significantly.
基金This work was supported by the National Natural Science Foundation of China(Grants Nos.61602323,61703288)Natural Science Foundation of Liaoning Province(2019-MS-264,201602604)Technology Research Project of Education Department of Liaoning(lnqn201913).
文摘This paper is devoted to the investigation of the evaluation and query algorithm problem for the influence of spatial location based on RkNN(reverse k nearest neighbor).On the one hand,an object can make contribution to multiple locations.However,for the existing measures for evaluating the influence of spatial location,an object only makes contribution to one location,and its influence is usually measured by the number of spatial objects in the region.In this case,a new measure for evaluating the influence of spatial location based on the RkNN is proposed.Since the weight of the contribution is determined by the distance between the object and the location,the influence weight definition is given,which meets the actual applications.On the other hand,a query algorithm for the influence of spatial location is introduced based on the proposed measure.Firstly,an algorithm named INCH(INtersection’s Convex Hull)is applied to get candidate regions,where all objects are candidates.Then,kNN and Range-k are used to refine results.Then,according to the proposed measure,the weights of objects in RkNN results are computed,and the influence of the location is accumulated.The experimental results on the real data show that the optimized algorithms outperform the basic algorithm on efficiency.In addition,in order to provide the best customer service in the location problem and make the best use of all infrastructures,a location algorithm with the query is presented based on RkNN.The influence of each facility is calculated in the location program and the equilibrium coefficient is used to evaluate the reasonability of the location in the paper.The smaller the equilibrium coefficient is,the more reasonability the program is.The actual application shows that the location based on influence makes the location algorithm more reasonable and available.
文摘This paper deals with two new methods,based on k-NN algorithm,for fault detection and classification in distance protection.In these methods,by finding the distance between each sample and its fifth nearest neighbor in a predefault window,the fault occurrence time and the faulty phases are determined.The maximum value of the distances in case of detection and classification procedures is compared with pre-defined threshold values.The main advantages of these methods are:simplicity,low calculation burden,acceptable accuracy,and speed.The performance of the proposed scheme is tested on a typical system in MATLAB Simulink.Various possible fault types in different fault resistances,fault inception angles,fault locations,short circuit levels,X/R ratios,source load angles are simulated.In addition,the performance of similar six well-known classification techniques is compared with the proposed classification method using plenty of simulation data.
文摘This paper proposes an efficient retrieval ap- proach for iris using local features. The features are extracted from segmented iris image using scale invariant feature trans- form (SIFT). The keypoint descriptors extracted from SIFT are clustered into m groups using k-means. The idea is to perform indexing of keypoints based on descriptor property. During database indexing phase, k-d tree k-dimensional tree is constructed for each cluster center taken from N iris im- ages. Thus for m clusters, rn such k-d trees are generated de- noted as ti, where 1 ~〈 i ~〈 m. During the retrieval phase, the keypoint descriptors from probe iris image are clustered into m groups and ith cluster center is used to traverse correspond- ing ti for searching, k nearest neighbor approach is used, which finds p neighbors from each tree (ti) that falls within certain radius r centered on the probe point in k-dimensional space. Finally, p neighbors from m trees are combined using union operation and top S matches (S c_ (m x p)) correspond- ing to query iris image are retrieved. The proposed approach has been tested on publicly available databases and outper- forms the existing approaches in terms of speed and accuracy.
基金This work was partly supported by the National Natural Science Foundation of China (Grant'No. 61175045).
文摘Recently, negative databases (NDBs) are proposed for privacy protection. Similar to the traditional databases, some basic operations could be conducted over the NDBs, such as select, intersection, update, delete and so on. However, both classifying and clustering in negative databases have not yet been studied. Therefore, two algorithms, i.e., a k nearest neighbor (kNN) classification algorithm and a k-means clustering algorithm in NDBs, are proposed in this paper, respectively. The core of these two algorithms is a novel method for estimating the Hamming distance between a binary string and an NDB. Experimental results demonstrate that classifying and clustering in NDBs are promising.