In order to classify nonlinear features with a linear classifier and improve the classification accuracy, a deep learning network named kernel principal component analysis network( KPCANet) is proposed. First, the d...In order to classify nonlinear features with a linear classifier and improve the classification accuracy, a deep learning network named kernel principal component analysis network( KPCANet) is proposed. First, the data is mapped into a higher-dimensional space with kernel principal component analysis to make the data linearly separable. Then a two-layer KPCANet is built to obtain the principal components of the image. Finally, the principal components are classified with a linear classifier. Experimental results showthat the proposed KPCANet is effective in face recognition, object recognition and handwritten digit recognition. It also outperforms principal component analysis network( PCANet) generally. Besides, KPCANet is invariant to illumination and stable to occlusion and slight deformation.展开更多
The existing recommendation algorithms have lower robustness in facing of shilling attacks. Considering this problem, we present a robust recommendation algorithm based on kernel principal component analysis and fuzzy...The existing recommendation algorithms have lower robustness in facing of shilling attacks. Considering this problem, we present a robust recommendation algorithm based on kernel principal component analysis and fuzzy c-means clustering. Firstly, we use kernel principal component analysis method to reduce the dimensionality of the original rating matrix, which can extract the effective features of users and items. Then, according to the dimension-reduced rating matrix and the high correlation characteristic between attack profiles, we use fuzzy c-means clustering method to cluster user profiles, which can realize the effective separation of genuine profiles and attack profiles. Finally, we construct an indicator function based on the attack detection results to decrease the influence of attack profiles on the recommendation, and incorporate it into the matrix factorization technology to design the corresponding robust recommendation algorithm. Experiment results indicate that the proposed algorithm is superior to the existing methods in both recommendation accuracy and robustness.展开更多
In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tig...In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tight sandstone reservoirs which lack the prior information and core experiments.A variety of evaluation parameters were selected,including lithology characteristic parameters,poro-permeability quality characteristic parameters,engineering quality characteristic parameters,and pore structure characteristic parameters.The PCA was used to reduce the dimension of the evaluation pa-rameters,and the low-dimensional data was used as input.The unsupervised reservoir classification of tight sandstone reservoir was carried out by the SAGA-FCM,the characteristics of reservoir at different categories were analyzed and compared with the lithological profiles.The analysis results of numerical simulation and actual logging data show that:1)compared with FCM algorithm,SAGA-FCM has stronger stability and higher accuracy;2)the proposed method can cluster the reservoir flexibly and effectively according to the degree of membership;3)the results of reservoir integrated classification match well with the lithologic profle,which demonstrates the reliability of the classification method.展开更多
With the increasing variety of application software of meteorological satellite ground system, how to provide reasonable hardware resources and improve the efficiency of software is paid more and more attention. In th...With the increasing variety of application software of meteorological satellite ground system, how to provide reasonable hardware resources and improve the efficiency of software is paid more and more attention. In this paper, a set of software classification method based on software operating characteristics is proposed. The method uses software run-time resource consumption to describe the software running characteristics. Firstly, principal component analysis (PCA) is used to reduce the dimension of software running feature data and to interpret software characteristic information. Then the modified K-means algorithm was used to classify the meteorological data processing software. Finally, it combined with the results of principal component analysis to explain the significance of various types of integrated software operating characteristics. And it is used as the basis for optimizing the allocation of software hardware resources and improving the efficiency of software operation.展开更多
In the industrial process situation, principal component analysis (PCA) is ageneral method in data reconciliation. However, PCA sometime is unfeasible to nonlinear featureanalysis and limited in application to nonline...In the industrial process situation, principal component analysis (PCA) is ageneral method in data reconciliation. However, PCA sometime is unfeasible to nonlinear featureanalysis and limited in application to nonlinear industrial process. Kernel PCA (KPCA) is extensionof PCA and can be used for nonlinear feature analysis. A nonlinear data reconciliation method basedon KPCA is proposed. The basic idea of this method is that firstly original data are mapped to highdimensional feature space by nonlinear function, and PCA is implemented in the feature space. Thennonlinear feature analysis is implemented and data are reconstructed by using the kernel. The datareconciliation method based on KPCA is applied to ternary distillation column. Simulation resultsshow that this method can filter the noise in measurements of nonlinear process and reconciliateddata can represent the true information of nonlinear process.展开更多
In practical process industries,a variety of online and offline sensors and measuring instruments have been used for process control and monitoring purposes,which indicates that the measurements coming from different ...In practical process industries,a variety of online and offline sensors and measuring instruments have been used for process control and monitoring purposes,which indicates that the measurements coming from different sources are collected at different sampling rates.To build a complete process monitoring strategy,all these multi-rate measurements should be considered for data-based modeling and monitoring.In this paper,a novel kernel multi-rate probabilistic principal component analysis(K-MPPCA)model is proposed to extract the nonlinear correlations among different sampling rates.In the proposed model,the model parameters are calibrated using the kernel trick and the expectation-maximum(EM)algorithm.Also,the corresponding fault detection methods based on the nonlinear features are developed.Finally,a simulated nonlinear case and an actual pre-decarburization unit in the ammonia synthesis process are tested to demonstrate the efficiency of the proposed method.展开更多
Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input da...Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input data may not be fully assigned to one class and it may partially belong to other classes.Based on the theory of fuzzy sets,this paper presents Fuzzy Principal Component Analysis(FPCA)and its nonlinear extension model,i.e.,Kernel-based Fuzzy Principal Component Analysis(KFPCA).The experimental results indicate that the proposed algorithms have good performances.展开更多
Panicle swarm optimization (PSO) is an optimization algorithm based on the swarm intelligent principle. In this paper the modified PSO is applied to a kernel principal component analysis ( KPCA ) for an optimal ke...Panicle swarm optimization (PSO) is an optimization algorithm based on the swarm intelligent principle. In this paper the modified PSO is applied to a kernel principal component analysis ( KPCA ) for an optimal kernel function parameter. We first comprehensively considered within-class scatter and between-class scatter of the sample features. Then, the fitness function of an optimized kernel function parameter is constructed, and the particle swarm optimization algorithm with adaptive acceleration (CPSO) is applied to optimizing it. It is used for gearbox condi- tion recognition, and the result is compared with the recognized results based on principal component analysis (PCA). The results show that KPCA optimized by CPSO can effectively recognize fault conditions of the gearbox by reducing bind set-up of the kernel function parameter, and its results of fault recognition outperform those of PCA. We draw the conclusion that KPCA based on CPSO has an advantage in nonlinear feature extraction of mechanical failure, and is helpful for fault condition recognition of complicated machines.展开更多
A model combining kernel principal component analysis(KPCA)and Xtreme Gradient Boosting(XGBoost)was introduced for forecasting the final oxygen content of electroslag remelting.KPCA was employed to reduce the dimensio...A model combining kernel principal component analysis(KPCA)and Xtreme Gradient Boosting(XGBoost)was introduced for forecasting the final oxygen content of electroslag remelting.KPCA was employed to reduce the dimensionality of the factors influencing the endpoint oxygen content and to eliminate any existing correlations among these factors.The resulting principal components were then utilized as input variables for the XGBoost prediction model.The KPCA-XGBoost model was trained and proven using data obtained from companies.The model structure was adapted,and hyperparameters were optimized using grid search cross-validation.The model performance of the KPCA-XGBoost model is compared with five machine learning models,including the support vector regression model.The findings demonstrated that the KPCA-XGBoost model exhibited the highest level of prediction accuracy,indicating that the incorporation of KPCA significantly enhanced the regression prediction performance of the model.The accuracy of the KPCA-XGBoost model was 82.4%,97.1%,and 100%at errors of±1.5×10^(-6),±2.0×10^(-6),and±3×10^(-6)for oxygen content,respectively.展开更多
The kernel principal component analysis (KPCA) method employs the first several kernel principal components (KPCs), which indicate the most variance information of normal observations for process monitoring, but m...The kernel principal component analysis (KPCA) method employs the first several kernel principal components (KPCs), which indicate the most variance information of normal observations for process monitoring, but may not reflect the fault information. In this study, sensitive kernel principal component analysis (SKPCA) is proposed to improve process monitoring performance, i.e., to deal with the discordance of T2 statistic and squared prediction error SVE statistic and reduce missed detection rates. T2 statistic can be used to measure the variation di rectly along each KPC and analyze the detection performance as well as capture the most useful information in a process. With the calculation of the change rate of T2 statistic along each KPC, SKPCA selects the sensitive kernel principal components for process monitoring. A simulated simple system and Tennessee Eastman process are employed to demonstrate the efficiency of SKPCA on online monitoring. The results indicate that the monitoring performance is improved significantly.展开更多
Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducte...Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducted a comparative study of KECA with other five dimensionality reduction methods,principal component analysis( PCA),kernel PCA( KPCA),locally linear embedding( LLE),laplacian eigenmaps( LAE) and diffusion maps( DM). Three quality assessment criteria, local continuity meta-criterion( LCMC),trustworthiness and continuity measure(T&C),and mean relative rank error( MRRE) are applied as direct performance indexes to assess those dimensionality reduction methods. Moreover,the clustering accuracy is used as an indirect performance index to evaluate the quality of the representative data gotten by those methods. The comparisons are performed on six datasets and the results are analyzed by Friedman test with the corresponding post-hoc tests. The results indicate that KECA shows an excellent performance in both quality assessment criteria and clustering accuracy assessing.展开更多
Kernal factor analysis (KFA) with vafimax was proposed by using Mercer kernel function which can map the data in the original space to a high-dimensional feature space, and was compared with the kernel principle com...Kernal factor analysis (KFA) with vafimax was proposed by using Mercer kernel function which can map the data in the original space to a high-dimensional feature space, and was compared with the kernel principle component analysis (KPCA). The results show that the best error rate in handwritten digit recognition by kernel factor analysis with vadmax (4.2%) was superior to KPCA (4.4%). The KFA with varimax could more accurately image handwritten digit recognition.展开更多
In order to monitor malt quality in the malting industry, despite yearly variations in the barley quality, 394 barley samples were analysed using conventional (moisture, protein and B-glucan content) and mid-infrare...In order to monitor malt quality in the malting industry, despite yearly variations in the barley quality, 394 barley samples were analysed using conventional (moisture, protein and B-glucan content) and mid-infrared Fourier transform spectroscopy FT-IR. The experimental dataset included barley from three harvest years, two barley species, 77 barley varieties, and two-row and six-row barley, from 16 cultivation sites. For each sample, the malt quality indices were also assessed according to European Brewing Convention (EBC) standards. Principal component analysis (PCA) was carried out on mean-centred, normalized and derivative spectra using 200/cm width spectral bands. The most informative spectral bands were observed in the 800-1,000/cm and 1,000-1,200/cm ranges. PCA revealed that barley harvested in 2010 and in 2011 had bands that were very close together, while 2009 harvest clearly displayed a difference in its quality. PCA made it possible to distinguish two species and confirmed that two-row winter barley quality was closer to two-row spring barley quality than to six-row winter barley. Results indicate that mid-infrared spectrometry (MIR) could be a very useful and rapid analytical tool to assess barley qualitative quality.展开更多
基金The National Natural Science Foundation of China(No.6120134461271312+7 种基金6140108511301074)the Research Fund for the Doctoral Program of Higher Education(No.20120092120036)the Program for Special Talents in Six Fields of Jiangsu Province(No.DZXX-031)Industry-University-Research Cooperation Project of Jiangsu Province(No.BY2014127-11)"333"Project(No.BRA2015288)High-End Foreign Experts Recruitment Program(No.GDT20153200043)Open Fund of Jiangsu Engineering Center of Network Monitoring(No.KJR1404)
文摘In order to classify nonlinear features with a linear classifier and improve the classification accuracy, a deep learning network named kernel principal component analysis network( KPCANet) is proposed. First, the data is mapped into a higher-dimensional space with kernel principal component analysis to make the data linearly separable. Then a two-layer KPCANet is built to obtain the principal components of the image. Finally, the principal components are classified with a linear classifier. Experimental results showthat the proposed KPCANet is effective in face recognition, object recognition and handwritten digit recognition. It also outperforms principal component analysis network( PCANet) generally. Besides, KPCANet is invariant to illumination and stable to occlusion and slight deformation.
基金Supported by the Scientific Research Foundation of Liaoning Provincial Education Department(L2015240)the National Natural Science Foundation of China(61379116,61503169)the Joint Fund of the Science and Technology Department of Liaoning Province(20170540448)
文摘The existing recommendation algorithms have lower robustness in facing of shilling attacks. Considering this problem, we present a robust recommendation algorithm based on kernel principal component analysis and fuzzy c-means clustering. Firstly, we use kernel principal component analysis method to reduce the dimensionality of the original rating matrix, which can extract the effective features of users and items. Then, according to the dimension-reduced rating matrix and the high correlation characteristic between attack profiles, we use fuzzy c-means clustering method to cluster user profiles, which can realize the effective separation of genuine profiles and attack profiles. Finally, we construct an indicator function based on the attack detection results to decrease the influence of attack profiles on the recommendation, and incorporate it into the matrix factorization technology to design the corresponding robust recommendation algorithm. Experiment results indicate that the proposed algorithm is superior to the existing methods in both recommendation accuracy and robustness.
基金funded by the National Natural Science Foundation of China(42174131)the Strategic Cooperation Technology Projects of CNPC and CUPB(ZLZX2020-03).
文摘In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tight sandstone reservoirs which lack the prior information and core experiments.A variety of evaluation parameters were selected,including lithology characteristic parameters,poro-permeability quality characteristic parameters,engineering quality characteristic parameters,and pore structure characteristic parameters.The PCA was used to reduce the dimension of the evaluation pa-rameters,and the low-dimensional data was used as input.The unsupervised reservoir classification of tight sandstone reservoir was carried out by the SAGA-FCM,the characteristics of reservoir at different categories were analyzed and compared with the lithological profiles.The analysis results of numerical simulation and actual logging data show that:1)compared with FCM algorithm,SAGA-FCM has stronger stability and higher accuracy;2)the proposed method can cluster the reservoir flexibly and effectively according to the degree of membership;3)the results of reservoir integrated classification match well with the lithologic profle,which demonstrates the reliability of the classification method.
文摘With the increasing variety of application software of meteorological satellite ground system, how to provide reasonable hardware resources and improve the efficiency of software is paid more and more attention. In this paper, a set of software classification method based on software operating characteristics is proposed. The method uses software run-time resource consumption to describe the software running characteristics. Firstly, principal component analysis (PCA) is used to reduce the dimension of software running feature data and to interpret software characteristic information. Then the modified K-means algorithm was used to classify the meteorological data processing software. Finally, it combined with the results of principal component analysis to explain the significance of various types of integrated software operating characteristics. And it is used as the basis for optimizing the allocation of software hardware resources and improving the efficiency of software operation.
基金This project is supported by Special Foundation for Major State Basic Research of China (Project 973, No.G1998030415)
文摘In the industrial process situation, principal component analysis (PCA) is ageneral method in data reconciliation. However, PCA sometime is unfeasible to nonlinear featureanalysis and limited in application to nonlinear industrial process. Kernel PCA (KPCA) is extensionof PCA and can be used for nonlinear feature analysis. A nonlinear data reconciliation method basedon KPCA is proposed. The basic idea of this method is that firstly original data are mapped to highdimensional feature space by nonlinear function, and PCA is implemented in the feature space. Thennonlinear feature analysis is implemented and data are reconstructed by using the kernel. The datareconciliation method based on KPCA is applied to ternary distillation column. Simulation resultsshow that this method can filter the noise in measurements of nonlinear process and reconciliateddata can represent the true information of nonlinear process.
基金supported by Zhejiang Provincial Natural Science Foundation of China(LY19F030003)Key Research and Development Project of Zhejiang Province(2021C04030)+1 种基金the National Natural Science Foundation of China(62003306)Educational Commission Research Program of Zhejiang Province(Y202044842)。
文摘In practical process industries,a variety of online and offline sensors and measuring instruments have been used for process control and monitoring purposes,which indicates that the measurements coming from different sources are collected at different sampling rates.To build a complete process monitoring strategy,all these multi-rate measurements should be considered for data-based modeling and monitoring.In this paper,a novel kernel multi-rate probabilistic principal component analysis(K-MPPCA)model is proposed to extract the nonlinear correlations among different sampling rates.In the proposed model,the model parameters are calibrated using the kernel trick and the expectation-maximum(EM)algorithm.Also,the corresponding fault detection methods based on the nonlinear features are developed.Finally,a simulated nonlinear case and an actual pre-decarburization unit in the ammonia synthesis process are tested to demonstrate the efficiency of the proposed method.
文摘Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input data may not be fully assigned to one class and it may partially belong to other classes.Based on the theory of fuzzy sets,this paper presents Fuzzy Principal Component Analysis(FPCA)and its nonlinear extension model,i.e.,Kernel-based Fuzzy Principal Component Analysis(KFPCA).The experimental results indicate that the proposed algorithms have good performances.
基金supported by National Natural Science Foundation under Grant No.50875247Shanxi Province Natural Science Foundation under Grant No.2009011026-1
文摘Panicle swarm optimization (PSO) is an optimization algorithm based on the swarm intelligent principle. In this paper the modified PSO is applied to a kernel principal component analysis ( KPCA ) for an optimal kernel function parameter. We first comprehensively considered within-class scatter and between-class scatter of the sample features. Then, the fitness function of an optimized kernel function parameter is constructed, and the particle swarm optimization algorithm with adaptive acceleration (CPSO) is applied to optimizing it. It is used for gearbox condi- tion recognition, and the result is compared with the recognized results based on principal component analysis (PCA). The results show that KPCA optimized by CPSO can effectively recognize fault conditions of the gearbox by reducing bind set-up of the kernel function parameter, and its results of fault recognition outperform those of PCA. We draw the conclusion that KPCA based on CPSO has an advantage in nonlinear feature extraction of mechanical failure, and is helpful for fault condition recognition of complicated machines.
基金Authors acknowledge the financial support by National Natural Science Foundation of China(Grant Nos.52174303,and 51874084)Fundamental Research Funds for the Central Universities(Grant No.2125026)+1 种基金Program of Introducing Talents of Discipline to Universities(Grant No.B21001)the 111 Project(Grant No.B16009).
文摘A model combining kernel principal component analysis(KPCA)and Xtreme Gradient Boosting(XGBoost)was introduced for forecasting the final oxygen content of electroslag remelting.KPCA was employed to reduce the dimensionality of the factors influencing the endpoint oxygen content and to eliminate any existing correlations among these factors.The resulting principal components were then utilized as input variables for the XGBoost prediction model.The KPCA-XGBoost model was trained and proven using data obtained from companies.The model structure was adapted,and hyperparameters were optimized using grid search cross-validation.The model performance of the KPCA-XGBoost model is compared with five machine learning models,including the support vector regression model.The findings demonstrated that the KPCA-XGBoost model exhibited the highest level of prediction accuracy,indicating that the incorporation of KPCA significantly enhanced the regression prediction performance of the model.The accuracy of the KPCA-XGBoost model was 82.4%,97.1%,and 100%at errors of±1.5×10^(-6),±2.0×10^(-6),and±3×10^(-6)for oxygen content,respectively.
基金Supported by the 973 project of China (2013CB733600), the National Natural Science Foundation (21176073), the Doctoral Fund of Ministry of Education (20090074110005), the New Century Excellent Talents in University (NCET-09-0346), "Shu Guang" project (09SG29) and the Fundamental Research Funds for the Central Universities.
文摘The kernel principal component analysis (KPCA) method employs the first several kernel principal components (KPCs), which indicate the most variance information of normal observations for process monitoring, but may not reflect the fault information. In this study, sensitive kernel principal component analysis (SKPCA) is proposed to improve process monitoring performance, i.e., to deal with the discordance of T2 statistic and squared prediction error SVE statistic and reduce missed detection rates. T2 statistic can be used to measure the variation di rectly along each KPC and analyze the detection performance as well as capture the most useful information in a process. With the calculation of the change rate of T2 statistic along each KPC, SKPCA selects the sensitive kernel principal components for process monitoring. A simulated simple system and Tennessee Eastman process are employed to demonstrate the efficiency of SKPCA on online monitoring. The results indicate that the monitoring performance is improved significantly.
基金Climbing Peak Discipline Project of Shanghai Dianji University,China(No.15DFXK02)Hi-Tech Research and Development Programs of China(No.2007AA041600)
文摘Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducted a comparative study of KECA with other five dimensionality reduction methods,principal component analysis( PCA),kernel PCA( KPCA),locally linear embedding( LLE),laplacian eigenmaps( LAE) and diffusion maps( DM). Three quality assessment criteria, local continuity meta-criterion( LCMC),trustworthiness and continuity measure(T&C),and mean relative rank error( MRRE) are applied as direct performance indexes to assess those dimensionality reduction methods. Moreover,the clustering accuracy is used as an indirect performance index to evaluate the quality of the representative data gotten by those methods. The comparisons are performed on six datasets and the results are analyzed by Friedman test with the corresponding post-hoc tests. The results indicate that KECA shows an excellent performance in both quality assessment criteria and clustering accuracy assessing.
基金The National Defence Foundation of China (No.NEWL51435Qt220401)
文摘Kernal factor analysis (KFA) with vafimax was proposed by using Mercer kernel function which can map the data in the original space to a high-dimensional feature space, and was compared with the kernel principle component analysis (KPCA). The results show that the best error rate in handwritten digit recognition by kernel factor analysis with vadmax (4.2%) was superior to KPCA (4.4%). The KFA with varimax could more accurately image handwritten digit recognition.
文摘In order to monitor malt quality in the malting industry, despite yearly variations in the barley quality, 394 barley samples were analysed using conventional (moisture, protein and B-glucan content) and mid-infrared Fourier transform spectroscopy FT-IR. The experimental dataset included barley from three harvest years, two barley species, 77 barley varieties, and two-row and six-row barley, from 16 cultivation sites. For each sample, the malt quality indices were also assessed according to European Brewing Convention (EBC) standards. Principal component analysis (PCA) was carried out on mean-centred, normalized and derivative spectra using 200/cm width spectral bands. The most informative spectral bands were observed in the 800-1,000/cm and 1,000-1,200/cm ranges. PCA revealed that barley harvested in 2010 and in 2011 had bands that were very close together, while 2009 harvest clearly displayed a difference in its quality. PCA made it possible to distinguish two species and confirmed that two-row winter barley quality was closer to two-row spring barley quality than to six-row winter barley. Results indicate that mid-infrared spectrometry (MIR) could be a very useful and rapid analytical tool to assess barley qualitative quality.