In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tig...In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tight sandstone reservoirs which lack the prior information and core experiments.A variety of evaluation parameters were selected,including lithology characteristic parameters,poro-permeability quality characteristic parameters,engineering quality characteristic parameters,and pore structure characteristic parameters.The PCA was used to reduce the dimension of the evaluation pa-rameters,and the low-dimensional data was used as input.The unsupervised reservoir classification of tight sandstone reservoir was carried out by the SAGA-FCM,the characteristics of reservoir at different categories were analyzed and compared with the lithological profiles.The analysis results of numerical simulation and actual logging data show that:1)compared with FCM algorithm,SAGA-FCM has stronger stability and higher accuracy;2)the proposed method can cluster the reservoir flexibly and effectively according to the degree of membership;3)the results of reservoir integrated classification match well with the lithologic profle,which demonstrates the reliability of the classification method.展开更多
With the increasing variety of application software of meteorological satellite ground system, how to provide reasonable hardware resources and improve the efficiency of software is paid more and more attention. In th...With the increasing variety of application software of meteorological satellite ground system, how to provide reasonable hardware resources and improve the efficiency of software is paid more and more attention. In this paper, a set of software classification method based on software operating characteristics is proposed. The method uses software run-time resource consumption to describe the software running characteristics. Firstly, principal component analysis (PCA) is used to reduce the dimension of software running feature data and to interpret software characteristic information. Then the modified K-means algorithm was used to classify the meteorological data processing software. Finally, it combined with the results of principal component analysis to explain the significance of various types of integrated software operating characteristics. And it is used as the basis for optimizing the allocation of software hardware resources and improving the efficiency of software operation.展开更多
Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducte...Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducted a comparative study of KECA with other five dimensionality reduction methods,principal component analysis( PCA),kernel PCA( KPCA),locally linear embedding( LLE),laplacian eigenmaps( LAE) and diffusion maps( DM). Three quality assessment criteria, local continuity meta-criterion( LCMC),trustworthiness and continuity measure(T&C),and mean relative rank error( MRRE) are applied as direct performance indexes to assess those dimensionality reduction methods. Moreover,the clustering accuracy is used as an indirect performance index to evaluate the quality of the representative data gotten by those methods. The comparisons are performed on six datasets and the results are analyzed by Friedman test with the corresponding post-hoc tests. The results indicate that KECA shows an excellent performance in both quality assessment criteria and clustering accuracy assessing.展开更多
The precision of the kernel independent component analysis( KICA) algorithm depends on the type and parameter values of kernel function. Therefore,it's of great significance to study the choice method of KICA'...The precision of the kernel independent component analysis( KICA) algorithm depends on the type and parameter values of kernel function. Therefore,it's of great significance to study the choice method of KICA's kernel parameters for improving its feature dimension reduction result. In this paper, a fitness function was established by use of the ideal of Fisher discrimination function firstly. Then the global optimal solution of fitness function was searched by particle swarm optimization( PSO) algorithm and a multi-state information dimension reduction algorithm based on PSO-KICA was established. Finally,the validity of this algorithm to enhance the precision of feature dimension reduction has been proven.展开更多
Based on improved multi-objective particle swarm optimization(MOPSO) algorithm with principal component analysis(PCA) methodology, an efficient high-dimension multiobjective optimization method is proposed, which,...Based on improved multi-objective particle swarm optimization(MOPSO) algorithm with principal component analysis(PCA) methodology, an efficient high-dimension multiobjective optimization method is proposed, which, as the purpose of this paper, aims to improve the convergence of Pareto front in multi-objective optimization design. The mathematical efficiency,the physical reasonableness and the reliability in dealing with redundant objectives of PCA are verified by typical DTLZ5 test function and multi-objective correlation analysis of supercritical airfoil,and the proposed method is integrated into aircraft multi-disciplinary design(AMDEsign) platform, which contains aerodynamics, stealth and structure weight analysis and optimization module.Then the proposed method is used for the multi-point integrated aerodynamic optimization of a wide-body passenger aircraft, in which the redundant objectives identified by PCA are transformed to optimization constraints, and several design methods are compared. The design results illustrate that the strategy used in this paper is sufficient and multi-point design requirements of the passenger aircraft are reached. The visualization level of non-dominant Pareto set is improved by effectively reducing the dimension without losing the primary feature of the problem.展开更多
An improved face recognition method is proposed based on principal component analysis (PCA) compounded with genetic algorithm (GA), named as genetic based principal component analysis (GPCA). Initially the eigen...An improved face recognition method is proposed based on principal component analysis (PCA) compounded with genetic algorithm (GA), named as genetic based principal component analysis (GPCA). Initially the eigenspace is created with eigenvalues and eigenvectors. From this space, the eigenfaces are constructed, and the most relevant eigenfaees have been selected using GPCA. With these eigenfaees, the input images are classified based on Euclidian distance. The proposed method was tested on ORL (Olivetti Research Labs) face database. Experimental results on this database demonstrate that the effectiveness of the proposed method for face recognition has less misclassification in comparison with previous methods.展开更多
This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preser...This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preserving Principal Component Analysis (SpC2DLPPCA). The modified SpC2DLDPCA and SpC2DLPPCA algorithm over their non-subpattern version and Subpattern Complete Two Dimensional Principal Component Analysis (SpC2DPCA) methods benefit greatly in the following four points: (1) SpC2DLDPCA and SpC2DLPPCA can avoid the failure that the larger dimension matrix may bring about more consuming time on computing their eigenvalues and eigenvectors. (2) SpC2DLDPCA and SpC2DLPPCA can extract local information to implement recognition. (3)The idea of subblock is introduced into Two Dimensional Principal Component Analysis (2DPCA) and Two Dimensional Linear Discriminant Analysis (2DLDA). SpC2DLDPCA combines a discriminant analysis and a compression technique with low energy loss. (4) The idea is also introduced into 2DPCA and Two Dimensional Locality Preserving projections (2DLPP), so SpC2DLPPCA can preserve local neighbor graph structure and compact feature expressions. Finally, the experiments on the CASIA(B) gait database show that SpC2DLDPCA and SpC2DLPPCA have higher recognition accuracies than their non-subpattern versions and SpC2DPCA.展开更多
The accurate extraction and classification of leather defects is an important guarantee for the automation and quality evaluation of leather industry. Aiming at the problem of data classification of leather defects,a ...The accurate extraction and classification of leather defects is an important guarantee for the automation and quality evaluation of leather industry. Aiming at the problem of data classification of leather defects,a hierarchical classification for defects is proposed.Firstly,samples are collected according to the method of minimum rectangle,and defects are extracted by image processing method.According to the geometric features of representation, they are divided into dot,line and surface for rough classification. From analysing the data which extracting the defects of geometry,gray and texture,the dominating characteristics can be acquired. Each type of defect by choosing different and representative characteristics,reducing the dimension of the data,and through these characteristics of clustering to achieve convergence effectively,realize extracted accurately,and digitized the defect characteristics,eventually establish the database. The results showthat this method can achieve more than 90% accuracy and greatly improve the accuracy of classification.展开更多
This study investigates the use of a decision tree classification model, combined with Principal Component Analysis (PCA), to distinguish between Assam and Bhutan ethnic groups based on specific anthropometric feature...This study investigates the use of a decision tree classification model, combined with Principal Component Analysis (PCA), to distinguish between Assam and Bhutan ethnic groups based on specific anthropometric features, including age, height, tail length, hair length, bang length, reach, and earlobe type. The dataset was reduced using PCA, which identified height, reach, and age as key features contributing to variance. However, while PCA effectively reduced dimensionality, it faced challenges in clearly distinguishing between the two ethnic groups, a limitation noted in previous research. In contrast, the decision tree model performed significantly better, establishing clear decision boundaries and achieving high classification accuracy. The decision tree consistently selected Height and Reach as the most important classifiers, a finding supported by existing studies on ethnic differences in Northeast India. The results highlight the strengths of combining PCA for dimensionality reduction with decision tree models for classification tasks. While PCA alone was insufficient for optimal class separation, its integration with decision trees improved both the model’s accuracy and interpretability. Future research could explore other machine learning models to enhance classification and examine a broader set of anthropometric features for more comprehensive ethnic group classification.展开更多
In this paper, a low-dimensional multiple-input and multiple-output (MIMO) model predictive control (MPC) configuration is presented for partial differential equation (PDE) unknown spatially-distributed systems ...In this paper, a low-dimensional multiple-input and multiple-output (MIMO) model predictive control (MPC) configuration is presented for partial differential equation (PDE) unknown spatially-distributed systems (SDSs). First, the dimension reduction with principal component analysis (PCA) is used to transform the high-dimensional spatio-temporal data into a low-dimensional time domain. The MPC strategy is proposed based on the online correction low-dimensional models, where the state of the system at a previous time is used to correct the output of low-dimensional models. Sufficient conditions for closed-loop stability are presented and proven. Simulations demonstrate the accuracy and efficiency of the proposed methodologies.展开更多
An automated method to optimize the definition of the progress variables in the flamelet-based dimension reduction is proposed. The performance of these optimized progress variables in coupling the flamelets and flow ...An automated method to optimize the definition of the progress variables in the flamelet-based dimension reduction is proposed. The performance of these optimized progress variables in coupling the flamelets and flow solver is presented. In the proposed method, the progress variables are defined according to the first two principal components (PCs) from the principal component analysis (PCA) or kernel-density-weighted PCA (KEDPCA) of a set of flamelets. These flamelets can then be mapped to these new progress variables instead of the mixture fraction/conventional progress variables. Thus, a new chemistry look-up table is constructed. A priori validation of these optimized progress variables and the new chemistry table is implemented in a CH4/N2/air lift-off flame. The reconstruction of the lift-off flame shows that the optimized progress variables perform better than the conventional ones, especially in the high temperature area. The coefficient determinations (R2 statistics) show that the KEDPCA performs slightly better than the PCA except for some minor species. The main advantage of the KEDPCA is that it is less sensitive to the database. Meanwhile, the criteria for the optimization are proposed and discussed. The constraint that the progress variables should monotonically evolve from fresh gas to burnt gas is analyzed in detail.展开更多
Kernal factor analysis (KFA) with vafimax was proposed by using Mercer kernel function which can map the data in the original space to a high-dimensional feature space, and was compared with the kernel principle com...Kernal factor analysis (KFA) with vafimax was proposed by using Mercer kernel function which can map the data in the original space to a high-dimensional feature space, and was compared with the kernel principle component analysis (KPCA). The results show that the best error rate in handwritten digit recognition by kernel factor analysis with vadmax (4.2%) was superior to KPCA (4.4%). The KFA with varimax could more accurately image handwritten digit recognition.展开更多
Support vector classifier(SVC)has the superior advantages for small sample learning problems with high dimensions,with especially better generalization ability.However there is some redundancy among the high dimension...Support vector classifier(SVC)has the superior advantages for small sample learning problems with high dimensions,with especially better generalization ability.However there is some redundancy among the high dimensions of the original samples and the main features of the samples may be picked up first to improve the performance of SVC.A principal component analysis(PCA)is employed to reduce the feature dimensions of the original samples and the pre-selected main features efficiently,and an SVC is constructed in the selected feature space to improve the learning speed and identification rate of SVC.Furthermore,a heuristic genetic algorithm-based automatic model selection is proposed to determine the hyperparameters of SVC to evaluate the performance of the learning machines.Experiments performed on the Heart and Adult benchmark data sets demonstrate that the proposed PCA-based SVC not only reduces the test time drastically,but also improves the identify rates effectively.展开更多
Principal component analysis(PCA)is ubiquitous in statistics and machine learning domains.It is frequently used as an intermediate procedure in various regression and classification problems to reduce the dimensionali...Principal component analysis(PCA)is ubiquitous in statistics and machine learning domains.It is frequently used as an intermediate procedure in various regression and classification problems to reduce the dimensionality of datasets.However,as the size of datasets becomes extremely large,direct application of PCA may not be feasible since loading and storing massive datasets may exceed the computational ability of common machines.To address this problem,subsampling is usually performed,in which a small proportion of the data is used as a surrogate of the entire dataset.This paper proposes an A-optimal subsampling algorithm to decrease the computational cost of PCA for super-large datasets.To be more specific,we establish the consistency and asymptotic normality of the eigenvectors of the subsampled covariance matrix.Subsequently,we derive the optimal subsampling probabilities for PCA based on the A-optimality criterion.We validate the theoretical results by conducting extensive simulation studies.Moreover,the proposed subsampling algorithm for PCA is embedded into a classification procedure for handwriting data to assess its effectiveness in real-world applications.展开更多
To make up the poor quality defects of traditional control methods and meet the growing requirements of accuracy for strip crown,an optimized model based on support vector machine(SVM)is put forward firstly to enhance...To make up the poor quality defects of traditional control methods and meet the growing requirements of accuracy for strip crown,an optimized model based on support vector machine(SVM)is put forward firstly to enhance the quality of product in hot strip rolling.Meanwhile,for enriching data information and ensuring data quality,experimental data were collected from a hot-rolled plant to set up prediction models,as well as the prediction performance of models was evaluated by calculating multiple indicators.Furthermore,the traditional SVM model and the combined prediction models with particle swarm optimization(PSO)algorithm and the principal component analysis combined with cuckoo search(PCA-CS)optimization strategies are presented to make a comparison.Besides,the prediction performance comparisons of the three models are discussed.Finally,the experimental results revealed that the PCA-CS-SVM model has the highest prediction accuracy and the fastest convergence speed.Furthermore,the root mean squared error(RMSE)of PCA-CS-SVM model is 2.04μm,and 98.15%of prediction data have an absolute error of less than 4.5μm.Especially,the results also proved that PCA-CS-SVM model not only satisfies precision requirement but also has certain guiding significance for the actual production of hot strip rolling.展开更多
When the electronic nose is used to identify different varieties of distilled liquors, the pattern recognition algorithm is chosen on the basis of the experience, which lacks the guiding principle. In this research, t...When the electronic nose is used to identify different varieties of distilled liquors, the pattern recognition algorithm is chosen on the basis of the experience, which lacks the guiding principle. In this research, the different brands of distilled spirits were identified using the pattern recognition algorithms (principal component analysis and the artificial neural network). The recognition rates of different algorithms were compared. The recognition rate of the Back Propagation Neural Network (BPNN) is the highest. Owing to the slow convergence speed of the BPNN, it tends easily to get into a local minimum. A chaotic BPNN was tried in order to overcome the disadvantage of the BPNN. The convergence speed of the chaotic BPNN is 75.5 times faster than that of the BPNN.展开更多
The eigenface method that uses principal component analysis(PCA) has been the standard and popular method used in face recognition.This paper presents a PCA-memetic algorithm(PCA-MA) approach for feature selection.PCA...The eigenface method that uses principal component analysis(PCA) has been the standard and popular method used in face recognition.This paper presents a PCA-memetic algorithm(PCA-MA) approach for feature selection.PCA has been extended by MAs where the former was used for feature extraction/dimensionality reduction and the latter exploited for feature selection.Simulations were performed over ORL and YaleB face databases using Euclidean norm as the classifier.It was found that as far as the recognition rate is concerned,PCA-MA completely outperforms the eigenface method.We compared the performance of PCA extended with genetic algorithm(PCA-GA) with our proposed PCA-MA method.The results also clearly established the supremacy of the PCA-MA method over the PCA-GA method.We further extended linear discriminant analysis(LDA) and kernel principal component analysis(KPCA) approaches with the MA and observed significant improvement in recognition rate with fewer features.This paper also compares the performance of PCA-MA,LDA-MA and KPCA-MA approaches.展开更多
Multivariate statistical process monitoring methods are often used in chemical process fault diagnosis.In this article,(I)the cycle temporal algorithm(CTA)combined with the dynamic kernel principal component analysis(...Multivariate statistical process monitoring methods are often used in chemical process fault diagnosis.In this article,(I)the cycle temporal algorithm(CTA)combined with the dynamic kernel principal component analysis(DKPCA)and the multiway dynamic kernel principal component analysis(MDKPCA)fault detection algorithms are proposed,which are used for continuous and batch process fault detections,respectively.In addition,(II)a fault variable identification model based on reconstructed-based contribution(RBC)model that paves the way for determining the cause of the fault are proposed.The proposed fault diagnosis model was applied to Tennessee Eastman(TE)process and penicillin fermentation process for fault diagnosis.And compare with other fault diagnosis methods.The results show that the proposed method has better detection effects than other methods.Finally,the reconstruction-based contribution(RBC)model method is used to accurately locate the root cause of the fault and determine the fault path.展开更多
The convergence of algorithms used for principal component analysis is analyzed. The algorithms are proved to converge to eigenvectors and eigenvalues of a matrix A which is the expectation of observed random samples....The convergence of algorithms used for principal component analysis is analyzed. The algorithms are proved to converge to eigenvectors and eigenvalues of a matrix A which is the expectation of observed random samples. The conditions required here are considerably weaker than those used in previous work.展开更多
To solve the increasing model complexity due to several input variables and large correlations under variable load conditions,a dynamic modeling method combining a kernel extreme learning machine(KELM)and principal co...To solve the increasing model complexity due to several input variables and large correlations under variable load conditions,a dynamic modeling method combining a kernel extreme learning machine(KELM)and principal component analysis(PCA)was proposed and applied to the prediction of nitrogen oxide(NO_(x))concentration at the outlet of a selective catalytic reduction(SCR)denitrification system.First,PCA is applied to the feature information extraction of input data,and the current and previous sequence values of the extracted information are used as the inputs of the KELM model to reflect the dynamic characteristics of the NO_(x)concentration at the SCR outlet.Then,the model takes the historical data of the NO_(x)concentration at the SCR outlet as the model input to improve its accuracy.Finally,an optimization algorithm is used to determine the optimal parameters of the model.Compared with the Gaussian process regression,long short-term memory,and convolutional neural network models,the prediction errors are reduced by approximately 78.4%,67.6%,and 59.3%,respectively.The results indicate that the proposed dynamic model structure is reliable and can accurately predict NO_(x)concentrations at the outlet of the SCR system.展开更多
基金funded by the National Natural Science Foundation of China(42174131)the Strategic Cooperation Technology Projects of CNPC and CUPB(ZLZX2020-03).
文摘In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tight sandstone reservoirs which lack the prior information and core experiments.A variety of evaluation parameters were selected,including lithology characteristic parameters,poro-permeability quality characteristic parameters,engineering quality characteristic parameters,and pore structure characteristic parameters.The PCA was used to reduce the dimension of the evaluation pa-rameters,and the low-dimensional data was used as input.The unsupervised reservoir classification of tight sandstone reservoir was carried out by the SAGA-FCM,the characteristics of reservoir at different categories were analyzed and compared with the lithological profiles.The analysis results of numerical simulation and actual logging data show that:1)compared with FCM algorithm,SAGA-FCM has stronger stability and higher accuracy;2)the proposed method can cluster the reservoir flexibly and effectively according to the degree of membership;3)the results of reservoir integrated classification match well with the lithologic profle,which demonstrates the reliability of the classification method.
文摘With the increasing variety of application software of meteorological satellite ground system, how to provide reasonable hardware resources and improve the efficiency of software is paid more and more attention. In this paper, a set of software classification method based on software operating characteristics is proposed. The method uses software run-time resource consumption to describe the software running characteristics. Firstly, principal component analysis (PCA) is used to reduce the dimension of software running feature data and to interpret software characteristic information. Then the modified K-means algorithm was used to classify the meteorological data processing software. Finally, it combined with the results of principal component analysis to explain the significance of various types of integrated software operating characteristics. And it is used as the basis for optimizing the allocation of software hardware resources and improving the efficiency of software operation.
基金Climbing Peak Discipline Project of Shanghai Dianji University,China(No.15DFXK02)Hi-Tech Research and Development Programs of China(No.2007AA041600)
文摘Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducted a comparative study of KECA with other five dimensionality reduction methods,principal component analysis( PCA),kernel PCA( KPCA),locally linear embedding( LLE),laplacian eigenmaps( LAE) and diffusion maps( DM). Three quality assessment criteria, local continuity meta-criterion( LCMC),trustworthiness and continuity measure(T&C),and mean relative rank error( MRRE) are applied as direct performance indexes to assess those dimensionality reduction methods. Moreover,the clustering accuracy is used as an indirect performance index to evaluate the quality of the representative data gotten by those methods. The comparisons are performed on six datasets and the results are analyzed by Friedman test with the corresponding post-hoc tests. The results indicate that KECA shows an excellent performance in both quality assessment criteria and clustering accuracy assessing.
文摘The precision of the kernel independent component analysis( KICA) algorithm depends on the type and parameter values of kernel function. Therefore,it's of great significance to study the choice method of KICA's kernel parameters for improving its feature dimension reduction result. In this paper, a fitness function was established by use of the ideal of Fisher discrimination function firstly. Then the global optimal solution of fitness function was searched by particle swarm optimization( PSO) algorithm and a multi-state information dimension reduction algorithm based on PSO-KICA was established. Finally,the validity of this algorithm to enhance the precision of feature dimension reduction has been proven.
基金supported by the National Natural Science Foundation of China (No.11402288)
文摘Based on improved multi-objective particle swarm optimization(MOPSO) algorithm with principal component analysis(PCA) methodology, an efficient high-dimension multiobjective optimization method is proposed, which, as the purpose of this paper, aims to improve the convergence of Pareto front in multi-objective optimization design. The mathematical efficiency,the physical reasonableness and the reliability in dealing with redundant objectives of PCA are verified by typical DTLZ5 test function and multi-objective correlation analysis of supercritical airfoil,and the proposed method is integrated into aircraft multi-disciplinary design(AMDEsign) platform, which contains aerodynamics, stealth and structure weight analysis and optimization module.Then the proposed method is used for the multi-point integrated aerodynamic optimization of a wide-body passenger aircraft, in which the redundant objectives identified by PCA are transformed to optimization constraints, and several design methods are compared. The design results illustrate that the strategy used in this paper is sufficient and multi-point design requirements of the passenger aircraft are reached. The visualization level of non-dominant Pareto set is improved by effectively reducing the dimension without losing the primary feature of the problem.
文摘An improved face recognition method is proposed based on principal component analysis (PCA) compounded with genetic algorithm (GA), named as genetic based principal component analysis (GPCA). Initially the eigenspace is created with eigenvalues and eigenvectors. From this space, the eigenfaces are constructed, and the most relevant eigenfaees have been selected using GPCA. With these eigenfaees, the input images are classified based on Euclidian distance. The proposed method was tested on ORL (Olivetti Research Labs) face database. Experimental results on this database demonstrate that the effectiveness of the proposed method for face recognition has less misclassification in comparison with previous methods.
基金Sponsored by the National Science Foundation of China( Grant No. 61201370,61100103)the Independent Innovation Foundation of Shandong University( Grant No. 2012DX07)
文摘This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preserving Principal Component Analysis (SpC2DLPPCA). The modified SpC2DLDPCA and SpC2DLPPCA algorithm over their non-subpattern version and Subpattern Complete Two Dimensional Principal Component Analysis (SpC2DPCA) methods benefit greatly in the following four points: (1) SpC2DLDPCA and SpC2DLPPCA can avoid the failure that the larger dimension matrix may bring about more consuming time on computing their eigenvalues and eigenvectors. (2) SpC2DLDPCA and SpC2DLPPCA can extract local information to implement recognition. (3)The idea of subblock is introduced into Two Dimensional Principal Component Analysis (2DPCA) and Two Dimensional Linear Discriminant Analysis (2DLDA). SpC2DLDPCA combines a discriminant analysis and a compression technique with low energy loss. (4) The idea is also introduced into 2DPCA and Two Dimensional Locality Preserving projections (2DLPP), so SpC2DLPPCA can preserve local neighbor graph structure and compact feature expressions. Finally, the experiments on the CASIA(B) gait database show that SpC2DLDPCA and SpC2DLPPCA have higher recognition accuracies than their non-subpattern versions and SpC2DPCA.
文摘The accurate extraction and classification of leather defects is an important guarantee for the automation and quality evaluation of leather industry. Aiming at the problem of data classification of leather defects,a hierarchical classification for defects is proposed.Firstly,samples are collected according to the method of minimum rectangle,and defects are extracted by image processing method.According to the geometric features of representation, they are divided into dot,line and surface for rough classification. From analysing the data which extracting the defects of geometry,gray and texture,the dominating characteristics can be acquired. Each type of defect by choosing different and representative characteristics,reducing the dimension of the data,and through these characteristics of clustering to achieve convergence effectively,realize extracted accurately,and digitized the defect characteristics,eventually establish the database. The results showthat this method can achieve more than 90% accuracy and greatly improve the accuracy of classification.
文摘This study investigates the use of a decision tree classification model, combined with Principal Component Analysis (PCA), to distinguish between Assam and Bhutan ethnic groups based on specific anthropometric features, including age, height, tail length, hair length, bang length, reach, and earlobe type. The dataset was reduced using PCA, which identified height, reach, and age as key features contributing to variance. However, while PCA effectively reduced dimensionality, it faced challenges in clearly distinguishing between the two ethnic groups, a limitation noted in previous research. In contrast, the decision tree model performed significantly better, establishing clear decision boundaries and achieving high classification accuracy. The decision tree consistently selected Height and Reach as the most important classifiers, a finding supported by existing studies on ethnic differences in Northeast India. The results highlight the strengths of combining PCA for dimensionality reduction with decision tree models for classification tasks. While PCA alone was insufficient for optimal class separation, its integration with decision trees improved both the model’s accuracy and interpretability. Future research could explore other machine learning models to enhance classification and examine a broader set of anthropometric features for more comprehensive ethnic group classification.
基金supported by National High Technology Research and Development Program of China (863 Program)(No. 2009AA04Z162)National Nature Science Foundation of China(No. 60825302, No. 60934007, No. 61074061)+1 种基金Program of Shanghai Subject Chief Scientist,"Shu Guang" project supported by Shang-hai Municipal Education Commission and Shanghai Education Development FoundationKey Project of Shanghai Science and Technology Commission, China (No. 10JC1403400)
文摘In this paper, a low-dimensional multiple-input and multiple-output (MIMO) model predictive control (MPC) configuration is presented for partial differential equation (PDE) unknown spatially-distributed systems (SDSs). First, the dimension reduction with principal component analysis (PCA) is used to transform the high-dimensional spatio-temporal data into a low-dimensional time domain. The MPC strategy is proposed based on the online correction low-dimensional models, where the state of the system at a previous time is used to correct the output of low-dimensional models. Sufficient conditions for closed-loop stability are presented and proven. Simulations demonstrate the accuracy and efficiency of the proposed methodologies.
基金Project supported by the National Natural Science Foundation of China(Nos.50936005,51576182,and 11172296)
文摘An automated method to optimize the definition of the progress variables in the flamelet-based dimension reduction is proposed. The performance of these optimized progress variables in coupling the flamelets and flow solver is presented. In the proposed method, the progress variables are defined according to the first two principal components (PCs) from the principal component analysis (PCA) or kernel-density-weighted PCA (KEDPCA) of a set of flamelets. These flamelets can then be mapped to these new progress variables instead of the mixture fraction/conventional progress variables. Thus, a new chemistry look-up table is constructed. A priori validation of these optimized progress variables and the new chemistry table is implemented in a CH4/N2/air lift-off flame. The reconstruction of the lift-off flame shows that the optimized progress variables perform better than the conventional ones, especially in the high temperature area. The coefficient determinations (R2 statistics) show that the KEDPCA performs slightly better than the PCA except for some minor species. The main advantage of the KEDPCA is that it is less sensitive to the database. Meanwhile, the criteria for the optimization are proposed and discussed. The constraint that the progress variables should monotonically evolve from fresh gas to burnt gas is analyzed in detail.
基金The National Defence Foundation of China (No.NEWL51435Qt220401)
文摘Kernal factor analysis (KFA) with vafimax was proposed by using Mercer kernel function which can map the data in the original space to a high-dimensional feature space, and was compared with the kernel principle component analysis (KPCA). The results show that the best error rate in handwritten digit recognition by kernel factor analysis with vadmax (4.2%) was superior to KPCA (4.4%). The KFA with varimax could more accurately image handwritten digit recognition.
基金the National Natural Science of China(50675167)a Foundation for the Author of National Excellent Doctoral Dissertation of China(200535)
文摘Support vector classifier(SVC)has the superior advantages for small sample learning problems with high dimensions,with especially better generalization ability.However there is some redundancy among the high dimensions of the original samples and the main features of the samples may be picked up first to improve the performance of SVC.A principal component analysis(PCA)is employed to reduce the feature dimensions of the original samples and the pre-selected main features efficiently,and an SVC is constructed in the selected feature space to improve the learning speed and identification rate of SVC.Furthermore,a heuristic genetic algorithm-based automatic model selection is proposed to determine the hyperparameters of SVC to evaluate the performance of the learning machines.Experiments performed on the Heart and Adult benchmark data sets demonstrate that the proposed PCA-based SVC not only reduces the test time drastically,but also improves the identify rates effectively.
基金supported by the National Key R&D Program of China(Grant No.2022YFA1003803)National Social Science Foundation of China(Grant No.21BTJ048)+3 种基金National Natural Science Foundation of China(Grant Nos.12371276 and 12131006)supported by the National Key R&D Program of China(Grant No.2023YFA1008700)National Social Science Foundation of China(Grant No.24BTJ066)National Natural Science Foundation of China(Grant No.12171033)。
文摘Principal component analysis(PCA)is ubiquitous in statistics and machine learning domains.It is frequently used as an intermediate procedure in various regression and classification problems to reduce the dimensionality of datasets.However,as the size of datasets becomes extremely large,direct application of PCA may not be feasible since loading and storing massive datasets may exceed the computational ability of common machines.To address this problem,subsampling is usually performed,in which a small proportion of the data is used as a surrogate of the entire dataset.This paper proposes an A-optimal subsampling algorithm to decrease the computational cost of PCA for super-large datasets.To be more specific,we establish the consistency and asymptotic normality of the eigenvectors of the subsampled covariance matrix.Subsequently,we derive the optimal subsampling probabilities for PCA based on the A-optimality criterion.We validate the theoretical results by conducting extensive simulation studies.Moreover,the proposed subsampling algorithm for PCA is embedded into a classification procedure for handwriting data to assess its effectiveness in real-world applications.
基金Project(52005358)supported by the National Natural Science Foundation of ChinaProject(2018YFB1307902)supported by the National Key R&D Program of China+1 种基金Project(201901D111243)supported by the Natural Science Foundation of Shanxi Province,ChinaProject(2019-KF-25-05)supported by the Natural Science Foundation of Liaoning Province,China。
文摘To make up the poor quality defects of traditional control methods and meet the growing requirements of accuracy for strip crown,an optimized model based on support vector machine(SVM)is put forward firstly to enhance the quality of product in hot strip rolling.Meanwhile,for enriching data information and ensuring data quality,experimental data were collected from a hot-rolled plant to set up prediction models,as well as the prediction performance of models was evaluated by calculating multiple indicators.Furthermore,the traditional SVM model and the combined prediction models with particle swarm optimization(PSO)algorithm and the principal component analysis combined with cuckoo search(PCA-CS)optimization strategies are presented to make a comparison.Besides,the prediction performance comparisons of the three models are discussed.Finally,the experimental results revealed that the PCA-CS-SVM model has the highest prediction accuracy and the fastest convergence speed.Furthermore,the root mean squared error(RMSE)of PCA-CS-SVM model is 2.04μm,and 98.15%of prediction data have an absolute error of less than 4.5μm.Especially,the results also proved that PCA-CS-SVM model not only satisfies precision requirement but also has certain guiding significance for the actual production of hot strip rolling.
基金the Science and Technology Plan Projects, Department of Education of Jilin Province, P R China (Grant no. 2006026)
文摘When the electronic nose is used to identify different varieties of distilled liquors, the pattern recognition algorithm is chosen on the basis of the experience, which lacks the guiding principle. In this research, the different brands of distilled spirits were identified using the pattern recognition algorithms (principal component analysis and the artificial neural network). The recognition rates of different algorithms were compared. The recognition rate of the Back Propagation Neural Network (BPNN) is the highest. Owing to the slow convergence speed of the BPNN, it tends easily to get into a local minimum. A chaotic BPNN was tried in order to overcome the disadvantage of the BPNN. The convergence speed of the chaotic BPNN is 75.5 times faster than that of the BPNN.
文摘The eigenface method that uses principal component analysis(PCA) has been the standard and popular method used in face recognition.This paper presents a PCA-memetic algorithm(PCA-MA) approach for feature selection.PCA has been extended by MAs where the former was used for feature extraction/dimensionality reduction and the latter exploited for feature selection.Simulations were performed over ORL and YaleB face databases using Euclidean norm as the classifier.It was found that as far as the recognition rate is concerned,PCA-MA completely outperforms the eigenface method.We compared the performance of PCA extended with genetic algorithm(PCA-GA) with our proposed PCA-MA method.The results also clearly established the supremacy of the PCA-MA method over the PCA-GA method.We further extended linear discriminant analysis(LDA) and kernel principal component analysis(KPCA) approaches with the MA and observed significant improvement in recognition rate with fewer features.This paper also compares the performance of PCA-MA,LDA-MA and KPCA-MA approaches.
基金financial support from the National Natural Science Foundation of China (21706220)
文摘Multivariate statistical process monitoring methods are often used in chemical process fault diagnosis.In this article,(I)the cycle temporal algorithm(CTA)combined with the dynamic kernel principal component analysis(DKPCA)and the multiway dynamic kernel principal component analysis(MDKPCA)fault detection algorithms are proposed,which are used for continuous and batch process fault detections,respectively.In addition,(II)a fault variable identification model based on reconstructed-based contribution(RBC)model that paves the way for determining the cause of the fault are proposed.The proposed fault diagnosis model was applied to Tennessee Eastman(TE)process and penicillin fermentation process for fault diagnosis.And compare with other fault diagnosis methods.The results show that the proposed method has better detection effects than other methods.Finally,the reconstruction-based contribution(RBC)model method is used to accurately locate the root cause of the fault and determine the fault path.
基金Project supported by the National Natural Science Foundation of China.
文摘The convergence of algorithms used for principal component analysis is analyzed. The algorithms are proved to converge to eigenvectors and eigenvalues of a matrix A which is the expectation of observed random samples. The conditions required here are considerably weaker than those used in previous work.
基金The National Natural Science Foundation of China(No.71471060)the Natural Science Foundation of Hebei Province(No.E2018502111)。
文摘To solve the increasing model complexity due to several input variables and large correlations under variable load conditions,a dynamic modeling method combining a kernel extreme learning machine(KELM)and principal component analysis(PCA)was proposed and applied to the prediction of nitrogen oxide(NO_(x))concentration at the outlet of a selective catalytic reduction(SCR)denitrification system.First,PCA is applied to the feature information extraction of input data,and the current and previous sequence values of the extracted information are used as the inputs of the KELM model to reflect the dynamic characteristics of the NO_(x)concentration at the SCR outlet.Then,the model takes the historical data of the NO_(x)concentration at the SCR outlet as the model input to improve its accuracy.Finally,an optimization algorithm is used to determine the optimal parameters of the model.Compared with the Gaussian process regression,long short-term memory,and convolutional neural network models,the prediction errors are reduced by approximately 78.4%,67.6%,and 59.3%,respectively.The results indicate that the proposed dynamic model structure is reliable and can accurately predict NO_(x)concentrations at the outlet of the SCR system.