This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a c...This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a convex approximation of the rank operator under mild condition. However, most nuclear norm minimization approaches are based on SVD operations. Given a matrix , the time complexity of SVD operation is O(mn2), which brings prohibitive computational complexity in large-scale problems. In this paper, an efficient and scalable algorithm for tensor principal component analysis is proposed which is called Linearized Alternating Direction Method with Vectorized technique for Tensor Principal Component Analysis (LADMVTPCA). Different from traditional matrix factorization methods, LADMVTPCA utilizes the vectorized technique to formulate the tensor as an outer product of vectors, which greatly improves the computational efficacy compared to matrix factorization method. In the experiment part, synthetic tensor data with different orders are used to empirically evaluate the proposed algorithm LADMVTPCA. Results have shown that LADMVTPCA outperforms matrix factorization based method.展开更多
In practical process industries,a variety of online and offline sensors and measuring instruments have been used for process control and monitoring purposes,which indicates that the measurements coming from different ...In practical process industries,a variety of online and offline sensors and measuring instruments have been used for process control and monitoring purposes,which indicates that the measurements coming from different sources are collected at different sampling rates.To build a complete process monitoring strategy,all these multi-rate measurements should be considered for data-based modeling and monitoring.In this paper,a novel kernel multi-rate probabilistic principal component analysis(K-MPPCA)model is proposed to extract the nonlinear correlations among different sampling rates.In the proposed model,the model parameters are calibrated using the kernel trick and the expectation-maximum(EM)algorithm.Also,the corresponding fault detection methods based on the nonlinear features are developed.Finally,a simulated nonlinear case and an actual pre-decarburization unit in the ammonia synthesis process are tested to demonstrate the efficiency of the proposed method.展开更多
Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input da...Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input data may not be fully assigned to one class and it may partially belong to other classes.Based on the theory of fuzzy sets,this paper presents Fuzzy Principal Component Analysis(FPCA)and its nonlinear extension model,i.e.,Kernel-based Fuzzy Principal Component Analysis(KFPCA).The experimental results indicate that the proposed algorithms have good performances.展开更多
Recovering the low-rank structure of data matrix from sparse errors arises in the principal component pursuit (PCP). This paper exploits the higher-order generalization of matrix recovery, named higher-order princip...Recovering the low-rank structure of data matrix from sparse errors arises in the principal component pursuit (PCP). This paper exploits the higher-order generalization of matrix recovery, named higher-order principal component pursuit (HOPCP), since it is critical in multi-way data analysis. Unlike the convexification (nuclear norm) for matrix rank function, the tensorial nuclear norm is stil an open problem. While existing preliminary works on the tensor completion field provide a viable way to indicate the low complexity estimate of tensor, therefore, the paper focuses on the low multi-linear rank tensor and adopt its convex relaxation to formulate the convex optimization model of HOPCP. The paper further propose two algorithms for HOPCP based on alternative minimization scheme: the augmented Lagrangian alternating direction method (ALADM) and its truncated higher-order singular value decomposition (ALADM-THOSVD) version. The former can obtain a high accuracy solution while the latter is more efficient to handle the computationally intractable problems. Experimental results on both synthetic data and real magnetic resonance imaging data show the applicability of our algorithms in high-dimensional tensor data processing.展开更多
In order to investigate the eutrophication degree of Yuqiao Reservoir, a hybrid method, combining principal component regression (PCR) and artificial neural network (ANN), was adopted to predict chlorophyll-a concentr...In order to investigate the eutrophication degree of Yuqiao Reservoir, a hybrid method, combining principal component regression (PCR) and artificial neural network (ANN), was adopted to predict chlorophyll-a concentration of Yuqiao Reservoir’s outflow. The data were obtained from two sampling sites, site 1 in the reservoir, and site 2 near the dam. Seven water variables, namely chlorophyll-a concentration of site 2 at time t and that of both sites 10 days before t, total phosphorus(TP), total nitrogen(TN),...展开更多
Robust principal component analysis(PCA) is widely used in many applications, such as image processing, data mining and bioinformatics. The existing methods for solving the robust PCA are mostly based on nuclear norm ...Robust principal component analysis(PCA) is widely used in many applications, such as image processing, data mining and bioinformatics. The existing methods for solving the robust PCA are mostly based on nuclear norm minimization. Those methods simultaneously minimize all the singular values, and thus the rank cannot be well approximated in practice. We extend the idea of truncated nuclear norm regularization(TNNR) to the robust PCA and consider truncated nuclear norm minimization(TNNM) instead of nuclear norm minimization(NNM). This method only minimizes the smallest N-r singular values to preserve the low-rank components, where N is the number of singular values and r is the matrix rank. Moreover, we propose an effective way to determine r via the shrinkage operator. Then we develop an effective iterative algorithm based on the alternating direction method to solve this optimization problem. Experimental results demonstrate the efficiency and accuracy of the TNNM method. Moreover, this method is much more robust in terms of the rank of the reconstructed matrix and the sparsity of the error.展开更多
Having researched for many years, seismologists in China presented about 80 earthquake prediction factors which reflected omen information of earthquake. How to concentrate the information that the 80 earthquake predi...Having researched for many years, seismologists in China presented about 80 earthquake prediction factors which reflected omen information of earthquake. How to concentrate the information that the 80 earthquake prediction factors have and how to choose the main factors to predict earthquakes precisely have become one of the topics in seismology. The model of principal component-discrimination consists of principal component analysis, correlation analysis, weighted method of principal factor coefficients and Mahalanobis distance discrimination analysis. This model combines the method of maximization earthquake prediction factor information with the weighted method of principal factor coefficients and correlation analysis to choose earthquake prediction variables, applying Mahalanobis distance discrimination to establishing earthquake prediction discrimination model. This model was applied to analyzing the earthquake data of Northern China area and obtained good prediction results.展开更多
This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverag...This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverage time, the frequency of coverage, average time per coverage and maximum coverage gap, etc. is required simultaneously. By introducing index normalization method to convert performance indices into dimensionless variables within the range of [0, 1], a design optimization method based on the principal component analysis and cluster analysis is proposed, which consists of index normalization method, principal component analysis, multiple-level cluster analysis and weighted evaluation method. The results of orbit optimization for earth observation satellites show that the optimal orbit can be obtained by using the proposed method. The principal component analysis can reduce the total number of indices with a non-independent relationship to save computing time. Similarly, the multiple-level cluster analysis with parallel computing could save computing time.展开更多
With the rapid growth of the international banking industry,bank failures can lead to severe economic losses and social impacts.Although existing measures to address such failures are well-developed,timely prediction ...With the rapid growth of the international banking industry,bank failures can lead to severe economic losses and social impacts.Although existing measures to address such failures are well-developed,timely prediction can significantly mitigate these effects.This study analyzes key indicators influencing bank fail-ure through data analysis and correlation analysis,then develops a neural net-work-based risk prediction model to estimate failure probabilities.First,we ex-tracted 64 indicators from the dataset,identified the most relevant indicators using the entropy weight method,and established a bank efficiency evaluation formula to determine the failure threshold.Next,we applied principal compo-nent analysis(PCA)to reduce dimensionality and derive a comprehensive scoring formula.Based on these findings,we constructed a machine learning model in MATLAB to predict bank failures.Finally,the model was used to predict the failure probabilities of all banks and identify 20 representative existing and failed banks.The developed models effectively predict bank fail-ure risks and demonstrate strong applicability across different scenarios.展开更多
Principal component analysis(PCA)is employed to extract the principal components(PCs)present in nuclear mass models for the first time.The effects from different nuclear mass models are reintegrated and reorganized in...Principal component analysis(PCA)is employed to extract the principal components(PCs)present in nuclear mass models for the first time.The effects from different nuclear mass models are reintegrated and reorganized in the extracted PCs.These PCs are recombined to build new mass models,which achieve better accuracy than the original theoretical mass models.This comparison indicates that using the PCA approach,the effects contained in different mass models can be collaborated to improve nuclear mass predictions.展开更多
With integration of renewable energy and use of non-linear loads in power systems,the power quality problem is increasingly attracting attention of researchers.In China,standards for individual power quality indexes a...With integration of renewable energy and use of non-linear loads in power systems,the power quality problem is increasingly attracting attention of researchers.In China,standards for individual power quality indexes are set.However,when evaluating power quality in practice,individual indexes cannot directly reflect a comprehensive level of power quality.In this paper,a comprehensive analysis of various indexes is conducted to obtain a unified parameter for describing the characteristics of power quality from an overall perspective.First,weight values of power quality indexes are calculated by combining the subjective and objective weight.Then,based on the principal components of the projection method,projection values of boundary data and data to be evaluated are obtained.Finally,using these projection values,a grade range for power quality data is located.A practical case study is presented to show the validity of the proposed method for evaluating power quality.展开更多
文摘This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a convex approximation of the rank operator under mild condition. However, most nuclear norm minimization approaches are based on SVD operations. Given a matrix , the time complexity of SVD operation is O(mn2), which brings prohibitive computational complexity in large-scale problems. In this paper, an efficient and scalable algorithm for tensor principal component analysis is proposed which is called Linearized Alternating Direction Method with Vectorized technique for Tensor Principal Component Analysis (LADMVTPCA). Different from traditional matrix factorization methods, LADMVTPCA utilizes the vectorized technique to formulate the tensor as an outer product of vectors, which greatly improves the computational efficacy compared to matrix factorization method. In the experiment part, synthetic tensor data with different orders are used to empirically evaluate the proposed algorithm LADMVTPCA. Results have shown that LADMVTPCA outperforms matrix factorization based method.
基金supported by Zhejiang Provincial Natural Science Foundation of China(LY19F030003)Key Research and Development Project of Zhejiang Province(2021C04030)+1 种基金the National Natural Science Foundation of China(62003306)Educational Commission Research Program of Zhejiang Province(Y202044842)。
文摘In practical process industries,a variety of online and offline sensors and measuring instruments have been used for process control and monitoring purposes,which indicates that the measurements coming from different sources are collected at different sampling rates.To build a complete process monitoring strategy,all these multi-rate measurements should be considered for data-based modeling and monitoring.In this paper,a novel kernel multi-rate probabilistic principal component analysis(K-MPPCA)model is proposed to extract the nonlinear correlations among different sampling rates.In the proposed model,the model parameters are calibrated using the kernel trick and the expectation-maximum(EM)algorithm.Also,the corresponding fault detection methods based on the nonlinear features are developed.Finally,a simulated nonlinear case and an actual pre-decarburization unit in the ammonia synthesis process are tested to demonstrate the efficiency of the proposed method.
文摘Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input data may not be fully assigned to one class and it may partially belong to other classes.Based on the theory of fuzzy sets,this paper presents Fuzzy Principal Component Analysis(FPCA)and its nonlinear extension model,i.e.,Kernel-based Fuzzy Principal Component Analysis(KFPCA).The experimental results indicate that the proposed algorithms have good performances.
基金supported by the National Natural Science Foundationof China(51275348)
文摘Recovering the low-rank structure of data matrix from sparse errors arises in the principal component pursuit (PCP). This paper exploits the higher-order generalization of matrix recovery, named higher-order principal component pursuit (HOPCP), since it is critical in multi-way data analysis. Unlike the convexification (nuclear norm) for matrix rank function, the tensorial nuclear norm is stil an open problem. While existing preliminary works on the tensor completion field provide a viable way to indicate the low complexity estimate of tensor, therefore, the paper focuses on the low multi-linear rank tensor and adopt its convex relaxation to formulate the convex optimization model of HOPCP. The paper further propose two algorithms for HOPCP based on alternative minimization scheme: the augmented Lagrangian alternating direction method (ALADM) and its truncated higher-order singular value decomposition (ALADM-THOSVD) version. The former can obtain a high accuracy solution while the latter is more efficient to handle the computationally intractable problems. Experimental results on both synthetic data and real magnetic resonance imaging data show the applicability of our algorithms in high-dimensional tensor data processing.
文摘In order to investigate the eutrophication degree of Yuqiao Reservoir, a hybrid method, combining principal component regression (PCR) and artificial neural network (ANN), was adopted to predict chlorophyll-a concentration of Yuqiao Reservoir’s outflow. The data were obtained from two sampling sites, site 1 in the reservoir, and site 2 near the dam. Seven water variables, namely chlorophyll-a concentration of site 2 at time t and that of both sites 10 days before t, total phosphorus(TP), total nitrogen(TN),...
基金the Doctoral Program of Higher Education of China(No.20120032110034)
文摘Robust principal component analysis(PCA) is widely used in many applications, such as image processing, data mining and bioinformatics. The existing methods for solving the robust PCA are mostly based on nuclear norm minimization. Those methods simultaneously minimize all the singular values, and thus the rank cannot be well approximated in practice. We extend the idea of truncated nuclear norm regularization(TNNR) to the robust PCA and consider truncated nuclear norm minimization(TNNM) instead of nuclear norm minimization(NNM). This method only minimizes the smallest N-r singular values to preserve the low-rank components, where N is the number of singular values and r is the matrix rank. Moreover, we propose an effective way to determine r via the shrinkage operator. Then we develop an effective iterative algorithm based on the alternating direction method to solve this optimization problem. Experimental results demonstrate the efficiency and accuracy of the TNNM method. Moreover, this method is much more robust in terms of the rank of the reconstructed matrix and the sparsity of the error.
文摘Having researched for many years, seismologists in China presented about 80 earthquake prediction factors which reflected omen information of earthquake. How to concentrate the information that the 80 earthquake prediction factors have and how to choose the main factors to predict earthquakes precisely have become one of the topics in seismology. The model of principal component-discrimination consists of principal component analysis, correlation analysis, weighted method of principal factor coefficients and Mahalanobis distance discrimination analysis. This model combines the method of maximization earthquake prediction factor information with the weighted method of principal factor coefficients and correlation analysis to choose earthquake prediction variables, applying Mahalanobis distance discrimination to establishing earthquake prediction discrimination model. This model was applied to analyzing the earthquake data of Northern China area and obtained good prediction results.
基金Funded by 973 Program of Ministry of National Defense of China(Grant No.613237)
文摘This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverage time, the frequency of coverage, average time per coverage and maximum coverage gap, etc. is required simultaneously. By introducing index normalization method to convert performance indices into dimensionless variables within the range of [0, 1], a design optimization method based on the principal component analysis and cluster analysis is proposed, which consists of index normalization method, principal component analysis, multiple-level cluster analysis and weighted evaluation method. The results of orbit optimization for earth observation satellites show that the optimal orbit can be obtained by using the proposed method. The principal component analysis can reduce the total number of indices with a non-independent relationship to save computing time. Similarly, the multiple-level cluster analysis with parallel computing could save computing time.
文摘With the rapid growth of the international banking industry,bank failures can lead to severe economic losses and social impacts.Although existing measures to address such failures are well-developed,timely prediction can significantly mitigate these effects.This study analyzes key indicators influencing bank fail-ure through data analysis and correlation analysis,then develops a neural net-work-based risk prediction model to estimate failure probabilities.First,we ex-tracted 64 indicators from the dataset,identified the most relevant indicators using the entropy weight method,and established a bank efficiency evaluation formula to determine the failure threshold.Next,we applied principal compo-nent analysis(PCA)to reduce dimensionality and derive a comprehensive scoring formula.Based on these findings,we constructed a machine learning model in MATLAB to predict bank failures.Finally,the model was used to predict the failure probabilities of all banks and identify 20 representative existing and failed banks.The developed models effectively predict bank fail-ure risks and demonstrate strong applicability across different scenarios.
基金supported by the State Key Laboratory of Nuclear Physics and Technology,Peking University(Grant No.NPT2023KFY02)the China Postdoctoral Science Foundation(Grant No.2021M700256)+2 种基金the National Key R&D Program of China(Grant No.2018YFA0404400)the National Natural Science Foundation of China(Grant Nos.11935003,11975031,12141501,and 12070131001)the High-performance Computing Platform of Peking University。
文摘Principal component analysis(PCA)is employed to extract the principal components(PCs)present in nuclear mass models for the first time.The effects from different nuclear mass models are reintegrated and reorganized in the extracted PCs.These PCs are recombined to build new mass models,which achieve better accuracy than the original theoretical mass models.This comparison indicates that using the PCA approach,the effects contained in different mass models can be collaborated to improve nuclear mass predictions.
基金supported by National Natural Science Foundation of China(NSFC)(51477111)National Key Research and Development Program of China(2016YFB0901104).
文摘With integration of renewable energy and use of non-linear loads in power systems,the power quality problem is increasingly attracting attention of researchers.In China,standards for individual power quality indexes are set.However,when evaluating power quality in practice,individual indexes cannot directly reflect a comprehensive level of power quality.In this paper,a comprehensive analysis of various indexes is conducted to obtain a unified parameter for describing the characteristics of power quality from an overall perspective.First,weight values of power quality indexes are calculated by combining the subjective and objective weight.Then,based on the principal components of the projection method,projection values of boundary data and data to be evaluated are obtained.Finally,using these projection values,a grade range for power quality data is located.A practical case study is presented to show the validity of the proposed method for evaluating power quality.