Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducte...Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducted a comparative study of KECA with other five dimensionality reduction methods,principal component analysis( PCA),kernel PCA( KPCA),locally linear embedding( LLE),laplacian eigenmaps( LAE) and diffusion maps( DM). Three quality assessment criteria, local continuity meta-criterion( LCMC),trustworthiness and continuity measure(T&C),and mean relative rank error( MRRE) are applied as direct performance indexes to assess those dimensionality reduction methods. Moreover,the clustering accuracy is used as an indirect performance index to evaluate the quality of the representative data gotten by those methods. The comparisons are performed on six datasets and the results are analyzed by Friedman test with the corresponding post-hoc tests. The results indicate that KECA shows an excellent performance in both quality assessment criteria and clustering accuracy assessing.展开更多
Owing to their global search capabilities and gradient-free operation,metaheuristic algorithms are widely applied to a wide range of optimization problems.However,their computational demands become prohibitive when ta...Owing to their global search capabilities and gradient-free operation,metaheuristic algorithms are widely applied to a wide range of optimization problems.However,their computational demands become prohibitive when tackling high-dimensional optimization challenges.To effectively address these challenges,this study introduces cooperative metaheuristics integrating dynamic dimension reduction(DR).Building upon particle swarm optimization(PSO)and differential evolution(DE),the proposed cooperative methods C-PSO and C-DE are developed.In the proposed methods,the modified principal components analysis(PCA)is utilized to reduce the dimension of design variables,thereby decreasing computational costs.The dynamic DR strategy implements periodic execution of modified PCA after a fixed number of iterations,resulting in the important dimensions being dynamically identified.Compared with the static one,the dynamic DR strategy can achieve precise identification of important dimensions,thereby enabling accelerated convergence toward optimal solutions.Furthermore,the influence of cumulative contribution rate thresholds on optimization problems with different dimensions is investigated.Metaheuristic algorithms(PSO,DE)and cooperative metaheuristics(C-PSO,C-DE)are examined by 15 benchmark functions and two engineering design problems(speed reducer and composite pressure vessel).Comparative results demonstrate that the cooperative methods achieve significantly superior performance compared to standard methods in both solution accuracy and computational efficiency.Compared to standard metaheuristic algorithms,cooperative metaheuristics achieve a reduction in computational cost of at least 40%.The cooperative metaheuristics can be effectively used to tackle both high-dimensional unconstrained and constrained optimization problems.展开更多
Based on improved multi-objective particle swarm optimization(MOPSO) algorithm with principal component analysis(PCA) methodology, an efficient high-dimension multiobjective optimization method is proposed, which,...Based on improved multi-objective particle swarm optimization(MOPSO) algorithm with principal component analysis(PCA) methodology, an efficient high-dimension multiobjective optimization method is proposed, which, as the purpose of this paper, aims to improve the convergence of Pareto front in multi-objective optimization design. The mathematical efficiency,the physical reasonableness and the reliability in dealing with redundant objectives of PCA are verified by typical DTLZ5 test function and multi-objective correlation analysis of supercritical airfoil,and the proposed method is integrated into aircraft multi-disciplinary design(AMDEsign) platform, which contains aerodynamics, stealth and structure weight analysis and optimization module.Then the proposed method is used for the multi-point integrated aerodynamic optimization of a wide-body passenger aircraft, in which the redundant objectives identified by PCA are transformed to optimization constraints, and several design methods are compared. The design results illustrate that the strategy used in this paper is sufficient and multi-point design requirements of the passenger aircraft are reached. The visualization level of non-dominant Pareto set is improved by effectively reducing the dimension without losing the primary feature of the problem.展开更多
This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preser...This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preserving Principal Component Analysis (SpC2DLPPCA). The modified SpC2DLDPCA and SpC2DLPPCA algorithm over their non-subpattern version and Subpattern Complete Two Dimensional Principal Component Analysis (SpC2DPCA) methods benefit greatly in the following four points: (1) SpC2DLDPCA and SpC2DLPPCA can avoid the failure that the larger dimension matrix may bring about more consuming time on computing their eigenvalues and eigenvectors. (2) SpC2DLDPCA and SpC2DLPPCA can extract local information to implement recognition. (3)The idea of subblock is introduced into Two Dimensional Principal Component Analysis (2DPCA) and Two Dimensional Linear Discriminant Analysis (2DLDA). SpC2DLDPCA combines a discriminant analysis and a compression technique with low energy loss. (4) The idea is also introduced into 2DPCA and Two Dimensional Locality Preserving projections (2DLPP), so SpC2DLPPCA can preserve local neighbor graph structure and compact feature expressions. Finally, the experiments on the CASIA(B) gait database show that SpC2DLDPCA and SpC2DLPPCA have higher recognition accuracies than their non-subpattern versions and SpC2DPCA.展开更多
This study investigates the use of a decision tree classification model, combined with Principal Component Analysis (PCA), to distinguish between Assam and Bhutan ethnic groups based on specific anthropometric feature...This study investigates the use of a decision tree classification model, combined with Principal Component Analysis (PCA), to distinguish between Assam and Bhutan ethnic groups based on specific anthropometric features, including age, height, tail length, hair length, bang length, reach, and earlobe type. The dataset was reduced using PCA, which identified height, reach, and age as key features contributing to variance. However, while PCA effectively reduced dimensionality, it faced challenges in clearly distinguishing between the two ethnic groups, a limitation noted in previous research. In contrast, the decision tree model performed significantly better, establishing clear decision boundaries and achieving high classification accuracy. The decision tree consistently selected Height and Reach as the most important classifiers, a finding supported by existing studies on ethnic differences in Northeast India. The results highlight the strengths of combining PCA for dimensionality reduction with decision tree models for classification tasks. While PCA alone was insufficient for optimal class separation, its integration with decision trees improved both the model’s accuracy and interpretability. Future research could explore other machine learning models to enhance classification and examine a broader set of anthropometric features for more comprehensive ethnic group classification.展开更多
An automated method to optimize the definition of the progress variables in the flamelet-based dimension reduction is proposed. The performance of these optimized progress variables in coupling the flamelets and flow ...An automated method to optimize the definition of the progress variables in the flamelet-based dimension reduction is proposed. The performance of these optimized progress variables in coupling the flamelets and flow solver is presented. In the proposed method, the progress variables are defined according to the first two principal components (PCs) from the principal component analysis (PCA) or kernel-density-weighted PCA (KEDPCA) of a set of flamelets. These flamelets can then be mapped to these new progress variables instead of the mixture fraction/conventional progress variables. Thus, a new chemistry look-up table is constructed. A priori validation of these optimized progress variables and the new chemistry table is implemented in a CH4/N2/air lift-off flame. The reconstruction of the lift-off flame shows that the optimized progress variables perform better than the conventional ones, especially in the high temperature area. The coefficient determinations (R2 statistics) show that the KEDPCA performs slightly better than the PCA except for some minor species. The main advantage of the KEDPCA is that it is less sensitive to the database. Meanwhile, the criteria for the optimization are proposed and discussed. The constraint that the progress variables should monotonically evolve from fresh gas to burnt gas is analyzed in detail.展开更多
In this paper, a low-dimensional multiple-input and multiple-output (MIMO) model predictive control (MPC) configuration is presented for partial differential equation (PDE) unknown spatially-distributed systems ...In this paper, a low-dimensional multiple-input and multiple-output (MIMO) model predictive control (MPC) configuration is presented for partial differential equation (PDE) unknown spatially-distributed systems (SDSs). First, the dimension reduction with principal component analysis (PCA) is used to transform the high-dimensional spatio-temporal data into a low-dimensional time domain. The MPC strategy is proposed based on the online correction low-dimensional models, where the state of the system at a previous time is used to correct the output of low-dimensional models. Sufficient conditions for closed-loop stability are presented and proven. Simulations demonstrate the accuracy and efficiency of the proposed methodologies.展开更多
降维对于数据的可视化和预处理具有重要意义,主成分分析作为最常用的无监督降维算法之一,在实际应用中面临着对噪声和离群点敏感的问题。为了解决这个问题,研究者们提出了多种鲁棒主成分分析算法,通过减小整体样本的重构误差来减小离群...降维对于数据的可视化和预处理具有重要意义,主成分分析作为最常用的无监督降维算法之一,在实际应用中面临着对噪声和离群点敏感的问题。为了解决这个问题,研究者们提出了多种鲁棒主成分分析算法,通过减小整体样本的重构误差来减小离群点的影响。然而,这些算法忽略了数据的固有局部结构,导致数据的本质结构信息丢失,从而影响了对噪声和离群点的准确辨识和移除,进而影响了后续算法的性能。因此,该文提出了基于Soft均值滤波的鲁棒主成分分析(Robust Principal Component Analysis Based on Soft Mean Filtering,RPCA-SMF)算法。RPCA-SMF采用Soft均值滤波的思想,通过两步走的形式,不仅在模型学习前对噪声处理,同时在模型学习后也引入了噪声处理机制。具体而言,RPCA-SMF算法首先引入了均值滤波的相关思想,通过对比样本与其局部近邻这两者和局部均值的偏差对样本进行Soft加权,从而对噪声进行判定。随后,通过第一步获取的关于噪声的“判别知识”处理噪声信息。由于均值滤波能有效保留数据的整体轮廓信息,因此对于被识别为噪声的样本,RPCA-SMF算法强调保留其低频整体轮廓信息,而非高频的噪声信息。这样能够有效地保留数据中的有用信息,提高对数据整体结构特征的保留能力,使得算法具有较强的鲁棒性和较好的泛化性。展开更多
基金Climbing Peak Discipline Project of Shanghai Dianji University,China(No.15DFXK02)Hi-Tech Research and Development Programs of China(No.2007AA041600)
文摘Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducted a comparative study of KECA with other five dimensionality reduction methods,principal component analysis( PCA),kernel PCA( KPCA),locally linear embedding( LLE),laplacian eigenmaps( LAE) and diffusion maps( DM). Three quality assessment criteria, local continuity meta-criterion( LCMC),trustworthiness and continuity measure(T&C),and mean relative rank error( MRRE) are applied as direct performance indexes to assess those dimensionality reduction methods. Moreover,the clustering accuracy is used as an indirect performance index to evaluate the quality of the representative data gotten by those methods. The comparisons are performed on six datasets and the results are analyzed by Friedman test with the corresponding post-hoc tests. The results indicate that KECA shows an excellent performance in both quality assessment criteria and clustering accuracy assessing.
基金funded by National Natural Science Foundation of China(Nos.12402142,11832013 and 11572134)Natural Science Foundation of Hubei Province(No.2024AFB235)+1 种基金Hubei Provincial Department of Education Science and Technology Research Project(No.Q20221714)the Opening Foundation of Hubei Key Laboratory of Digital Textile Equipment(Nos.DTL2023019 and DTL2022012).
文摘Owing to their global search capabilities and gradient-free operation,metaheuristic algorithms are widely applied to a wide range of optimization problems.However,their computational demands become prohibitive when tackling high-dimensional optimization challenges.To effectively address these challenges,this study introduces cooperative metaheuristics integrating dynamic dimension reduction(DR).Building upon particle swarm optimization(PSO)and differential evolution(DE),the proposed cooperative methods C-PSO and C-DE are developed.In the proposed methods,the modified principal components analysis(PCA)is utilized to reduce the dimension of design variables,thereby decreasing computational costs.The dynamic DR strategy implements periodic execution of modified PCA after a fixed number of iterations,resulting in the important dimensions being dynamically identified.Compared with the static one,the dynamic DR strategy can achieve precise identification of important dimensions,thereby enabling accelerated convergence toward optimal solutions.Furthermore,the influence of cumulative contribution rate thresholds on optimization problems with different dimensions is investigated.Metaheuristic algorithms(PSO,DE)and cooperative metaheuristics(C-PSO,C-DE)are examined by 15 benchmark functions and two engineering design problems(speed reducer and composite pressure vessel).Comparative results demonstrate that the cooperative methods achieve significantly superior performance compared to standard methods in both solution accuracy and computational efficiency.Compared to standard metaheuristic algorithms,cooperative metaheuristics achieve a reduction in computational cost of at least 40%.The cooperative metaheuristics can be effectively used to tackle both high-dimensional unconstrained and constrained optimization problems.
基金supported by the National Natural Science Foundation of China (No.11402288)
文摘Based on improved multi-objective particle swarm optimization(MOPSO) algorithm with principal component analysis(PCA) methodology, an efficient high-dimension multiobjective optimization method is proposed, which, as the purpose of this paper, aims to improve the convergence of Pareto front in multi-objective optimization design. The mathematical efficiency,the physical reasonableness and the reliability in dealing with redundant objectives of PCA are verified by typical DTLZ5 test function and multi-objective correlation analysis of supercritical airfoil,and the proposed method is integrated into aircraft multi-disciplinary design(AMDEsign) platform, which contains aerodynamics, stealth and structure weight analysis and optimization module.Then the proposed method is used for the multi-point integrated aerodynamic optimization of a wide-body passenger aircraft, in which the redundant objectives identified by PCA are transformed to optimization constraints, and several design methods are compared. The design results illustrate that the strategy used in this paper is sufficient and multi-point design requirements of the passenger aircraft are reached. The visualization level of non-dominant Pareto set is improved by effectively reducing the dimension without losing the primary feature of the problem.
基金Sponsored by the National Science Foundation of China( Grant No. 61201370,61100103)the Independent Innovation Foundation of Shandong University( Grant No. 2012DX07)
文摘This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preserving Principal Component Analysis (SpC2DLPPCA). The modified SpC2DLDPCA and SpC2DLPPCA algorithm over their non-subpattern version and Subpattern Complete Two Dimensional Principal Component Analysis (SpC2DPCA) methods benefit greatly in the following four points: (1) SpC2DLDPCA and SpC2DLPPCA can avoid the failure that the larger dimension matrix may bring about more consuming time on computing their eigenvalues and eigenvectors. (2) SpC2DLDPCA and SpC2DLPPCA can extract local information to implement recognition. (3)The idea of subblock is introduced into Two Dimensional Principal Component Analysis (2DPCA) and Two Dimensional Linear Discriminant Analysis (2DLDA). SpC2DLDPCA combines a discriminant analysis and a compression technique with low energy loss. (4) The idea is also introduced into 2DPCA and Two Dimensional Locality Preserving projections (2DLPP), so SpC2DLPPCA can preserve local neighbor graph structure and compact feature expressions. Finally, the experiments on the CASIA(B) gait database show that SpC2DLDPCA and SpC2DLPPCA have higher recognition accuracies than their non-subpattern versions and SpC2DPCA.
文摘This study investigates the use of a decision tree classification model, combined with Principal Component Analysis (PCA), to distinguish between Assam and Bhutan ethnic groups based on specific anthropometric features, including age, height, tail length, hair length, bang length, reach, and earlobe type. The dataset was reduced using PCA, which identified height, reach, and age as key features contributing to variance. However, while PCA effectively reduced dimensionality, it faced challenges in clearly distinguishing between the two ethnic groups, a limitation noted in previous research. In contrast, the decision tree model performed significantly better, establishing clear decision boundaries and achieving high classification accuracy. The decision tree consistently selected Height and Reach as the most important classifiers, a finding supported by existing studies on ethnic differences in Northeast India. The results highlight the strengths of combining PCA for dimensionality reduction with decision tree models for classification tasks. While PCA alone was insufficient for optimal class separation, its integration with decision trees improved both the model’s accuracy and interpretability. Future research could explore other machine learning models to enhance classification and examine a broader set of anthropometric features for more comprehensive ethnic group classification.
基金Project supported by the National Natural Science Foundation of China(Nos.50936005,51576182,and 11172296)
文摘An automated method to optimize the definition of the progress variables in the flamelet-based dimension reduction is proposed. The performance of these optimized progress variables in coupling the flamelets and flow solver is presented. In the proposed method, the progress variables are defined according to the first two principal components (PCs) from the principal component analysis (PCA) or kernel-density-weighted PCA (KEDPCA) of a set of flamelets. These flamelets can then be mapped to these new progress variables instead of the mixture fraction/conventional progress variables. Thus, a new chemistry look-up table is constructed. A priori validation of these optimized progress variables and the new chemistry table is implemented in a CH4/N2/air lift-off flame. The reconstruction of the lift-off flame shows that the optimized progress variables perform better than the conventional ones, especially in the high temperature area. The coefficient determinations (R2 statistics) show that the KEDPCA performs slightly better than the PCA except for some minor species. The main advantage of the KEDPCA is that it is less sensitive to the database. Meanwhile, the criteria for the optimization are proposed and discussed. The constraint that the progress variables should monotonically evolve from fresh gas to burnt gas is analyzed in detail.
基金supported by National High Technology Research and Development Program of China (863 Program)(No. 2009AA04Z162)National Nature Science Foundation of China(No. 60825302, No. 60934007, No. 61074061)+1 种基金Program of Shanghai Subject Chief Scientist,"Shu Guang" project supported by Shang-hai Municipal Education Commission and Shanghai Education Development FoundationKey Project of Shanghai Science and Technology Commission, China (No. 10JC1403400)
文摘In this paper, a low-dimensional multiple-input and multiple-output (MIMO) model predictive control (MPC) configuration is presented for partial differential equation (PDE) unknown spatially-distributed systems (SDSs). First, the dimension reduction with principal component analysis (PCA) is used to transform the high-dimensional spatio-temporal data into a low-dimensional time domain. The MPC strategy is proposed based on the online correction low-dimensional models, where the state of the system at a previous time is used to correct the output of low-dimensional models. Sufficient conditions for closed-loop stability are presented and proven. Simulations demonstrate the accuracy and efficiency of the proposed methodologies.
文摘降维对于数据的可视化和预处理具有重要意义,主成分分析作为最常用的无监督降维算法之一,在实际应用中面临着对噪声和离群点敏感的问题。为了解决这个问题,研究者们提出了多种鲁棒主成分分析算法,通过减小整体样本的重构误差来减小离群点的影响。然而,这些算法忽略了数据的固有局部结构,导致数据的本质结构信息丢失,从而影响了对噪声和离群点的准确辨识和移除,进而影响了后续算法的性能。因此,该文提出了基于Soft均值滤波的鲁棒主成分分析(Robust Principal Component Analysis Based on Soft Mean Filtering,RPCA-SMF)算法。RPCA-SMF采用Soft均值滤波的思想,通过两步走的形式,不仅在模型学习前对噪声处理,同时在模型学习后也引入了噪声处理机制。具体而言,RPCA-SMF算法首先引入了均值滤波的相关思想,通过对比样本与其局部近邻这两者和局部均值的偏差对样本进行Soft加权,从而对噪声进行判定。随后,通过第一步获取的关于噪声的“判别知识”处理噪声信息。由于均值滤波能有效保留数据的整体轮廓信息,因此对于被识别为噪声的样本,RPCA-SMF算法强调保留其低频整体轮廓信息,而非高频的噪声信息。这样能够有效地保留数据中的有用信息,提高对数据整体结构特征的保留能力,使得算法具有较强的鲁棒性和较好的泛化性。