Owing to their global search capabilities and gradient-free operation,metaheuristic algorithms are widely applied to a wide range of optimization problems.However,their computational demands become prohibitive when ta...Owing to their global search capabilities and gradient-free operation,metaheuristic algorithms are widely applied to a wide range of optimization problems.However,their computational demands become prohibitive when tackling high-dimensional optimization challenges.To effectively address these challenges,this study introduces cooperative metaheuristics integrating dynamic dimension reduction(DR).Building upon particle swarm optimization(PSO)and differential evolution(DE),the proposed cooperative methods C-PSO and C-DE are developed.In the proposed methods,the modified principal components analysis(PCA)is utilized to reduce the dimension of design variables,thereby decreasing computational costs.The dynamic DR strategy implements periodic execution of modified PCA after a fixed number of iterations,resulting in the important dimensions being dynamically identified.Compared with the static one,the dynamic DR strategy can achieve precise identification of important dimensions,thereby enabling accelerated convergence toward optimal solutions.Furthermore,the influence of cumulative contribution rate thresholds on optimization problems with different dimensions is investigated.Metaheuristic algorithms(PSO,DE)and cooperative metaheuristics(C-PSO,C-DE)are examined by 15 benchmark functions and two engineering design problems(speed reducer and composite pressure vessel).Comparative results demonstrate that the cooperative methods achieve significantly superior performance compared to standard methods in both solution accuracy and computational efficiency.Compared to standard metaheuristic algorithms,cooperative metaheuristics achieve a reduction in computational cost of at least 40%.The cooperative metaheuristics can be effectively used to tackle both high-dimensional unconstrained and constrained optimization problems.展开更多
In the past decade,financial institutions have invested significant efforts in the development of accurate analytical credit scoring models.The evidence suggests that even small improvements in the accuracy of existin...In the past decade,financial institutions have invested significant efforts in the development of accurate analytical credit scoring models.The evidence suggests that even small improvements in the accuracy of existing credit-scoring models may optimize profits while effectively managing risk exposure.Despite continuing efforts,the majority of existing credit scoring models still include some judgment-based assumptions that are sometimes supported by the significant findings of previous studies but are not validated using the institution’s internal data.We argue that current studies related to the development of credit scoring models have largely ignored recent developments in statistical methods for sufficient dimension reduction.To contribute to the field of financial innovation,this study proposes a Dimension Reduction Assisted Credit Scoring(DRA-CS)method via distance covariance-based sufficient dimension reduction(DCOV-SDR)in Majorization-Minimization(MM)algorithm.First,in the presence of a large number of variables,the DRA-CS method results in greater dimension reduction and better prediction accuracy than the other methods used for dimension reduction.Second,when the DRA-CS method is employed with logistic regression,it outperforms existing methods based on different variable selection techniques.This study argues that the DRA-CS method should be used by financial institutions as a financial innovation tool to analyze high-dimensional customer datasets and improve the accuracy of existing credit scoring methods.展开更多
In aerodynamic optimization, global optimization methods such as genetic algorithms are preferred in many cases because of their advantage on reaching global optimum. However,for complex problems in which large number...In aerodynamic optimization, global optimization methods such as genetic algorithms are preferred in many cases because of their advantage on reaching global optimum. However,for complex problems in which large number of design variables are needed, the computational cost becomes prohibitive, and thus original global optimization strategies are required. To address this need, data dimensionality reduction method is combined with global optimization methods, thus forming a new global optimization system, aiming to improve the efficiency of conventional global optimization. The new optimization system involves applying Proper Orthogonal Decomposition(POD) in dimensionality reduction of design space while maintaining the generality of original design space. Besides, an acceleration approach for samples calculation in surrogate modeling is applied to reduce the computational time while providing sufficient accuracy. The optimizations of a transonic airfoil RAE2822 and the transonic wing ONERA M6 are performed to demonstrate the effectiveness of the proposed new optimization system. In both cases, we manage to reduce the number of design variables from 20 to 10 and from 42 to 20 respectively. The new design optimization system converges faster and it takes 1/3 of the total time of traditional optimization to converge to a better design, thus significantly reducing the overall optimization time and improving the efficiency of conventional global design optimization method.展开更多
Big data is a vast amount of structured and unstructured data that must be dealt with on a regular basis.Dimensionality reduction is the process of converting a huge set of data into data with tiny dimensions so that ...Big data is a vast amount of structured and unstructured data that must be dealt with on a regular basis.Dimensionality reduction is the process of converting a huge set of data into data with tiny dimensions so that equal information may be expressed easily.These tactics are frequently utilized to improve classification or regression challenges while dealing with machine learning issues.To achieve dimensionality reduction for huge data sets,this paper offers a hybrid particle swarm optimization-rough set PSO-RS and Mayfly algorithm-rough set MA-RS.A novel hybrid strategy based on the Mayfly algorithm(MA)and the rough set(RS)is proposed in particular.The performance of the novel hybrid algorithm MA-RS is evaluated by solving six different data sets from the literature.The simulation results and comparison with common reduction methods demonstrate the proposed MARS algorithm’s capacity to handle a wide range of data sets.Finally,the rough set approach,as well as the hybrid optimization techniques PSO-RS and MARS,were applied to deal with the massive data problem.MA-hybrid RS’s method beats other classic dimensionality reduction techniques,according to the experimental results and statistical testing studies.展开更多
<strong>Purpose:</strong><span style="font-family:;" "=""><span style="font-family:Verdana;"> This study sought to review the characteristics, strengths, weak...<strong>Purpose:</strong><span style="font-family:;" "=""><span style="font-family:Verdana;"> This study sought to review the characteristics, strengths, weaknesses variants, applications areas and data types applied on the various </span><span><span style="font-family:Verdana;">Dimension Reduction techniques. </span><b><span style="font-family:Verdana;">Methodology: </span></b><span style="font-family:Verdana;">The most commonly used databases employed to search for the papers were ScienceDirect, Scopus, Google Scholar, IEEE Xplore and Mendeley. An integrative review was used for the study where </span></span></span><span style="font-family:Verdana;">341</span><span style="font-family:;" "=""><span style="font-family:Verdana;"> papers were reviewed. </span><b><span style="font-family:Verdana;">Results:</span></b><span style="font-family:Verdana;"> The linear techniques considered were Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Singular Value Decomposition (SVD), Latent Semantic Analysis (LSA), Locality Preserving Projections (LPP), Independent Component Analysis (ICA) and Project Pursuit (PP). The non-linear techniques which were developed to work with applications that ha</span></span><span style="font-family:Verdana;">ve</span><span style="font-family:Verdana;"> complex non-linear structures considered were Kernel Principal Component Analysis (KPC</span><span style="font-family:Verdana;">A), Multi</span><span style="font-family:Verdana;">-</span><span style="font-family:;" "=""><span style="font-family:Verdana;">dimensional Scaling (MDS), Isomap, Locally Linear Embedding (LLE), Self-Organizing Map (SOM), Latent Vector Quantization (LVQ), t-Stochastic </span><span style="font-family:Verdana;">neighbor embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP). DR techniques can further be categorized into supervised, unsupervised and more recently semi-supervised learning methods. The supervised versions are the LDA and LVQ. All the other techniques are unsupervised. Supervised variants of PCA, LPP, KPCA and MDS have </span><span style="font-family:Verdana;">been developed. Supervised and semi-supervised variants of PP and t-SNE have also been developed and a semi supervised version of the LDA has been developed. </span><b><span style="font-family:Verdana;">Conclusion:</span></b><span style="font-family:Verdana;"> The various application areas, strengths, weaknesses and variants of the DR techniques were explored. The different data types that have been applied on the various DR techniques were also explored.</span></span>展开更多
Advanced engineering systems, like aircraft, are defined by tens or even hundreds of design variables. Building an accurate surrogate model for use in such high-dimensional optimization problems is a difficult task ow...Advanced engineering systems, like aircraft, are defined by tens or even hundreds of design variables. Building an accurate surrogate model for use in such high-dimensional optimization problems is a difficult task owing to the curse of dimensionality. This paper presents a new algorithm to reduce the size of a design space to a smaller region of interest allowing a more accurate surrogate model to be generated. The framework requires a set of models of different physical or numerical fidelities. The low-fidelity (LF) model provides physics-based approximation of the high-fidelity (HF) model at a fraction of the computational cost. It is also instrumental in identifying the small region of interest in the design space that encloses the high-fidelity optimum. A surrogate model is then constructed to match the low-fidelity model to the high-fidelity model in the identified region of interest. The optimization process is managed by an update strategy to prevent convergence to false optima. The algorithm is applied on mathematical problems and a two-dimen-sional aerodynamic shape optimization problem in a variable-fidelity context. Results obtained are in excellent agreement with high-fidelity results, even with lower-fidelity flow solvers, while showing up to 39% time savings.展开更多
为有效识别和剔除风电机组实测数据中的异常数据,通过分析风电机组实测数据的高维特征,提出一种基于流形学习的异常数据识别算法。首先,采用k-近邻互信息算法实现风电机组特征变量选择;随后,使用将样本间距离度量替换为欧几里得度量和...为有效识别和剔除风电机组实测数据中的异常数据,通过分析风电机组实测数据的高维特征,提出一种基于流形学习的异常数据识别算法。首先,采用k-近邻互信息算法实现风电机组特征变量选择;随后,使用将样本间距离度量替换为欧几里得度量和局部主成分分析(local principal component analysis,LPCA)差别加权和的优化t-分布随机近邻嵌入(t-distributed stochastic neighbor embedding,t-SNE)算法挖掘出高维流形数据中具有内在规律的低维特征,使得具有不同分布特征的数据在可视化二维空间中显著分离;最后,采用基于密度的噪声空间聚类(density-based spatial clustering of applications with noise,DBSCAN)算法对二维空间中的数据进行聚类。结果表明,与主成分分析(principal component analysis,PCA)算法、局部线性嵌入(locally linear embedding,LLE)算法和原t-SNE算法相比,所提方法能够对各种复杂工况数据进行可视化分离聚类,并对异常数据进行识别和剔除。展开更多
This study presents an autoencoder-embedded optimization(AEO)algorithm which involves a bi-population cooperative strategy for medium-scale expensive problems(MEPs).A huge search space can be compressed to an informat...This study presents an autoencoder-embedded optimization(AEO)algorithm which involves a bi-population cooperative strategy for medium-scale expensive problems(MEPs).A huge search space can be compressed to an informative lowdimensional space by using an autoencoder as a dimension reduction tool.The search operation conducted in this low space facilitates the population with fast convergence towards the optima.To strike the balance between exploration and exploitation during optimization,two phases of a tailored teaching-learning-based optimization(TTLBO)are adopted to coevolve solutions in a distributed fashion,wherein one is assisted by an autoencoder and the other undergoes a regular evolutionary process.Also,a dynamic size adjustment scheme according to problem dimension and evolutionary progress is proposed to promote information exchange between these two phases and accelerate evolutionary convergence speed.The proposed algorithm is validated by testing benchmark functions with dimensions varying from 50 to 200.As indicated in our experiments,TTLBO is suitable for dealing with medium-scale problems and thus incorporated into the AEO framework as a base optimizer.Compared with the state-of-the-art algorithms for MEPs,AEO shows extraordinarily high efficiency for these challenging problems,t hus opening new directions for various evolutionary algorithms under AEO to tackle MEPs and greatly advancing the field of medium-scale computationally expensive optimization.展开更多
This paper focuses on the unsupervised detection of the Higgs boson particle using the most informative features and variables which characterize the“Higgs machine learning challenge 2014”data set.This unsupervised ...This paper focuses on the unsupervised detection of the Higgs boson particle using the most informative features and variables which characterize the“Higgs machine learning challenge 2014”data set.This unsupervised detection goes in this paper analysis through 4 steps:(1)selection of the most informative features from the considered data;(2)definition of the number of clusters based on the elbow criterion.The experimental results showed that the optimal number of clusters that group the considered data in an unsupervised manner corresponds to 2 clusters;(3)proposition of a new approach for hybridization of both hard and fuzzy clustering tuned with Ant Lion Optimization(ALO);(4)comparison with some existing metaheuristic optimizations such as Genetic Algorithm(GA)and Particle Swarm Optimization(PSO).By employing a multi-angle analysis based on the cluster validation indices,the confusion matrix,the efficiencies and purities rates,the average cost variation,the computational time and the Sammon mapping visualization,the results highlight the effectiveness of the improved Gustafson-Kessel algorithm optimized withALO(ALOGK)to validate the proposed approach.Even if the paper gives a complete clustering analysis,its novel contribution concerns only the Steps(1)and(3)considered above.The first contribution lies in the method used for Step(1)to select the most informative features and variables.We used the t-Statistic technique to rank them.Afterwards,a feature mapping is applied using Self-Organizing Map(SOM)to identify the level of correlation between them.Then,Particle Swarm Optimization(PSO),a metaheuristic optimization technique,is used to reduce the data set dimension.The second contribution of thiswork concern the third step,where each one of the clustering algorithms as K-means(KM),Global K-means(GlobalKM),Partitioning AroundMedoids(PAM),Fuzzy C-means(FCM),Gustafson-Kessel(GK)and Gath-Geva(GG)is optimized and tuned with ALO.展开更多
The compressive sensing (CS) theory allows people to obtain signal in the frequency much lower than the requested one of sampling theorem. Because the theory is based on the assumption of that the location of sparse...The compressive sensing (CS) theory allows people to obtain signal in the frequency much lower than the requested one of sampling theorem. Because the theory is based on the assumption of that the location of sparse values is unknown, it has many constraints in practical applications. In fact, in many cases such as image processing, the location of sparse values is knowable, and CS can degrade to a linear process. In order to take full advantage of the visual information of images, this paper proposes the concept of dimensionality reduction transform matrix and then se- lects sparse values by constructing an accuracy control matrix, so on this basis, a degradation algorithm is designed that the signal can be obtained by the measurements as many as sparse values and reconstructed through a linear process. In comparison with similar methods, the degradation algorithm is effective in reducing the number of sensors and improving operational efficiency. The algorithm is also used to achieve the CS process with the same amount of data as joint photographic exports group (JPEG) compression and acquires the same display effect.展开更多
In dealing with high-dimensional data, such as the global climate model, facial data analysis, human gene distribution and so on, the problem of dimensionality reduction is often encountered, that is, to find the low ...In dealing with high-dimensional data, such as the global climate model, facial data analysis, human gene distribution and so on, the problem of dimensionality reduction is often encountered, that is, to find the low dimensional structure hidden in high-dimensional data. Nonlinear dimensionality reduction facilitates the discovery of the intrinsic structure and relevance of the data and can make the high-dimensional data visible in the low dimension. The isometric mapping algorithm (Isomap) is an important algorithm for nonlinear dimensionality reduction, which originates from the traditional dimensionality reduction algorithm MDS. The MDS algorithm is based on maintaining the distance between the samples in the original space and the distance between the samples in the lower dimensional space;the distance used here is Euclidean distance, and the Isomap algorithm discards the Euclidean distance, and calculates the shortest path between samples by Floyd algorithm to approximate the geodesic distance along the manifold surface. Compared with the previous nonlinear dimensionality reduction algorithm, the Isomap algorithm can effectively compute a global optimal solution, and it can ensure that the data manifold converges to the real structure asymptotically.展开更多
The general corrosion and local corrosion of Q235 steel were tested by acoustic emission (AE) detecting system under 6% FeCl3.6H2O solution to effectively detect the corrosion acoustic emission signal from complex b...The general corrosion and local corrosion of Q235 steel were tested by acoustic emission (AE) detecting system under 6% FeCl3.6H2O solution to effectively detect the corrosion acoustic emission signal from complex background noise. The short-time fractal dimension and discrete fractional cosine transform methods are combined to reduce noise. The input SNR is 0-15 dB while corrosion acoustic emission signals being added with white noise, color noise and pink noise respectively. The results show that the output signal-to-noise ratio is improved by up to 8 dB compared with discrete cosine transform and discrete fractional cosine transform. The above-mentioned noise reduction method is of significance for the identification of corrosion induced acoustic emission signals and the evaluation of the metal remaining life.展开更多
The computation burden in the model-based predictive control algorithm is heavy when solving QR optimization with a limited sampling step, especially for a complicated system with large dimension. A fast algorithm is ...The computation burden in the model-based predictive control algorithm is heavy when solving QR optimization with a limited sampling step, especially for a complicated system with large dimension. A fast algorithm is proposed in this paper to solve this problem, in which real-time values are modulated to bit streams to simplify the multiplication. In addition, manipulated variables in the prediction horizon are deduced to the current control horizon approximately by a recursive relation to decrease the dimension of QR optimization. The simulation results demonstrate the feasibility of this fast algorithm for MIMO systems.展开更多
The availability of many variables with predictive power makes their selection in a regression context difficult.This study considers robust and understandable low-dimensional estimators as building blocks to improve ...The availability of many variables with predictive power makes their selection in a regression context difficult.This study considers robust and understandable low-dimensional estimators as building blocks to improve overall predictive power by optimally combining these building blocks.Our new algorithm is based on generalized cross-validation and builds a predictive model step-by-step from a simple mean to more complex predictive combinations.Empirical applications to annual fnancial returns and actuarial telematics data show its usefulness in the financial and insurance industries.展开更多
As modern weapons and equipment undergo increasing levels of informatization,intelligence,and networking,the topology and traffic characteristics of battlefield data networks built with tactical data links are becomin...As modern weapons and equipment undergo increasing levels of informatization,intelligence,and networking,the topology and traffic characteristics of battlefield data networks built with tactical data links are becoming progressively complex.In this paper,we employ a traffic matrix to model the tactical data link network.We propose a method that utilizes the Maximum Variance Unfolding(MVU)algorithm to conduct nonlinear dimensionality reduction analysis on high-dimensional open network traffic matrix datasets.This approach introduces novel ideas and methods for future applications,including traffic prediction and anomaly analysis in real battlefield network environments.展开更多
基金funded by National Natural Science Foundation of China(Nos.12402142,11832013 and 11572134)Natural Science Foundation of Hubei Province(No.2024AFB235)+1 种基金Hubei Provincial Department of Education Science and Technology Research Project(No.Q20221714)the Opening Foundation of Hubei Key Laboratory of Digital Textile Equipment(Nos.DTL2023019 and DTL2022012).
文摘Owing to their global search capabilities and gradient-free operation,metaheuristic algorithms are widely applied to a wide range of optimization problems.However,their computational demands become prohibitive when tackling high-dimensional optimization challenges.To effectively address these challenges,this study introduces cooperative metaheuristics integrating dynamic dimension reduction(DR).Building upon particle swarm optimization(PSO)and differential evolution(DE),the proposed cooperative methods C-PSO and C-DE are developed.In the proposed methods,the modified principal components analysis(PCA)is utilized to reduce the dimension of design variables,thereby decreasing computational costs.The dynamic DR strategy implements periodic execution of modified PCA after a fixed number of iterations,resulting in the important dimensions being dynamically identified.Compared with the static one,the dynamic DR strategy can achieve precise identification of important dimensions,thereby enabling accelerated convergence toward optimal solutions.Furthermore,the influence of cumulative contribution rate thresholds on optimization problems with different dimensions is investigated.Metaheuristic algorithms(PSO,DE)and cooperative metaheuristics(C-PSO,C-DE)are examined by 15 benchmark functions and two engineering design problems(speed reducer and composite pressure vessel).Comparative results demonstrate that the cooperative methods achieve significantly superior performance compared to standard methods in both solution accuracy and computational efficiency.Compared to standard metaheuristic algorithms,cooperative metaheuristics achieve a reduction in computational cost of at least 40%.The cooperative metaheuristics can be effectively used to tackle both high-dimensional unconstrained and constrained optimization problems.
文摘In the past decade,financial institutions have invested significant efforts in the development of accurate analytical credit scoring models.The evidence suggests that even small improvements in the accuracy of existing credit-scoring models may optimize profits while effectively managing risk exposure.Despite continuing efforts,the majority of existing credit scoring models still include some judgment-based assumptions that are sometimes supported by the significant findings of previous studies but are not validated using the institution’s internal data.We argue that current studies related to the development of credit scoring models have largely ignored recent developments in statistical methods for sufficient dimension reduction.To contribute to the field of financial innovation,this study proposes a Dimension Reduction Assisted Credit Scoring(DRA-CS)method via distance covariance-based sufficient dimension reduction(DCOV-SDR)in Majorization-Minimization(MM)algorithm.First,in the presence of a large number of variables,the DRA-CS method results in greater dimension reduction and better prediction accuracy than the other methods used for dimension reduction.Second,when the DRA-CS method is employed with logistic regression,it outperforms existing methods based on different variable selection techniques.This study argues that the DRA-CS method should be used by financial institutions as a financial innovation tool to analyze high-dimensional customer datasets and improve the accuracy of existing credit scoring methods.
基金supported by the National Natural Science Foundation of China (No. 11502211)
文摘In aerodynamic optimization, global optimization methods such as genetic algorithms are preferred in many cases because of their advantage on reaching global optimum. However,for complex problems in which large number of design variables are needed, the computational cost becomes prohibitive, and thus original global optimization strategies are required. To address this need, data dimensionality reduction method is combined with global optimization methods, thus forming a new global optimization system, aiming to improve the efficiency of conventional global optimization. The new optimization system involves applying Proper Orthogonal Decomposition(POD) in dimensionality reduction of design space while maintaining the generality of original design space. Besides, an acceleration approach for samples calculation in surrogate modeling is applied to reduce the computational time while providing sufficient accuracy. The optimizations of a transonic airfoil RAE2822 and the transonic wing ONERA M6 are performed to demonstrate the effectiveness of the proposed new optimization system. In both cases, we manage to reduce the number of design variables from 20 to 10 and from 42 to 20 respectively. The new design optimization system converges faster and it takes 1/3 of the total time of traditional optimization to converge to a better design, thus significantly reducing the overall optimization time and improving the efficiency of conventional global design optimization method.
文摘Big data is a vast amount of structured and unstructured data that must be dealt with on a regular basis.Dimensionality reduction is the process of converting a huge set of data into data with tiny dimensions so that equal information may be expressed easily.These tactics are frequently utilized to improve classification or regression challenges while dealing with machine learning issues.To achieve dimensionality reduction for huge data sets,this paper offers a hybrid particle swarm optimization-rough set PSO-RS and Mayfly algorithm-rough set MA-RS.A novel hybrid strategy based on the Mayfly algorithm(MA)and the rough set(RS)is proposed in particular.The performance of the novel hybrid algorithm MA-RS is evaluated by solving six different data sets from the literature.The simulation results and comparison with common reduction methods demonstrate the proposed MARS algorithm’s capacity to handle a wide range of data sets.Finally,the rough set approach,as well as the hybrid optimization techniques PSO-RS and MARS,were applied to deal with the massive data problem.MA-hybrid RS’s method beats other classic dimensionality reduction techniques,according to the experimental results and statistical testing studies.
文摘<strong>Purpose:</strong><span style="font-family:;" "=""><span style="font-family:Verdana;"> This study sought to review the characteristics, strengths, weaknesses variants, applications areas and data types applied on the various </span><span><span style="font-family:Verdana;">Dimension Reduction techniques. </span><b><span style="font-family:Verdana;">Methodology: </span></b><span style="font-family:Verdana;">The most commonly used databases employed to search for the papers were ScienceDirect, Scopus, Google Scholar, IEEE Xplore and Mendeley. An integrative review was used for the study where </span></span></span><span style="font-family:Verdana;">341</span><span style="font-family:;" "=""><span style="font-family:Verdana;"> papers were reviewed. </span><b><span style="font-family:Verdana;">Results:</span></b><span style="font-family:Verdana;"> The linear techniques considered were Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Singular Value Decomposition (SVD), Latent Semantic Analysis (LSA), Locality Preserving Projections (LPP), Independent Component Analysis (ICA) and Project Pursuit (PP). The non-linear techniques which were developed to work with applications that ha</span></span><span style="font-family:Verdana;">ve</span><span style="font-family:Verdana;"> complex non-linear structures considered were Kernel Principal Component Analysis (KPC</span><span style="font-family:Verdana;">A), Multi</span><span style="font-family:Verdana;">-</span><span style="font-family:;" "=""><span style="font-family:Verdana;">dimensional Scaling (MDS), Isomap, Locally Linear Embedding (LLE), Self-Organizing Map (SOM), Latent Vector Quantization (LVQ), t-Stochastic </span><span style="font-family:Verdana;">neighbor embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP). DR techniques can further be categorized into supervised, unsupervised and more recently semi-supervised learning methods. The supervised versions are the LDA and LVQ. All the other techniques are unsupervised. Supervised variants of PCA, LPP, KPCA and MDS have </span><span style="font-family:Verdana;">been developed. Supervised and semi-supervised variants of PP and t-SNE have also been developed and a semi supervised version of the LDA has been developed. </span><b><span style="font-family:Verdana;">Conclusion:</span></b><span style="font-family:Verdana;"> The various application areas, strengths, weaknesses and variants of the DR techniques were explored. The different data types that have been applied on the various DR techniques were also explored.</span></span>
文摘Advanced engineering systems, like aircraft, are defined by tens or even hundreds of design variables. Building an accurate surrogate model for use in such high-dimensional optimization problems is a difficult task owing to the curse of dimensionality. This paper presents a new algorithm to reduce the size of a design space to a smaller region of interest allowing a more accurate surrogate model to be generated. The framework requires a set of models of different physical or numerical fidelities. The low-fidelity (LF) model provides physics-based approximation of the high-fidelity (HF) model at a fraction of the computational cost. It is also instrumental in identifying the small region of interest in the design space that encloses the high-fidelity optimum. A surrogate model is then constructed to match the low-fidelity model to the high-fidelity model in the identified region of interest. The optimization process is managed by an update strategy to prevent convergence to false optima. The algorithm is applied on mathematical problems and a two-dimen-sional aerodynamic shape optimization problem in a variable-fidelity context. Results obtained are in excellent agreement with high-fidelity results, even with lower-fidelity flow solvers, while showing up to 39% time savings.
文摘为有效识别和剔除风电机组实测数据中的异常数据,通过分析风电机组实测数据的高维特征,提出一种基于流形学习的异常数据识别算法。首先,采用k-近邻互信息算法实现风电机组特征变量选择;随后,使用将样本间距离度量替换为欧几里得度量和局部主成分分析(local principal component analysis,LPCA)差别加权和的优化t-分布随机近邻嵌入(t-distributed stochastic neighbor embedding,t-SNE)算法挖掘出高维流形数据中具有内在规律的低维特征,使得具有不同分布特征的数据在可视化二维空间中显著分离;最后,采用基于密度的噪声空间聚类(density-based spatial clustering of applications with noise,DBSCAN)算法对二维空间中的数据进行聚类。结果表明,与主成分分析(principal component analysis,PCA)算法、局部线性嵌入(locally linear embedding,LLE)算法和原t-SNE算法相比,所提方法能够对各种复杂工况数据进行可视化分离聚类,并对异常数据进行识别和剔除。
基金supported in part by the National Natural Science Foundation of China(72171172,62088101)in part by the Shanghai Science and Technology Major Special Project of Shanghai Development and Reform Commission(2021SHZDZX0100)+2 种基金in part by the Shanghai Commission of Science and Technology(19511132100,19511132101)in part by the China Scholarship Councilin part by the Deanship of Scientific Research(DSR)at King Abdulaziz University(KAU),Jeddah,Saudi Arabia(FP-146-43)。
文摘This study presents an autoencoder-embedded optimization(AEO)algorithm which involves a bi-population cooperative strategy for medium-scale expensive problems(MEPs).A huge search space can be compressed to an informative lowdimensional space by using an autoencoder as a dimension reduction tool.The search operation conducted in this low space facilitates the population with fast convergence towards the optima.To strike the balance between exploration and exploitation during optimization,two phases of a tailored teaching-learning-based optimization(TTLBO)are adopted to coevolve solutions in a distributed fashion,wherein one is assisted by an autoencoder and the other undergoes a regular evolutionary process.Also,a dynamic size adjustment scheme according to problem dimension and evolutionary progress is proposed to promote information exchange between these two phases and accelerate evolutionary convergence speed.The proposed algorithm is validated by testing benchmark functions with dimensions varying from 50 to 200.As indicated in our experiments,TTLBO is suitable for dealing with medium-scale problems and thus incorporated into the AEO framework as a base optimizer.Compared with the state-of-the-art algorithms for MEPs,AEO shows extraordinarily high efficiency for these challenging problems,t hus opening new directions for various evolutionary algorithms under AEO to tackle MEPs and greatly advancing the field of medium-scale computationally expensive optimization.
文摘This paper focuses on the unsupervised detection of the Higgs boson particle using the most informative features and variables which characterize the“Higgs machine learning challenge 2014”data set.This unsupervised detection goes in this paper analysis through 4 steps:(1)selection of the most informative features from the considered data;(2)definition of the number of clusters based on the elbow criterion.The experimental results showed that the optimal number of clusters that group the considered data in an unsupervised manner corresponds to 2 clusters;(3)proposition of a new approach for hybridization of both hard and fuzzy clustering tuned with Ant Lion Optimization(ALO);(4)comparison with some existing metaheuristic optimizations such as Genetic Algorithm(GA)and Particle Swarm Optimization(PSO).By employing a multi-angle analysis based on the cluster validation indices,the confusion matrix,the efficiencies and purities rates,the average cost variation,the computational time and the Sammon mapping visualization,the results highlight the effectiveness of the improved Gustafson-Kessel algorithm optimized withALO(ALOGK)to validate the proposed approach.Even if the paper gives a complete clustering analysis,its novel contribution concerns only the Steps(1)and(3)considered above.The first contribution lies in the method used for Step(1)to select the most informative features and variables.We used the t-Statistic technique to rank them.Afterwards,a feature mapping is applied using Self-Organizing Map(SOM)to identify the level of correlation between them.Then,Particle Swarm Optimization(PSO),a metaheuristic optimization technique,is used to reduce the data set dimension.The second contribution of thiswork concern the third step,where each one of the clustering algorithms as K-means(KM),Global K-means(GlobalKM),Partitioning AroundMedoids(PAM),Fuzzy C-means(FCM),Gustafson-Kessel(GK)and Gath-Geva(GG)is optimized and tuned with ALO.
基金supported by the National Natural Science Foundation of China (61077079)the Specialized Research Fund for the Doctoral Program of Higher Education (20102304110013)the Program Ex-cellent Academic Leaders of Harbin (2009RFXXG034)
文摘The compressive sensing (CS) theory allows people to obtain signal in the frequency much lower than the requested one of sampling theorem. Because the theory is based on the assumption of that the location of sparse values is unknown, it has many constraints in practical applications. In fact, in many cases such as image processing, the location of sparse values is knowable, and CS can degrade to a linear process. In order to take full advantage of the visual information of images, this paper proposes the concept of dimensionality reduction transform matrix and then se- lects sparse values by constructing an accuracy control matrix, so on this basis, a degradation algorithm is designed that the signal can be obtained by the measurements as many as sparse values and reconstructed through a linear process. In comparison with similar methods, the degradation algorithm is effective in reducing the number of sensors and improving operational efficiency. The algorithm is also used to achieve the CS process with the same amount of data as joint photographic exports group (JPEG) compression and acquires the same display effect.
文摘In dealing with high-dimensional data, such as the global climate model, facial data analysis, human gene distribution and so on, the problem of dimensionality reduction is often encountered, that is, to find the low dimensional structure hidden in high-dimensional data. Nonlinear dimensionality reduction facilitates the discovery of the intrinsic structure and relevance of the data and can make the high-dimensional data visible in the low dimension. The isometric mapping algorithm (Isomap) is an important algorithm for nonlinear dimensionality reduction, which originates from the traditional dimensionality reduction algorithm MDS. The MDS algorithm is based on maintaining the distance between the samples in the original space and the distance between the samples in the lower dimensional space;the distance used here is Euclidean distance, and the Isomap algorithm discards the Euclidean distance, and calculates the shortest path between samples by Floyd algorithm to approximate the geodesic distance along the manifold surface. Compared with the previous nonlinear dimensionality reduction algorithm, the Isomap algorithm can effectively compute a global optimal solution, and it can ensure that the data manifold converges to the real structure asymptotically.
文摘The general corrosion and local corrosion of Q235 steel were tested by acoustic emission (AE) detecting system under 6% FeCl3.6H2O solution to effectively detect the corrosion acoustic emission signal from complex background noise. The short-time fractal dimension and discrete fractional cosine transform methods are combined to reduce noise. The input SNR is 0-15 dB while corrosion acoustic emission signals being added with white noise, color noise and pink noise respectively. The results show that the output signal-to-noise ratio is improved by up to 8 dB compared with discrete cosine transform and discrete fractional cosine transform. The above-mentioned noise reduction method is of significance for the identification of corrosion induced acoustic emission signals and the evaluation of the metal remaining life.
基金Supported by the National Natural Science Foundation of China(61333010,61203157)the Fundamental Research Funds for the Central Universities+2 种基金the National High-Tech Research and Development Program of China(2013AA040701)Shanghai Natural Science Foundation Project(15ZR1408900)Shanghai Key Technologies R&D Program Project(13111103800)
文摘The computation burden in the model-based predictive control algorithm is heavy when solving QR optimization with a limited sampling step, especially for a complicated system with large dimension. A fast algorithm is proposed in this paper to solve this problem, in which real-time values are modulated to bit streams to simplify the multiplication. In addition, manipulated variables in the prediction horizon are deduced to the current control horizon approximately by a recursive relation to decrease the dimension of QR optimization. The simulation results demonstrate the feasibility of this fast algorithm for MIMO systems.
基金financial support from Ministerio de Ciencia,Innovacion y Universidades(PID2020-116587GB-I00)financial support from Austrian National Bank(Jubilaumsfondsprojekt 18901)。
文摘The availability of many variables with predictive power makes their selection in a regression context difficult.This study considers robust and understandable low-dimensional estimators as building blocks to improve overall predictive power by optimally combining these building blocks.Our new algorithm is based on generalized cross-validation and builds a predictive model step-by-step from a simple mean to more complex predictive combinations.Empirical applications to annual fnancial returns and actuarial telematics data show its usefulness in the financial and insurance industries.
文摘As modern weapons and equipment undergo increasing levels of informatization,intelligence,and networking,the topology and traffic characteristics of battlefield data networks built with tactical data links are becoming progressively complex.In this paper,we employ a traffic matrix to model the tactical data link network.We propose a method that utilizes the Maximum Variance Unfolding(MVU)algorithm to conduct nonlinear dimensionality reduction analysis on high-dimensional open network traffic matrix datasets.This approach introduces novel ideas and methods for future applications,including traffic prediction and anomaly analysis in real battlefield network environments.