We propose a novel framework for learning a low-dimensional representation of data based on nonlinear dynamical systems,which we call the dynamical dimension reduction(DDR).In the DDR model,each point is evolved via a...We propose a novel framework for learning a low-dimensional representation of data based on nonlinear dynamical systems,which we call the dynamical dimension reduction(DDR).In the DDR model,each point is evolved via a nonlinear flow towards a lower-dimensional subspace;the projection onto the subspace gives the low-dimensional embedding.Training the model involves identifying the nonlinear flow and the subspace.Following the equation discovery method,we represent the vector field that defines the flow using a linear combination of dictionary elements,where each element is a pre-specified linear/nonlinear candidate function.A regularization term for the average total kinetic energy is also introduced and motivated by the optimal transport theory.We prove that the resulting optimization problem is well-posed and establish several properties of the DDR method.We also show how the DDR method can be trained using a gradient-based optimization method,where the gradients are computed using the adjoint method from the optimal control theory.The DDR method is implemented and compared on synthetic and example data sets to other dimension reduction methods,including the PCA,t-SNE,and Umap.展开更多
Dimension reduction is defined as the processes of projecting high-dimensional data to a much lower-dimensional space. Dimension reduction methods variously applied in regression, classification, feature analysis and ...Dimension reduction is defined as the processes of projecting high-dimensional data to a much lower-dimensional space. Dimension reduction methods variously applied in regression, classification, feature analysis and visualization. In this paper, we review in details the last and most new version of methods that extensively developed in the past decade.展开更多
Gas turbine rotors are complex dynamic systems with high-dimensional,discrete,and multi-source nonlinear coupling characteristics.Significant amounts of resources and time are spent during the process of solving dynam...Gas turbine rotors are complex dynamic systems with high-dimensional,discrete,and multi-source nonlinear coupling characteristics.Significant amounts of resources and time are spent during the process of solving dynamic characteristics.Therefore,it is necessary to design a lowdimensional model that can well reflect the dynamic characteristics of high-dimensional system.To build such a low-dimensional model,this study developed a dimensionality reduction method considering global order energy distribution by modifying the proper orthogonal decomposition theory.First,sensitivity analysis of key dimensionality reduction parameters to the energy distribution was conducted.Then a high-dimensional rotor-bearing system considering the nonlinear stiffness and oil film force was reduced,and the accuracy and the reusability of the low-dimensional model under different operating conditions were examined.Finally,the response results of a multi-disk rotor-bearing test bench were reduced using the proposed method,and spectrum results were then compared experimentally.Numerical and experimental results demonstrate that,during the dimensionality reduction process,the solution period of dynamic response results has the most significant influence on the accuracy of energy preservation.The transient signal in the transformation matrix mainly affects the high-order energy distribution of the rotor system.The larger the proportion of steady-state signals is,the closer the energy tends to accumulate towards lower orders.The low-dimensional rotor model accurately reflects the frequency response characteristics of the original high-dimensional system with an accuracy of up to 98%.The proposed dimensionality reduction method exhibits significant application potential in the dynamic analysis of highdimensional systems coupled with strong nonlinearities under variable operating conditions.展开更多
Principal component analysis and generalized low rank approximation of matrices are two different dimensionality reduction methods. Two different dimensionality reduction algorithms are applied to the L1-CSVM model ba...Principal component analysis and generalized low rank approximation of matrices are two different dimensionality reduction methods. Two different dimensionality reduction algorithms are applied to the L1-CSVM model based on augmented Lagrange method to explore the variation of running time and accuracy of the model in dimensionality reduction space. The results show that the improved algorithm can greatly reduce the running time and improve the accuracy of the algorithm.展开更多
Data with large dimensions will bring various problems to the application of data envelopment analysis(DEA).In this study,we focus on a“big data”problem related to the considerably large dimensions of the input-outp...Data with large dimensions will bring various problems to the application of data envelopment analysis(DEA).In this study,we focus on a“big data”problem related to the considerably large dimensions of the input-output data.The four most widely used approaches to guide dimension reduction in DEA are compared via Monte Carlo simulation,including principal component analysis(PCA-DEA),which is based on the idea of aggregating input and output,efficiency contribution measurement(ECM),average efficiency measure(AEC),and regression-based detection(RB),which is based on the idea of variable selection.We compare the performance of these methods under different scenarios and a brand-new comparison benchmark for the simulation test.In addition,we discuss the effect of initial variable selection in RB for the first time.Based on the results,we offer guidelines that are more reliable on how to choose an appropriate method.展开更多
The development of artificial intelligence(AI)technologies creates a great chance for the iteration of railway monitoring.This paper proposes a comprehensive method for railway utility pole detection.The framework of ...The development of artificial intelligence(AI)technologies creates a great chance for the iteration of railway monitoring.This paper proposes a comprehensive method for railway utility pole detection.The framework of this paper on railway systems consists of two parts:point cloud preprocessing and railway utility pole detection.Thismethod overcomes the challenges of dynamic environment adaptability,reliance on lighting conditions,sensitivity to weather and environmental conditions,and visual occlusion issues present in 2D images and videos,which utilize mobile LiDAR(Laser Radar)acquisition devices to obtain point cloud data.Due to factors such as acquisition equipment and environmental conditions,there is a significant amount of noise interference in the point cloud data,affecting subsequent detection tasks.We designed a Dual-Region Adaptive Point Cloud Preprocessing method,which divides the railway point cloud data into track and non-track regions.The track region undergoes projection dimensionality reduction,with the projected results being unique and subsequently subjected to 2D density clustering,greatly reducing data computation volume.The non-track region undergoes PCA-based dimensionality reduction and clustering operations to achieve preprocessing of large-scale point cloud scenes.Finally,the preprocessed results are used for training,achieving higher accuracy in utility pole detection and data communication.Experimental results show that our proposed preprocessing method not only improves efficiency but also enhances detection accuracy.展开更多
In this paper, a class of electromagnetic field frequency domain reliability problem is first defined. The frequency domain reliability refers to the probability that an electromagnetic performance indicator can meet ...In this paper, a class of electromagnetic field frequency domain reliability problem is first defined. The frequency domain reliability refers to the probability that an electromagnetic performance indicator can meet the intended requirements within a specific frequency band, considering the uncertainty of structural parameters and frequency-variant electromagnetic parameters.And then a frequency domain reliability analysis method based on univariate dimension reduction method is proposed, which provides an effective calculation tool for electromagnetic frequency domain reliability. In electromagnetic problems, performance indicators usually vary with frequency. The method firstly discretizes the frequency-variant performance indicator function into a series of frequency points' functions, and then transforms the frequency domain reliability problem into a series system reliability problem of discrete frequency points' functions. Secondly, the univariate dimension reduction method is introduced to solve the probability distribution functions and correlation coefficients of discrete frequency points' functions in the system. Finally, according to the above calculation results, the series system reliability can be solved to obtain the frequency domain reliability, and the cumulative distribution function of the performance indicator can also be obtained. In this study,Monte Carlo simulation is adopted to demonstrate the validity of the frequency domain reliability analysis method. Three examples are investigated to demonstrate the accuracy and efficiency of the proposed method.展开更多
The transient proper orthogonal decomposition(TPOD) method is used to study dynamic behaviors of the reduced rotor-bearing models,and the fault-free model is compared with the models with looseness fault.A 22 degree o...The transient proper orthogonal decomposition(TPOD) method is used to study dynamic behaviors of the reduced rotor-bearing models,and the fault-free model is compared with the models with looseness fault.A 22 degree of freedoms(DOFs) rotor model supported by bearings is established.Both one end and two ends pedestal looseness of the liquid-film bearings are studied by analyzing the time history and the frequency-spectrum curves.The effects of the initial displacement and velocity values to frequency components of the original systems and the dimension reduction efficiency are discussed.Moreover,the effects of variation of initial conditions on the efficiency of the TPOD method are studied.Reduced models can provide guidance significance from the perspectives of the theory and numerical simplification to discuss the characteristics of pedestal looseness fault.展开更多
The feature-selection problem in training AdaBoost classifiers is addressed in this paper. A working feature subset is generated by adopting a novel feature subset selection method based on the partial least square (...The feature-selection problem in training AdaBoost classifiers is addressed in this paper. A working feature subset is generated by adopting a novel feature subset selection method based on the partial least square (PLS) regression, and then trained and selected from this feature subset in Boosting. The experiments show that the proposed PLS-based feature-selection method outperforms the current feature ranking method and the random sampling method.展开更多
Driven by the challenge of integrating large amount of experimental data, classification technique emerges as one of the major and popular tools in computational biology and bioinformatics research. Machine learning m...Driven by the challenge of integrating large amount of experimental data, classification technique emerges as one of the major and popular tools in computational biology and bioinformatics research. Machine learning methods, especially kernel methods with Support Vector Machines (SVMs) are very popular and effective tools. In the perspective of kernel matrix, a technique namely Eigen- matrix translation has been introduced for protein data classification. The Eigen-matrix translation strategy has a lot of nice properties which deserve more exploration. This paper investigates the major role of Eigen-matrix translation in classification. The authors propose that its importance lies in the dimension reduction of predictor attributes within the data set. This is very important when the dimension of features is huge. The authors show by numerical experiments on real biological data sets that the proposed framework is crucial and effective in improving classification accuracy. This can therefore serve as a novel perspective for future research in dimension reduction problems.展开更多
文摘We propose a novel framework for learning a low-dimensional representation of data based on nonlinear dynamical systems,which we call the dynamical dimension reduction(DDR).In the DDR model,each point is evolved via a nonlinear flow towards a lower-dimensional subspace;the projection onto the subspace gives the low-dimensional embedding.Training the model involves identifying the nonlinear flow and the subspace.Following the equation discovery method,we represent the vector field that defines the flow using a linear combination of dictionary elements,where each element is a pre-specified linear/nonlinear candidate function.A regularization term for the average total kinetic energy is also introduced and motivated by the optimal transport theory.We prove that the resulting optimization problem is well-posed and establish several properties of the DDR method.We also show how the DDR method can be trained using a gradient-based optimization method,where the gradients are computed using the adjoint method from the optimal control theory.The DDR method is implemented and compared on synthetic and example data sets to other dimension reduction methods,including the PCA,t-SNE,and Umap.
文摘Dimension reduction is defined as the processes of projecting high-dimensional data to a much lower-dimensional space. Dimension reduction methods variously applied in regression, classification, feature analysis and visualization. In this paper, we review in details the last and most new version of methods that extensively developed in the past decade.
基金supported by the China Postdoctoral Science Foundation(No.2024M764171)the Postdoctoral Research Start-up Funds,China(No.AUGA5710027424)+1 种基金the National Natural Science Foundation of China(No.U2341237)the Development and construction funds for the School of Mechatronics Engineering of HIT,China(No.CBQQ8880103624)。
文摘Gas turbine rotors are complex dynamic systems with high-dimensional,discrete,and multi-source nonlinear coupling characteristics.Significant amounts of resources and time are spent during the process of solving dynamic characteristics.Therefore,it is necessary to design a lowdimensional model that can well reflect the dynamic characteristics of high-dimensional system.To build such a low-dimensional model,this study developed a dimensionality reduction method considering global order energy distribution by modifying the proper orthogonal decomposition theory.First,sensitivity analysis of key dimensionality reduction parameters to the energy distribution was conducted.Then a high-dimensional rotor-bearing system considering the nonlinear stiffness and oil film force was reduced,and the accuracy and the reusability of the low-dimensional model under different operating conditions were examined.Finally,the response results of a multi-disk rotor-bearing test bench were reduced using the proposed method,and spectrum results were then compared experimentally.Numerical and experimental results demonstrate that,during the dimensionality reduction process,the solution period of dynamic response results has the most significant influence on the accuracy of energy preservation.The transient signal in the transformation matrix mainly affects the high-order energy distribution of the rotor system.The larger the proportion of steady-state signals is,the closer the energy tends to accumulate towards lower orders.The low-dimensional rotor model accurately reflects the frequency response characteristics of the original high-dimensional system with an accuracy of up to 98%.The proposed dimensionality reduction method exhibits significant application potential in the dynamic analysis of highdimensional systems coupled with strong nonlinearities under variable operating conditions.
文摘Principal component analysis and generalized low rank approximation of matrices are two different dimensionality reduction methods. Two different dimensionality reduction algorithms are applied to the L1-CSVM model based on augmented Lagrange method to explore the variation of running time and accuracy of the model in dimensionality reduction space. The results show that the improved algorithm can greatly reduce the running time and improve the accuracy of the algorithm.
文摘Data with large dimensions will bring various problems to the application of data envelopment analysis(DEA).In this study,we focus on a“big data”problem related to the considerably large dimensions of the input-output data.The four most widely used approaches to guide dimension reduction in DEA are compared via Monte Carlo simulation,including principal component analysis(PCA-DEA),which is based on the idea of aggregating input and output,efficiency contribution measurement(ECM),average efficiency measure(AEC),and regression-based detection(RB),which is based on the idea of variable selection.We compare the performance of these methods under different scenarios and a brand-new comparison benchmark for the simulation test.In addition,we discuss the effect of initial variable selection in RB for the first time.Based on the results,we offer guidelines that are more reliable on how to choose an appropriate method.
文摘The development of artificial intelligence(AI)technologies creates a great chance for the iteration of railway monitoring.This paper proposes a comprehensive method for railway utility pole detection.The framework of this paper on railway systems consists of two parts:point cloud preprocessing and railway utility pole detection.Thismethod overcomes the challenges of dynamic environment adaptability,reliance on lighting conditions,sensitivity to weather and environmental conditions,and visual occlusion issues present in 2D images and videos,which utilize mobile LiDAR(Laser Radar)acquisition devices to obtain point cloud data.Due to factors such as acquisition equipment and environmental conditions,there is a significant amount of noise interference in the point cloud data,affecting subsequent detection tasks.We designed a Dual-Region Adaptive Point Cloud Preprocessing method,which divides the railway point cloud data into track and non-track regions.The track region undergoes projection dimensionality reduction,with the projected results being unique and subsequently subjected to 2D density clustering,greatly reducing data computation volume.The non-track region undergoes PCA-based dimensionality reduction and clustering operations to achieve preprocessing of large-scale point cloud scenes.Finally,the preprocessed results are used for training,achieving higher accuracy in utility pole detection and data communication.Experimental results show that our proposed preprocessing method not only improves efficiency but also enhances detection accuracy.
基金supported by the National Natural Science Foundation of China(Grant No.51490662)the National Science Fund for Distinguished Young Scholars(Grant No.51725502)
文摘In this paper, a class of electromagnetic field frequency domain reliability problem is first defined. The frequency domain reliability refers to the probability that an electromagnetic performance indicator can meet the intended requirements within a specific frequency band, considering the uncertainty of structural parameters and frequency-variant electromagnetic parameters.And then a frequency domain reliability analysis method based on univariate dimension reduction method is proposed, which provides an effective calculation tool for electromagnetic frequency domain reliability. In electromagnetic problems, performance indicators usually vary with frequency. The method firstly discretizes the frequency-variant performance indicator function into a series of frequency points' functions, and then transforms the frequency domain reliability problem into a series system reliability problem of discrete frequency points' functions. Secondly, the univariate dimension reduction method is introduced to solve the probability distribution functions and correlation coefficients of discrete frequency points' functions in the system. Finally, according to the above calculation results, the series system reliability can be solved to obtain the frequency domain reliability, and the cumulative distribution function of the performance indicator can also be obtained. In this study,Monte Carlo simulation is adopted to demonstrate the validity of the frequency domain reliability analysis method. Three examples are investigated to demonstrate the accuracy and efficiency of the proposed method.
基金Sponsored by the National Basic Research Program of China(Grant No.2015CB057400)
文摘The transient proper orthogonal decomposition(TPOD) method is used to study dynamic behaviors of the reduced rotor-bearing models,and the fault-free model is compared with the models with looseness fault.A 22 degree of freedoms(DOFs) rotor model supported by bearings is established.Both one end and two ends pedestal looseness of the liquid-film bearings are studied by analyzing the time history and the frequency-spectrum curves.The effects of the initial displacement and velocity values to frequency components of the original systems and the dimension reduction efficiency are discussed.Moreover,the effects of variation of initial conditions on the efficiency of the TPOD method are studied.Reduced models can provide guidance significance from the perspectives of the theory and numerical simplification to discuss the characteristics of pedestal looseness fault.
基金Supported by the National Natural Science Foundation of China(60772066)
文摘The feature-selection problem in training AdaBoost classifiers is addressed in this paper. A working feature subset is generated by adopting a novel feature subset selection method based on the partial least square (PLS) regression, and then trained and selected from this feature subset in Boosting. The experiments show that the proposed PLS-based feature-selection method outperforms the current feature ranking method and the random sampling method.
基金supported by Research Grants Council of Hong Kong under Grant No.17301214HKU CERG Grants,Fundamental Research Funds for the Central Universities+2 种基金the Research Funds of Renmin University of ChinaHung Hing Ying Physical Research Grantthe Natural Science Foundation of China under Grant No.11271144
文摘Driven by the challenge of integrating large amount of experimental data, classification technique emerges as one of the major and popular tools in computational biology and bioinformatics research. Machine learning methods, especially kernel methods with Support Vector Machines (SVMs) are very popular and effective tools. In the perspective of kernel matrix, a technique namely Eigen- matrix translation has been introduced for protein data classification. The Eigen-matrix translation strategy has a lot of nice properties which deserve more exploration. This paper investigates the major role of Eigen-matrix translation in classification. The authors propose that its importance lies in the dimension reduction of predictor attributes within the data set. This is very important when the dimension of features is huge. The authors show by numerical experiments on real biological data sets that the proposed framework is crucial and effective in improving classification accuracy. This can therefore serve as a novel perspective for future research in dimension reduction problems.