:Cross-project defect prediction(CPDP)aims to predict the defects on target project by using a prediction model built on source projects.The main problem in CPDP is the huge distribution gap between the source project...:Cross-project defect prediction(CPDP)aims to predict the defects on target project by using a prediction model built on source projects.The main problem in CPDP is the huge distribution gap between the source project and the target project,which prevents the prediction model from performing well.Most existing methods overlook the class discrimination of the learned features.Seeking an effective transferable model from the source project to the target project for CPDP is challenging.In this paper,we propose an unsupervised domain adaptation based on the discriminative subspace learning(DSL)approach for CPDP.DSL treats the data from two projects as being from two domains and maps the data into a common feature space.It employs crossdomain alignment with discriminative information from different projects to reduce the distribution difference of the data between different projects and incorporates the class discriminative information.Specifically,DSL first utilizes subspace learning based domain adaptation to reduce the distribution gap of data between different projects.Then,it makes full use of the class label information of the source project and transfers the discrimination ability of the source project to the target project in the common space.Comprehensive experiments on five projects verify that DSL can build an effective prediction model and improve the performance over the related competing methods by at least 7.10%and 11.08%in terms of G-measure and AUC.展开更多
Unsupervised transfer subspace learning is one of the challenging and important topics in domain adaptation,which aims to classify unlabeled target data by using source domain information.The traditional transfer subs...Unsupervised transfer subspace learning is one of the challenging and important topics in domain adaptation,which aims to classify unlabeled target data by using source domain information.The traditional transfer subspace learning methods often impose low-rank constraints,i.e.,trace norm,to preserve data structural information of different domains.However,trace norm is only the convex surrogate to approximate the ideal low-rank constraints and may make their solutions seriously deviate from the original optimums.In addition,the traditional methods directly use the strict labels of source domain,which is difficult to deal with label noise.To solve these problems,we propose a novel nonconvex and discriminative transfer subspace learning method named NDTSL by incorporating Schatten-norm and soft label matrix.Specifically,Schatten-norm can be imposed to approximate the low-rank constraints and obtain a better lowrank representation.Then,we design and adopt soft label matrix in source domain to learn a more flexible classifier and enhance the discriminative ability of target data.Besides,due to the nonconvexity of Schatten-norm,we design an efficient alternative algorithm IALM to solve it.Finally,experimental results on several public transfer tasks demonstrate the effectiveness of NDTSL compared with several state-of-the-art methods.展开更多
The gearbox of a wind turbine (WT) has dominant failure rates and highest downtime loss among all WT subsystems. Thus, gearbox health assessment for maintenance cost reduction is of paramount importance. The concurr...The gearbox of a wind turbine (WT) has dominant failure rates and highest downtime loss among all WT subsystems. Thus, gearbox health assessment for maintenance cost reduction is of paramount importance. The concurrence of multiple faults in gearbox components is a common phenomenon due to fault induction mechanism. This problem should be considered before planning to replace the components of the WT gearbox. Therefore, the key fault patterns should be reliably identified from noisy observation data for the development of an effective maintenance strategy. However, most of the existing studies focusing on multiple fault diagnosis always suffer from inappropriate division of fault information in order to satisfy various rigorous decomposition principles or statistical assumptions, such as the smooth envelope principle of ensemble empirical mode decomposition and the mutual independence assumption of independent component analysis. Thus, this paper presents a joint subspace learning-based multiple fault detection (JSLMFD) technique to construct different subspaces adaptively for different fault pattems. Its main advantage is its capability to learn multiple fault subspaces directly from the observation signal itself. It can also sparsely concentrate the feature information into a few dominant subspace coefficients. Furthermore, it can eliminate noise by simply performing coefficient shrinkage operations. Consequently, multiple fault patterns are reliably identified by utilizing the maximum fault information criterion. The superiority of JSL-MFD in multiple fault separation and detection is comprehensively investigated and verified by the analysis of a data set of a 750 kW WT gearbox. Results show that JSL-MFD is superior to a state-of-the-art technique in detecting hidden fault patterns and enhancing detection accuracy.展开更多
The existing multi-view subspace clustering algorithms based on tensor singular value decomposition(t-SVD)predominantly utilize tensor nuclear norm to explore the intra view correlation between views of the same sampl...The existing multi-view subspace clustering algorithms based on tensor singular value decomposition(t-SVD)predominantly utilize tensor nuclear norm to explore the intra view correlation between views of the same samples,while neglecting the correlation among the samples within different views.Moreover,the tensor nuclear norm is not fully considered as a convex approximation of the tensor rank function.Treating different singular values equally may result in suboptimal tensor representation.A hypergraph regularized multi-view subspace clustering algorithm with dual tensor log-determinant(HRMSC-DTL)was proposed.The algorithm used subspace learning in each view to learn a specific set of affinity matrices,and introduced a non-convex tensor log-determinant function to replace the tensor nuclear norm to better improve global low-rankness.It also introduced hyper-Laplacian regularization to preserve the local geometric structure embedded in the high-dimensional space.Furthermore,it rotated the original tensor and incorporated a dual tensor mechanism to fully exploit the intra view correlation of the original tensor and the inter view correlation of the rotated tensor.At the same time,an alternating direction of multipliers method(ADMM)was also designed to solve non-convex optimization model.Experimental evaluations on seven widely used datasets,along with comparisons to several state-of-the-art algorithms,demonstrated the superiority and effectiveness of the HRMSC-DTL algorithm in terms of clustering performance.展开更多
With the incredible growth of high-dimensional data such as microarray gene expression data and web blogs from internet, the researchers are desirable to develop new clustering techniques to address the critical probl...With the incredible growth of high-dimensional data such as microarray gene expression data and web blogs from internet, the researchers are desirable to develop new clustering techniques to address the critical problem created by irrelevant dimensions. Properties of Nonnegative Matrix Factorization(NMF) as a clustering method were studied by relating its formulation to other methods such as K-means clustering. In this paper, by introducing clustering indicator constraints on NMF and incorporating manifold regularization to preserve geometric structures,we propose a novel manifold regularized NMF method that can simultaneously learn subspace and do clustering. As a result, our clustering results can directly assign cluster label to data points. Extensive experimental results show that our method outperforms related other methods.展开更多
Learning-outcome prediction(LOP)is a long-standing and critical problem in educational routes.Many studies have contributed to developing effective models while often suffering from data shortage and low generalizatio...Learning-outcome prediction(LOP)is a long-standing and critical problem in educational routes.Many studies have contributed to developing effective models while often suffering from data shortage and low generalization to various institutions due to the privacy-protection issue.To this end,this study proposes a distributed grade prediction model,dubbed FecMap,by exploiting the federated learning(FL)framework that preserves the private data of local clients and communicates with others through a global generalized model.FecMap considers local subspace learning(LSL),which explicitly learns the local features against the global features,and multi-layer privacy protection(MPP),which hierarchically protects the private features,including model-shareable features and not-allowably shared features,to achieve client-specific classifiers of high performance on LOP per institution.FecMap is then achieved in an iteration manner with all datasets distributed on clients by training a local neural network composed of a global part,a local part,and a classification head in clients and averaging the global parts from clients on the server.To evaluate the FecMap model,we collected three higher-educational datasets of student academic records from engineering majors.Experiment results manifest that FecMap benefits from the proposed LSL and MPP and achieves steady performance on the task of LOP,compared with the state-of-the-art models.This study makes a fresh attempt at the use of federated learning in the learning-analytical task,potentially paving the way to facilitating personalized education with privacy protection.展开更多
Due to limitations to extract invariant features for recognition when the aircraft presents various poses and lacks enough samples for training, a novel algorithm called Weighted Marginal Fisher Analysis with Spatiall...Due to limitations to extract invariant features for recognition when the aircraft presents various poses and lacks enough samples for training, a novel algorithm called Weighted Marginal Fisher Analysis with Spatially Smooth (WMFA-SS) for extracting invariant features in aircraft rec- ognition is proposed. According to the Graph Embedding (GE) framework, Heat Kernel function is firstly introduced to characterize the interclass separability when choosing the weights of penalty graph. Furthermore, Laplacian penalty is applied to constraining the coefficients to be spatially smooth in this algorithm. Laplacian penalty is able to incorporate the prior information that neigh- boring pixels are correlated. Besides, using a Laplacian penalty can also avoid the singularity of Laplacian matrix of intrinsic graph. Once compact representations of the images are obtained, it can be considered as invariant features and then be performed in classification to recognize different patterns of aircraft. Real aircraft recognition experiments show the superiority of our proposed WMFA-SS in comparison to other GE algorithms and the current aircraft recognition algorithm; the accuracy rate of our proposed method is 90.00% for dataset BH-AIR1.0 and 99.25% for dataset BH-AIR2.0.展开更多
Despite tons of advanced classification models that have recently been developed for the land cover mapping task,the monotonicity of a single remote sensing data source,such as only using hyperspectral data or multisp...Despite tons of advanced classification models that have recently been developed for the land cover mapping task,the monotonicity of a single remote sensing data source,such as only using hyperspectral data or multispectral data,hinders the classification accuracy from being further improved and tends to meet the performance bottleneck.For this reason,we develop a novel superpixel-based subspace learning model,called Supace,by jointly learning multimodal feature representations from HS and MS superpixels for more accurate LCC results.Supace can learn a common subspace across multimodal RS data,where the diverse and complementary information from different modalities can be better combined,being capable of enhancing the discriminative ability of to-be-learned features in a more effective way.To better capture semantic information of objects in the feature learning process,superpixels that beyond pixels are regarded as the study object in our Supace for LCC.Extensive experiments have been conducted on two popular hyperspectral and multispectral datasets,demonstrating the superiority of the proposed Supace in the land cover classification task compared with several well-known baselines related to multimodal remote sensing image feature learning.展开更多
Lie group machine learning is recognized as the theoretical basis of brain intelligence,brain learning,higher machine learning,and higher artificial intelligence.Sample sets of Lie group matrices are widely available ...Lie group machine learning is recognized as the theoretical basis of brain intelligence,brain learning,higher machine learning,and higher artificial intelligence.Sample sets of Lie group matrices are widely available in practical applications.Lie group learning is a vibrant field of increasing importance and extraordinary potential and thus needs to be developed further.This study aims to provide a comprehensive survey on recent advances in Lie group machine learning.We introduce Lie group machine learning techniques in three major categories:supervised Lie group machine learning,semisupervised Lie group machine learning,and unsupervised Lie group machine learning.In addition,we introduce the special application of Lie group machine learning in image processing.This work covers the following techniques:Lie group machine learning model,Lie group subspace orbit generation learning,symplectic group learning,quantum group learning,Lie group fiber bundle learning,Lie group cover learning,Lie group deep structure learning,Lie group semisupervised learning,Lie group kernel learning,tensor learning,frame bundle connection learning,spectral estimation learning,Finsler geometric learning,homology boundary learning,category representation learning,and neuromorphic synergy learning.Overall,this survey aims to provide an insightful overview of state-of-the-art development in the field of Lie group machine learning.It will enable researchers to comprehensively understand the state of the field,identify the most appropriate tools for particular applications,and identify directions for future research.展开更多
基金This paper was supported by the National Natural Science Foundation of China(61772286,61802208,and 61876089)China Postdoctoral Science Foundation Grant 2019M651923Natural Science Foundation of Jiangsu Province of China(BK0191381).
文摘:Cross-project defect prediction(CPDP)aims to predict the defects on target project by using a prediction model built on source projects.The main problem in CPDP is the huge distribution gap between the source project and the target project,which prevents the prediction model from performing well.Most existing methods overlook the class discrimination of the learned features.Seeking an effective transferable model from the source project to the target project for CPDP is challenging.In this paper,we propose an unsupervised domain adaptation based on the discriminative subspace learning(DSL)approach for CPDP.DSL treats the data from two projects as being from two domains and maps the data into a common feature space.It employs crossdomain alignment with discriminative information from different projects to reduce the distribution difference of the data between different projects and incorporates the class discriminative information.Specifically,DSL first utilizes subspace learning based domain adaptation to reduce the distribution gap of data between different projects.Then,it makes full use of the class label information of the source project and transfers the discrimination ability of the source project to the target project in the common space.Comprehensive experiments on five projects verify that DSL can build an effective prediction model and improve the performance over the related competing methods by at least 7.10%and 11.08%in terms of G-measure and AUC.
基金supported by the National Natural Science Foundation of China(Grant No.61922087)the Huxiang Young Talents Program of Hunan Province(2021RC3070).
文摘Unsupervised transfer subspace learning is one of the challenging and important topics in domain adaptation,which aims to classify unlabeled target data by using source domain information.The traditional transfer subspace learning methods often impose low-rank constraints,i.e.,trace norm,to preserve data structural information of different domains.However,trace norm is only the convex surrogate to approximate the ideal low-rank constraints and may make their solutions seriously deviate from the original optimums.In addition,the traditional methods directly use the strict labels of source domain,which is difficult to deal with label noise.To solve these problems,we propose a novel nonconvex and discriminative transfer subspace learning method named NDTSL by incorporating Schatten-norm and soft label matrix.Specifically,Schatten-norm can be imposed to approximate the low-rank constraints and obtain a better lowrank representation.Then,we design and adopt soft label matrix in source domain to learn a more flexible classifier and enhance the discriminative ability of target data.Besides,due to the nonconvexity of Schatten-norm,we design an efficient alternative algorithm IALM to solve it.Finally,experimental results on several public transfer tasks demonstrate the effectiveness of NDTSL compared with several state-of-the-art methods.
基金This work was supported by the National Natural Science Foundation of China (Grant Nos. 51505364 and 51335006), the National Key Basic Research Program of China (Grant No. 2015CB057400), and the Program for Changjiang Scholars. The authors thank NREL for supporting this work and providing the vibration data used for the validation of the JSL-MFD technique.
文摘The gearbox of a wind turbine (WT) has dominant failure rates and highest downtime loss among all WT subsystems. Thus, gearbox health assessment for maintenance cost reduction is of paramount importance. The concurrence of multiple faults in gearbox components is a common phenomenon due to fault induction mechanism. This problem should be considered before planning to replace the components of the WT gearbox. Therefore, the key fault patterns should be reliably identified from noisy observation data for the development of an effective maintenance strategy. However, most of the existing studies focusing on multiple fault diagnosis always suffer from inappropriate division of fault information in order to satisfy various rigorous decomposition principles or statistical assumptions, such as the smooth envelope principle of ensemble empirical mode decomposition and the mutual independence assumption of independent component analysis. Thus, this paper presents a joint subspace learning-based multiple fault detection (JSLMFD) technique to construct different subspaces adaptively for different fault pattems. Its main advantage is its capability to learn multiple fault subspaces directly from the observation signal itself. It can also sparsely concentrate the feature information into a few dominant subspace coefficients. Furthermore, it can eliminate noise by simply performing coefficient shrinkage operations. Consequently, multiple fault patterns are reliably identified by utilizing the maximum fault information criterion. The superiority of JSL-MFD in multiple fault separation and detection is comprehensively investigated and verified by the analysis of a data set of a 750 kW WT gearbox. Results show that JSL-MFD is superior to a state-of-the-art technique in detecting hidden fault patterns and enhancing detection accuracy.
基金supported by National Natural Science Foundation of China(No.61806006)Priority Academic Program Development of Jiangsu Higher Education Institutions。
文摘The existing multi-view subspace clustering algorithms based on tensor singular value decomposition(t-SVD)predominantly utilize tensor nuclear norm to explore the intra view correlation between views of the same samples,while neglecting the correlation among the samples within different views.Moreover,the tensor nuclear norm is not fully considered as a convex approximation of the tensor rank function.Treating different singular values equally may result in suboptimal tensor representation.A hypergraph regularized multi-view subspace clustering algorithm with dual tensor log-determinant(HRMSC-DTL)was proposed.The algorithm used subspace learning in each view to learn a specific set of affinity matrices,and introduced a non-convex tensor log-determinant function to replace the tensor nuclear norm to better improve global low-rankness.It also introduced hyper-Laplacian regularization to preserve the local geometric structure embedded in the high-dimensional space.Furthermore,it rotated the original tensor and incorporated a dual tensor mechanism to fully exploit the intra view correlation of the original tensor and the inter view correlation of the rotated tensor.At the same time,an alternating direction of multipliers method(ADMM)was also designed to solve non-convex optimization model.Experimental evaluations on seven widely used datasets,along with comparisons to several state-of-the-art algorithms,demonstrated the superiority and effectiveness of the HRMSC-DTL algorithm in terms of clustering performance.
文摘With the incredible growth of high-dimensional data such as microarray gene expression data and web blogs from internet, the researchers are desirable to develop new clustering techniques to address the critical problem created by irrelevant dimensions. Properties of Nonnegative Matrix Factorization(NMF) as a clustering method were studied by relating its formulation to other methods such as K-means clustering. In this paper, by introducing clustering indicator constraints on NMF and incorporating manifold regularization to preserve geometric structures,we propose a novel manifold regularized NMF method that can simultaneously learn subspace and do clustering. As a result, our clustering results can directly assign cluster label to data points. Extensive experimental results show that our method outperforms related other methods.
基金the National Natural Science Foundation of China(Grant Nos.62272392,U1811262,61802313)the Key Research and Development Program of China(2020AAA0108500)+2 种基金the Key Research and Development Program of Shaanxi Province(2023-YBGY-405)the Fundamental Research Funds for the Central University(D5000230088)the Higher Research Funding on International Talent Cultivation at NPU(GJGZZD202202)。
文摘Learning-outcome prediction(LOP)is a long-standing and critical problem in educational routes.Many studies have contributed to developing effective models while often suffering from data shortage and low generalization to various institutions due to the privacy-protection issue.To this end,this study proposes a distributed grade prediction model,dubbed FecMap,by exploiting the federated learning(FL)framework that preserves the private data of local clients and communicates with others through a global generalized model.FecMap considers local subspace learning(LSL),which explicitly learns the local features against the global features,and multi-layer privacy protection(MPP),which hierarchically protects the private features,including model-shareable features and not-allowably shared features,to achieve client-specific classifiers of high performance on LOP per institution.FecMap is then achieved in an iteration manner with all datasets distributed on clients by training a local neural network composed of a global part,a local part,and a classification head in clients and averaging the global parts from clients on the server.To evaluate the FecMap model,we collected three higher-educational datasets of student academic records from engineering majors.Experiment results manifest that FecMap benefits from the proposed LSL and MPP and achieves steady performance on the task of LOP,compared with the state-of-the-art models.This study makes a fresh attempt at the use of federated learning in the learning-analytical task,potentially paving the way to facilitating personalized education with privacy protection.
基金co-supported by the National Key Scientific Instrument and Equipment Development Project (No.2012YQ140032)
文摘Due to limitations to extract invariant features for recognition when the aircraft presents various poses and lacks enough samples for training, a novel algorithm called Weighted Marginal Fisher Analysis with Spatially Smooth (WMFA-SS) for extracting invariant features in aircraft rec- ognition is proposed. According to the Graph Embedding (GE) framework, Heat Kernel function is firstly introduced to characterize the interclass separability when choosing the weights of penalty graph. Furthermore, Laplacian penalty is applied to constraining the coefficients to be spatially smooth in this algorithm. Laplacian penalty is able to incorporate the prior information that neigh- boring pixels are correlated. Besides, using a Laplacian penalty can also avoid the singularity of Laplacian matrix of intrinsic graph. Once compact representations of the images are obtained, it can be considered as invariant features and then be performed in classification to recognize different patterns of aircraft. Real aircraft recognition experiments show the superiority of our proposed WMFA-SS in comparison to other GE algorithms and the current aircraft recognition algorithm; the accuracy rate of our proposed method is 90.00% for dataset BH-AIR1.0 and 99.25% for dataset BH-AIR2.0.
基金supported by the National Natural Science Foundation of China (Grant Nos. 62161160336, 42030111, and 62101045)the China Postdoctoral Science Foundation Funded Project (Grant No. 2021M690385)
文摘Despite tons of advanced classification models that have recently been developed for the land cover mapping task,the monotonicity of a single remote sensing data source,such as only using hyperspectral data or multispectral data,hinders the classification accuracy from being further improved and tends to meet the performance bottleneck.For this reason,we develop a novel superpixel-based subspace learning model,called Supace,by jointly learning multimodal feature representations from HS and MS superpixels for more accurate LCC results.Supace can learn a common subspace across multimodal RS data,where the diverse and complementary information from different modalities can be better combined,being capable of enhancing the discriminative ability of to-be-learned features in a more effective way.To better capture semantic information of objects in the feature learning process,superpixels that beyond pixels are regarded as the study object in our Supace for LCC.Extensive experiments have been conducted on two popular hyperspectral and multispectral datasets,demonstrating the superiority of the proposed Supace in the land cover classification task compared with several well-known baselines related to multimodal remote sensing image feature learning.
基金supported by the National Key Research and Development Program(Nos.2018YFA0701700 and 2018YFA0701701)Scientific Research Foundation for Advanced Talents(No.jit-b-202045)
文摘Lie group machine learning is recognized as the theoretical basis of brain intelligence,brain learning,higher machine learning,and higher artificial intelligence.Sample sets of Lie group matrices are widely available in practical applications.Lie group learning is a vibrant field of increasing importance and extraordinary potential and thus needs to be developed further.This study aims to provide a comprehensive survey on recent advances in Lie group machine learning.We introduce Lie group machine learning techniques in three major categories:supervised Lie group machine learning,semisupervised Lie group machine learning,and unsupervised Lie group machine learning.In addition,we introduce the special application of Lie group machine learning in image processing.This work covers the following techniques:Lie group machine learning model,Lie group subspace orbit generation learning,symplectic group learning,quantum group learning,Lie group fiber bundle learning,Lie group cover learning,Lie group deep structure learning,Lie group semisupervised learning,Lie group kernel learning,tensor learning,frame bundle connection learning,spectral estimation learning,Finsler geometric learning,homology boundary learning,category representation learning,and neuromorphic synergy learning.Overall,this survey aims to provide an insightful overview of state-of-the-art development in the field of Lie group machine learning.It will enable researchers to comprehensively understand the state of the field,identify the most appropriate tools for particular applications,and identify directions for future research.