In the need of some real applications, such as text categorization and image classification, the multi-label learning gradually becomes a hot research point in recent years. Much attention has been paid to the researc...In the need of some real applications, such as text categorization and image classification, the multi-label learning gradually becomes a hot research point in recent years. Much attention has been paid to the research of multi-label classification algorithms. Considering the fact that the high dimensionality of the multi-label datasets may cause the curse of dimensionality and wil hamper the classification process, a dimensionality reduction algorithm, named multi-label kernel discriminant analysis (MLKDA), is proposed to reduce the dimensionality of multi-label datasets. MLKDA, with the kernel trick, processes the multi-label integrally and realizes the nonlinear dimensionality reduction with the idea similar with linear discriminant analysis (LDA). In the classification process of multi-label data, the extreme learning machine (ELM) is an efficient algorithm in the premise of good accuracy. MLKDA, combined with ELM, shows a good performance in multi-label learning experiments with several datasets. The experiments on both static data and data stream show that MLKDA outperforms multi-label dimensionality reduction via dependence maximization (MDDM) and multi-label linear discriminant analysis (MLDA) in cases of balanced datasets and stronger correlation between tags, and ELM is also a good choice for multi-label classification.展开更多
Multi-label classification problems arise frequently in text categorization, and many other related applications. Like conventional categorization problems, multi-label categorization tasks suffer from the curse of hi...Multi-label classification problems arise frequently in text categorization, and many other related applications. Like conventional categorization problems, multi-label categorization tasks suffer from the curse of high dimensionality. Existing multi-label dimensionality reduction methods mainly suffer from two limitations. First, latent nonlinear structures are not utilized in the input space. Second, the label information is not fully exploited. This paper proposes a new method, multi-label local discriminative embedding (MLDE), which exploits latent structures to minimize intraclass distances and maximize interclass distances on the basis of label correlations. The latent structures are extracted by constructing two sets of adjacency graphs to make use of nonlinear information. Non-symmetric label correlations, which are the case in real applications, are adopted. The problem is formulated into a global objective function and a linear mapping is achieved to solve out-of-sample problems. Empirical studies across 11 Yahoo sub-tasks, Enron and Bibtex are conducted to validate the superiority of MLDE to state-of-art multi-label dimensionality reduction methods.展开更多
在多标签学习中,人工标注标签的主观性和不稳定性往往造成标签缺失,无法形成完备的标签空间,从而对监督学习算法的训练产生误导.标签相关性可在一定程度上弥补缺失标签对算法分类性能造成的不利影响.但缺失标签也会导致对标签相关性的...在多标签学习中,人工标注标签的主观性和不稳定性往往造成标签缺失,无法形成完备的标签空间,从而对监督学习算法的训练产生误导.标签相关性可在一定程度上弥补缺失标签对算法分类性能造成的不利影响.但缺失标签也会导致对标签相关性的估计不准确.针对该问题,提出一种增强标签相关性矩阵的不完备多标签学习(multi-label learning with incomplete labels via augmented label correlation matrix,ML-ALC)方法.首先,通过拉普拉斯映射构造数据的低维流形;然后,使用标签向量计算原始标签相关矩阵;接着,构造一个校正矩阵对原始标签相关矩阵进行增强,并通过回归系数矩阵和增强标签相关性矩阵将原始特征空间和标签空间分别映射到低维流形;最后,经过迭代学习获得优化的回归系数矩阵和增强标签相关性矩阵,并应用于多标签分类.实验结果表明,ML-ALC方法的分类性能优于其他针对缺失标签的多标签分类方法.展开更多
基金supported by the National Natural Science Foundation of China(5110505261173163)the Liaoning Provincial Natural Science Foundation of China(201102037)
文摘In the need of some real applications, such as text categorization and image classification, the multi-label learning gradually becomes a hot research point in recent years. Much attention has been paid to the research of multi-label classification algorithms. Considering the fact that the high dimensionality of the multi-label datasets may cause the curse of dimensionality and wil hamper the classification process, a dimensionality reduction algorithm, named multi-label kernel discriminant analysis (MLKDA), is proposed to reduce the dimensionality of multi-label datasets. MLKDA, with the kernel trick, processes the multi-label integrally and realizes the nonlinear dimensionality reduction with the idea similar with linear discriminant analysis (LDA). In the classification process of multi-label data, the extreme learning machine (ELM) is an efficient algorithm in the premise of good accuracy. MLKDA, combined with ELM, shows a good performance in multi-label learning experiments with several datasets. The experiments on both static data and data stream show that MLKDA outperforms multi-label dimensionality reduction via dependence maximization (MDDM) and multi-label linear discriminant analysis (MLDA) in cases of balanced datasets and stronger correlation between tags, and ELM is also a good choice for multi-label classification.
基金supported by the National Natural Science Foundation of China(61472305)the Science Research Program,Xi’an,China(2017073CG/RC036CXDKD003)the Aeronautical Science Foundation of China(20151981009)
文摘Multi-label classification problems arise frequently in text categorization, and many other related applications. Like conventional categorization problems, multi-label categorization tasks suffer from the curse of high dimensionality. Existing multi-label dimensionality reduction methods mainly suffer from two limitations. First, latent nonlinear structures are not utilized in the input space. Second, the label information is not fully exploited. This paper proposes a new method, multi-label local discriminative embedding (MLDE), which exploits latent structures to minimize intraclass distances and maximize interclass distances on the basis of label correlations. The latent structures are extracted by constructing two sets of adjacency graphs to make use of nonlinear information. Non-symmetric label correlations, which are the case in real applications, are adopted. The problem is formulated into a global objective function and a linear mapping is achieved to solve out-of-sample problems. Empirical studies across 11 Yahoo sub-tasks, Enron and Bibtex are conducted to validate the superiority of MLDE to state-of-art multi-label dimensionality reduction methods.
文摘在多标签学习中,人工标注标签的主观性和不稳定性往往造成标签缺失,无法形成完备的标签空间,从而对监督学习算法的训练产生误导.标签相关性可在一定程度上弥补缺失标签对算法分类性能造成的不利影响.但缺失标签也会导致对标签相关性的估计不准确.针对该问题,提出一种增强标签相关性矩阵的不完备多标签学习(multi-label learning with incomplete labels via augmented label correlation matrix,ML-ALC)方法.首先,通过拉普拉斯映射构造数据的低维流形;然后,使用标签向量计算原始标签相关矩阵;接着,构造一个校正矩阵对原始标签相关矩阵进行增强,并通过回归系数矩阵和增强标签相关性矩阵将原始特征空间和标签空间分别映射到低维流形;最后,经过迭代学习获得优化的回归系数矩阵和增强标签相关性矩阵,并应用于多标签分类.实验结果表明,ML-ALC方法的分类性能优于其他针对缺失标签的多标签分类方法.