Non-negative Matrix Factorization (NMF) has been an ideal tool for machine learning. Non-negative Matrix Tri-Factorization (NMTF) is a generalization of NMF that incorporates a third non-negative factorization matrix,...Non-negative Matrix Factorization (NMF) has been an ideal tool for machine learning. Non-negative Matrix Tri-Factorization (NMTF) is a generalization of NMF that incorporates a third non-negative factorization matrix, and has shown impressive clustering performance by imposing simultaneous orthogonality constraints on both sample and feature spaces. However, the performance of NMTF dramatically degrades if the data are contaminated with noises and outliers. Furthermore, the high-order geometric information is rarely considered. In this paper, a Robust NMTF with Dual Hyper-graph regularization (namely RDHNMTF) is introduced. Firstly, to enhance the robustness of NMTF, an improvement is made by utilizing the l_(2,1)-norm to evaluate the reconstruction error. Secondly, a dual hyper-graph is established to uncover the higher-order inherent information within sample space and feature spaces for clustering. Furthermore, an alternating iteration algorithm is devised, and its convergence is thoroughly analyzed. Additionally, computational complexity is analyzed among comparison algorithms. The effectiveness of RDHNMTF is verified by benchmarking against ten cutting-edge algorithms across seven datasets corrupted with four types of noise.展开更多
Heterogeneous Information Networks(HINs)contain multiple types of nodes and edges;therefore,they can preserve the semantic information and structure information.Cluster analysis using an HIN has obvious advantages ove...Heterogeneous Information Networks(HINs)contain multiple types of nodes and edges;therefore,they can preserve the semantic information and structure information.Cluster analysis using an HIN has obvious advantages over a transformation into a homogenous information network,which can promote the clustering results of different types of nodes.In our study,we applied a Nonnegative Matrix Tri-Factorization(NMTF)in a cluster analysis of multiple metapaths in HIN.Unlike the parameter estimation method of the probability distribution in previous studies,NMTF can obtain several dependent latent variables simultaneously,and each latent variable in NMTF is associated with the cluster of the corresponding node in the HIN.The method is suited to co-clustering leveraging multiple metapaths in HIN,because NMTF is employed for multiple nonnegative matrix factorizations simultaneously in our study.Experimental results on the real dataset show that the validity and correctness of our method,and the clustering result are better than that of the existing similar clustering algorithm.展开更多
Traditional anomaly detection on microblogging mostly focuses on individual anomalous users or messages. Since anomalous users employ advanced intelligent means, the anomaly detection is greatly poor in performance. I...Traditional anomaly detection on microblogging mostly focuses on individual anomalous users or messages. Since anomalous users employ advanced intelligent means, the anomaly detection is greatly poor in performance. In this paper, we propose an innovative framework of anomaly detection based on bipartite graph and co-clustering. A bipartite graph between users and messages is built to model the homogeneous and heterogeneous interactions. The proposed co- clustering algorithm based on nonnegative matrix tri-factorization can detect anomalous users and messages simultaneously. The homogeneous relations modeled by the bipartite graph are used as constraints to improve the accuracy of the co- clustering algorithm. Experimental results show that the proposed scheme can detect individual and group anomalies with high accuracy on a Sina Weibo dataset.展开更多
基金supported by the National Natural Science Foundation of China(No.62003281)the Natural Science Foundation of Chongqing,China(No.cstc2021jcyjmsxmX1169)the Science and Technology Research Program of Chongqing Municipal Education Commission(No.KJQN202200207).
文摘Non-negative Matrix Factorization (NMF) has been an ideal tool for machine learning. Non-negative Matrix Tri-Factorization (NMTF) is a generalization of NMF that incorporates a third non-negative factorization matrix, and has shown impressive clustering performance by imposing simultaneous orthogonality constraints on both sample and feature spaces. However, the performance of NMTF dramatically degrades if the data are contaminated with noises and outliers. Furthermore, the high-order geometric information is rarely considered. In this paper, a Robust NMTF with Dual Hyper-graph regularization (namely RDHNMTF) is introduced. Firstly, to enhance the robustness of NMTF, an improvement is made by utilizing the l_(2,1)-norm to evaluate the reconstruction error. Secondly, a dual hyper-graph is established to uncover the higher-order inherent information within sample space and feature spaces for clustering. Furthermore, an alternating iteration algorithm is devised, and its convergence is thoroughly analyzed. Additionally, computational complexity is analyzed among comparison algorithms. The effectiveness of RDHNMTF is verified by benchmarking against ten cutting-edge algorithms across seven datasets corrupted with four types of noise.
基金supported in part by the National Natural Science Foundation of China(No.61701190)the Youth Science Foundation of Jilin Province of China(No.20180520021JH)+4 种基金the National Key Research and Development Plan of China(No.2017YFA0604500)the Key Scientific and Technological Research and Development Plan of Jilin Province of China(No.20180201103GX)the China Postdoctoral Science Foundation(No.2018M631873)the Project of Jilin Province Development and Reform Commission(No.2019FGWTZC001)the Key Technology Innovation Cooperation Project of Government and University for the Whole Industry Demonstration(No.SXGJSF2017-4)。
文摘Heterogeneous Information Networks(HINs)contain multiple types of nodes and edges;therefore,they can preserve the semantic information and structure information.Cluster analysis using an HIN has obvious advantages over a transformation into a homogenous information network,which can promote the clustering results of different types of nodes.In our study,we applied a Nonnegative Matrix Tri-Factorization(NMTF)in a cluster analysis of multiple metapaths in HIN.Unlike the parameter estimation method of the probability distribution in previous studies,NMTF can obtain several dependent latent variables simultaneously,and each latent variable in NMTF is associated with the cluster of the corresponding node in the HIN.The method is suited to co-clustering leveraging multiple metapaths in HIN,because NMTF is employed for multiple nonnegative matrix factorizations simultaneously in our study.Experimental results on the real dataset show that the validity and correctness of our method,and the clustering result are better than that of the existing similar clustering algorithm.
基金the National Natural Science Foundation of China under Grant No. 61170242, the National High Technology Research and Development 863 Program of China under Grant No. 2012AA012802, and the Fundamental Research Fhnds for the Central Universities of China under Grant No. HEUCF100605.
文摘Traditional anomaly detection on microblogging mostly focuses on individual anomalous users or messages. Since anomalous users employ advanced intelligent means, the anomaly detection is greatly poor in performance. In this paper, we propose an innovative framework of anomaly detection based on bipartite graph and co-clustering. A bipartite graph between users and messages is built to model the homogeneous and heterogeneous interactions. The proposed co- clustering algorithm based on nonnegative matrix tri-factorization can detect anomalous users and messages simultaneously. The homogeneous relations modeled by the bipartite graph are used as constraints to improve the accuracy of the co- clustering algorithm. Experimental results show that the proposed scheme can detect individual and group anomalies with high accuracy on a Sina Weibo dataset.