Working memory plays an important role in human cognition. This study investigated how working memory was encoded by the power of multichannel local field potentials (LFPs) based on sparse non negative matrix factor...Working memory plays an important role in human cognition. This study investigated how working memory was encoded by the power of multichannel local field potentials (LFPs) based on sparse non negative matrix factorization (SNMF). SNMF was used to extract features from LFPs recorded from the prefrontal cortex of four SpragueDawley rats during a memory task in a Y maze, with 10 trials for each rat. Then the powerincreased LFP components were selected as working memoryrelated features and the other components were removed. After that, the inverse operation of SNMF was used to study the encoding of working memory in the time frequency domain. We demonstrated that theta and gamma power increased significantly during the working memory task. The results suggested that postsynaptic activity was simulated well by the sparse activity model. The theta and gamma bands were meaningful for encoding working memory.展开更多
CircRNAs,widely found throughout the human bodies,play a crucial role in regulating various biological processes and are closely linked to complex human diseases.Investigating potential associations between circRNAs a...CircRNAs,widely found throughout the human bodies,play a crucial role in regulating various biological processes and are closely linked to complex human diseases.Investigating potential associations between circRNAs and diseases can enhance our understanding of diseases and provide new strategies and tools for early diagnosis,treatment,and disease prevention.However,existing models have limitations in accurately capturing similarities,handling the sparse and noise attributes of association networks,and fully leveraging bioinformatical aspects from multiple viewpoints.To address these issues,this study introduces a new non-negative matrix factorization-based framework called NMFMSN.First,we incorporate circRNA sequence data and disease semantic information to compute circRNA and disease similarity,respectively.Given the sparse known associations between circRNAs and diseases,we reconstruct the network to complete more associations by imputing missing links based on neighboring circRNA and disease interactions.Finally,we integrate these two similarity networks into a non-negative matrix factorization framework to identify potential circRNA-disease associations.Upon conducting 5-fold cross-validation and leave-one-out cross-validation,the AUC values for NMFMSN reach 0.9712 and 0.9768,respectively,outperforming the currently most advanced models.Case studies on lung cancer and hepatocellular carcinoma show that NMFMSN is a good way to predict new associations between circRNAs and diseases.展开更多
Aiming at the problems of bispectral analysis when applied to machinery fault diagnosis, a machinery fault feature extraction method based on sparseness-controlled non-negative tensor factorization (SNTF) is propose...Aiming at the problems of bispectral analysis when applied to machinery fault diagnosis, a machinery fault feature extraction method based on sparseness-controlled non-negative tensor factorization (SNTF) is proposed. First, a non-negative tensor factorization(NTF) algorithm is improved by imposing sparseness constraints on it. Secondly, the bispectral images of mechanical signals are obtained and stacked to form a third-order tensor. Thirdly, the improved algorithm is used to extract features, which are represented by a series of basis images from this tensor. Finally, coefficients indicating these basis images' weights in constituting original bispectral images are calculated for fault classification. Experiments on fault diagnosis of gearboxes show that the extracted features can not only reveal some nonlinear characteristics of the system, but also have intuitive meanings with regard to fault characteristic frequencies. These features provide great convenience for the interpretation of the relationships between machinery faults and corresponding bispectra.展开更多
A novel framework is proposed to obtain physiologically meaningful features for Alzheimer's disease(AD)classification based on sparse functional connectivity and non-negative matrix factorization.Specifically,the ...A novel framework is proposed to obtain physiologically meaningful features for Alzheimer's disease(AD)classification based on sparse functional connectivity and non-negative matrix factorization.Specifically,the non-negative adaptive sparse representation(NASR)method is applied to compute the sparse functional connectivity among brain regions based on functional magnetic resonance imaging(fMRI)data for feature extraction.Afterwards,the sparse non-negative matrix factorization(sNMF)method is adopted for dimensionality reduction to obtain low-dimensional features with straightforward physical meaning.The experimental results show that the proposed framework outperforms the competing frameworks in terms of classification accuracy,sensitivity and specificity.Furthermore,three sub-networks,including the default mode network,the basal ganglia-thalamus-limbic network and the temporal-insular network,are found to have notable differences between the AD patients and the healthy subjects.The proposed framework can effectively identify AD patients and has potentials for extending the understanding of the pathological changes of AD.展开更多
Due to the non-stationary characteristics of vibration signals acquired from rolling element bearing fault, thc time-frequency analysis is often applied to describe the local information of these unstable signals smar...Due to the non-stationary characteristics of vibration signals acquired from rolling element bearing fault, thc time-frequency analysis is often applied to describe the local information of these unstable signals smartly. However, it is difficult to classitythe high dimensional feature matrix directly because of too large dimensions for many classifiers. This paper combines the concepts of time-frequency distribution(TFD) with non-negative matrix factorization(NMF), and proposes a novel TFD matrix factorization method to enhance representation and identification of bearing fault. Throughout this method, the TFD of a vibration signal is firstly accomplished to describe the localized faults with short-time Fourier transform(STFT). Then, the supervised NMF mapping is adopted to extract the fault features from TFD. Meanwhile, the fault samples can be clustered and recognized automatically by using the clustering property of NMF. The proposed method takes advantages of the NMF in the parts-based representation and the adaptive clustering. The localized fault features of interest can be extracted as well. To evaluate the performance of the proposed method, the 9 kinds of the bearing fault on a test bench is performed. The proposed method can effectively identify the fault severity and different fault types. Moreover, in comparison with the artificial neural network(ANN), NMF yields 99.3% mean accuracy which is much superior to ANN. This research presents a simple and practical resolution for the fault diagnosis problem of rolling element bearing in high dimensional feature space.展开更多
This paper proposes a Graph regularized Lpsmooth non-negative matrix factorization(GSNMF) method by incorporating graph regularization and L_p smoothing constraint, which considers the intrinsic geometric information ...This paper proposes a Graph regularized Lpsmooth non-negative matrix factorization(GSNMF) method by incorporating graph regularization and L_p smoothing constraint, which considers the intrinsic geometric information of a data set and produces smooth and stable solutions. The main contributions are as follows: first, graph regularization is added into NMF to discover the hidden semantics and simultaneously respect the intrinsic geometric structure information of a data set. Second,the Lpsmoothing constraint is incorporated into NMF to combine the merits of isotropic(L_2-norm) and anisotropic(L_1-norm)diffusion smoothing, and produces a smooth and more accurate solution to the optimization problem. Finally, the update rules and proof of convergence of GSNMF are given. Experiments on several data sets show that the proposed method outperforms related state-of-the-art methods.展开更多
The constrained weighted-non-negative matrix factorization(CW-NMF)hybrid receptor model was applied to study the influence of steelmaking activities on PM_(2.5)(particulate matter with equivalent aerodynamic diameter ...The constrained weighted-non-negative matrix factorization(CW-NMF)hybrid receptor model was applied to study the influence of steelmaking activities on PM_(2.5)(particulate matter with equivalent aerodynamic diameter less than 2.5μm)composition in Dunkerque,Northern France.Semi-diurnal PM_(2.5)samples were collected using a high volume sampler in winter 2010 and spring 2011 and were analyzed for trace metals,water-soluble ions,and total carbon using inductively coupled plasma–atomic emission spectrometry(ICP-AES),ICP-mass spectrometry(ICP-MS),ionic chromatography and micro elemental carbon analyzer.The elemental composition shows that NO_(3)^(-),SO_(4)^(2-),NH_4~+and total carbon are the main PM_(2.5)constituents.Trace metals data were interpreted using concentration roses and both influences of integrated steelworks and electric steel plant were evidenced.The distinction between the two sources is made possible by the use Zn/Fe and Zn/Mn diagnostic ratios.Moreover Rb/Cr,Pb/Cr and Cu/Cd combination ratio are proposed to distinguish the ISW-sintering stack from the ISW-fugitive emissions.The a priori knowledge on the influencing source was introduced in the CW-NMF to guide the calculation.Eleven source profiles with various contributions were identified:8 are characteristics of coastal urban background site profiles and 3 are related to the steelmaking activities.Between them,secondary nitrates,secondary sulfates and combustion profiles give the highest contributions and account for 93%of the PM_(2.5)concentration.The steelwork facilities contribute in about 2%of the total PM_(2.5)concentration and appear to be the main source of Cr,Cu,Fe,Mn,Zn.展开更多
This paper presents a novel medical image registration algorithm named total variation constrained graphregularization for non-negative matrix factorization(TV-GNMF).The method utilizes non-negative matrix factorizati...This paper presents a novel medical image registration algorithm named total variation constrained graphregularization for non-negative matrix factorization(TV-GNMF).The method utilizes non-negative matrix factorization by total variation constraint and graph regularization.The main contributions of our work are the following.First,total variation is incorporated into NMF to control the diffusion speed.The purpose is to denoise in smooth regions and preserve features or details of the data in edge regions by using a diffusion coefficient based on gradient information.Second,we add graph regularization into NMF to reveal intrinsic geometry and structure information of features to enhance the discrimination power.Third,the multiplicative update rules and proof of convergence of the TV-GNMF algorithm are given.Experiments conducted on datasets show that the proposed TV-GNMF method outperforms other state-of-the-art algorithms.展开更多
Nonnegative matrix factorization (NMF) is a method to get parts-based features of information and form the typical profiles. But the basis vectors NMF gets are not orthogonal so that parts-based features of informatio...Nonnegative matrix factorization (NMF) is a method to get parts-based features of information and form the typical profiles. But the basis vectors NMF gets are not orthogonal so that parts-based features of information are usually redundancy. In this paper, we propose two different approaches based on localized non-negative matrix factorization (LNMF) to obtain the typical user session profiles and typical semantic profiles of junk mails. The LNMF get basis vectors as orthogonal as possible so that it can get accurate profiles. The experiments show that the approach based on LNMF can obtain better profiles than the approach based on NMF. Key words localized non-negative matrix factorization - profile - log mining - mail filtering CLC number TP 391 Foundation item: Supported by the National Natural Science Foundation of China (60373066, 60303024), National Grand Fundamental Research 973 Program of China (2002CB312000), National Research Foundation for the Doctoral Program of Higher Education of China (20020286004).Biography: Jiang Ji-xiang (1980-), male, Master candidate, research direction: data mining, knowledge representation on the Web.展开更多
Object-based audio coding is the main technique of audio scene coding. It can effectively reconstruct each object trajectory, besides provide sufficient flexibility for personalized audio scene reconstruction. So more...Object-based audio coding is the main technique of audio scene coding. It can effectively reconstruct each object trajectory, besides provide sufficient flexibility for personalized audio scene reconstruction. So more and more attentions have been paid to the object-based audio coding. However, existing object-based techniques have poor sound quality because of low parameter frequency domain resolution. In order to achieve high quality audio object coding, we propose a new coding framework with introducing the non-negative matrix factorization(NMF) method. We extract object parameters with high resolution to improve sound quality, and apply NMF method to parameter coding to reduce the high bitrate caused by high resolution. And the experimental results have shown that the proposed framework can improve the coding quality by 25%, so it can provide a better solution to encode audio scene in a more flexible and higher quality way.展开更多
We present a novel approach to solve the problem of single channel source separation (SCSS) based on filterbank technique and sparse non-negative matrix two dimensional deconvolution (SNMF2D). The proposed approach do...We present a novel approach to solve the problem of single channel source separation (SCSS) based on filterbank technique and sparse non-negative matrix two dimensional deconvolution (SNMF2D). The proposed approach does not require training information of the sources and therefore, it is highly suited for practicality of SCSS. The major problem of most existing SCSS algorithms lies in their inability to resolve the mixing ambiguity in the single channel observation. Our proposed approach tackles this difficult problem by using filterbank which decomposes the mixed signal into sub-band domain. This will result the mixture in sub-band domain to be more separable. By incorporating SNMF2D algorithm, the spectral-temporal structure of the sources can be obtained more accurately. Real time test has been conducted and it is shown that the proposed method gives high quality source separation performance.展开更多
Non-negative matrix factorization (NMF) is a technique for dimensionality reduction by placing non-negativity constraints on the matrix. Based on the PARAFAC model, NMF was extended for three-dimension data decompos...Non-negative matrix factorization (NMF) is a technique for dimensionality reduction by placing non-negativity constraints on the matrix. Based on the PARAFAC model, NMF was extended for three-dimension data decomposition. The three-dimension nonnegative matrix factorization (NMF3) algorithm, which was concise and easy to implement, was given in this paper. The NMF3 algorithm implementation was based on elements but not on vectors. It could decompose a data array directly without unfolding, which was not similar to that the traditional algorithms do, It has been applied to the simulated data array decomposition and obtained reasonable results. It showed that NMF3 could be introduced for curve resolution in chemometrics.展开更多
Background:Establishing an appropriate prognostic model for PCa is essential for its effective treatment.Glycolysis is a vital energy-harvesting mechanism for tumors.Developing a prognostic model for PCa based on glyc...Background:Establishing an appropriate prognostic model for PCa is essential for its effective treatment.Glycolysis is a vital energy-harvesting mechanism for tumors.Developing a prognostic model for PCa based on glycolysis-related genes is novel and has great potential.Methods:First,gene expression and clinical data of PCa patients were downloaded from The Cancer Genome Atlas(TCGA)and Gene Expression Omnibus(GEO),and glycolysis-related genes were obtained from the Molecular Signatures Database(MSigDB).Gene enrichment analysis was performed to verify that glycolysis functions were enriched in the genes we obtained,which were used in nonnegative matrix factorization(NMF)to identify clusters.The correlation between clusters and clinical features was discussed,and the differentially expressed genes(DEGs)between the two clusters were investigated.Based on the DEGs,we investigated the biological differences between clusters,including immune cell infiltration,mutation,tumor immune dysfunction and exclusion,immune function,and checkpoint genes.To establish the prognostic model,the genes were filtered based on univariable Cox regression,LASSO,and multivariable Cox regression.Kaplan–Meier analysis and receiver operating characteristic analysis validated the prognostic value of the model.A nomogram of the risk score calculated by the prognostic model and clinical characteristics was constructed to quantitatively estimate the survival probability for PCa patients in the clinical setting.Result:The genes obtained from MSigDB were enriched in glycolysis functions.Two clusters were identified by NMF analysis based on 272 glycolysis-related genes,and a prognostic model based on DEGs between the two clusters was finally established.The prognostic model consisted of LAMPS,SPRN,ATOH1,TANC1,ETV1,TDRD1,KLK14,MESP2,POSTN,CRIP2,NAT1,AKR7A3,PODXL,CARTPT,and PCDHGB2.All sample,training,and test cohorts from The Cancer Genome Atlas(TCGA)and the external validation cohort from GEO showed significant differences between the high-risk and low-risk groups.The area under the ROC curve showed great performance of this prognostic model.Conclusion:A prognostic model based on glycolysis-related genes was established,with great performance and potential significance to the clinical application.展开更多
Data is humongous today because of the extensive use of World WideWeb, Social Media and Intelligent Systems. This data can be very important anduseful if it is harnessed carefully and correctly. Useful information can...Data is humongous today because of the extensive use of World WideWeb, Social Media and Intelligent Systems. This data can be very important anduseful if it is harnessed carefully and correctly. Useful information can beextracted from this massive data using the Data Mining process. The informationextracted can be used to make vital decisions in various industries. Clustering is avery popular Data Mining method which divides the data points into differentgroups such that all similar data points form a part of the same group. Clusteringmethods are of various types. Many parameters and indexes exist for the evaluationand comparison of these methods. In this paper, we have compared partitioningbased methods K-Means, Fuzzy C-Means (FCM), Partitioning AroundMedoids (PAM) and Clustering Large Application (CLARA) on secure perturbeddata. Comparison and identification has been done for the method which performsbetter for analyzing the data perturbed using Extended NMF on the basis of thevalues of various indexes like Dunn Index, Silhouette Index, Xie-Beni Indexand Davies-Bouldin Index.展开更多
The use of online discussion forum can?effectively engage students in their studies. As the number of messages posted on the forum is increasing, it is more difficult for instructors to read and respond to them in a p...The use of online discussion forum can?effectively engage students in their studies. As the number of messages posted on the forum is increasing, it is more difficult for instructors to read and respond to them in a prompt way. In this paper, we apply non-negative matrix factorization and visualization to clustering message data, in order to provide a summary view of messages that disclose their deep semantic relationships. In particular, the NMF is able to find the underlying issues hidden in the messages about which most of the students are concerned. Visualization is employed to estimate the initial number of clusters, showing the relation communities. The experiments and comparison on a real dataset have been reported to demonstrate the effectiveness of the approaches.展开更多
Rank determination issue is one of the most significant issues in non-negative matrix factorization (NMF) research. However, rank determination problem has not received so much emphasis as sparseness regularization pr...Rank determination issue is one of the most significant issues in non-negative matrix factorization (NMF) research. However, rank determination problem has not received so much emphasis as sparseness regularization problem. Usually, the rank of base matrix needs to be assumed. In this paper, we propose an unsupervised multi-level non-negative matrix factorization model to extract the hidden data structure and seek the rank of base matrix. From machine learning point of view, the learning result depends on its prior knowledge. In our unsupervised multi-level model, we construct a three-level data structure for non-negative matrix factorization algorithm. Such a construction could apply more prior knowledge to the algorithm and obtain a better approximation of real data structure. The final bases selection is achieved through L2-norm optimization. We implement our experiment via binary datasets. The results demonstrate that our approach is able to retrieve the hidden structure of data, thus determine the correct rank of base matrix.展开更多
The proportionate recursive least squares(PRLS)algorithm has shown faster convergence and better performance than both proportionate updating(PU)mechanism based least mean squares(LMS)algorithms and RLS algorithms wit...The proportionate recursive least squares(PRLS)algorithm has shown faster convergence and better performance than both proportionate updating(PU)mechanism based least mean squares(LMS)algorithms and RLS algorithms with a sparse regularization term.In this paper,we propose a variable forgetting factor(VFF)PRLS algorithm with a sparse penalty,e.g.,l_(1)-norm,for sparse identification.To reduce the computation complexity of the proposed algorithm,a fast implementation method based on dichotomous coordinate descent(DCD)algorithm is also derived.Simulation results indicate superior performance of the proposed algorithm.展开更多
LDL-factorization is an efficient way of solving Ax = b for a large symmetric positive definite sparse matrix A. This paper presents a new method that further improves the efficiency of LDL-factorization. It is based ...LDL-factorization is an efficient way of solving Ax = b for a large symmetric positive definite sparse matrix A. This paper presents a new method that further improves the efficiency of LDL-factorization. It is based on the theory of elimination trees for the factorization factor. It breaks the computations involved in LDL-factorization down into two stages: 1) the pattern of nonzero entries of the factor is predicted, and 2) the numerical values of the nonzero entries of the factor are computed. The factor is stored using the form of an elimination tree so as to reduce memory usage and avoid unnecessary numerical operations. The calculation results for some typical numerical examples demonstrate that this method provides a significantly higher calculation efficiency for the one-to-one marketing optimization algorithm.展开更多
Topic modeling stands as a well-explored and foundational challenge in the text mining domain.Traditional topic schemes based on word co-occurrences,aim to expose the latent semantic structure embedded in a document c...Topic modeling stands as a well-explored and foundational challenge in the text mining domain.Traditional topic schemes based on word co-occurrences,aim to expose the latent semantic structure embedded in a document corpus.Nevertheless,the inherent brevity of short texts introduces data sparsity,hindering the effectiveness of conventional topic models and yielding suboptimal outcomes for such text.Typically,short texts encompass a restricted number of topics,necessitating a grasp of relevant background knowledge for a comprehensive understanding of semantic content.Motivated by the observed information,this research introduces a novel Deep Auto encoder Graph Regularized Non-negative Matrix Factorization algorithm(DAGR-NMF)to uncover significant and meaningful topics within short document contents.The three main phases of proposed work are preprocessing,feature extraction and topic modeling.Initially,the data are preprocessed using natural language preprocessing tasks such as stop word removal,stemming and lemmatizing.Then,feature extraction is performed using hybrid Absolute Deviation Factors-Class Term Frequency(ADF-CTF)to capture the most relevant information from the text.Finally,topic modeling task is executed using proposed DAGR-NMF approach.Experimental findings demonstrate that the introduced DAGR-NMF model outperforms all other techniques by achieving NMI values of 0.852,0.857,0.793,and 0.831 on associated press,political blog datasets,20NewsGroups,and News category dataset,respectively.展开更多
Latent factor(LF)models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS)matrices which are commonly seen in various industrial applications.An LF model usually adopts iterativ...Latent factor(LF)models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS)matrices which are commonly seen in various industrial applications.An LF model usually adopts iterative optimizers,which may consume many iterations to achieve a local optima,resulting in considerable time cost.Hence,determining how to accelerate the training process for LF models has become a significant issue.To address this,this work proposes a randomized latent factor(RLF)model.It incorporates the principle of randomized learning techniques from neural networks into the LF analysis of HiDS matrices,thereby greatly alleviating computational burden.It also extends a standard learning process for randomized neural networks in context of LF analysis to make the resulting model represent an HiDS matrix correctly.Experimental results on three HiDS matrices from industrial applications demonstrate that compared with state-of-the-art LF models,RLF is able to achieve significantly higher computational efficiency and comparable prediction accuracy for missing data.I provides an important alternative approach to LF analysis of HiDS matrices,which is especially desired for industrial applications demanding highly efficient models.展开更多
基金supported by the National Natural Science Foundation of China (61074131 and 91132722)the Doctoral Fund of the Ministry of Education of China (21101202110007)
文摘Working memory plays an important role in human cognition. This study investigated how working memory was encoded by the power of multichannel local field potentials (LFPs) based on sparse non negative matrix factorization (SNMF). SNMF was used to extract features from LFPs recorded from the prefrontal cortex of four SpragueDawley rats during a memory task in a Y maze, with 10 trials for each rat. Then the powerincreased LFP components were selected as working memoryrelated features and the other components were removed. After that, the inverse operation of SNMF was used to study the encoding of working memory in the time frequency domain. We demonstrated that theta and gamma power increased significantly during the working memory task. The results suggested that postsynaptic activity was simulated well by the sparse activity model. The theta and gamma bands were meaningful for encoding working memory.
基金the Gansu Province Industrial Support Plan(No.2023CYZC-25)Natural Science Foundation of Gansu Province(No.23JRRA770)the National Natural Science Foundation of China(No.62162040)。
文摘CircRNAs,widely found throughout the human bodies,play a crucial role in regulating various biological processes and are closely linked to complex human diseases.Investigating potential associations between circRNAs and diseases can enhance our understanding of diseases and provide new strategies and tools for early diagnosis,treatment,and disease prevention.However,existing models have limitations in accurately capturing similarities,handling the sparse and noise attributes of association networks,and fully leveraging bioinformatical aspects from multiple viewpoints.To address these issues,this study introduces a new non-negative matrix factorization-based framework called NMFMSN.First,we incorporate circRNA sequence data and disease semantic information to compute circRNA and disease similarity,respectively.Given the sparse known associations between circRNAs and diseases,we reconstruct the network to complete more associations by imputing missing links based on neighboring circRNA and disease interactions.Finally,we integrate these two similarity networks into a non-negative matrix factorization framework to identify potential circRNA-disease associations.Upon conducting 5-fold cross-validation and leave-one-out cross-validation,the AUC values for NMFMSN reach 0.9712 and 0.9768,respectively,outperforming the currently most advanced models.Case studies on lung cancer and hepatocellular carcinoma show that NMFMSN is a good way to predict new associations between circRNAs and diseases.
基金The National Natural Science Foundation of China (No.50875048)the Natural Science Foundation of Jiangsu Province (No.BK2007115)the National High Technology Research and Development Program of China (863 Program)(No.2007AA04Z421)
文摘Aiming at the problems of bispectral analysis when applied to machinery fault diagnosis, a machinery fault feature extraction method based on sparseness-controlled non-negative tensor factorization (SNTF) is proposed. First, a non-negative tensor factorization(NTF) algorithm is improved by imposing sparseness constraints on it. Secondly, the bispectral images of mechanical signals are obtained and stacked to form a third-order tensor. Thirdly, the improved algorithm is used to extract features, which are represented by a series of basis images from this tensor. Finally, coefficients indicating these basis images' weights in constituting original bispectral images are calculated for fault classification. Experiments on fault diagnosis of gearboxes show that the extracted features can not only reveal some nonlinear characteristics of the system, but also have intuitive meanings with regard to fault characteristic frequencies. These features provide great convenience for the interpretation of the relationships between machinery faults and corresponding bispectra.
基金The Foundation of Hygiene and Health of Jiangsu Province(No.H2018042)the National Natural Science Foundation of China(No.61773114)the Key Research and Development Plan(Industry Foresight and Common Key Technology)of Jiangsu Province(No.BE2017007-3)
文摘A novel framework is proposed to obtain physiologically meaningful features for Alzheimer's disease(AD)classification based on sparse functional connectivity and non-negative matrix factorization.Specifically,the non-negative adaptive sparse representation(NASR)method is applied to compute the sparse functional connectivity among brain regions based on functional magnetic resonance imaging(fMRI)data for feature extraction.Afterwards,the sparse non-negative matrix factorization(sNMF)method is adopted for dimensionality reduction to obtain low-dimensional features with straightforward physical meaning.The experimental results show that the proposed framework outperforms the competing frameworks in terms of classification accuracy,sensitivity and specificity.Furthermore,three sub-networks,including the default mode network,the basal ganglia-thalamus-limbic network and the temporal-insular network,are found to have notable differences between the AD patients and the healthy subjects.The proposed framework can effectively identify AD patients and has potentials for extending the understanding of the pathological changes of AD.
基金Supported by Shaanxi Provincial Overall Innovation Project of Science and Technology,China(Grant No.2013KTCQ01-06)
文摘Due to the non-stationary characteristics of vibration signals acquired from rolling element bearing fault, thc time-frequency analysis is often applied to describe the local information of these unstable signals smartly. However, it is difficult to classitythe high dimensional feature matrix directly because of too large dimensions for many classifiers. This paper combines the concepts of time-frequency distribution(TFD) with non-negative matrix factorization(NMF), and proposes a novel TFD matrix factorization method to enhance representation and identification of bearing fault. Throughout this method, the TFD of a vibration signal is firstly accomplished to describe the localized faults with short-time Fourier transform(STFT). Then, the supervised NMF mapping is adopted to extract the fault features from TFD. Meanwhile, the fault samples can be clustered and recognized automatically by using the clustering property of NMF. The proposed method takes advantages of the NMF in the parts-based representation and the adaptive clustering. The localized fault features of interest can be extracted as well. To evaluate the performance of the proposed method, the 9 kinds of the bearing fault on a test bench is performed. The proposed method can effectively identify the fault severity and different fault types. Moreover, in comparison with the artificial neural network(ANN), NMF yields 99.3% mean accuracy which is much superior to ANN. This research presents a simple and practical resolution for the fault diagnosis problem of rolling element bearing in high dimensional feature space.
基金supported by the National Natural Science Foundation of China(61702251,61363049,11571011)the State Scholarship Fund of China Scholarship Council(CSC)(201708360040)+3 种基金the Natural Science Foundation of Jiangxi Province(20161BAB212033)the Natural Science Basic Research Plan in Shaanxi Province of China(2018JM6030)the Doctor Scientific Research Starting Foundation of Northwest University(338050050)Youth Academic Talent Support Program of Northwest University
文摘This paper proposes a Graph regularized Lpsmooth non-negative matrix factorization(GSNMF) method by incorporating graph regularization and L_p smoothing constraint, which considers the intrinsic geometric information of a data set and produces smooth and stable solutions. The main contributions are as follows: first, graph regularization is added into NMF to discover the hidden semantics and simultaneously respect the intrinsic geometric structure information of a data set. Second,the Lpsmoothing constraint is incorporated into NMF to combine the merits of isotropic(L_2-norm) and anisotropic(L_1-norm)diffusion smoothing, and produces a smooth and more accurate solution to the optimization problem. Finally, the update rules and proof of convergence of GSNMF are given. Experiments on several data sets show that the proposed method outperforms related state-of-the-art methods.
基金financially supported by the Nord-Pas-de-Calais Region Councilthe Ministry of Higher Education and Research+1 种基金the European Regional Development FundsAdib Kfoury acknowledges the“Pole Metropolitain Cote d'Opale”(PMCO)for its PhD financial support
文摘The constrained weighted-non-negative matrix factorization(CW-NMF)hybrid receptor model was applied to study the influence of steelmaking activities on PM_(2.5)(particulate matter with equivalent aerodynamic diameter less than 2.5μm)composition in Dunkerque,Northern France.Semi-diurnal PM_(2.5)samples were collected using a high volume sampler in winter 2010 and spring 2011 and were analyzed for trace metals,water-soluble ions,and total carbon using inductively coupled plasma–atomic emission spectrometry(ICP-AES),ICP-mass spectrometry(ICP-MS),ionic chromatography and micro elemental carbon analyzer.The elemental composition shows that NO_(3)^(-),SO_(4)^(2-),NH_4~+and total carbon are the main PM_(2.5)constituents.Trace metals data were interpreted using concentration roses and both influences of integrated steelworks and electric steel plant were evidenced.The distinction between the two sources is made possible by the use Zn/Fe and Zn/Mn diagnostic ratios.Moreover Rb/Cr,Pb/Cr and Cu/Cd combination ratio are proposed to distinguish the ISW-sintering stack from the ISW-fugitive emissions.The a priori knowledge on the influencing source was introduced in the CW-NMF to guide the calculation.Eleven source profiles with various contributions were identified:8 are characteristics of coastal urban background site profiles and 3 are related to the steelmaking activities.Between them,secondary nitrates,secondary sulfates and combustion profiles give the highest contributions and account for 93%of the PM_(2.5)concentration.The steelwork facilities contribute in about 2%of the total PM_(2.5)concentration and appear to be the main source of Cr,Cu,Fe,Mn,Zn.
基金supported by the National Natural Science Foundation of China(61702251,41971424,61701191,U1605254)the Natural Science Basic Research Plan in Shaanxi Province of China(2018JM6030)+4 种基金the Key Technical Project of Fujian Province(2017H6015)the Science and Technology Project of Xiamen(3502Z20183032)the Doctor Scientific Research Starting Foundation of Northwest University(338050050)Youth Academic Talent Support Program of Northwest University(360051900151)the Natural Sciences and Engineering Research Council of Canada,Canada。
文摘This paper presents a novel medical image registration algorithm named total variation constrained graphregularization for non-negative matrix factorization(TV-GNMF).The method utilizes non-negative matrix factorization by total variation constraint and graph regularization.The main contributions of our work are the following.First,total variation is incorporated into NMF to control the diffusion speed.The purpose is to denoise in smooth regions and preserve features or details of the data in edge regions by using a diffusion coefficient based on gradient information.Second,we add graph regularization into NMF to reveal intrinsic geometry and structure information of features to enhance the discrimination power.Third,the multiplicative update rules and proof of convergence of the TV-GNMF algorithm are given.Experiments conducted on datasets show that the proposed TV-GNMF method outperforms other state-of-the-art algorithms.
文摘Nonnegative matrix factorization (NMF) is a method to get parts-based features of information and form the typical profiles. But the basis vectors NMF gets are not orthogonal so that parts-based features of information are usually redundancy. In this paper, we propose two different approaches based on localized non-negative matrix factorization (LNMF) to obtain the typical user session profiles and typical semantic profiles of junk mails. The LNMF get basis vectors as orthogonal as possible so that it can get accurate profiles. The experiments show that the approach based on LNMF can obtain better profiles than the approach based on NMF. Key words localized non-negative matrix factorization - profile - log mining - mail filtering CLC number TP 391 Foundation item: Supported by the National Natural Science Foundation of China (60373066, 60303024), National Grand Fundamental Research 973 Program of China (2002CB312000), National Research Foundation for the Doctoral Program of Higher Education of China (20020286004).Biography: Jiang Ji-xiang (1980-), male, Master candidate, research direction: data mining, knowledge representation on the Web.
基金supported by National High Technology Research and Development Program of China (863 Program) (No.2015AA016306)National Nature Science Foundation of China (No.61231015)National Nature Science Foundation of China (No.61671335)
文摘Object-based audio coding is the main technique of audio scene coding. It can effectively reconstruct each object trajectory, besides provide sufficient flexibility for personalized audio scene reconstruction. So more and more attentions have been paid to the object-based audio coding. However, existing object-based techniques have poor sound quality because of low parameter frequency domain resolution. In order to achieve high quality audio object coding, we propose a new coding framework with introducing the non-negative matrix factorization(NMF) method. We extract object parameters with high resolution to improve sound quality, and apply NMF method to parameter coding to reduce the high bitrate caused by high resolution. And the experimental results have shown that the proposed framework can improve the coding quality by 25%, so it can provide a better solution to encode audio scene in a more flexible and higher quality way.
文摘We present a novel approach to solve the problem of single channel source separation (SCSS) based on filterbank technique and sparse non-negative matrix two dimensional deconvolution (SNMF2D). The proposed approach does not require training information of the sources and therefore, it is highly suited for practicality of SCSS. The major problem of most existing SCSS algorithms lies in their inability to resolve the mixing ambiguity in the single channel observation. Our proposed approach tackles this difficult problem by using filterbank which decomposes the mixed signal into sub-band domain. This will result the mixture in sub-band domain to be more separable. By incorporating SNMF2D algorithm, the spectral-temporal structure of the sources can be obtained more accurately. Real time test has been conducted and it is shown that the proposed method gives high quality source separation performance.
文摘Non-negative matrix factorization (NMF) is a technique for dimensionality reduction by placing non-negativity constraints on the matrix. Based on the PARAFAC model, NMF was extended for three-dimension data decomposition. The three-dimension nonnegative matrix factorization (NMF3) algorithm, which was concise and easy to implement, was given in this paper. The NMF3 algorithm implementation was based on elements but not on vectors. It could decompose a data array directly without unfolding, which was not similar to that the traditional algorithms do, It has been applied to the simulated data array decomposition and obtained reasonable results. It showed that NMF3 could be introduced for curve resolution in chemometrics.
基金supported by the Public Health Research Project in Futian District,Shenzhen(Grant Nos.FTWS2020026,FTWS2021073).
文摘Background:Establishing an appropriate prognostic model for PCa is essential for its effective treatment.Glycolysis is a vital energy-harvesting mechanism for tumors.Developing a prognostic model for PCa based on glycolysis-related genes is novel and has great potential.Methods:First,gene expression and clinical data of PCa patients were downloaded from The Cancer Genome Atlas(TCGA)and Gene Expression Omnibus(GEO),and glycolysis-related genes were obtained from the Molecular Signatures Database(MSigDB).Gene enrichment analysis was performed to verify that glycolysis functions were enriched in the genes we obtained,which were used in nonnegative matrix factorization(NMF)to identify clusters.The correlation between clusters and clinical features was discussed,and the differentially expressed genes(DEGs)between the two clusters were investigated.Based on the DEGs,we investigated the biological differences between clusters,including immune cell infiltration,mutation,tumor immune dysfunction and exclusion,immune function,and checkpoint genes.To establish the prognostic model,the genes were filtered based on univariable Cox regression,LASSO,and multivariable Cox regression.Kaplan–Meier analysis and receiver operating characteristic analysis validated the prognostic value of the model.A nomogram of the risk score calculated by the prognostic model and clinical characteristics was constructed to quantitatively estimate the survival probability for PCa patients in the clinical setting.Result:The genes obtained from MSigDB were enriched in glycolysis functions.Two clusters were identified by NMF analysis based on 272 glycolysis-related genes,and a prognostic model based on DEGs between the two clusters was finally established.The prognostic model consisted of LAMPS,SPRN,ATOH1,TANC1,ETV1,TDRD1,KLK14,MESP2,POSTN,CRIP2,NAT1,AKR7A3,PODXL,CARTPT,and PCDHGB2.All sample,training,and test cohorts from The Cancer Genome Atlas(TCGA)and the external validation cohort from GEO showed significant differences between the high-risk and low-risk groups.The area under the ROC curve showed great performance of this prognostic model.Conclusion:A prognostic model based on glycolysis-related genes was established,with great performance and potential significance to the clinical application.
文摘Data is humongous today because of the extensive use of World WideWeb, Social Media and Intelligent Systems. This data can be very important anduseful if it is harnessed carefully and correctly. Useful information can beextracted from this massive data using the Data Mining process. The informationextracted can be used to make vital decisions in various industries. Clustering is avery popular Data Mining method which divides the data points into differentgroups such that all similar data points form a part of the same group. Clusteringmethods are of various types. Many parameters and indexes exist for the evaluationand comparison of these methods. In this paper, we have compared partitioningbased methods K-Means, Fuzzy C-Means (FCM), Partitioning AroundMedoids (PAM) and Clustering Large Application (CLARA) on secure perturbeddata. Comparison and identification has been done for the method which performsbetter for analyzing the data perturbed using Extended NMF on the basis of thevalues of various indexes like Dunn Index, Silhouette Index, Xie-Beni Indexand Davies-Bouldin Index.
文摘The use of online discussion forum can?effectively engage students in their studies. As the number of messages posted on the forum is increasing, it is more difficult for instructors to read and respond to them in a prompt way. In this paper, we apply non-negative matrix factorization and visualization to clustering message data, in order to provide a summary view of messages that disclose their deep semantic relationships. In particular, the NMF is able to find the underlying issues hidden in the messages about which most of the students are concerned. Visualization is employed to estimate the initial number of clusters, showing the relation communities. The experiments and comparison on a real dataset have been reported to demonstrate the effectiveness of the approaches.
文摘Rank determination issue is one of the most significant issues in non-negative matrix factorization (NMF) research. However, rank determination problem has not received so much emphasis as sparseness regularization problem. Usually, the rank of base matrix needs to be assumed. In this paper, we propose an unsupervised multi-level non-negative matrix factorization model to extract the hidden data structure and seek the rank of base matrix. From machine learning point of view, the learning result depends on its prior knowledge. In our unsupervised multi-level model, we construct a three-level data structure for non-negative matrix factorization algorithm. Such a construction could apply more prior knowledge to the algorithm and obtain a better approximation of real data structure. The final bases selection is achieved through L2-norm optimization. We implement our experiment via binary datasets. The results demonstrate that our approach is able to retrieve the hidden structure of data, thus determine the correct rank of base matrix.
基金supported by National Key Research and Development Program of China(2020YFB0505803)National Key Research and Development Program of China(2016YFB0501700)。
文摘The proportionate recursive least squares(PRLS)algorithm has shown faster convergence and better performance than both proportionate updating(PU)mechanism based least mean squares(LMS)algorithms and RLS algorithms with a sparse regularization term.In this paper,we propose a variable forgetting factor(VFF)PRLS algorithm with a sparse penalty,e.g.,l_(1)-norm,for sparse identification.To reduce the computation complexity of the proposed algorithm,a fast implementation method based on dichotomous coordinate descent(DCD)algorithm is also derived.Simulation results indicate superior performance of the proposed algorithm.
基金This work was supported in part by the National Natural Science Foundation of PRC (No.60425310)the Teaching and Research Award Program for Outstanding Young Teachers in Higher Education Institutions of MOE,PRC.
文摘LDL-factorization is an efficient way of solving Ax = b for a large symmetric positive definite sparse matrix A. This paper presents a new method that further improves the efficiency of LDL-factorization. It is based on the theory of elimination trees for the factorization factor. It breaks the computations involved in LDL-factorization down into two stages: 1) the pattern of nonzero entries of the factor is predicted, and 2) the numerical values of the nonzero entries of the factor are computed. The factor is stored using the form of an elimination tree so as to reduce memory usage and avoid unnecessary numerical operations. The calculation results for some typical numerical examples demonstrate that this method provides a significantly higher calculation efficiency for the one-to-one marketing optimization algorithm.
文摘Topic modeling stands as a well-explored and foundational challenge in the text mining domain.Traditional topic schemes based on word co-occurrences,aim to expose the latent semantic structure embedded in a document corpus.Nevertheless,the inherent brevity of short texts introduces data sparsity,hindering the effectiveness of conventional topic models and yielding suboptimal outcomes for such text.Typically,short texts encompass a restricted number of topics,necessitating a grasp of relevant background knowledge for a comprehensive understanding of semantic content.Motivated by the observed information,this research introduces a novel Deep Auto encoder Graph Regularized Non-negative Matrix Factorization algorithm(DAGR-NMF)to uncover significant and meaningful topics within short document contents.The three main phases of proposed work are preprocessing,feature extraction and topic modeling.Initially,the data are preprocessed using natural language preprocessing tasks such as stop word removal,stemming and lemmatizing.Then,feature extraction is performed using hybrid Absolute Deviation Factors-Class Term Frequency(ADF-CTF)to capture the most relevant information from the text.Finally,topic modeling task is executed using proposed DAGR-NMF approach.Experimental findings demonstrate that the introduced DAGR-NMF model outperforms all other techniques by achieving NMI values of 0.852,0.857,0.793,and 0.831 on associated press,political blog datasets,20NewsGroups,and News category dataset,respectively.
基金supported in part by the National Natural Science Foundation of China (6177249391646114)+1 种基金Chongqing research program of technology innovation and application (cstc2017rgzn-zdyfX0020)in part by the Pioneer Hundred Talents Program of Chinese Academy of Sciences
文摘Latent factor(LF)models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS)matrices which are commonly seen in various industrial applications.An LF model usually adopts iterative optimizers,which may consume many iterations to achieve a local optima,resulting in considerable time cost.Hence,determining how to accelerate the training process for LF models has become a significant issue.To address this,this work proposes a randomized latent factor(RLF)model.It incorporates the principle of randomized learning techniques from neural networks into the LF analysis of HiDS matrices,thereby greatly alleviating computational burden.It also extends a standard learning process for randomized neural networks in context of LF analysis to make the resulting model represent an HiDS matrix correctly.Experimental results on three HiDS matrices from industrial applications demonstrate that compared with state-of-the-art LF models,RLF is able to achieve significantly higher computational efficiency and comparable prediction accuracy for missing data.I provides an important alternative approach to LF analysis of HiDS matrices,which is especially desired for industrial applications demanding highly efficient models.