期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Semantic Knowledge Acquisition from Blogs with Tag-Topic Model 被引量:3
1
作者 He Tingting Li Fang 《China Communications》 SCIE CSCD 2012年第3期38-48,共11页
This paper focuses on semantic knowl- edge acquisition from blogs with the proposed tag- topic model. The model extends the Latent Dirichlet Allocation (LDA) model by adding a tag layer be- tween the document and th... This paper focuses on semantic knowl- edge acquisition from blogs with the proposed tag- topic model. The model extends the Latent Dirichlet Allocation (LDA) model by adding a tag layer be- tween the document and the topic. Each document is represented by a mixture of tags; each tag is as- sociated with a multinomial distribution over topics and each topic is associated with a multinomial dis- trNution over words. After parameter estimation, the tags are used to descrNe the underlying topics. Thus the latent semantic knowledge within the top- ics could be represented explicitly. The tags are treated as concepts, and the top-N words from the top topics are selected as related words of the con- cepts. Then PMI-IR is employed to compute the re- latedness between each tag-word pair and noisy words with low correlation removed to improve the quality of the semantic knowledge. Experiment re- sults show that the proposed method can effectively capture semantic knowledge, especially the polyse- me and synonym. 展开更多
关键词 semantic knowledge acquisition topicmodel TAG
在线阅读 下载PDF
Online belief propagation algorithm for probabilistic latent semantic analysis 被引量:2
2
作者 Yun YE Shengrong GONG +3 位作者 Chunping LIU Jia ZENG Ning JIA YiZHANG 《Frontiers of Computer Science》 SCIE EI CSCD 2013年第4期526-535,共10页
Probabilistic latent semantic analysis (PLSA) is a topic model for text documents, which has been widely used in text mining, computer vision, computational biology and so on. For batch PLSA inference algorithms, th... Probabilistic latent semantic analysis (PLSA) is a topic model for text documents, which has been widely used in text mining, computer vision, computational biology and so on. For batch PLSA inference algorithms, the required memory size grows linearly with the data size, and handling massive data streams is very difficult. To process big data streams, we propose an online belief propagation (OBP) algorithm based on the improved factor graph representation for PLSA. The factor graph of PLSA facilitates the classic belief propagation (BP) algorithm. Furthermore, OBP splits the data stream into a set of small segments, and uses the estimated parameters of previous segments to calculate the gradient descent of the current segment. Because OBP removes each segment from memory after processing, it is memoryefficient for big data streams. We examine the performance of OBP on four document data sets, and demonstrate that OBP is competitive in both speed and accuracy for online ex- pectation maximization (OEM) in PLSA, and can also give a more accurate topic evolution. Experiments on massive data streams from Baidu further confirm the effectiveness of the OBP algorithm. 展开更多
关键词 probabilistic latent semantic analysis topicmodels expectation maximization belief propagation
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部