期刊文献+

LDA及主题词相关性的新事件检测 被引量:4

New Event Detection Based on LDA and Correlation of Topic Terms
在线阅读 下载PDF
导出
摘要 目前,话题检测与跟踪已被广泛应用,新事件检测作为话题检测与跟踪领域中的研究任务之一,为跟踪后续话题发展的先验知识,在话题检测与跟踪领域具有重要的理论研究意义。LDA主题模型不能自动识别新事件,其主题数需通过人工或反复实验来确定,识别效率低。本文提出基于LDA及主题词间的相关性新事件检测算法,同时结合报道发生的时间,确定合理的主题数目,从而探知新事件。实验证明,与传统LDA算法及Gibbs LDA算法相比,该方法具有一定优势,提高了对新事件检测的敏感度。 Topic detection and tracking(TDT) is widely used. As one of research tasks for TDT, new event detection can provide prior knowledge to TDT, so it is of great theoretical research significance in the field of TDT. Because LDA model can not auto- maticaUy identify new events, and the number of LDA topic is determined by the artificial, or by repeated experiments, it is of low efficiency. This paper presents a new method based on LDA and correlation of topic terms, which considers the correlation of subject terms and report time, it can dynamically adapt updated topics and then detect the new event. Experiment results demon- strate that this method is of some advantages and the sensitivity of new events detection is increased.
作者 黄颖
出处 《计算机与现代化》 2012年第1期6-9,13,共5页 Computer and Modernization
基金 江西省教育厅科技项目(GJJ11216) 赣南师范学院校级科研课题(10KYZ05)
关键词 LDA 话题检测 新事件检测 主题词相关性 latent Diricblet allocation(LDA) topic detection new event detection correlation of the topic terms
  • 相关文献

参考文献14

  • 1Allan J, Papka R, Lavrenko V. On-line new event detection and tracking[ C ]//Proceedings of the 21 st Annual Interna- tional ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 1998:37-45.
  • 2Yang Y,Pierce T,Carbonell J. A study of retrospective and on-line event detection [ C ]//Proceedings of the 21 st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 1998:28-36.
  • 3Brants T,Chen F,Farahat A. A system for new event detection [ C]//Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York:ACM Press,2003:330-337.
  • 4Kumaran G,AUan J. Text classification and named entities for new event detection [ C ]//Proceedings of the 27th Annum International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press ,2004:297-304.
  • 5Allan J,Jin H, Rajman M, et al. Topic based novelty detection [ C]//Proceedings of the Johns Hopkins Summer Workshop at CLSP Final Report. Baltimore.1999:1-52.
  • 6Yang Y, Zhang J, Carbonell J, et al. Topic conditioned novehy detection[ C]//Pmceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press,2002:688-593.
  • 7Lam W, Meng H M L, Wong K L, et al. Using contextual analysis for news event detection[ J]. International Journal of Intelligent Systems,2001,16 (4) : 525-546.
  • 8Blei D M,Ng A Y,Jordan M I. Latent Dirichlet allocation[ J ]. Journal of Machine Learning Research,2003,3:993-1022.
  • 9Blei D M, Lafferty J D. Correlated topic models [ C ]//Advances in Neural Information Processing Systems. 2006,18: 147-154.
  • 10Blei D M, McAuliffe J D. Supervised topic models [ C ]// Advances in Neural Information Processing Systems. 2008, 20 : 121-128.

二级参考文献29

  • 1Konstantin Tretyakov. Machine Learning Techniquesin Spam Filtering[ A]. Data Mining Problem -Oriented Seminar, MTAT. 03,177, May 2004:60 - 79.
  • 2Nello C, John S T. An Introduction to Support Vector Machines and Other Kernel - based Learning Methods [ M ]. Cambridge : Cambridge University Press ,2000.
  • 3Wegelin J A. A Survey of Partial Least Squares(PLS) Methods,with Emphasis On the Two - block Case[ R]. Seattle:Department of Statistics, University of Washington,2000:21 - 28.
  • 4Hosku 1 dsson A. PLs regressiOn methods [ J ]. Journal of Chemo metrics, 1988,3 ( 2 ) : 211 - 228.
  • 5Xiaogang Wang, Eric Grimson. Spatial Latent Dirichlet Allocation. Proceedings of Neural Information Processing Systems (NIPS2007). 2007 [ EB/OL]. Http ://books. nips. cc/papers/files/nips20/NIPS2007_0964, pdf.
  • 6McCallum A ,Corrada- Emmanuel A,Wang X. Topic and role discovery, in social networks[ A]. Proceedings of 19th Joint conference on artificial intelligence. 2005.
  • 7Thorsten Brants, Francine Chen, Ioannis Tsocbantaridis. Topic - based document segmentation with probabilistic latent semantic analysis [ A]. Proceedings of the eleventh international Conference on hfformation and knowledge management McLean, Virginia, USA. 2002.211 - 218.
  • 8Thomas Minka, John Lafferty. Expectation -Propagation for the Generative Aspect Model[ A]. Uncertainty in Artificial Intelligence, 2002.
  • 9Sahon G, Wong A, Yang C. A vector space model for automatic indexing [J]//Communications of the ACM, 1975, 18(11): 613-620.
  • 10Hinneburg A, Aggarwal C, Keim D. What Is the Nearest Neighbor in High Dimensional Spaces [C]// Proceeding of the 26th VLDB Conference, 2000: 506-515.

共引文献101

同被引文献60

  • 1张云涛,龚玲,王永成.基于综合方法的文本主题句的自动抽取[J].上海交通大学学报,2006,40(5):771-774. 被引量:16
  • 2谭松波,王月粉.中文文本分类语料库-TanCorpv1.0[EB/OL].(2007-08-29)[2008-01-20].http://www.searehforum:org.cn/tansongbo/corpus.htm.
  • 3Deerwester S,Dumais S T,Furnas G W,et al.Indexing by latent semantic analysis[J].Journal of the American Society for Information Science,1990,114(2):211-244.
  • 4Hofmann T.Probabilistic latent semantic analysis[C]//Proceedings of the Twenty-Second Annual International SIGIR,Conference on Research and Development in Information Retrieval.New York:ACM,1999:50-57.
  • 5Blei D M,Ng A Y,Jordan M L,et al.Latent Dirichlet allocation[J].Journal of Machine Learning Research,2003,3(2):993-1022.
  • 6Blei D M.Probabilistic topic models[J].Communications of the ACM,2012,55(4):77-84.
  • 7Barbieri N,Manco G,Ritacco E,et al.Probabilistic topic models for sequence data[J].Machine Learning,2013,93(1):5-29.
  • 8Isaly L,Trias E,Peterson G.Improving the latent Dirichlet allocation document model with WordNet[C]//Proceedings of the 5th International Conference on Information Warfare and Security.London:Academic Conferences Ltd,2010:163-170.
  • 9Hofmann T.Unsupervised learning by probabilistic latent semantic analysis[J].Machine Learning,2001,42(1):177-196.
  • 10Du Lan,Buntine W,Jin Huidong,et al.Sequential latent Dirichlet allocation[J].Knowledge and Information Systems,2012,31(3):475-503.

引证文献4

二级引证文献100

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部