期刊文献+

基于转发层次分析的新浪微博热度预测研究 被引量:7

Research on Hotness Prediction in Sina Microblog Based on Forward Level Analysis
在线阅读 下载PDF
导出
摘要 微博作为新型的消息传播媒介,其影响力和传播速度都超越了传统主流媒体,预测微博热度对舆情监测、政府宣传、企业营销及热点推送等具有重要意义。通过分析微博转发的层次规律,结合转发量、转发深度及广度指标,定义新的热度指数计算方法。将微博热度划分为5个等级,对转发数大于100的微博预测其热度达到特定等级的概率。使用有监督的机器学习算法,先后提取训练样本的静态和动态特征训练热度预测模型。通过自主开发的Big Data爬虫开放平台获取来源于新浪微博的训练样本,并应用十折交叉验证法进行实验,结果表明,相比只使用静态特征的热度预测模型,加入微博动态特征能有效提高预测性能,平均F1值达到76.9%。 Microblog is a new type of news media,and its influence and propagation speed surpasses traditional major media. Therefore,it has a great importance to predict hotness in microblog for public opinion monitoring, government propaganda, corporation marketing and popular issues pushing. Through analyzing microblog forward level which combining the effects of the forward index, forward depth and breadth index, this paper gives a new definition of calculating the hotness index of microblog. Then depend on this definition, the hotness index of the microblog is classified as five levels. The goal is to predict the hotness of microblog whose repost count is over 100 to achieve a specified level. By using supervised machine learning algorithm, it successively extracts the static attributes and dynamic repost characteristics of the training samples to train hotness prediction model. The training samples is from Sina microblog is caught by using self-developed BigData open crawler platform. Experimental result by using 10-fold cross-validation shows that, compared with hotness prediction model based on effectively improve the prediction performance, and Fl-measure static attributes, the model with dynamic features can achieves 76.9% .
出处 《计算机工程》 CAS CSCD 北大核心 2015年第7期31-35,共5页 Computer Engineering
基金 国家"863"计划基金资助项目"基于媒体大数据的大众信息消费服务平台及应用示范"(SS2014AA012305)
关键词 微博 爬虫 静态特征 动态特征 热度指数 多分类问题 microblog crawler static feature dynamic feature hotness index multi-classification problem
  • 相关文献

参考文献14

  • 1Bandari R,Asur S,Huberman B A.The Pulse of News in Social Media:Forecasting Popularity[C]//Proceedings of the 6th International AAAI Conference on Weblogs and Social Media.Palo Alto,USA:AAAI Press,2012:26-33.
  • 2杨于峰,余伟萍,田盼.基于SOM神经网络的品牌丑闻微博传播分类预测研究[J].情报杂志,2013,32(10):23-28. 被引量:10
  • 3Weng J,Lim E,Jiang J,et al.TwitterRank:Finding Topic-sensitive Influential Tw itterers[C]//Proceedings of International Conference on Web Search and Data Mining.New York,USA:ACM Press,2010:261-270.
  • 4Naveed N,Gottron T,Kunegis J,et al.Bad News Travel Fast:A Content-based Analysis of Interestingness on Tw itter[C]//Proceedings of the 3rd International Web Science Conference.New York,USA:ACM Press,2011:45-53.
  • 5Suh B,Hong L,Pirolli P,et al.Want to be Retweeted Large Scale Analytics on Factors Impacting Retw eet in Tw itter Netw ork[C]//Proceedings of the 2nd International Conference on Social Computing.Washington D.C.,USA:IEEE Press,2010:177-184.
  • 6Yang J,Counts S.Predicting the Speed,Scale,and Range of Information Diffusion in Tw itter[C]//Proceedings of the 4th International AAAI Conference on Weblogs and Social Media.Palo Alto,USA:AAAI Press,2010:355-358.
  • 7Szabo G,Huberman B.Predicting the Popularity of Online Content[J].Communications of the ACM,2010,53(8):80-88.
  • 8Petrovic S,Osborne M,Lavrenko V.RT to Win!Predicting Message Propagation in Tw itter[C]//Proceedings of the 5th International AAAI Conference on Weblogs and Social Media.Palo Alto,USA:AAAI Press,2011:586-589.
  • 9Hong Liangjie,Dan O,Davison B D.Predicting Popular Messages in Tw itter[C]//Proceedings of the 20th International Conference Companion on World Wide Web.New York,USA:ACM Press,2011:57-58.
  • 10张旸,路荣,杨青.微博客中转发行为的预测研究[J].中文信息学报,2012,26(4):109-114. 被引量:71

二级参考文献115

  • 1李春华,李宁,史培军.自组织特征映射神经网络原理和应用研究[J].北京师范大学学报(自然科学版),2006,42(5):543-547. 被引量:25
  • 2张吉刚.基于SOM神经网络的国家技术创新能力分类[J].长江大学学报(自科版)(上旬),2007,4(1):104-106. 被引量:1
  • 3丁露,崔平.SOM聚类算法在文本分类上的应用[J].现代情报,2007,27(9):162-164. 被引量:4
  • 4张华平,刘群.中文自然语言处理开发平台[EB/OL].http://www.nip.org.cn.2002.
  • 5R. Lahan. The Economics of Attention[M]. Univer- sity of Chicago Press, 2006.
  • 6Pete Cashmore. YouTube: Why Do We Watch? [DB/ OL]. http://editin, cnn. com/2009/TECH/12/17/ cashmore, youtube/ index, html, 2010.
  • 7J. Berger, K. L. Milkman. Social Transmission, E- motion, and the Virality of Online Content[R]. Whar- ton Research Paper, 2010.
  • 8A. Tumaslan, T. O. Sprenger, P. G. Sanduer, et al. Predicting Elections with Twitter: What 140 Charac- ters Reveal about Political Sentiment[C]//Proceedings of the 4th International AAAI Conference on Weblogs and Social Media. ICWSM,10, 2010.
  • 9J. Bollen, H. Mao, A. Pepe. Determining the public mood state by analysis of mieroblogging posts[C]// Proceedings of the Alife XII Conference MIT Press, 2010.
  • 10T. Sakaki, M. Okazaki, Y. Matsuo. Earthquake Shakes Twitter Users: Real-time Event Detection by Social Sensors[C]//Proceedings of WWW,10, 2010.

共引文献113

同被引文献76

引证文献7

二级引证文献110

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部