期刊文献+

多模态数据驱动的社交网络谣言传播者识别方法研究 被引量:1

ldentifying Social Network Rumor Spreaders with Multi-Modal Data
原文传递
导出
摘要 【目的】根据多模态数据特征,从社交网络用户中识别出社交网络谣言传播者。【方法】考虑到网络谣言传播呈现多模态与用户样本不平衡的特点,首先对原始数据进行过采样处理,然后将用户属性、微博发文等传统特征与用户生成内容中的多模态信息特征深度融合;在XGBoost模型基础上构建能够广泛融合社交网络用户特征的社交网络谣言传播者识别框架,并在模型输出层嵌入SHAP值,增加算法可解释性。【结果】XGBoost模型在经过样本平衡处理的数据集上综合性能表现最优,召回率提升12.3个百分点。融合多模态信息特征的识别方法准确率可达0.912,比对照组提升2.5个百分点。【局限】多模态信息特征仅考虑文本、图片两种模态,未来可进一步结合音频、视频等模态信息拓展研究。【结论】基于多模态数据与过采样算法训练得到的识别方法,能够有效完成社交网络谣言传播者的识别任务。 [Objective]This paper aims to identify social network rumor spreaders by leveraging multi-modal data.[Methods]Given the multi-modal nature of rumor propagation and the imbalance in user sample distribution,we first applied an oversampling technique to the raw data.Then,we deeply integrated traditional user attributes and microblogging features with multi-modal information extracted from user-generated content.Third,we constructed the intelligent identification method for social network rumor spreaders,which effectively integrates diverse user features based on the XGBoost model.Additionally,SHAP values were embedded in the model's output layer to enhance algorithmic interpretability.[Results]The XGBoost model achieves optimal overall performance after sample balancing,with a 12.3%improvement in recall.The identification method incorporating multi-modal information features can attain an accuracy of 0.912,2.5%higher than the control group.[Limitations]This paper only considered text and image modalities.Future research can be expanded by incorporating audio and video data.[Conclusions]The proposed model can effectively identify social network rumor spreaders.
作者 潘宏鹏 刘忠轶 Pan Hongpeng;Liu Zhongyi(School of Management,People's Public Security University of China,Beijing100038,China)
出处 《数据分析与知识发现》 北大核心 2025年第2期59-70,共12页 Data Analysis and Knowledge Discovery
基金 中国人民公安大学基本科研业务费项目(项目编号:2022JKF02004) 北京社会科学基金重点项目(项目编号:22GLA011)的研究成果之一。
关键词 多模态特征 过采样 网络谣言传播者 Multi-Modal Characteristics Over-Sampling Social Network Rumor Spreader
  • 相关文献

参考文献12

二级参考文献144

  • 1夏松,林荣蓉,刘勘.网络谣言敏感词库的构建研究——以新浪微博谣言为例[J].知识管理论坛,2019(5):267-275. 被引量:6
  • 2雷震,吴玲达,雷蕾,黄炎焱.初始化类中心的增量K均值法及其在新闻事件探测中的应用[J].情报学报,2006,25(3):289-295. 被引量:25
  • 3樊高月.美军网络中心战理论与实践[J].外国军事学术,2007,0(10):1-8. 被引量:7
  • 4中国互联网网络信息中心.第33次中国互联网发展状况调查统计报告[R/OL].(2014-03-05)【2014-07-01].http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201403/t20140305-46240.htm.
  • 5Yardi S, Romero D, Schoenebeck G. Detecting spam in a twitter network. First Monday, 2009, 15(1): 1-13.
  • 6Stringhini G, Kruegel C, Vigna G. Detectingspammers on social networks // Proceedings 26th Annual Computer Security Applications ference. New York: ACM, 2010:1-9 of the Con-.
  • 7Thomas K, Grier C, Song D, et al. Suspended accounts in retrospect: an analysis of twitter spare // Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement. New York: ACM, 2011 243-258.
  • 8Zhang X, Zhu S, Liang W. Detecting spam and promoting campaigns in the twitter social network // Proceedings of the 2012 IEEE 12th International Conference on Data Mining. Brussels: IEEE Com- puter Society, 2012:1194-1199.
  • 9Lee K, Eoff B D, Caverlee J. Seven months with the devils: a long-term study of content polluters on Twitter // AAAI Conference on Weblogs and Social Media (ICWSM). Barcelona, 2011 : 185-192.
  • 10Yang C, Harkreader R C, Gu G. Die free or live hard? empirical evaluation and new design for fighting evolving twitter spammers // Recent advances in intrusion detection. Berlin: Springer, 2011:318-337.

共引文献194

同被引文献9

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部