期刊文献+

置信学习驱动下融合心理语言学特征的人格检测 被引量:1

Personality Detection with Psycholinguistic Feature Driven by Confident Learning
在线阅读 下载PDF
导出
摘要 随着互联网的普及,越来越多用户倾向于在社交平台公开表达自己的个人细节和情感内容,这些网络文本数据往往体现着不同场景下的真实表达,反映了用户内在的心理特质及人格倾向.近年来,基于社交文本的人格检测研究取得了显著进展,然而,研究者们大多直接使用未经处理的公开数据集,这些数据集因其收集过程导致不可避免地存在噪声,此外,大多过分依赖预训练模型提取的文本语义特征,而缺乏对心理语言学特征的引入.为了解决以上问题,提出一种新型的人格检测研究方法.该方法首先基于置信学习完成噪声数据清洗,提高数据集质量.其次,提取多层次心理语言学特征来填补单一文本语义特征的不足.最后通过动态深度图卷积网络来优化特征表达,完成最终的人格检测任务.在公开的Kaggle MBTI数据集上对该方法进行性能评估,结果表明,与目前先进的方法相比,该方法在准确率和F1值上分别提升了5.48%和4.22%. With the widespread use of Internet,an increasing number of users are inclined to share personal details and emotion on social platforms.These online text data often capture genuine expressions in various contexts,reflecting the users’internal psychological traits and personality tendencies.In recent years,research on personality detection based on social media text has made significant progress.However,most researchers rely on unprocessed public datasets,which inevitably contain noise due to their collection process.In addition,there is an over-reliance on semantic features extracted by pre-trained models,with insufficient attention to psycholinguistic features.To address these issues,this study proposes a novel method for personality detection.First,a plug-and-play data cleaning module based on confident learning is used to remove noisy data and improve dataset quality.Second,multi-level psycholinguistic features are extracted to complement the semantic features of the text.The proposed method is evaluated on the public Kaggle MBTI dataset,with results showing that,compared to existing advanced methods,it achieves improvements of 5.48%in accuracy and 4.22%in F1-score.
作者 王春东 杨宇涵 林浩 黄思源 WANG Chun-Dong;YANG Yu-Han;LIN Hao;HUANG Si-Yuan(School of Computer Science and Engineering,Tianjin University of Technology,Tianjin 300384,China;National Engineering Laboratory for Computer Virus Prevention and Control Technology,Tianjin University of Technology,Tianjin 300384,China;School of Artificial Intelligence,Hebei University of Technology,Tianjin 300401,China)
出处 《计算机系统应用》 2025年第7期48-58,共11页 Computer Systems & Applications
基金 国家重点研发计划(2023YFB2703900) 天津市科委重大专项(15ZXDSGX00030)。
关键词 人格检测 社交媒体 数据清洗 迈尔斯布里格斯类型指标 personality detection social media data cleaning Myers-Briggs type indicator(MBTI)
  • 相关文献

参考文献3

二级参考文献27

共引文献5

同被引文献4

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部