摘要
针对社交网络用户人格预测问题,提出一种结合信息增益与语义特征提炼用户文本信息,并采用多标签分类算法进行综合预测的方法.先基于信息增益提取文本词特征,包括情感词、词性和时态等,进行特征选择与加权;对于语义特征,将文本内容映射为本体概念并计算语义相关度;然后以基于词的特征和语义特征的共同影响为依据,运用多标签分类算法执行人格预测过程,从不同角度处理文本信息,并充分考虑了类标签间的相关性.实验结果验证了该方法的有效性.
Aiming at the problem of the personality prediction of social network users,we proposed a method that combined information gain and semantic features to refine user’s text information,and adopted a method of multi-label classification algorithm for comprehensive prediction.Firstly,lexical features in text were extracted based on information gain,including sentiment word,part of speech and tense etc,and feature selection and weighting were carried out.For semantic features,text content was mapped to ontology concepts and then semantic relevance was calculated.Secondly,based on the combined influence of lexical features and semantic features,a multi-label classification algorithm was used to execute personality prediction process.Text information was handled from different perspectives and label relevance was taken into full consideration.Experimental results verify the effectiveness of the proposed method.
出处
《吉林大学学报(理学版)》
CAS
CSCD
北大核心
2016年第3期561-568,共8页
Journal of Jilin University:Science Edition
基金
国家自然科学基金(批准号:61472049)
吉林省重点科技攻关项目(批准号:20130206051GX)
关键词
社交网络
人格预测
社会计算
多标签分类
social network
personality prediction
social computing
multi-label classification