期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
Enriching short text representation in microblog for clustering 被引量:14
1
作者 Jiliang TANG Xufei WANG +2 位作者 Huiji GAO Xia HU Huan LIU 《Frontiers of Computer Science》 SCIE EI CSCD 2012年第1期88-101,共14页
Social media websites allow users to exchange short texts such as tweets via microblogs and user status in friendship networks.Their limited length,pervasive abbrevi-ations,and coined acronyms and words exacerbate the... Social media websites allow users to exchange short texts such as tweets via microblogs and user status in friendship networks.Their limited length,pervasive abbrevi-ations,and coined acronyms and words exacerbate the prob-lems of synonymy and polysemy,and bring about new chal-lenges to data mining applications such as text clustering and classification.To address these issues,we dissect some poten-tial causes and devise an efficient approach that enriches data representation by employing machine translation to increase the number of features from different languages.Then we propose a novel framework which performs multi-language knowledge integration and feature reduction simultaneously through matrix factorization techniques.The proposed ap-proach is evaluated extensively in terms of effectiveness on two social media datasets from Facebook and Twitter.With its significant performance improvement,we further investi-gate potential factors that contribute to the improved perfor-mance. 展开更多
关键词 short texts text representation multi-languageknowledge matrix factorization social media
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部