摘要
为了揭示用户的访问模式,对传统的基于聚类技术构建用户概貌方法进行了研究,同时引入语义事务分析的观点,提出一种基于潜在语义模型构建用户概貌的方法。通过语义分析中的奇异值分解(SVD)算法,将构建的用户会话-浏览页面矩阵向量空间投影到潜在语义向量空间;利用扩展的K-means聚类算法,对潜在语义向量空间聚类生成用户会话聚类;计算浏览页面均值向量,构建以加权浏览页面集表示的用户概貌;最后采用加权平均访问百分比(WAVP)方法评价构建的用户概貌,表明了该方法的有效性。
To reveal the user’s access pattern,a method of building user profile based on clustering technology is researched,meanwhile,the semantic analysis view and an approach based on latent semantic analysis is introduced to construct user profilesd.Firstly,user session-pageview matrix vector space is constructed and singular value decomposition(SVD) algorithm is applied to project into latent semantic vector space.An extended K-means clustering algorithm is performed on latent semantic vector space to generate user session clusters.Then,the pageview mean vector is computed to construct user profiles expressed in the form of a weighted pageview collection.Finally,the weighted average visit percentage(WAVP) is adopted to evaluate the constructed user profiles,and the validity and efficiency of the presented method is demonstrated.
出处
《计算机工程与设计》
CSCD
北大核心
2010年第20期4497-4499,4523,共4页
Computer Engineering and Design
基金
江苏省高等科学基金项目(09KJB52003)
关键词
潜在语义分析
用户概貌
聚类
用户事务信息
语义向量空间
奇异值分解
latent semantic analysis user profile clustering user transaction information semantic vector space singular value decomposition