摘要
本文首先提出链接分析法中存在的垃圾页面问题并将其形式化描述出来;再从两个角度分别介绍了挑选种子页面集的思想;然后在改进现有PageRank算法的基础上提出了垃圾页面检测算法;同时给出了几个表征检测算法效率的性能指标,最后简要阐述了基于信任指数的对抗web垃圾页面方案。
This paper firstly brings forward the problem of web spams in the link-analysis method and formalizes it, secondly introduces some ideas of choosing the best seed sets of web pages from two angles respectively, thirdly raises the spam detection algorithm based on the former PageRank algorithm, and presents some performance standard parameters to assess the efficiency of the improved algorithm, and briefly set forth the trust-based scheme of combating web spams in the end of our thesis.
出处
《微计算机信息》
北大核心
2006年第03X期4-6,共3页
Control & Automation
基金
国家自然科学基金(No.60273072)
国家高技术研究发展计划(863)(No.2002AA423450)资助