摘要
本文在简要论述了当前Web挖掘采用的数据源不足后,分析了XML文档结构与Web挖掘算法结构的相似性,提出了采用XML技术在应用服务层采集用户访问数据的数据源模型X-DIM,并分析了它的优越性。该模型克服了以往基于Web访问日志在数据预处理中的一系列问题,具有数据完备、准确度高、便于为挖掘算法使用等优点,有较高的应用价值。
The paper briefly describes the demerits of insufficient data sources adopted in the current Web mining,analyses the similarity between the XML document structure and the Web mining algorithm structure,proposes a data source model X-DIM of adopting the XML technology in the application service layer to sample users' access data,and analyes its advantages.The model overcomes a series of problems previously encountered in data preprocessing based on the Web access log,and features the merits of data completeness,high accuracy,ease of use in mining algorithms,and high application value.
出处
《计算机工程与科学》
CSCD
2007年第2期36-39,共4页
Computer Engineering & Science