摘要
目前很多网站都是用HTML构建的,要真正做到高效准确的挖掘数据非常困难,XML的出现为基于Web的数据挖掘带来了便利。在研究Web数据挖掘技术的基础上,利用XML数据抽取技术将半结构化数据映射为结构化数据,建立了一个具有基本挖掘功能的面向多种Web数据的挖掘系统模型Web_mining。最后将Agent技术引入数据挖掘,提出了一种基于Agent技术的体系结构,用来实现大容量的数据在分布式存放情况下的数据挖掘,并对基于Web的数据挖掘技术进行深入的研究和探讨。
At present many websites are built with HTML, which is difficult to achieve real effective and accurate web mining. The appearance of XML has brought convenience for it. Based on the research of web mining, XML is used to transform semi-structured data to well structured data, and a model of web mining system which has basic data mining function and faces multi-data on the web is built. Finally, a gent technology is introduced into web mining, and a structure based on Agent, which is used to mine distributed data is realized, then data mining based on web deeply is stadied and discussed.
出处
《计算机工程与设计》
CSCD
北大核心
2007年第2期272-274,277,共4页
Computer Engineering and Design