期刊文献+

通过查询模式聚类结构化的Deep Web资源

Organizing Structured Deep Web Sources by Query Schemas
在线阅读 下载PDF
导出
摘要 近几年,网络被在线数据库迅速地深化。在深网中,大量的资料提供了丰富的数据模式,这些模式详细说明了它们的目标领域和查询性能,因此对大规模数据的整合是当前面临的挑战。在数据挖掘中,聚类分析是一个重要方法。本文论述通过查询接口采用凝聚层次聚类方法聚类结构化的Web资源,并采用先聚类后分类的方法稍加改进。实验显示对于聚类Web查询模式,凝聚的层次聚类能正确地组织资料。 In the recent years, the Web has been rapidly "deepened" with the databases online. On this deep Web, numerous sources are structured, providing schema-rich data-Their schemas define the object domain and its query capabilities. The structured deep Web thus presents challenges for large-scale information integration. Clustering is one of the important approaches in data mining, this paper studies organizing structured Web sources by query schemas with the hierarchical agglomerative clustering algorithm. And we use pre-clustering and post-classification techniques to improve it. Our experiments show the effectiveness- By clustering the query schemas, the hierarchical agglomerative clustering algorithm can accurately organize sources into object domains.
出处 《现代计算机》 2006年第9期19-21,62,共4页 Modern Computer
关键词 数据整合 深网 凝聚层次聚类 Data Integration Deep Web Hierarchical Agglomerative Clustering
  • 相关文献

参考文献3

  • 1B.He,T.Tao,and K.C.-C.Chang.Clustering Structured Web Sources:A Schema-based,Model-Differentiation Approach.Technical Report UIUCDCS-R-2003-2322,Dept.of Computer Science,UIUC,Feb.2003
  • 2B.He and K.C.-C.Chang.Statistical Schema Matching Across Web Query Interfaces.In Proceedings of the 2003ACM SIGMOD Conference (SIGMOD 2003),2003
  • 3Jiawei Han,Micheline Kamber.数据挖掘概念与技术.范明,盂小峰等译.北京:机械工业出版社,2005.8

共引文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部