期刊文献+

基于直方图的XPath含值谓词路径选择性代价估计

Using Histograms to Estimate the Selectivity of XPath Expression with Value Predicates
在线阅读 下载PDF
导出
摘要 路径选择性代价估计是XML查询优化的基础,也是研究的热点·目前的方法采用大量正态分布和独立性分布假设是造成误差的根本原因·定义了一种新颖的值-位置直方图用于统计XML数据中的结构和值的分布情况,并提出了6种直方图运算·在此基础上,给出用直方图计算估计路径中任一结点选择性的方法·实验证明,这种方法无需独立性分布假设,也能在数据结构和数值分布不均匀的情况下,精确地估计路径选择性代价· Selectivity estimation of path expressions is the basis of XML query optimization and also intense research interest. A path expression may contain multiple branches with value predicates. Some of the values and the nodes of the XML data are highly correlated. Previous methods of selectivity estimation rarely take that relation into consideration, and assume, instead, that the selectivity of attribute values on different nodes and structures is independent and uniform. In this paper, a novel value histogram is proposed, which captures the correlation between the structures and the values in the XML data. Also defined are six operations on the value histograms as well as on the traditional histograms that capture nodes positional distribution. Based on such operations, the selectivity of any node (or branch) in a path expression can be estimated. Experimental results indicate that the method provides accuracy especially in cases where the distribution of the value or structure of the data exhibit a certain correlation without any independent assumption.
出处 《计算机研究与发展》 EI CSCD 北大核心 2006年第2期288-294,共7页 Journal of Computer Research and Development
基金 国家自然科学基金项目(60073014 60273018) 国家"八六三"高技术研究发展计划基金项目(2002AA116030) 教育部科学技术重点基金项目(03044) 河北大学博士科研启动基金项目(Y2005050)~~
关键词 XML 查询优化 选择性 直方图 谓词 XML query optimization selectivity histogram predicate
  • 相关文献

参考文献11

  • 1J. McHugh, J. Widom. Query optimization for XML. In: Proc.25th VLDB Conf. San Francisco: Morgan Kaufmann, 1999. 315- 326.
  • 2A. Aboulnaga, R. A, Alameldeen, J. Naughton. Estimating the selectivity of XML path expressions for internet seale applications.Inz Proe, 27th VLDB Conf. San Francisco: Morgan Kaufmann,2001. 591 -600.
  • 3Z. Chen, V. H. Jagadish, F. Korn, et al. Counting twig matches in a tree. In: Proe. 17th ICDE Conf. Los Alamitos,CA: IEEE Computer Society Press, 2001. 595-604.
  • 4N, Polyzotis, M, Garofalakis. Statistical synopses for graph-structured XML databases. In: Proc. 2002 ACM SIGMOD Conf.New York: ACM Press, 358-369.
  • 5J. Freire, R. Jayant, M. Ramanath, et al. StatiX: Making XML count. In: Proe. 2002 ACM SIGMOD Conf. New York:ACM Press, 2002. 181-191.
  • 6Y. Wu, J. Patel, H. Jagadish. Estimating answer sizes for XML queries. In: Proe. 8th EDBT Conf. Berlin: Springer, 2002. 590-608.
  • 7W. Wang, H. Jiang, H. Lu, et al. Containment join size estimation: Models and methods. In: Proe. 2003 ACM SIGMOD Conf. New York: ACM Press, 2003. 145-156.
  • 8N. PolyzotJs, M.Garofalakis. Structure and value synoposes for XML data graphs. In: Proc. 28th VLDB Conf. San Francisco,CA: Morgan Kaufmann, 2002.
  • 9N. Polyzotis, M. Garofalakis, Y, loannidis. Selectivity estimation for XML twigs. In: Proc. 20th ICDE Conf. Los Alamitos, CA: IEEE Computer Society Press, 2004.
  • 10CWI. Xmark. http://monetdb.cwi.nl/xml, 2003-01.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部