Finding all occurrences of a twig pattern is a core operation of extensible markup language (XML) query processing. Holistic twig join algorithms, which avoid a large number of intermediate results, represent the stat...Finding all occurrences of a twig pattern is a core operation of extensible markup language (XML) query processing. Holistic twig join algorithms, which avoid a large number of intermediate results, represent the state-of-the-art algorithms. However, ordered XML twig join is mentioned rarely in the literature and previous algorithms developed in attempts to solve the problem of ordered twig pattern (OTP) matching have poor performance. In this paper, we first propose a novel children linked stacks encoding scheme to represent compactly the partial ordered twig join results. Based on this encoding scheme and extended Dewey, we design a novel holistic OTP matching algorithm, called OTJFast, which needs only to access the labels of the leaf query nodes. Furthermore, we propose a new algorithm, named OTJFaster, incorporating three effective optimization rules to avoid unnecessary computations. This works well on available indices (such as B+-tree), skipping useless elements. Thus, not only is disk access reduced greatly, but also many unnecessary computations are avoided. Finally, our extensive experiments over both real and synthetic datasets indicate that our algorithms are superior to previous approaches.展开更多
针对数据库查询优化中多表连接优化问题,任务是找到一个合适的连接顺序使查询执行计划最优,为此提出一种查询语句的嵌入表示方法SmartEncoder。通过优化查询语句中多表连接的嵌入表示信息,得到更丰富的关于连接的信息,将多表连接顺序选...针对数据库查询优化中多表连接优化问题,任务是找到一个合适的连接顺序使查询执行计划最优,为此提出一种查询语句的嵌入表示方法SmartEncoder。通过优化查询语句中多表连接的嵌入表示信息,得到更丰富的关于连接的信息,将多表连接顺序选择优化建模为深度强化学习问题,根据动作的概率分布选择连接,从过去的经验中学习,生成更好的查询执行计划。在Join Order Benchmark数据集上的实验结果表明,SmartEncoder能够有效提高查询的效率。展开更多
连接顺序选择是查询优化领域中极具挑战性的研究方向,对于数据库管理系统获得良好的查询性能至关重要.然而,传统优化方法和现有智能优化方法均存在着不足,如规划时间过长、容易得到质量较差的连接计划、编码未考虑结构特征、依赖基数估...连接顺序选择是查询优化领域中极具挑战性的研究方向,对于数据库管理系统获得良好的查询性能至关重要.然而,传统优化方法和现有智能优化方法均存在着不足,如规划时间过长、容易得到质量较差的连接计划、编码未考虑结构特征、依赖基数估计和代价估计使得连接计划无法反映真实的执行时间等.针对上述问题,提出了一种新型基于异步Dueling DQN(Deep Q-network)和计划时间预测网络的连接优化器:ADP-Join(Asynchronous Dueling DQN and Plan Latency Prediction Network for Join Order Selection).ADP-Join集成了一种新的编码方法,能够区分不同结构的连接计划.ADP-Join设计了计划时间预测网络PLN(Plan Latency Prediction Network)来改善现有基于强化学习优化器的奖励机制.再者,提出异步更新机制改进Dueling DQN模型来提升训练性能和减少训练时间.大量的实验结果表明,在TPC-H和JOB真实数据集上ADP-Join的性能优于现有的智能优化器.展开更多
基金Project supported by the National Natural Science Foundation of China (Nos 60603044 and 60803003)the Program for the Changjiang Scholars and Innovative Research Team in University (No IRT0652)the Key Technology Projects of Zhejiang Province, China (No. 2006c11108)
文摘Finding all occurrences of a twig pattern is a core operation of extensible markup language (XML) query processing. Holistic twig join algorithms, which avoid a large number of intermediate results, represent the state-of-the-art algorithms. However, ordered XML twig join is mentioned rarely in the literature and previous algorithms developed in attempts to solve the problem of ordered twig pattern (OTP) matching have poor performance. In this paper, we first propose a novel children linked stacks encoding scheme to represent compactly the partial ordered twig join results. Based on this encoding scheme and extended Dewey, we design a novel holistic OTP matching algorithm, called OTJFast, which needs only to access the labels of the leaf query nodes. Furthermore, we propose a new algorithm, named OTJFaster, incorporating three effective optimization rules to avoid unnecessary computations. This works well on available indices (such as B+-tree), skipping useless elements. Thus, not only is disk access reduced greatly, but also many unnecessary computations are avoided. Finally, our extensive experiments over both real and synthetic datasets indicate that our algorithms are superior to previous approaches.
文摘针对数据库查询优化中多表连接优化问题,任务是找到一个合适的连接顺序使查询执行计划最优,为此提出一种查询语句的嵌入表示方法SmartEncoder。通过优化查询语句中多表连接的嵌入表示信息,得到更丰富的关于连接的信息,将多表连接顺序选择优化建模为深度强化学习问题,根据动作的概率分布选择连接,从过去的经验中学习,生成更好的查询执行计划。在Join Order Benchmark数据集上的实验结果表明,SmartEncoder能够有效提高查询的效率。
文摘连接顺序选择是查询优化领域中极具挑战性的研究方向,对于数据库管理系统获得良好的查询性能至关重要.然而,传统优化方法和现有智能优化方法均存在着不足,如规划时间过长、容易得到质量较差的连接计划、编码未考虑结构特征、依赖基数估计和代价估计使得连接计划无法反映真实的执行时间等.针对上述问题,提出了一种新型基于异步Dueling DQN(Deep Q-network)和计划时间预测网络的连接优化器:ADP-Join(Asynchronous Dueling DQN and Plan Latency Prediction Network for Join Order Selection).ADP-Join集成了一种新的编码方法,能够区分不同结构的连接计划.ADP-Join设计了计划时间预测网络PLN(Plan Latency Prediction Network)来改善现有基于强化学习优化器的奖励机制.再者,提出异步更新机制改进Dueling DQN模型来提升训练性能和减少训练时间.大量的实验结果表明,在TPC-H和JOB真实数据集上ADP-Join的性能优于现有的智能优化器.