The author investigates the query optimization problem for parallel relational databases. A multi - weighted tree based query optimization method is proposed. The method consists of a multi - weighted tree based paral...The author investigates the query optimization problem for parallel relational databases. A multi - weighted tree based query optimization method is proposed. The method consists of a multi - weighted tree based parallel query plan model, a cost model for parallel qury plans and a query optimizer. The parallel query plan model is the first one to model all basic relational operations, all three types of parallelism of query execution, processor and memory allocation to operations, memory allocation to the buffers between operations in pipelines and data redistribution among processors. The cost model takes the waiting time of the operations in pipelining execution into consideration and is computable in a bottom - up fashion. The query optimizer addresses the query optimization problem in the context of Select - Project - Join queries that are widely used in commercial DBMSs. Several heuristics determining the processor allocation to operations are derived and used in the query optimizer. The query optimizer is aware of memory resources in order to generate good - quality plans. It includes the heuristics for determining the memory allocation to operations and buffers between operations in pipelines so that the memory resourse is fully exploit. In addition, multiple algorithms for implementing join operations are consided in the query optimizer. The query optimizer can make an optimal choice of join algorithm for each join operation in a query. The proposed query optimization method has been used in a prototype parallel database management system designed and implemented by the author.展开更多
Recently, attention has been focused on spatial query language which is used to query spatial databases. A design of spatial query language has been presented in this paper by extending the standard relational databas...Recently, attention has been focused on spatial query language which is used to query spatial databases. A design of spatial query language has been presented in this paper by extending the standard relational database query language SQL. It recognizes the significantly different requirements of spatial data handling and overcomes the inherent problems of the application of conventional database query languages. This design is based on an extended spatial data model, including the spatial data types and the spatial operators on them. The processing and optimization of spatial queries have also been discussed in this design. In the end, an implementation of this design is given in a spatial query subsystem.展开更多
Dynamic programming(DP) is an effective query optimization approach to select an appropriate join order for relational database management system(RDBMS) in multi-table joins. This method was extended and made availabl...Dynamic programming(DP) is an effective query optimization approach to select an appropriate join order for relational database management system(RDBMS) in multi-table joins. This method was extended and made available in distributed DBMS(D-DBMS). The structure of this optimal solution was firstly characterized according to the distributing status of tables and data, and then the recurrence relations between a problem and its sub-problems were recursively defined. DP in D-DBMS has the same time-complexity with that in centralized DBMS, while it has the capability to solve a much more sophisticated optimal problem of multi-table join in D-DBMS. The effectiveness of this optimal strategy has been proved by experiments.展开更多
A systematic, efficient compilation method for query evaluation of DeductiveDatabases (DeDB) is proposed in this paper. In order to eliminate redundancyand to minimize the potentially relevant facts, which are two key...A systematic, efficient compilation method for query evaluation of DeductiveDatabases (DeDB) is proposed in this paper. In order to eliminate redundancyand to minimize the potentially relevant facts, which are two key issues to theefficiency of a DeDB, the compilation process is decomposed into two phases.The first is the pre-compilation phase, which is responsible for the minimiza-tion of the potentially relevant facts. The second, which we refer to as thegeneral compilation phase, is responsible for the elimination of redundancy.The rule/goal graph devised by J. D. Ullman is appropriately extended andused as a uniform formalism. Two general algorithms corresponding to the twophases respectively are described intuitively and formally展开更多
Fundamentally, semantic grid database is about bringing globally distributed databases together in order to coordinate resource sharing and problem solving in which information is given well-defined meaning, and DartG...Fundamentally, semantic grid database is about bringing globally distributed databases together in order to coordinate resource sharing and problem solving in which information is given well-defined meaning, and DartGrid II is the implemented database gird system whose goal is to provide a semantic solution for integrating database resources on the Web. Although many algorithms have been proposed for optimizing query-processing in order to minimize costs and/or response time, associated with obtaining the answer to query in a distributed database system, database grid query optimization problem is fundamentally different from traditional distributed query optimization. These differences are shown to be the consequences of autonomy and heterogeneity of database nodes in database grid. Therefore, more challenges have arisen for query optimization in database grid than traditional distributed database. Following this observation, the design of a query optimizer in DartGrid II is presented, and a heuristic, dynamic and parallel query optimization approach to processing query in database grid is proposed. A set of semantic tools supporting relational database integration and semantic-based information browsing has also been implemented to realize the above vision.展开更多
The query optimizer uses cost-based optimization to create an execution plan with the least cost,which also consumes the least amount of resources.The challenge of query optimization for relational database systems is...The query optimizer uses cost-based optimization to create an execution plan with the least cost,which also consumes the least amount of resources.The challenge of query optimization for relational database systems is a combinatorial optimization problem,which renders exhaustive search impossible as query sizes rise.Increases in CPU performance have surpassed main memory,and disk access speeds in recent decades,allowing data compression to be used—strategies for improving database performance systems.For performance enhancement,compression and query optimization are the two most factors.Compression reduces the volume of data,whereas query optimization minimizes execution time.Compressing the database reduces memory requirement,data takes less time to load into memory,fewer buffer missing occur,and the size of intermediate results is more diminutive.This paper performed query optimization on the graph database in a cloud dew environment by considering,which requires less time to execute a query.The factors compression and query optimization improve the performance of the databases.This research compares the performance of MySQL and Neo4j databases in terms of memory usage and execution time running on cloud dew servers.展开更多
As the popularity of XML (extensible Markup Language) keeps growing rapidly,the management of XML compliant structured-document databases has become a very interesting andcompelling research area. Query optimization f...As the popularity of XML (extensible Markup Language) keeps growing rapidly,the management of XML compliant structured-document databases has become a very interesting andcompelling research area. Query optimization for XML structured-documents stands out as one of themost challenging research issues in this area because of the much enlarged optimization (search)space, which is a consequence of the intrinsic complexity of the underlying data model of XML data.We therefore propose to apply deterministic transformations on query expressions to mostaggressively prune the search space and fast achieve a sufficiently improved alternative (if not theoptimal) for each incoming query expression. This idea is not just exciting but practicallyattainable. This paper first provides an overview of our optimization strategy, and then focuses onthe key implementation issues of our rule-based transformation system for XML query optimization ina database environment. The performance results we obtained from experimentation show that ourapproach is a valid and effective one.展开更多
Through the mapping from UMQL ( unified multimedia query language) conditional expressions to UMQA (unified multimedia query algebra) query operations, a translation algorithm from a UMQL query to a UMQA query pla...Through the mapping from UMQL ( unified multimedia query language) conditional expressions to UMQA (unified multimedia query algebra) query operations, a translation algorithm from a UMQL query to a UMQA query plan is put forward, which can generate an equivalent UMQA internal query plan for any UMQL query. Then, to improve the execution costs of UMQA query plans effectively, equivalent UMQA translation formulae and general optimization strategies are studied, and an optimization algorithm for UMQA internal query plans is presented. This algorithm uses equivalent UMQA translation formulae to optimize query plans, and makes the optimized query plans accord with the optimization strategies as much as possible. Finally, the logic implementation methods of UMQA plans, i.e., logic implementation methods of UMQA operators, are discussed to obtain useful target data from a muifirnedia database. All of these algorithms are implemented in a UMQL prototype system. Application results show that these query processing techniques are feasible and applicable.展开更多
We developed a parallel object relational DBMS named PORLES. It uses BSP model as its parallel computing model, and monoid calculus as its basis of data model. In this paper, we introduce its data model, parallel que...We developed a parallel object relational DBMS named PORLES. It uses BSP model as its parallel computing model, and monoid calculus as its basis of data model. In this paper, we introduce its data model, parallel query optimization, transaction processing system and parallel access method in detail.展开更多
在目前基于深度强化学习的数据库索引推荐中,当负载变化时,由于实际负载与训练负载差距较大,模型的推荐效果会显著下降。针对现有基于深度强化学习的索引推荐算法在负载增量变化下自适应性和模型泛化性不足的问题,提出了一个基于多智能...在目前基于深度强化学习的数据库索引推荐中,当负载变化时,由于实际负载与训练负载差距较大,模型的推荐效果会显著下降。针对现有基于深度强化学习的索引推荐算法在负载增量变化下自适应性和模型泛化性不足的问题,提出了一个基于多智能体迁移强化学习的索引推荐算法MARLIA(multi-agent reinforcement learning index advisor)。该算法结合了迁移学习的思想,使用多智能体进行模型训练。在负载变化更新导致模型推荐效果下降时,该算法可以利用策略蒸馏的方式将旧索引推荐策略传递给新索引推荐智能体,提高了模型的泛化性和对动态负载的支持。在TPC-H数据集上的实验结果表明,该算法的负载代价提升率与基线算法相比稳定在7%以内,在负载为120条时缓存命中率为76.3%。该研究表明,MARLIA算法在负载变化时具有强大的自适应性和模型泛化能力。展开更多
基金Supported by the National Natural Science Foundation of China National (9846-004) '863' High -Technique Program of China (8
文摘The author investigates the query optimization problem for parallel relational databases. A multi - weighted tree based query optimization method is proposed. The method consists of a multi - weighted tree based parallel query plan model, a cost model for parallel qury plans and a query optimizer. The parallel query plan model is the first one to model all basic relational operations, all three types of parallelism of query execution, processor and memory allocation to operations, memory allocation to the buffers between operations in pipelines and data redistribution among processors. The cost model takes the waiting time of the operations in pipelining execution into consideration and is computable in a bottom - up fashion. The query optimizer addresses the query optimization problem in the context of Select - Project - Join queries that are widely used in commercial DBMSs. Several heuristics determining the processor allocation to operations are derived and used in the query optimizer. The query optimizer is aware of memory resources in order to generate good - quality plans. It includes the heuristics for determining the memory allocation to operations and buffers between operations in pipelines so that the memory resourse is fully exploit. In addition, multiple algorithms for implementing join operations are consided in the query optimizer. The query optimizer can make an optimal choice of join algorithm for each join operation in a query. The proposed query optimization method has been used in a prototype parallel database management system designed and implemented by the author.
基金This work is supported by the National High Technology Research and Development Program ofChina(2 0 0 2 AA135 2 30 ) and the Major Project of National Natural Science Foundation of Beijing(4 0 110 0 2 ) .
文摘Recently, attention has been focused on spatial query language which is used to query spatial databases. A design of spatial query language has been presented in this paper by extending the standard relational database query language SQL. It recognizes the significantly different requirements of spatial data handling and overcomes the inherent problems of the application of conventional database query languages. This design is based on an extended spatial data model, including the spatial data types and the spatial operators on them. The processing and optimization of spatial queries have also been discussed in this design. In the end, an implementation of this design is given in a spatial query subsystem.
文摘Dynamic programming(DP) is an effective query optimization approach to select an appropriate join order for relational database management system(RDBMS) in multi-table joins. This method was extended and made available in distributed DBMS(D-DBMS). The structure of this optimal solution was firstly characterized according to the distributing status of tables and data, and then the recurrence relations between a problem and its sub-problems were recursively defined. DP in D-DBMS has the same time-complexity with that in centralized DBMS, while it has the capability to solve a much more sophisticated optimal problem of multi-table join in D-DBMS. The effectiveness of this optimal strategy has been proved by experiments.
文摘A systematic, efficient compilation method for query evaluation of DeductiveDatabases (DeDB) is proposed in this paper. In order to eliminate redundancyand to minimize the potentially relevant facts, which are two key issues to theefficiency of a DeDB, the compilation process is decomposed into two phases.The first is the pre-compilation phase, which is responsible for the minimiza-tion of the potentially relevant facts. The second, which we refer to as thegeneral compilation phase, is responsible for the elimination of redundancy.The rule/goal graph devised by J. D. Ullman is appropriately extended andused as a uniform formalism. Two general algorithms corresponding to the twophases respectively are described intuitively and formally
文摘Fundamentally, semantic grid database is about bringing globally distributed databases together in order to coordinate resource sharing and problem solving in which information is given well-defined meaning, and DartGrid II is the implemented database gird system whose goal is to provide a semantic solution for integrating database resources on the Web. Although many algorithms have been proposed for optimizing query-processing in order to minimize costs and/or response time, associated with obtaining the answer to query in a distributed database system, database grid query optimization problem is fundamentally different from traditional distributed query optimization. These differences are shown to be the consequences of autonomy and heterogeneity of database nodes in database grid. Therefore, more challenges have arisen for query optimization in database grid than traditional distributed database. Following this observation, the design of a query optimizer in DartGrid II is presented, and a heuristic, dynamic and parallel query optimization approach to processing query in database grid is proposed. A set of semantic tools supporting relational database integration and semantic-based information browsing has also been implemented to realize the above vision.
文摘The query optimizer uses cost-based optimization to create an execution plan with the least cost,which also consumes the least amount of resources.The challenge of query optimization for relational database systems is a combinatorial optimization problem,which renders exhaustive search impossible as query sizes rise.Increases in CPU performance have surpassed main memory,and disk access speeds in recent decades,allowing data compression to be used—strategies for improving database performance systems.For performance enhancement,compression and query optimization are the two most factors.Compression reduces the volume of data,whereas query optimization minimizes execution time.Compressing the database reduces memory requirement,data takes less time to load into memory,fewer buffer missing occur,and the size of intermediate results is more diminutive.This paper performed query optimization on the graph database in a cloud dew environment by considering,which requires less time to execute a query.The factors compression and query optimization improve the performance of the databases.This research compares the performance of MySQL and Neo4j databases in terms of memory usage and execution time running on cloud dew servers.
文摘As the popularity of XML (extensible Markup Language) keeps growing rapidly,the management of XML compliant structured-document databases has become a very interesting andcompelling research area. Query optimization for XML structured-documents stands out as one of themost challenging research issues in this area because of the much enlarged optimization (search)space, which is a consequence of the intrinsic complexity of the underlying data model of XML data.We therefore propose to apply deterministic transformations on query expressions to mostaggressively prune the search space and fast achieve a sufficiently improved alternative (if not theoptimal) for each incoming query expression. This idea is not just exciting but practicallyattainable. This paper first provides an overview of our optimization strategy, and then focuses onthe key implementation issues of our rule-based transformation system for XML query optimization ina database environment. The performance results we obtained from experimentation show that ourapproach is a valid and effective one.
基金The National High Technology Research and Development Program of China(863 Program) (No.2006AA01Z430)
文摘Through the mapping from UMQL ( unified multimedia query language) conditional expressions to UMQA (unified multimedia query algebra) query operations, a translation algorithm from a UMQL query to a UMQA query plan is put forward, which can generate an equivalent UMQA internal query plan for any UMQL query. Then, to improve the execution costs of UMQA query plans effectively, equivalent UMQA translation formulae and general optimization strategies are studied, and an optimization algorithm for UMQA internal query plans is presented. This algorithm uses equivalent UMQA translation formulae to optimize query plans, and makes the optimized query plans accord with the optimization strategies as much as possible. Finally, the logic implementation methods of UMQA plans, i.e., logic implementation methods of UMQA operators, are discussed to obtain useful target data from a muifirnedia database. All of these algorithms are implemented in a UMQL prototype system. Application results show that these query processing techniques are feasible and applicable.
文摘We developed a parallel object relational DBMS named PORLES. It uses BSP model as its parallel computing model, and monoid calculus as its basis of data model. In this paper, we introduce its data model, parallel query optimization, transaction processing system and parallel access method in detail.
文摘在目前基于深度强化学习的数据库索引推荐中,当负载变化时,由于实际负载与训练负载差距较大,模型的推荐效果会显著下降。针对现有基于深度强化学习的索引推荐算法在负载增量变化下自适应性和模型泛化性不足的问题,提出了一个基于多智能体迁移强化学习的索引推荐算法MARLIA(multi-agent reinforcement learning index advisor)。该算法结合了迁移学习的思想,使用多智能体进行模型训练。在负载变化更新导致模型推荐效果下降时,该算法可以利用策略蒸馏的方式将旧索引推荐策略传递给新索引推荐智能体,提高了模型的泛化性和对动态负载的支持。在TPC-H数据集上的实验结果表明,该算法的负载代价提升率与基线算法相比稳定在7%以内,在负载为120条时缓存命中率为76.3%。该研究表明,MARLIA算法在负载变化时具有强大的自适应性和模型泛化能力。