For small devices like the PDAs and mobile phones, formulation of relational database queries is not as simple as using conventional devices such as the personal computers and laptops. Due to the restricted size and r...For small devices like the PDAs and mobile phones, formulation of relational database queries is not as simple as using conventional devices such as the personal computers and laptops. Due to the restricted size and resources of these smaller devices, current works mostly limit the queries that can be posed by users by having them predetermined by the developers. This limits the capability of these devices in supporting robust queries. Hence, this paper proposes a universal relation based database querying language which is targeted for small devices. The language allows formulation of relational database queries that uses minimal query terms. The formulation of the language and its structure will be described and usability test results will be presented to support the effectiveness of the language.展开更多
Cardinality estimation is crucial for query optimization,but traditional methods struggle with complex queries.We propose LW-CQ,a lightweight machine learning-based algorithm that improves cardinality estimation for c...Cardinality estimation is crucial for query optimization,but traditional methods struggle with complex queries.We propose LW-CQ,a lightweight machine learning-based algorithm that improves cardinality estimation for complex queries by enhancing the LW-XGB method.LW-CQ introduces four feature-level improvements and extends support for disjunctive queries and LIKE predicates.Experimental results show that LW-CQ achieves competitive accuracy while significantly reducing training and inference time,making it a promising solution for real-world database applications.展开更多
查询是数据库系统最主要的操作,查询性能直接决定了应用程序的响应速度和用户体验.多查询并行场景下,查询之间争用或共享数据库系统资源,产生查询交互(Query Interaction,QI),是影响查询性能的主要因素,准确度量QI是为查询选择合适执行...查询是数据库系统最主要的操作,查询性能直接决定了应用程序的响应速度和用户体验.多查询并行场景下,查询之间争用或共享数据库系统资源,产生查询交互(Query Interaction,QI),是影响查询性能的主要因素,准确度量QI是为查询选择合适执行计划及提升查询性能的关键.QI随着查询中操作的执行动态变化,现有度量方法只考虑新查询加入时刻系统资源的使用情况,不考虑系统资源在查询执行过程中的变化,度量不准确.为此,本文提出查询组合时序异构图,用于描述查询组合中QI随时间的动态变化;提出时间感知多边类型权重计算模型(Time-Aware Multi-edge Type Weight Calculation,TA-MTWC),计算异构图中操作节点之间任意执行时刻的边权重,捕捉QI随时间的动态变化;提出查询组合时序异构图分类模型(Query-mix Time-series Heterogeneous Graph Classification,QTHGC),采用长短期记忆神经网络(Long Short Term Memory,LSTM)学习多个时刻图表示之间的时序关系,为并行查询选择执行计划.在PostgreSQL上的实验证明,QTHGC的平均准确率比查询优化器提高51.2%,比现有最新的QHGC模型提高2.87%.展开更多
Through the mapping from UMQL ( unified multimedia query language) conditional expressions to UMQA (unified multimedia query algebra) query operations, a translation algorithm from a UMQL query to a UMQA query pla...Through the mapping from UMQL ( unified multimedia query language) conditional expressions to UMQA (unified multimedia query algebra) query operations, a translation algorithm from a UMQL query to a UMQA query plan is put forward, which can generate an equivalent UMQA internal query plan for any UMQL query. Then, to improve the execution costs of UMQA query plans effectively, equivalent UMQA translation formulae and general optimization strategies are studied, and an optimization algorithm for UMQA internal query plans is presented. This algorithm uses equivalent UMQA translation formulae to optimize query plans, and makes the optimized query plans accord with the optimization strategies as much as possible. Finally, the logic implementation methods of UMQA plans, i.e., logic implementation methods of UMQA operators, are discussed to obtain useful target data from a muifirnedia database. All of these algorithms are implemented in a UMQL prototype system. Application results show that these query processing techniques are feasible and applicable.展开更多
An approximate approach of querying between heterogeneous ontology-basedinformation systems based on an association matrix is proposed. First, the association matrix isdefined to describe relations between concepts in...An approximate approach of querying between heterogeneous ontology-basedinformation systems based on an association matrix is proposed. First, the association matrix isdefined to describe relations between concepts in two ontologies. Then, a methodof rewriting queriesbased on the association matrix is presented to solve the ontology heterogeneity problem. Itrewrites the queries in one ontology to approximate queries in another ontology based on thesubsumption relations between concepts. The method also uses vectors to represent queries, and thencomputes the vectors with the association matrix; the disjoint relations between concepts can beconsidered by the results. It can get better approximations than the methods currently in use, whichdonot consider disjoint relations. The method can be processed by machines automatically. It issimple to implement and expected to run quite fast.展开更多
Text2SQL技术通过减少非专业用户与关系数据库交互的技术障碍,已发展为数据分析和数据库管理的重要工具.以GPT为代表的大语言模型(large language model,LLM)的引入,进一步提升了Text2SQL系统的性能.然而,由于空间数据涉及复杂的几何关...Text2SQL技术通过减少非专业用户与关系数据库交互的技术障碍,已发展为数据分析和数据库管理的重要工具.以GPT为代表的大语言模型(large language model,LLM)的引入,进一步提升了Text2SQL系统的性能.然而,由于空间数据涉及复杂的几何关系、多样化的查询类型和对高精度语义理解的需求,现有的Text2SQL技术难以直接适用于空间数据库领域.为了解决上述问题,降低普通用户与空间数据库的交互门槛,提出了面向空间数据库的自然语言查询(natural language query,NLQ)转换方法.该方法有两个核心阶段:(1)自然语言理解;(2)可执行语言生成.在阶段(1)中使用实体信息提取算法提取关键查询实体,并基于大语言模型构建空间数据查询语料库进而确定查询类型.在阶段(2)中根据查询类型选择结构化语言模型(structured language model,SLM),然后将实体映射到结构化语言模型中,得到最终的空间数据库可执行语言.在多组真实数据集上的实验结果表明,该方法可以实现从用户的自然语言查询到空间数据库可执行语言的高效转换.展开更多
文摘For small devices like the PDAs and mobile phones, formulation of relational database queries is not as simple as using conventional devices such as the personal computers and laptops. Due to the restricted size and resources of these smaller devices, current works mostly limit the queries that can be posed by users by having them predetermined by the developers. This limits the capability of these devices in supporting robust queries. Hence, this paper proposes a universal relation based database querying language which is targeted for small devices. The language allows formulation of relational database queries that uses minimal query terms. The formulation of the language and its structure will be described and usability test results will be presented to support the effectiveness of the language.
基金supported by Sichuan Science and Technology Program(2024YFHZ0161).
文摘Cardinality estimation is crucial for query optimization,but traditional methods struggle with complex queries.We propose LW-CQ,a lightweight machine learning-based algorithm that improves cardinality estimation for complex queries by enhancing the LW-XGB method.LW-CQ introduces four feature-level improvements and extends support for disjunctive queries and LIKE predicates.Experimental results show that LW-CQ achieves competitive accuracy while significantly reducing training and inference time,making it a promising solution for real-world database applications.
文摘查询是数据库系统最主要的操作,查询性能直接决定了应用程序的响应速度和用户体验.多查询并行场景下,查询之间争用或共享数据库系统资源,产生查询交互(Query Interaction,QI),是影响查询性能的主要因素,准确度量QI是为查询选择合适执行计划及提升查询性能的关键.QI随着查询中操作的执行动态变化,现有度量方法只考虑新查询加入时刻系统资源的使用情况,不考虑系统资源在查询执行过程中的变化,度量不准确.为此,本文提出查询组合时序异构图,用于描述查询组合中QI随时间的动态变化;提出时间感知多边类型权重计算模型(Time-Aware Multi-edge Type Weight Calculation,TA-MTWC),计算异构图中操作节点之间任意执行时刻的边权重,捕捉QI随时间的动态变化;提出查询组合时序异构图分类模型(Query-mix Time-series Heterogeneous Graph Classification,QTHGC),采用长短期记忆神经网络(Long Short Term Memory,LSTM)学习多个时刻图表示之间的时序关系,为并行查询选择执行计划.在PostgreSQL上的实验证明,QTHGC的平均准确率比查询优化器提高51.2%,比现有最新的QHGC模型提高2.87%.
基金The National High Technology Research and Development Program of China(863 Program) (No.2006AA01Z430)
文摘Through the mapping from UMQL ( unified multimedia query language) conditional expressions to UMQA (unified multimedia query algebra) query operations, a translation algorithm from a UMQL query to a UMQA query plan is put forward, which can generate an equivalent UMQA internal query plan for any UMQL query. Then, to improve the execution costs of UMQA query plans effectively, equivalent UMQA translation formulae and general optimization strategies are studied, and an optimization algorithm for UMQA internal query plans is presented. This algorithm uses equivalent UMQA translation formulae to optimize query plans, and makes the optimized query plans accord with the optimization strategies as much as possible. Finally, the logic implementation methods of UMQA plans, i.e., logic implementation methods of UMQA operators, are discussed to obtain useful target data from a muifirnedia database. All of these algorithms are implemented in a UMQL prototype system. Application results show that these query processing techniques are feasible and applicable.
文摘An approximate approach of querying between heterogeneous ontology-basedinformation systems based on an association matrix is proposed. First, the association matrix isdefined to describe relations between concepts in two ontologies. Then, a methodof rewriting queriesbased on the association matrix is presented to solve the ontology heterogeneity problem. Itrewrites the queries in one ontology to approximate queries in another ontology based on thesubsumption relations between concepts. The method also uses vectors to represent queries, and thencomputes the vectors with the association matrix; the disjoint relations between concepts can beconsidered by the results. It can get better approximations than the methods currently in use, whichdonot consider disjoint relations. The method can be processed by machines automatically. It issimple to implement and expected to run quite fast.
文摘Text2SQL技术通过减少非专业用户与关系数据库交互的技术障碍,已发展为数据分析和数据库管理的重要工具.以GPT为代表的大语言模型(large language model,LLM)的引入,进一步提升了Text2SQL系统的性能.然而,由于空间数据涉及复杂的几何关系、多样化的查询类型和对高精度语义理解的需求,现有的Text2SQL技术难以直接适用于空间数据库领域.为了解决上述问题,降低普通用户与空间数据库的交互门槛,提出了面向空间数据库的自然语言查询(natural language query,NLQ)转换方法.该方法有两个核心阶段:(1)自然语言理解;(2)可执行语言生成.在阶段(1)中使用实体信息提取算法提取关键查询实体,并基于大语言模型构建空间数据查询语料库进而确定查询类型.在阶段(2)中根据查询类型选择结构化语言模型(structured language model,SLM),然后将实体映射到结构化语言模型中,得到最终的空间数据库可执行语言.在多组真实数据集上的实验结果表明,该方法可以实现从用户的自然语言查询到空间数据库可执行语言的高效转换.