大型语言模型(LLM)已成为推进Text-to-SQL任务的强大工具。研究发现,基于LLM的模型在不同评估指标下,其性能表现与经过微调的模型存在显著差异。因此,文章分析了测试套件执行准确度(EXE)和精确集匹配准确度(ESM)在评估基于LLM的Text-to-...大型语言模型(LLM)已成为推进Text-to-SQL任务的强大工具。研究发现,基于LLM的模型在不同评估指标下,其性能表现与经过微调的模型存在显著差异。因此,文章分析了测试套件执行准确度(EXE)和精确集匹配准确度(ESM)在评估基于LLM的Text-to-SQL模型时的不足,并提出了改进指标EESM(Enhanced Exact Set Matching)。实验结果表明,EXE和ESM分别存在高达13.2%和10.8%的假阳性和假阴性率,而EESM的假阳性率和假阴性率分别仅为0.2%和1.8%,表明EESM能够提供更准确的评估。展开更多
Text-to-SQL is the task of translating a natural language query into a structured query language. Existing text-to-SQL approaches focus on improving the model’s architecture while ignoring the relationship between qu...Text-to-SQL is the task of translating a natural language query into a structured query language. Existing text-to-SQL approaches focus on improving the model’s architecture while ignoring the relationship between queries and table schemas and the differences in difficulty between examples in the dataset. To tackle these challenges, a two-stage curriculum learning framework for text-to-SQL(TSCL-SQL) is proposed in this paper. To exploit the relationship between the queries and the table schemas, a schema identification pre-training task is proposed to make the model choose the correct table schema from a set of candidates for a specific query. To leverage the differences in difficulty between examples, curriculum learning is applied to the text-to-SQL task, accompanied by an automatic curriculum learning solution, including a difficulty scorer and a training scheduler. Experiments show that the framework proposed in this paper is effective.展开更多
The increasing complexity of modern power systems,driven by factors such as the large-scale integration of renewable energy and the proliferation of distributed generation,has placed unprecedented demands on power dis...The increasing complexity of modern power systems,driven by factors such as the large-scale integration of renewable energy and the proliferation of distributed generation,has placed unprecedented demands on power dispatching operations.Ensuring grid stability and safety in this new environment requires real-time monitoring and swift,data-driven decision-making.Consequently,efficient and accurate data querying capabilities have become paramount.This study introduces Intelli-Dispatch-SQL,a novel agent-based Text-to-SQL framework that leverages the Large Language Model(LLM)to enhance the accuracy and reliability of generated SQL queries in the context of power dispatching.By integrating intent recognition and SQL validation modules,Intelli-Dispatch-SQL ensures that generated queries are not only syntactically correct but also semantically aligned with user intent and executable within the operational context.Through comprehensive experiments,including ablation studies and cross-model evaluations,we demonstrate that Intelli-Dispatch-SQL significantly outperforms existing Text-to-SQL models,achieving substantial improvements in both Exact Match(EM)and Execution Accuracy(EX).Notably,the incorporation of intent recognition and SQL validation modules is shown to be critical for performance enhancement.The framework’s effectiveness was further validated across various LLMs,confirming its robustness and applicability across diverse scenarios.Intelli-Dispatch-SQL offers a performance high-and generalizable solution for Text-to-SQL in power dispatching,paving the way for more efficient and intelligent power system management.展开更多
Text-to-SQL aims at translating textual questions into the corresponding SQL queries.Aggregate tables are widely created for high-frequent queries.Although text-to-SQL has emerged as an important task,recent studies p...Text-to-SQL aims at translating textual questions into the corresponding SQL queries.Aggregate tables are widely created for high-frequent queries.Although text-to-SQL has emerged as an important task,recent studies paid little attention to the task over aggregate tables.The increased aggregate tables bring two challenges:(1)mapping of natural language questions and relational databases will suffer from more ambiguity,(2)modern models usually adopt self-attention mechanism to encode database schema and question.The mechanism is of quadratic time complexity,which will make inferring more time-consuming as input sequence length grows.In this paper,we introduce a novel approach named WAGG for text-to-SQL over aggregate tables.To effectively select among ambiguous items,we propose a relation selection mechanism for relation computing.To deal with high computation costs,we introduce a dynamical pruning strategy to discard unrelated items that are common for aggregate tables.We also construct a new large-scale dataset SpiderwAGG extended from Spider dataset for validation,where extensive experiments show the effectiveness and efficiency of our proposed method with 4%increase of accuracy and 15%decrease of inference time w.r.t a strong baseline RAT-SQL.展开更多
This study presents a comparative analysis of a complex SQL benchmark, TPC-DS, with two existing text-to-SQL benchmarks, BIRD and Spider. Our findings reveal that TPC-DS queries exhibit a significantly higher level of...This study presents a comparative analysis of a complex SQL benchmark, TPC-DS, with two existing text-to-SQL benchmarks, BIRD and Spider. Our findings reveal that TPC-DS queries exhibit a significantly higher level of structural complexity compared to the other two benchmarks. This underscores the need for more intricate benchmarks to simulate realistic scenarios effectively. To facilitate this comparison, we devised several measures of structural complexity and applied them across all three benchmarks. The results of this study can guide future research in the development of more sophisticated text-to-SQL benchmarks. We utilized 11 distinct Language Models (LLMs) to generate SQL queries based on the query descriptions provided by the TPC-DS benchmark. The prompt engineering process incorporated both the query description as outlined in the TPC-DS specification and the database schema of TPC-DS. Our findings indicate that the current state-of-the-art generative AI models fall short in generating accurate decision-making queries. We conducted a comparison of the generated queries with the TPC-DS gold standard queries using a series of fuzzy structure matching techniques based on query features. The results demonstrated that the accuracy of the generated queries is insufficient for practical real-world application.展开更多
文摘大型语言模型(LLM)已成为推进Text-to-SQL任务的强大工具。研究发现,基于LLM的模型在不同评估指标下,其性能表现与经过微调的模型存在显著差异。因此,文章分析了测试套件执行准确度(EXE)和精确集匹配准确度(ESM)在评估基于LLM的Text-to-SQL模型时的不足,并提出了改进指标EESM(Enhanced Exact Set Matching)。实验结果表明,EXE和ESM分别存在高达13.2%和10.8%的假阳性和假阴性率,而EESM的假阳性率和假阴性率分别仅为0.2%和1.8%,表明EESM能够提供更准确的评估。
基金Fundamental Research Funds for the Central Universities,China (No. 2232023D-19)。
文摘Text-to-SQL is the task of translating a natural language query into a structured query language. Existing text-to-SQL approaches focus on improving the model’s architecture while ignoring the relationship between queries and table schemas and the differences in difficulty between examples in the dataset. To tackle these challenges, a two-stage curriculum learning framework for text-to-SQL(TSCL-SQL) is proposed in this paper. To exploit the relationship between the queries and the table schemas, a schema identification pre-training task is proposed to make the model choose the correct table schema from a set of candidates for a specific query. To leverage the differences in difficulty between examples, curriculum learning is applied to the text-to-SQL task, accompanied by an automatic curriculum learning solution, including a difficulty scorer and a training scheduler. Experiments show that the framework proposed in this paper is effective.
基金supported by the Guangdong Power Grid Com-pany(Grant Number:GDKJXM20231024)the National Natural Sci-ence Foundation of China(Grant Number:72331009,72171206 and 92270105)the Shenzhen Key Laboratory of Crowd Intelligence Em-powered Low-Carbon Energy Network(Grant number:ZDSYS20220606100601002).
文摘The increasing complexity of modern power systems,driven by factors such as the large-scale integration of renewable energy and the proliferation of distributed generation,has placed unprecedented demands on power dispatching operations.Ensuring grid stability and safety in this new environment requires real-time monitoring and swift,data-driven decision-making.Consequently,efficient and accurate data querying capabilities have become paramount.This study introduces Intelli-Dispatch-SQL,a novel agent-based Text-to-SQL framework that leverages the Large Language Model(LLM)to enhance the accuracy and reliability of generated SQL queries in the context of power dispatching.By integrating intent recognition and SQL validation modules,Intelli-Dispatch-SQL ensures that generated queries are not only syntactically correct but also semantically aligned with user intent and executable within the operational context.Through comprehensive experiments,including ablation studies and cross-model evaluations,we demonstrate that Intelli-Dispatch-SQL significantly outperforms existing Text-to-SQL models,achieving substantial improvements in both Exact Match(EM)and Execution Accuracy(EX).Notably,the incorporation of intent recognition and SQL validation modules is shown to be critical for performance enhancement.The framework’s effectiveness was further validated across various LLMs,confirming its robustness and applicability across diverse scenarios.Intelli-Dispatch-SQL offers a performance high-and generalizable solution for Text-to-SQL in power dispatching,paving the way for more efficient and intelligent power system management.
文摘Text-to-SQL aims at translating textual questions into the corresponding SQL queries.Aggregate tables are widely created for high-frequent queries.Although text-to-SQL has emerged as an important task,recent studies paid little attention to the task over aggregate tables.The increased aggregate tables bring two challenges:(1)mapping of natural language questions and relational databases will suffer from more ambiguity,(2)modern models usually adopt self-attention mechanism to encode database schema and question.The mechanism is of quadratic time complexity,which will make inferring more time-consuming as input sequence length grows.In this paper,we introduce a novel approach named WAGG for text-to-SQL over aggregate tables.To effectively select among ambiguous items,we propose a relation selection mechanism for relation computing.To deal with high computation costs,we introduce a dynamical pruning strategy to discard unrelated items that are common for aggregate tables.We also construct a new large-scale dataset SpiderwAGG extended from Spider dataset for validation,where extensive experiments show the effectiveness and efficiency of our proposed method with 4%increase of accuracy and 15%decrease of inference time w.r.t a strong baseline RAT-SQL.
文摘This study presents a comparative analysis of a complex SQL benchmark, TPC-DS, with two existing text-to-SQL benchmarks, BIRD and Spider. Our findings reveal that TPC-DS queries exhibit a significantly higher level of structural complexity compared to the other two benchmarks. This underscores the need for more intricate benchmarks to simulate realistic scenarios effectively. To facilitate this comparison, we devised several measures of structural complexity and applied them across all three benchmarks. The results of this study can guide future research in the development of more sophisticated text-to-SQL benchmarks. We utilized 11 distinct Language Models (LLMs) to generate SQL queries based on the query descriptions provided by the TPC-DS benchmark. The prompt engineering process incorporated both the query description as outlined in the TPC-DS specification and the database schema of TPC-DS. Our findings indicate that the current state-of-the-art generative AI models fall short in generating accurate decision-making queries. We conducted a comparison of the generated queries with the TPC-DS gold standard queries using a series of fuzzy structure matching techniques based on query features. The results demonstrated that the accuracy of the generated queries is insufficient for practical real-world application.