摘要
随着信息检索和知识获取需求的增加,智能问答系统在多个垂直领域得到广泛应用。然而,在气象领域仍缺乏专门的智能问答系统研究,严重限制了气象信息的高效利用和气象系统的服务效率。针对这一需求,提出了一种面向气象数据库的大模型检索智能问答技术实现方案。该方案设计了一种基于关系型数据库(SQL)与文档型数据(NoSQL)的多通道查询路由(multi-channel retrieval router,McRR)方法,为了适配数据库进行大模型查询以及增强大模型对查询表的理解,分别提出指令查询转换方法与数据库表摘要方法DNSUM,提升大模型对数据库的语义理解能力,通过结合问题理解、重排序器和响应生成等关键模块,构建了一个端到端的智能问答模型,可实现多数据源的相关知识检索及答案生成。实验结果显示,该模型可以有效理解用户问题并生成准确的答案,具有良好的检索和响应能力。不仅为气象领域提供了一种智能问答的解决方案,也为气象智能问答技术提供了新的应用实施参考。
With the increasing demand for information retrieval and knowledge acquisition,question-answering systems are widely applied across various domains.However,there is a notable lack of specialized question-answering system research in the meteorological field,which severely limits the efficient utilization of meteorological information and the service efficiency of meteorological systems.To address this gap,it proposes a retrieval-augmented generation based questionanswering implementation scheme for meteorological databases.This scheme designs a multi-channel query routing(McRR)method based on relational databases(SQL)and document-oriented data(NoSQL).Additionally,to adapt large model queries to databases and enhance the model’s understanding of query tables,the paper proposes an instruction query conversion method and a database table summarization method(termed as DNSUM)to improve the model’s semantic understanding of databases.Furthermore,by integrating key modules such as question understanding,re-rankers,and response generation,it constructs an end-to-end intelligent question-answering engine capable of retrieving relevant knowledge and generating answers from multiple data sources.Experimental results on the constructed meteorological question-answering dataset demonstrate that this engine effectively understands user queries and generates accurate answers,exhibiting strong retrieval and response capabilities.This research not only provides a question-answering solution for the meteorological field but also offers new directions for the application of question-answering technology in vertical domains.
作者
江双五
张嘉玮
华连生
杨菁林
JIANG Shuangwu;ZHANG Jiawei;HUA Liansheng;YANG Jinglin(Anhui Meteorological Information Center,Hefei 230031,China;Beihang University,Beijing 100191,China;National Computer Network Emergency Response Technical Team/Coordination Center of China,Beijing 100029,China)
出处
《计算机工程与应用》
北大核心
2025年第5期113-121,共9页
Computer Engineering and Applications
基金
国家重点研发计划(2022YFC3321002)
国家档案局科技项目(2022-X-060)
中国气象局档案建设专项(YBSZX2024007)
安徽省气象局创新团队建设计划。
关键词
数据库查询
数据库问答
大语言模型
检索增强生成
气象问答
sructured query
database question-answering
large language models(LLM)
retrieval-augmented generation(RAG)
meteorological Q&A