摘要
大语言模型在自然语言处理领域表现出强大的能力,但依然面临诸如幻觉、缺乏领域特定知识等问题。检索增强生成(retrieval-augmented generation,RAG)利用大规模的外部知识库来增强模型的语义理解和生成能力,有效缓解了大语言模型所面临的部分问题,为开放域问答、文本摘要、对话系统等自然语言处理任务提供了有效的解决方案。将全面综述检索增强生成的关键技术进展,包括检索器、生成器以及各个部分优化的可能性;总结了现有的检索增强生成评估方法,探讨了当前RAG评估的局限性。最后,讨论了检索增强生成未来可能的研究方向。
Large language models have shown strong capabilities in the field of natural language processing,but still face problems such as hallucinations and lack of domain-specific knowledge.Retrieval-augmented generation(RAG)effectively alleviates some of the problems faced by large language models by utilizing large-scale external knowledge bases to enhance the semantic understanding and generation capabilities of the models,and providing an effective solution for natural language processing tasks such as open-domain question answering,text summarization,and dialogue systems.This paper comprehensively reviews the key technical advances in retrieval-augmented generation,including the retriever,generator,and the possibility of optimizing each part.In addition,it summarizes the existing retrieval-augmented generation evaluation methods and explores the limitations of the current RAG evaluation.Finally,possible future research directions for retrieval-augmented generation are discussed.
作者
吴璇
付涛
WU Xuan;FU Tao(Yunnan University of Finance and Economics,Kunming 650221,China)
出处
《计算机工程与应用》
北大核心
2025年第20期19-35,共17页
Computer Engineering and Applications
基金
2025年度云南财经大学研究生创新基金(2025YUFEYC152)。