In the field of natural language processing(NLP),there have been various pre-training language models in recent years,with question answering systems gaining significant attention.However,as algorithms,data,and comput...In the field of natural language processing(NLP),there have been various pre-training language models in recent years,with question answering systems gaining significant attention.However,as algorithms,data,and computing power advance,the issue of increasingly larger models and a growing number of parameters has surfaced.Consequently,model training has become more costly and less efficient.To enhance the efficiency and accuracy of the training process while reducing themodel volume,this paper proposes a first-order pruningmodel PAL-BERT based on the ALBERT model according to the characteristics of question-answering(QA)system and language model.Firstly,a first-order network pruning method based on the ALBERT model is designed,and the PAL-BERT model is formed.Then,the parameter optimization strategy of the PAL-BERT model is formulated,and the Mish function was used as an activation function instead of ReLU to improve the performance.Finally,after comparison experiments with traditional deep learning models TextCNN and BiLSTM,it is confirmed that PALBERT is a pruning model compression method that can significantly reduce training time and optimize training efficiency.Compared with traditional models,PAL-BERT significantly improves the NLP task’s performance.展开更多
Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the ...Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the increasing size and complexity of these models have led to increased training costs and reduced efficiency.This study aims to minimize the inference time of such models while maintaining computational performance.It also proposes a novel Distillation model for PAL-BERT(DPAL-BERT),specifically,employs knowledge distillation,using the PAL-BERT model as the teacher model to train two student models:DPAL-BERT-Bi and DPAL-BERTC.This research enhances the dataset through techniques such as masking,replacement,and n-gram sampling to optimize knowledge transfer.The experimental results showed that the distilled models greatly outperform models trained from scratch.In addition,although the distilled models exhibit a slight decrease in performance compared to PAL-BERT,they significantly reduce inference time to just 0.25%of the original.This demonstrates the effectiveness of the proposed approach in balancing model performance and efficiency.展开更多
In the context of power generation companies, vast amounts of specialized data and expert knowledge have been accumulated. However, challenges such as data silos and fragmented knowledge hinder the effective utilizati...In the context of power generation companies, vast amounts of specialized data and expert knowledge have been accumulated. However, challenges such as data silos and fragmented knowledge hinder the effective utilization of this information. This study proposes a novel framework for intelligent Question-and-Answer (Q&A) systems based on Retrieval-Augmented Generation (RAG) to address these issues. The system efficiently acquires domain-specific knowledge by leveraging external databases, including Relational Databases (RDBs) and graph databases, without additional fine-tuning for Large Language Models (LLMs). Crucially, the framework integrates a Dynamic Knowledge Base Updating Mechanism (DKBUM) and a Weighted Context-Aware Similarity (WCAS) method to enhance retrieval accuracy and mitigate inherent limitations of LLMs, such as hallucinations and lack of specialization. Additionally, the proposed DKBUM dynamically adjusts knowledge weights within the database, ensuring that the most recent and relevant information is utilized, while WCAS refines the alignment between queries and knowledge items by enhanced context understanding. Experimental validation demonstrates that the system can generate timely, accurate, and context-sensitive responses, making it a robust solution for managing complex business logic in specialized industries.展开更多
To improve question answering (QA) performance based on real-world web data sets,a new set of question classes and a general answer re-ranking model are defined.With pre-defined dictionary and grammatical analysis,t...To improve question answering (QA) performance based on real-world web data sets,a new set of question classes and a general answer re-ranking model are defined.With pre-defined dictionary and grammatical analysis,the question classifier draws both semantic and grammatical information into information retrieval and machine learning methods in the form of various training features,including the question word,the main verb of the question,the dependency structure,the position of the main auxiliary verb,the main noun of the question,the top hypernym of the main noun,etc.Then the QA query results are re-ranked by question class information.Experiments show that the questions in real-world web data sets can be accurately classified by the classifier,and the QA results after re-ranking can be obviously improved.It is proved that with both semantic and grammatical information,applications such as QA, built upon real-world web data sets, can be improved,thus showing better performance.展开更多
Recently,pre-trained language representation models such as bidirec-tional encoder representations from transformers(BERT)have been performing well in commonsense question answering(CSQA).However,there is a problem th...Recently,pre-trained language representation models such as bidirec-tional encoder representations from transformers(BERT)have been performing well in commonsense question answering(CSQA).However,there is a problem that the models do not directly use explicit information of knowledge sources existing outside.To augment this,additional methods such as knowledge-aware graph network(KagNet)and multi-hop graph relation network(MHGRN)have been proposed.In this study,we propose to use the latest pre-trained language model a lite bidirectional encoder representations from transformers(ALBERT)with knowledge graph information extraction technique.We also propose to applying the novel method,schema graph expansion to recent language models.Then,we analyze the effect of applying knowledge graph-based knowledge extraction techniques to recent pre-trained language models and confirm that schema graph expansion is effective in some extent.Furthermore,we show that our proposed model can achieve better performance than existing KagNet and MHGRN models in CommonsenseQA dataset.展开更多
Previous works employ the Large Language Model(LLM)like GPT-3 for knowledge-based Visual Question Answering(VQA).We argue that the inferential capacity of LLM can be enhanced through knowledge injection.Although metho...Previous works employ the Large Language Model(LLM)like GPT-3 for knowledge-based Visual Question Answering(VQA).We argue that the inferential capacity of LLM can be enhanced through knowledge injection.Although methods that utilize knowledge graphs to enhance LLM have been explored in various tasks,they may have some limitations,such as the possibility of not being able to retrieve the required knowledge.In this paper,we introduce a novel framework for knowledge-based VQA titled“Prompting Large Language Models with Knowledge-Injection”(PLLMKI).We use vanilla VQA model to inspire the LLM and further enhance the LLM with knowledge injection.Unlike earlier approaches,we adopt the LLM for knowledge enhancement instead of relying on knowledge graphs.Furthermore,we leverage open LLMs,incurring no additional costs.In comparison to existing baselines,our approach exhibits the accuracy improvement of over 1.3 and 1.7 on two knowledge-based VQA datasets,namely OK-VQA and A-OKVQA,respectively.展开更多
Automatic Question Answer System(QAS)is a kind of high-powered software system based on Internet.Its key technology is the interrelated technology based on natural language understanding,including the construction of ...Automatic Question Answer System(QAS)is a kind of high-powered software system based on Internet.Its key technology is the interrelated technology based on natural language understanding,including the construction of knowledge base and corpus,the Word Segmentation and POS Tagging of text,the Grammatical Analysis and Semantic Analysis of sentences etc.This thesis dissertated mainly the denotation of knowledge-information based on semantic network in QAS,the stochastic syntax-parse model named LSF of knowledge-information in QAS,the structure and constitution of QAS.And the LSF model's parameters were exercised,which proved that they were feasible.At the same time,through "the limited-domain QAS" which was exploited for banks by us,these technologies were proved effective and propagable.展开更多
以ChatGPT为代表的大型语言模型(LLMs)在多种任务中展现了巨大潜力。然而,LLMs仍然面临幻觉现象和长尾知识遗忘等问题。为了解决这些问题,现有方法通过结合知识图谱等外部知识显著增强LLMs的生成能力,从而提升回答的准确性和完整性。但...以ChatGPT为代表的大型语言模型(LLMs)在多种任务中展现了巨大潜力。然而,LLMs仍然面临幻觉现象和长尾知识遗忘等问题。为了解决这些问题,现有方法通过结合知识图谱等外部知识显著增强LLMs的生成能力,从而提升回答的准确性和完整性。但是,这些方法存在如知识图谱构建复杂、语义丢失以及知识单向流动等问题。为此,我们提出了一种双向增强框架,不仅利用知识图谱增强LLMs的生成效果,而且利用LLMs的推理结果补充知识图谱,从而形成知识的双向流动,并最终形成知识图谱与LLMs之间的循环正反馈,不断优化系统效果。此外,通过设计增强知识图谱(Enhanced Knowledge Graph,EKG),我们将关系抽取任务延迟到检索阶段,降低知识图谱的构建成本,并利用向量检索技术缓解语义丢失问题。基于此框架,本文构建了双向增强系统——BEKO(Bidirectional Enhancement with a Knowledge Ocean)系统,并在关系推理应用中相比传统方法取得明显的性能提升,验证了双向增强框架的可行性和有效性。BEKO系统目前已经部署在公开的网站——ko.zhonghuapu.com。展开更多
大语言模型(LLMs,Large Language Models)具有极强的自然语言理解和复杂问题求解能力,本文基于大语言模型构建了矿物问答系统,以高效地获取矿物知识。该系统首先从互联网资源获取矿物数据,清洗后将矿物数据结构化为矿物文档和问答对;将...大语言模型(LLMs,Large Language Models)具有极强的自然语言理解和复杂问题求解能力,本文基于大语言模型构建了矿物问答系统,以高效地获取矿物知识。该系统首先从互联网资源获取矿物数据,清洗后将矿物数据结构化为矿物文档和问答对;将矿物文档经过格式转换和建立索引后转化为矿物知识库,用于检索增强大语言模型生成,问答对用于微调大语言模型。使用矿物知识库检索增强大语言模型生成时,采用先召回再精排的两级检索模式,以获得更好的大语言模型生成结果。矿物大语言模型微调采用了主流的低秩适配(Low-Rank Adaption,LoRA)方法,以较少的训练参数获得了与全参微调性能相当的效果,节省了计算资源。实验结果表明,基于检索增强生成的大语言模型的矿物问答系统能以较高的准确率快捷地获取矿物知识。展开更多
随着全球气候变化日益严重,企业碳排放分析成为国际关注的焦点,针对通用大语言模型(large language model,LLM)知识更新滞后,增强生成架构在处理复杂问题时缺乏专业性与准确性,以及大模型生成结果中幻觉率高的问题,通过构建专有知识库,...随着全球气候变化日益严重,企业碳排放分析成为国际关注的焦点,针对通用大语言模型(large language model,LLM)知识更新滞后,增强生成架构在处理复杂问题时缺乏专业性与准确性,以及大模型生成结果中幻觉率高的问题,通过构建专有知识库,开发了基于大语言模型的企业碳排放分析与知识问答系统。提出了一种多样化索引模块构建方法,构建高质量的知识与法规检索数据集。针对碳排放报告(政策)领域的知识问答任务,提出了自提示检索增强生成架构,集成意图识别、改进的结构化思维链、混合检索技术、高质量提示工程和Text2SQL系统,支持多维度分析企业可持续性报告,为企业碳排放报告(政策)提供了一种高效、精准的知识问答解决方案。通过多层分块机制、文档索引和幻觉识别功能,确保结果的准确性与可验证性,降低了LLM技术在系统中的幻觉率。通过对比实验,所提算法在各模块的协同下在检索增强生成实验中各指标表现优异,对于企业碳排放报告的关键信息抽取和报告评价,尤其是长文本处理具有明显的优势。展开更多
基金Supported by Sichuan Science and Technology Program(2021YFQ0003,2023YFSY0026,2023YFH0004).
文摘In the field of natural language processing(NLP),there have been various pre-training language models in recent years,with question answering systems gaining significant attention.However,as algorithms,data,and computing power advance,the issue of increasingly larger models and a growing number of parameters has surfaced.Consequently,model training has become more costly and less efficient.To enhance the efficiency and accuracy of the training process while reducing themodel volume,this paper proposes a first-order pruningmodel PAL-BERT based on the ALBERT model according to the characteristics of question-answering(QA)system and language model.Firstly,a first-order network pruning method based on the ALBERT model is designed,and the PAL-BERT model is formed.Then,the parameter optimization strategy of the PAL-BERT model is formulated,and the Mish function was used as an activation function instead of ReLU to improve the performance.Finally,after comparison experiments with traditional deep learning models TextCNN and BiLSTM,it is confirmed that PALBERT is a pruning model compression method that can significantly reduce training time and optimize training efficiency.Compared with traditional models,PAL-BERT significantly improves the NLP task’s performance.
基金supported by Sichuan Science and Technology Program(2023YFSY0026,2023YFH0004).
文摘Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the increasing size and complexity of these models have led to increased training costs and reduced efficiency.This study aims to minimize the inference time of such models while maintaining computational performance.It also proposes a novel Distillation model for PAL-BERT(DPAL-BERT),specifically,employs knowledge distillation,using the PAL-BERT model as the teacher model to train two student models:DPAL-BERT-Bi and DPAL-BERTC.This research enhances the dataset through techniques such as masking,replacement,and n-gram sampling to optimize knowledge transfer.The experimental results showed that the distilled models greatly outperform models trained from scratch.In addition,although the distilled models exhibit a slight decrease in performance compared to PAL-BERT,they significantly reduce inference time to just 0.25%of the original.This demonstrates the effectiveness of the proposed approach in balancing model performance and efficiency.
文摘In the context of power generation companies, vast amounts of specialized data and expert knowledge have been accumulated. However, challenges such as data silos and fragmented knowledge hinder the effective utilization of this information. This study proposes a novel framework for intelligent Question-and-Answer (Q&A) systems based on Retrieval-Augmented Generation (RAG) to address these issues. The system efficiently acquires domain-specific knowledge by leveraging external databases, including Relational Databases (RDBs) and graph databases, without additional fine-tuning for Large Language Models (LLMs). Crucially, the framework integrates a Dynamic Knowledge Base Updating Mechanism (DKBUM) and a Weighted Context-Aware Similarity (WCAS) method to enhance retrieval accuracy and mitigate inherent limitations of LLMs, such as hallucinations and lack of specialization. Additionally, the proposed DKBUM dynamically adjusts knowledge weights within the database, ensuring that the most recent and relevant information is utilized, while WCAS refines the alignment between queries and knowledge items by enhanced context understanding. Experimental validation demonstrates that the system can generate timely, accurate, and context-sensitive responses, making it a robust solution for managing complex business logic in specialized industries.
基金Microsoft Research Asia Internet Services in Academic Research Fund(No.FY07-RES-OPP-116)the Science and Technology Development Program of Tianjin(No.06YFGZGX05900)
文摘To improve question answering (QA) performance based on real-world web data sets,a new set of question classes and a general answer re-ranking model are defined.With pre-defined dictionary and grammatical analysis,the question classifier draws both semantic and grammatical information into information retrieval and machine learning methods in the form of various training features,including the question word,the main verb of the question,the dependency structure,the position of the main auxiliary verb,the main noun of the question,the top hypernym of the main noun,etc.Then the QA query results are re-ranked by question class information.Experiments show that the questions in real-world web data sets can be accurately classified by the classifier,and the QA results after re-ranking can be obviously improved.It is proved that with both semantic and grammatical information,applications such as QA, built upon real-world web data sets, can be improved,thus showing better performance.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea Government(MSIT)(No.2020R1G1A1100493).
文摘Recently,pre-trained language representation models such as bidirec-tional encoder representations from transformers(BERT)have been performing well in commonsense question answering(CSQA).However,there is a problem that the models do not directly use explicit information of knowledge sources existing outside.To augment this,additional methods such as knowledge-aware graph network(KagNet)and multi-hop graph relation network(MHGRN)have been proposed.In this study,we propose to use the latest pre-trained language model a lite bidirectional encoder representations from transformers(ALBERT)with knowledge graph information extraction technique.We also propose to applying the novel method,schema graph expansion to recent language models.Then,we analyze the effect of applying knowledge graph-based knowledge extraction techniques to recent pre-trained language models and confirm that schema graph expansion is effective in some extent.Furthermore,we show that our proposed model can achieve better performance than existing KagNet and MHGRN models in CommonsenseQA dataset.
基金supported by the National Natural Science Foundation of China(No.62272100)Consulting Project of Chinese Academy of Engineering(No.2023-XY-09)Fundamental Research Funds for the Central Universities.
文摘Previous works employ the Large Language Model(LLM)like GPT-3 for knowledge-based Visual Question Answering(VQA).We argue that the inferential capacity of LLM can be enhanced through knowledge injection.Although methods that utilize knowledge graphs to enhance LLM have been explored in various tasks,they may have some limitations,such as the possibility of not being able to retrieve the required knowledge.In this paper,we introduce a novel framework for knowledge-based VQA titled“Prompting Large Language Models with Knowledge-Injection”(PLLMKI).We use vanilla VQA model to inspire the LLM and further enhance the LLM with knowledge injection.Unlike earlier approaches,we adopt the LLM for knowledge enhancement instead of relying on knowledge graphs.Furthermore,we leverage open LLMs,incurring no additional costs.In comparison to existing baselines,our approach exhibits the accuracy improvement of over 1.3 and 1.7 on two knowledge-based VQA datasets,namely OK-VQA and A-OKVQA,respectively.
基金Sponsored by the National Natural Science Foundation of China(Grant No.60305009)the Ph.D Degree Teacher Foundation of North China Electric Power University(Grant No.H0585).
文摘Automatic Question Answer System(QAS)is a kind of high-powered software system based on Internet.Its key technology is the interrelated technology based on natural language understanding,including the construction of knowledge base and corpus,the Word Segmentation and POS Tagging of text,the Grammatical Analysis and Semantic Analysis of sentences etc.This thesis dissertated mainly the denotation of knowledge-information based on semantic network in QAS,the stochastic syntax-parse model named LSF of knowledge-information in QAS,the structure and constitution of QAS.And the LSF model's parameters were exercised,which proved that they were feasible.At the same time,through "the limited-domain QAS" which was exploited for banks by us,these technologies were proved effective and propagable.
文摘以ChatGPT为代表的大型语言模型(LLMs)在多种任务中展现了巨大潜力。然而,LLMs仍然面临幻觉现象和长尾知识遗忘等问题。为了解决这些问题,现有方法通过结合知识图谱等外部知识显著增强LLMs的生成能力,从而提升回答的准确性和完整性。但是,这些方法存在如知识图谱构建复杂、语义丢失以及知识单向流动等问题。为此,我们提出了一种双向增强框架,不仅利用知识图谱增强LLMs的生成效果,而且利用LLMs的推理结果补充知识图谱,从而形成知识的双向流动,并最终形成知识图谱与LLMs之间的循环正反馈,不断优化系统效果。此外,通过设计增强知识图谱(Enhanced Knowledge Graph,EKG),我们将关系抽取任务延迟到检索阶段,降低知识图谱的构建成本,并利用向量检索技术缓解语义丢失问题。基于此框架,本文构建了双向增强系统——BEKO(Bidirectional Enhancement with a Knowledge Ocean)系统,并在关系推理应用中相比传统方法取得明显的性能提升,验证了双向增强框架的可行性和有效性。BEKO系统目前已经部署在公开的网站——ko.zhonghuapu.com。
文摘大语言模型(LLMs,Large Language Models)具有极强的自然语言理解和复杂问题求解能力,本文基于大语言模型构建了矿物问答系统,以高效地获取矿物知识。该系统首先从互联网资源获取矿物数据,清洗后将矿物数据结构化为矿物文档和问答对;将矿物文档经过格式转换和建立索引后转化为矿物知识库,用于检索增强大语言模型生成,问答对用于微调大语言模型。使用矿物知识库检索增强大语言模型生成时,采用先召回再精排的两级检索模式,以获得更好的大语言模型生成结果。矿物大语言模型微调采用了主流的低秩适配(Low-Rank Adaption,LoRA)方法,以较少的训练参数获得了与全参微调性能相当的效果,节省了计算资源。实验结果表明,基于检索增强生成的大语言模型的矿物问答系统能以较高的准确率快捷地获取矿物知识。
文摘随着全球气候变化日益严重,企业碳排放分析成为国际关注的焦点,针对通用大语言模型(large language model,LLM)知识更新滞后,增强生成架构在处理复杂问题时缺乏专业性与准确性,以及大模型生成结果中幻觉率高的问题,通过构建专有知识库,开发了基于大语言模型的企业碳排放分析与知识问答系统。提出了一种多样化索引模块构建方法,构建高质量的知识与法规检索数据集。针对碳排放报告(政策)领域的知识问答任务,提出了自提示检索增强生成架构,集成意图识别、改进的结构化思维链、混合检索技术、高质量提示工程和Text2SQL系统,支持多维度分析企业可持续性报告,为企业碳排放报告(政策)提供了一种高效、精准的知识问答解决方案。通过多层分块机制、文档索引和幻觉识别功能,确保结果的准确性与可验证性,降低了LLM技术在系统中的幻觉率。通过对比实验,所提算法在各模块的协同下在检索增强生成实验中各指标表现优异,对于企业碳排放报告的关键信息抽取和报告评价,尤其是长文本处理具有明显的优势。