Objective To develop QingNangTCM,a specialized large language model(LLM)tailored for expert-level traditional Chinese medicine(TCM)question-answering and clinical reasoning,addressing the scarcity of domain-specific c...Objective To develop QingNangTCM,a specialized large language model(LLM)tailored for expert-level traditional Chinese medicine(TCM)question-answering and clinical reasoning,addressing the scarcity of domain-specific corpora and specialized alignment.Methods We constructed QnTCM_Dataset,a corpus of 100000 entries,by integrating data from ShenNong_TCM_Dataset and SymMap v2.0,and synthesizing additional samples via retrieval-augmented generation(RAG)and persona-driven generation.The dataset comprehensively covers diagnostic inquiries,prescriptions,and herbal knowledge.Utilizing P-Tuning v2,we fine-tuned the GLM-4-9B-Chat backbone to develop QingNangTCM.A multidimensional evaluation framework,assessing accuracy,coverage,consistency,safety,professionalism,and fluency,was established using metrics such as bilingual evaluation understudy(BLEU),recall-oriented understudy for gisting evaluation(ROUGE),metric for evaluation of translation with explicit ordering(METEOR),and LLM-as-a-Judge with expert review.Qualitative analysis was conducted across four simulated clinical scenarios:symptom analysis,disease treatment,herb inquiry,and failure cases.Baseline models included GLM-4-9BChat,DeepSeek-V2,HuatuoGPT-II(7B),and GLM-4-9B-Chat(freeze-tuning).Results QingNangTCM achieved the highest scores in BLEU-1/2/3/4(0.425/0.298/0.137/0.064),ROUGE-1/2(0.368/0.157),and METEOR(0.218),demonstrating a balanced and superior normalized performance profile of 0.900 across the dimensions of accuracy,coverage,and consistency.Although its ROUGE-L score(0.299)was lower than that of HuatuoGPT-II(7B)(0.351),it significantly outperformed domain-specific models in expert-validated win rates for professionalism(86%)and safety(73%).Qualitative analysis confirmed that the model strictly adheres to the“symptom-syndrome-pathogenesis-treatment”reasoning chain,though occasional misclassifications and hallucinations persisted when dealing with rare medicinal materials and uncommon syndromes.展开更多
在大数据时代,海量的互联网信息飞速增长,人们对信息获取的精准度与效率提出了更高的要求。随着企业信息化和装备管理现代化的不断推进,对海量企业装备信息进行有效的提炼、管理与利用,对于提升企业装备知识的应用价值以及企业资源的利...在大数据时代,海量的互联网信息飞速增长,人们对信息获取的精准度与效率提出了更高的要求。随着企业信息化和装备管理现代化的不断推进,对海量企业装备信息进行有效的提炼、管理与利用,对于提升企业装备知识的应用价值以及企业资源的利用效率具有重要意义。本研究提出了一套融合大语言模型自然语言处理能力的系统,可智能理解用户查询并提供精准的装备信息。通过采用P-Tuning v2方法对大语言模型进行微调,大幅提升了其在企业装备领域对关键词的识别和提取能力。同时,借助企业装备知识图谱作为本地知识库,为模型提供行业领域知识,使其能够将相关信息作为问题的上下文进行学习。基于此,还设计了提示工程来引导模型生成更准确的回复,并对结果进行了效果评估。实验结果表明,相较于直接使用大语言模型,该基于知识图谱增强的大语言模型在企业装备领域的智能化回复准确率更高,为企业装备问答系统的建设提供了有力支持。In the era of big data, the volume of Internet information is growing at an astonishing rate, and people have put forward higher requirements for the accuracy and efficiency of information acquisition. With the continuous advancement of enterprise informatization and modernization of equipment management, effectively extracting, managing and utilizing massive enterprise equipment information is of great significance for enhancing the application value of enterprise equipment knowledge and improving the efficiency of enterprise resource utilization. This study proposes a system that integrates the natural language processing capabilities of large language models, which can intelligently understand user queries and provide precise equipment information. By using the P-Tuning v2 method to fine-tune the large language model, its ability to recognize and extract keywords in the field of enterprise equipment has been significantly enhanced. At the same time, with the help of the enterprise equipment knowledge graph as a local knowledge base, industry-specific knowledge is provided to the model, enabling it to learn relevant information in the context of the question. Based on this, prompt engineering is designed to guide the model to generate more accurate responses, and the results are evaluated. Experimental results show that compared with directly using large language models, the knowledge graph-enhanced large language model has a higher accuracy rate in intelligent responses in the field of enterprise equipment, providing strong support for the construction of enterprise equipment question-answering systems.展开更多
基金Hebei Province Higher Education Scientific Research Project(QN2025367)Zhangjiakou City 2022 Municipal Science and Technology Plan Self-raised Fund Project(221105D)Hebei Province Education Science“14th Five-Year Plan”Project(2404224).
文摘Objective To develop QingNangTCM,a specialized large language model(LLM)tailored for expert-level traditional Chinese medicine(TCM)question-answering and clinical reasoning,addressing the scarcity of domain-specific corpora and specialized alignment.Methods We constructed QnTCM_Dataset,a corpus of 100000 entries,by integrating data from ShenNong_TCM_Dataset and SymMap v2.0,and synthesizing additional samples via retrieval-augmented generation(RAG)and persona-driven generation.The dataset comprehensively covers diagnostic inquiries,prescriptions,and herbal knowledge.Utilizing P-Tuning v2,we fine-tuned the GLM-4-9B-Chat backbone to develop QingNangTCM.A multidimensional evaluation framework,assessing accuracy,coverage,consistency,safety,professionalism,and fluency,was established using metrics such as bilingual evaluation understudy(BLEU),recall-oriented understudy for gisting evaluation(ROUGE),metric for evaluation of translation with explicit ordering(METEOR),and LLM-as-a-Judge with expert review.Qualitative analysis was conducted across four simulated clinical scenarios:symptom analysis,disease treatment,herb inquiry,and failure cases.Baseline models included GLM-4-9BChat,DeepSeek-V2,HuatuoGPT-II(7B),and GLM-4-9B-Chat(freeze-tuning).Results QingNangTCM achieved the highest scores in BLEU-1/2/3/4(0.425/0.298/0.137/0.064),ROUGE-1/2(0.368/0.157),and METEOR(0.218),demonstrating a balanced and superior normalized performance profile of 0.900 across the dimensions of accuracy,coverage,and consistency.Although its ROUGE-L score(0.299)was lower than that of HuatuoGPT-II(7B)(0.351),it significantly outperformed domain-specific models in expert-validated win rates for professionalism(86%)and safety(73%).Qualitative analysis confirmed that the model strictly adheres to the“symptom-syndrome-pathogenesis-treatment”reasoning chain,though occasional misclassifications and hallucinations persisted when dealing with rare medicinal materials and uncommon syndromes.
文摘针对现有的中文命名实体识别算法没有充分考虑实体识别任务的数据特征,存在中文样本数据的类别不平衡、训练数据中的噪声太大和每次模型生成数据的分布差异较大的问题,提出了一种以BERT-BiLSTM-CRF(Bidirectional Encoder Representations from Transformers-Bidirectional Long Short-Term Memory-Conditional Random Field)为基线改进的中文命名实体识别模型。首先在BERT-BiLSTM-CRF模型上结合P-Tuning v2技术,精确提取数据特征,然后使用3个损失函数包括聚焦损失(Focal Loss)、标签平滑(Label Smoothing)和KL Loss(Kullback-Leibler divergence loss)作为正则项参与损失计算。实验结果表明,改进的模型在Weibo、Resume和MSRA(Microsoft Research Asia)数据集上的F 1得分分别为71.13%、96.31%、95.90%,验证了所提算法具有更好的性能,并且在不同的下游任务中,所提算法易于与其他的神经网络结合与扩展。
文摘在大数据时代,海量的互联网信息飞速增长,人们对信息获取的精准度与效率提出了更高的要求。随着企业信息化和装备管理现代化的不断推进,对海量企业装备信息进行有效的提炼、管理与利用,对于提升企业装备知识的应用价值以及企业资源的利用效率具有重要意义。本研究提出了一套融合大语言模型自然语言处理能力的系统,可智能理解用户查询并提供精准的装备信息。通过采用P-Tuning v2方法对大语言模型进行微调,大幅提升了其在企业装备领域对关键词的识别和提取能力。同时,借助企业装备知识图谱作为本地知识库,为模型提供行业领域知识,使其能够将相关信息作为问题的上下文进行学习。基于此,还设计了提示工程来引导模型生成更准确的回复,并对结果进行了效果评估。实验结果表明,相较于直接使用大语言模型,该基于知识图谱增强的大语言模型在企业装备领域的智能化回复准确率更高,为企业装备问答系统的建设提供了有力支持。In the era of big data, the volume of Internet information is growing at an astonishing rate, and people have put forward higher requirements for the accuracy and efficiency of information acquisition. With the continuous advancement of enterprise informatization and modernization of equipment management, effectively extracting, managing and utilizing massive enterprise equipment information is of great significance for enhancing the application value of enterprise equipment knowledge and improving the efficiency of enterprise resource utilization. This study proposes a system that integrates the natural language processing capabilities of large language models, which can intelligently understand user queries and provide precise equipment information. By using the P-Tuning v2 method to fine-tune the large language model, its ability to recognize and extract keywords in the field of enterprise equipment has been significantly enhanced. At the same time, with the help of the enterprise equipment knowledge graph as a local knowledge base, industry-specific knowledge is provided to the model, enabling it to learn relevant information in the context of the question. Based on this, prompt engineering is designed to guide the model to generate more accurate responses, and the results are evaluated. Experimental results show that compared with directly using large language models, the knowledge graph-enhanced large language model has a higher accuracy rate in intelligent responses in the field of enterprise equipment, providing strong support for the construction of enterprise equipment question-answering systems.