摘要
近年来,大语言模型(Large Language Model,LLM)的快速发展显著推动了人工智能与各垂直领域的融合应用。然而,其在工业场景的部署应用仍面临高计算开销与缺乏领域知识深度适配的双重挑战。本方案针对格物平台,构建了一套基于知识蒸馏和思维链技术的行业大模型框架。该框架通过监督微调,将大模型的推理能力和领域知识迁移到较小规模的学生模型上,从而构建出兼具高效推理能力和领域知识专精化的轻量化模型。实验结果显示,经过蒸馏微调后的Qwen2.5-7B模型在评测集上的准确率显著提升。
In recent years,the rapid development of Large Language Models has significantly promoted the integration and application of artificial intelligence in various vertical fields.However,their deployment and application in industrial scenarios still face the dual challenges of high computational overhead and the lack of in-depth adaptation of domain knowledge.This paper proposes a framework for building industry-specific large language models for the GeWu platform based on knowledge distillation and chain-of-thought techniques.Through supervised fine-tuning,this framework transfers the reasoning ability and domain knowledge of the large model to a smaller-scale student model,thereby constructing a light-weight model with both efficient reasoning ability and specialized domain knowledge.Experimental results show that the accuracy of the Qwen2.5-7B model after distillation and fine-tuning has been significantly improved on the evaluation set.
作者
朱亮
蒋维
唐俊
李子涵
李朝辉
ZHU Liang;JIANG Wei;TANG Jun;LI Zihan;LI Zhaohui(China Unicom Digital Technology Co.,Ltd.)
出处
《江苏通信》
2025年第5期69-73,共5页
Jiangsu Communications
关键词
大语言模型
知识蒸馏
思维链
监督微调
Large Language Model
knowledge distillation
chain-of-thought
supervised fine-tuning