期刊文献+

高速列车标书技术指标自动抽取对齐方法

Automatic Extraction and Alignment Method of Technical Specifications from High-Speed Train Tender Documents
在线阅读 下载PDF
导出
摘要 标书是高速列车配置设计的核心依据,现有的技术指标抽取主要依赖人工操作,存在效率低、易遗漏等问题。尽管近年来文档实体及关系抽取技术取得了显著进展,但由于不同业主对技术指标的描述存在差异,且列车模块间存在复杂的结构及接口约束,导致技术指标的自动抽取仍难以达到预期。为此,提出了一种融合知识图谱和大语言模型的标书技术指标自动抽取方法。首先,建立谱系配置设计本体模型,定义了本体结构及关联约束;其次,设计了自动抽取技术框架,通过预训练大语言模型实现标书技术指标的自动抽取,通过谱系元结构树和名称字典对抽取结果进行预对齐,构建出产品技术指标数据图,通过形状约束定义语言对产品模块、模块间结构及参数值约束进行了定义,构建出形状约束图对抽取结果进行检查并提供修正依据;最后,开发了自动抽取工具软件,并以某高速列车标书为例,验证了方法的有效性。 The tender document is the core basis for the configuration design of high-speed trains.Existing methods for extracting technical specifications rely mainly on manual operations,leading to low efficiency and the possibility of omissions.Despite significant progress in document entity and relation extraction technology in recent years,the automatic extraction of technical specifications still fails to meet expectations due to differences in the descriptions of specifications of various stakeholders and the complex structural and interface constraints between train modules.To address this,a method for automatically extracting technical indicators from bidding documents by integrating knowledge graphs and large language models has been proposed.First,a product series configuration design ontology model is established,and the ontology structure and constraints are defined.Next,an automatic extraction framework is designed,wherein a pre-trained large language model is used to automatically extract entities and relations from the tender document.The extraction results are pre-aligned using a product series meta-structure tree and an named dictionary to construct a product technical specification data graph.Additionally,structural,interface,and performance constraints between modules are defined using a shape constraint definition language,and a shape constraint graph is constructed to check and correct the automatic extraction results.Finally,an automatic extraction tool is developed,and the effectiveness of the proposed method is verified using a case study of a high-speed train tender document.
作者 任坤华 张雨婷 王淑营 REN Kun-hua;ZHANG Yu-ting;WANG Shu-ying(CRRC Academy Co.,Ltd.,Beijing 100071;School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu 61003l)
出处 《制造业自动化》 2025年第8期160-169,共10页 Manufacturing Automation
基金 国家重点研发计划(2022YFC3005200)。
关键词 模块化配置设计 知识图谱 实体关系抽取 预训练大语言模型 图形状约束 modular configuration design knowledge graph entity relation extraction LLM shape constraints
  • 相关文献

参考文献6

二级参考文献63

共引文献39

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部