摘要
为提升煤矿瓦斯事故数据集实体识别的精度和召回率,应对原始数据规模小和标注数据缺乏的问题,采用大语言模型进行语料增强,并构建命名实体识别模型BiLSTM-CRF进行研究。通过对比深度学习模型BiLSTM-CRF及其经过优化后的模型效果,验证数据增强方法的有效性。研究结果表明:经过数据增强的BiLSTM-CRF模型在煤矿瓦斯事故数据集上表现出更高的精度和召回率,相较于原有模型BiLSTM-CRF,具有更为出色的表现。此外,结合知识图谱和大语言模型应用于安全预警,经过GPT-4数据增强后的煤矿瓦斯事故实体识别准确率为91.5%,相较于未经过数据增强的基线准确率83.1%,提升了8.4百分点。研究结果可为煤矿瓦斯事故的风险防控提供1种新的数据处理方法和实体识别技术手段,有助于提高煤矿安全预警和事故防控的准确性和可靠性。
In order to enhance the precision and recall of entity recognition in coal mine gas accident datasets while addressing the challenges of small-scale raw data and insufficient annotated data,this study employed large language models(LLMs)for corpus augmentation and constructs a BiLSTM-CRF named entity recognition(NER)model.By comparing the performance of the deep learning model BiLSTM-CRF with its optimized variants,the effectiveness of the data augmentation approach was validated.The results demonstrate that the data-augmented BiLSTM-CRF model achieves significantly higher precision and recall on coal mine gas accident datasets,outperforming the original BiLSTM-CRF model.Furthermore,integrating knowledge graphs and LLMs for safety early warning,the GPT-4-enhanced gas accident entity recognition attains an accuracy of 91.5%—an 8.4 percentage point improvement over the non-augmented baseline accuracy of 83.1%.These findings provide a novel data processing methodology and NER technical solution for risk prevention and control in coal mine gas accidents,there by enhancing the reliability and accuracy of coal mine safety early warning and accident control.
作者
蔡春城
刘永
宿国瑞
招晖
崔杰
胡而已
王泽
CAI Chuncheng;LIU Yong;SU Guorui;ZHAO Hui;CUI Jie;HU Eryi;WANG Ze(Shanghai Datun Energy Co.,Ltd.,Xuzhou Jiangsu 221600,China;Information Institute of Ministry of Emergency Management,Beijing 100029,China;Beijing Jingtong Kexin Technology Co.,Ltd,Beijing 100102,China)
出处
《中国安全生产科学技术》
北大核心
2025年第11期90-97,共8页
Journal of Safety Science and Technology
基金
中煤集团重点科技项目(20221CY001)。
关键词
大语言模型
煤矿安全
数据增强
深度学习
large language models
coal mine safety
data augmentation
deep learning