期刊文献+

基于信息抽取算法与多任务学习的电子病历文本数据研究

Research on Electronic Medical Record Text Data Based on Information Extraction Algorithm and Multi-Task Learning
暂未订购
导出
摘要 目的为解决电子病历文本数据中关键临床信息提取效率低下的问题,支持精准医疗决策与疾病研究,开发一种高效的联合信息抽取算法,实现病历文本中实体与关系的自动化抽取。方法基于病历文本非结构化特征差异,提出一种融合多任务学习的联合抽取模型。首先构建双向长短期记忆(Bidirectional Long Short-Term Memory,BiLSTM)网络-条件随机场(Conditional Random Field,CRF)基准模型(BiLSTM-CRF),结合双向编码器与CRF完成实体识别;其次,引入多头注意力机制捕捉实体间的远程依赖关系;最后,采用多任务学习框架处理实体关系重叠问题,得到联合模型双向编码器表征法(Bidirectional Encoder Representations from Transformers,BERT)-BiLSTM-CRF。结果在中文电子病历数据集上进行训练与验证以评估脑血管疾病领域的抽取性能。结果表明,BERT-BiLSTM-CRF模型在数据集上的文本信息实体识别准确度均超过80.00%,且对实体关系的抽取误差结果均不超过0.2,优于其他算法模型。BERT-BiLSTM-CRF算法在脑血管疾病实例数据集上的识别准确度为91.18%,能较好地对医疗文本数据进行关系识别。结论BERT-BiLSTM-CRF模型能有效突破实体关系重叠的技术瓶颈,为电子病历深度挖掘提供新方法,为临床医疗决策和疾病诊断提供研究思路。 Objective To solve the problem of low efficiency in extracting key clinical information from electronic medical record text data,support precision medical decision-making and disease research,so as to develope an efficient joint information extraction algorithm to achieve the automatic extraction of entities and relationships in medical record texts.Methods Based on the differences in unstructured features of medical record texts,a joint extraction model integrating multi-task learning was proposed.Firstly,bidirectional long short-term memory(BiLSTM)-conditional random field(CRF)baseline model(BiLSTM-CRF)was constructed.Entity recognition was accomplished by combining bidirectional encoders with CRF.Secondly,a multi-head attention mechanism was introduced to capture the remote dependencies between entities.Finally,the multi-task learning framework was adopted to handle the problem of entity relationship overlap,obtaining the bidirectional encoder representations from transformers(BERT)-BiLSTM-CRF of the joint model.Results Training and validation were conducted on the Chinese electronic medical record dataset to evaluate the extraction performance in the field of cerebrovascular diseases.The results showed that the entity recognition accuracy of the text information of the BERT-BiLSTM-CRF model on the dataset exceeded 80.00%,and the extraction error results of entity relationships did not exceed 0.2,which was superior to other algorithm models.The recognition accuracy of the BERT-BiLSTM-CRF algorithm in instance dataset of cerebrovascular diseases reached 91.18%,and it performed relationship recognition on medical text data quite well.Conclusion The BERT-BiLSTM-CRF model can effectively break through the technical bottleneck of overlapping entity relationships,provide a new method for the in-depth mining of electronic medical records,and offer research ideas for clinical medical decision-making and disease diagnosis.
作者 陈硕 周全 CHEN Shuo;ZHOU Quan(Department of Information,No.988 Hospital of Joint Logistics Support Force,Zhengzhou Henan 450000,China;Department of Information,No.990 Hospital of Joint Logistics Support Force,Zhumadian Henan 463000,China)
出处 《中国医疗设备》 2025年第12期93-99,共7页 China Medical Devices
关键词 电子病历 词向量 双向编码器表征法(BERT) 双向长短期记忆(BiLSTM)网络 注意力机制 信息抽取算法 条件随机场(CRF) electronic medical record word vector bidirectional encoder representations from transformers(BERT) bidirectional long short-term memory(BiLSTM)network attention mechanism information extraction algorithm conditional random field(CRF)
  • 相关文献

参考文献22

二级参考文献169

共引文献157

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部