摘要
目的:在电子病历文本中提取诊疗事件用于判断合理诊疗行为。方法:提出了一种在电子病历文本中抽取诊疗事件实体的方法,将抽取过程分为提取时间表示和提取临床实体两个步骤,首先采用正则表达式完成事件的时间表达提取和规范化,然后使用基于BERT-BiLSTM-CRF的深度学习模型提取临床医疗实体。结果:使用中国知识图谱与语义计算会议2019评测任务数据集的1379份电子病历文本进行模型验证,基于正则表达式的时间表达提取方法具有92.76%的综合识别率;基于BERT-BiLSTM-CRF模型的诊疗事件实体识别准确率为83.66%,召回率为87.66%,F1值为85.61%。结论:实验表明本研究的方法具有良好的实体抽取准确率和召回率,可以为基于电子病历文本的合理诊疗行为判断提供帮助。
Objective:To extract diagnosis and treatment events from texts of electronic medical records for judging the reasonable diagnosis and treatment behaviors.Methods:A method of extracting medical event entities from texts of electronic medical records is proposed.The extraction process is divided into extracting time representation and extracting clinical entities.First,the regular expression is used to extract and normalize the time representation of events.And then the medical clinical entities are extracted by using the deep learning model based on BERT-BiLSTMCRF.Results:1,379 texts of electronic medical records from the Evaluation Task Dataset of China Knowledge Mapping and Semantic Computing Conference in 2019 are used to verify the model.The comprehensive recognition rate of the time expression extraction method based on regular expression is 92.76%.The accuracy rate of entity recognition of diagnosis and treatment events based on BERT-BiLSTM-CRF model is 83.66%,the recall rate is 87.66%,and the F1 value is 85.61%.Conclusion:Experimental results show that the method in this paper has a high accuracy of entity extraction and recall rate.It can help with the judgment of reasonable diagnosis and treatment behaviors based on texts of electronic medical records.
出处
《中国数字医学》
2021年第7期33-38,共6页
China Digital Medicine
基金
浙江省自然科学基金(编号:2018YFB1004900)
杭州市属高校第二轮优秀创新团队
关键词
电子病历文本
诊疗事件抽取
命名实体识别
自然语言处理
electronic medical records
diagnosis and treatment event extraction
named entity recognition
natural language processing