摘要
目的构建并验证皮肤型红斑狼疮(CLE)(包含未明确分型的患者)及其2种亚型[盘状红斑狼疮(DLE)和亚急性皮肤型红斑狼疮(SCLE)]的数据提取与患者识别规则,以实现从医疗保险数据库中高效识别CLE患者。方法利用2013—2017年全国医疗保险数据库,构建数据提取与患者识别规则,并以人工核查结果为金标准,计算规则的灵敏度和特异度,同时对识别出的患者进行基本特征分析。结果基于标准医学术语及诊断编码构建标准表达式,并结合临床医师的经验,补充可能的别称、错写等情况,以完善初筛表达式。在此基础上,由临床医师与数据管理工程师通过反复核查,最终确定疾病精筛规则。构建的3种目标疾病的数据提取与患者快速识别规则均表现良好,CLE、DLE和SCLE的灵敏度分别为0.985、1.000和0.991,特异度分别为0.997、0.999和0.998。共提取2013—2017年CLE患者34554例,其中DLE患者2879例,SCLE患者623例,女性比例均高于男性。结论构建的CLE数据提取与患者识别规则具备良好的性能,为皮肤病学研究中患者快速识别方法的开发与优化提供了参考。
Objective To develop and validate data extraction and patient identification algorithms for cutaneous lupus erythematosus(CLE)and its two subtypes,discoid lupus erythematosus(DLE)and subacute cutaneous lupus erythematosus(SCLE),and to enable high-efficiency patient identification in large-scale electronic health databases.Methods This study utilized data from the 2013-2017 National Insurance Claims for Epidemiological Research(NICER)to construct data extraction and rapid patient identification algorithms.The manual verification results were used as gold standard to assess the sensitivity and specificity of the algorithms.Additionally,the basic characteristics of the identified patients were analyzed.Results Initially,standardized expressions were developed based on medical terminology and diagnostic codes.These were further refined with input from clinicians to include potential synonyms and common misspellings,improving the preliminary screening expressions.Through iterative verification by clinicians and data management engineers,a final disease-specific screening algorithm was established.The developed extraction and identification algorithms for all 3 targeted disease demonstrated strong performance,with sensitivity values of 0.985,1.000,and 0.991,and specificity values of 0.997,0.999,and 0.998 for CLE,DLE,and SCLE,respectively.A total of 34,554 CLE cases,including 2,879 DLE cases,and 623 SCLE cases were identified between 2013 and 2017,with a higher prevalence among females than males.Conclusion This study developed and validated an identification algorithm for CLE patients based on medical insurance databases,demonstrating high performance.The proposed algorithm provides a methodological framework and empirical evidence for designing and optimizing big data-driven rapid patient identification algorithms in dermatology research.
作者
王予童
孟祥龙
潘雨
尉晨
靳慧
王胜锋
WANG Yutong;MENG Xianglong;PAN Yu;WEI Chen;JIN Hui;WANG Shengfeng(Department of Epidemiology and Biostatistics,School of Public Health,Peking University,Beijing 100191,China;Key Laboratory of Epidemiology of Major Diseases(Peking University),Ministry of Education,Beijing 100191,China;Hospital for Skin Diseases,Institute of Dermatology,Chinese Academy of Medical Sciences and Peking Union Medical College,Nanjing 210042,China;Key Laboratory of Basic and Translational Research for Immune-Mediated Skin Diseases,Chinese Academy of Medical Sciences,Nanjing 210042,China;Shanghai Songsheng Business Consulting Co.Ltd.,Shanghai 201913,China;Institute for Artificial Intelligence,Peking University,Beijing 100871,China)
出处
《药物流行病学杂志》
2025年第7期743-752,共10页
Chinese Journal of Pharmacoepidemiology
基金
国家自然科学基金专项项目(72342015)。
关键词
皮肤型红斑狼疮
医疗保险数据
数据快速提取规则
Cutaneous lupus erythematosus
Medical insurance
Rapid identification algorithm