期刊文献+

大语言模型在检验医学中的应用评测体系现状及进展

Current status and progress in the application evaluation system for applications of large language model in laboratory medicine
在线阅读 下载PDF
导出
摘要 大语言模型(LLM)是基于Transformer架构和海量数据训练的深度学习模型,具有对话、内容生成和推理能力。LLM赋能智慧检验医学,在检验前、中、后及实验室管理等环节具有多种应用场景。但是,LLM的应用伴随着幻觉、可解释性差等风险,其安全性和有效性亟待严格评估。应用评测体系用于衡量LLM在真实场景中的效果与价值,因此,构建一套科学、全面的应用评测体系至关重要。该文综述了LLM应用评测体系的构成要素,包括评测的维度、指标、评分、数据集、策略及方法,阐述LLM在检验医学领域的应用评测案例,发现评测数据集以公开及模拟数据为主,还面临着决策不透明、缺乏公认标准、隐私及数据安全等挑战。未来将聚焦于构建专用评测框架、采用真实世界数据集、健全应用监管体系及人机协同工作新范式等。探索LLM的应用评测体系,可为LLM在检验医学领域的安全、有效及合规应用提供理论框架与实践参考。 Large language model(LLM)is deep learning models based on the Transformer architecture and trained on massive datasets,possessing capabilities for dialogue,content generation and reasoning.LLM empower intelligent laboratory medicine,presenting diverse application scenarios across pre-analytical,analytical and post-analytical phases,as well as laboratory management.However,the application of LLM is accompanied by risks such as hallucinations and poor interpretability;therefore,their safety and effectiveness need to be rigorously evaluated.Application evaluation systems are used to assess the effectiveness and value of LLM in real-world scenarios;therefore,establishing a scientific and comprehensive application evaluation system is crucial.This article reviews the constituent elements of LLM application evaluation systems,including evaluation dimensions,metrics,scoring schemes,datasets,strategies and methods,and elaborates on application evaluation cases of LLM in laboratory medicine.It is found that evaluation datasets are mainly based on public and simulated data,and that challenges such as opaque decision-making,lack of widely accepted standards,and issues of privacy and data security still remain.Future efforts will focus on constructing dedicated evaluation frameworks,adopting real-world datasets,improving application regulatory systems,and establishing new paradigms for human-machine collaboration.Exploration of application evaluation systems for LLM can provide a theoretical framework and practical reference for the safe,effective,and compliant application of LLM in laboratory medicine.
作者 刘涛(综述) 杨大干(审校) LIU Tao;YANG Dagan(College of Medical Technology and Information Engineering,Zhejiang Chinese Medical University,Hangzhou,Zhejiang 310053,China;Department of Laboratory Medicine,Yongkang Maternal and Child Care Hospital,Jinhua,Zhejiang 321300,China;Department of Laboratory Medicine,the First Affiliated Hospital,Zhejiang University School of Medicine,Hangzhou,Zhejiang 310003,China)
出处 《检验医学与临床》 2025年第24期3322-3327,3334,共7页 Laboratory Medicine and Clinic
基金 国家重点研发计划项目课题(2022YFC3602302)。
关键词 大语言模型 检验医学 人工智能 应用评测体系 幻觉 large language model laboratory medicine artificial intelligence application evaluation system hallucination
  • 相关文献

参考文献6

二级参考文献31

共引文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部