期刊文献+
共找到4篇文章
< 1 >
每页显示 20 50 100
ChatGPT-4.0与DeepSeek-V3两种人工智能语言模型在回答近视问题的基准分析比较
1
作者 姚晶磊 李露茜 +3 位作者 姜慧君 Sun Chen-Hsin 任骁方 肖林 《中国医学装备》 2026年第3期86-89,共4页
目的:比较ChatGPT-4.0与DeepSeek-V3两种人工智能(AI)聊天机器人在应答近视问题的表现差异,为AI聊天机器人的应用提供参考。方法:2024年10月至2025年3月在新加坡国立大学医院(NUHS)和中国北京京煤集团总医院进行大型语言模型(LLM)ChatGP... 目的:比较ChatGPT-4.0与DeepSeek-V3两种人工智能(AI)聊天机器人在应答近视问题的表现差异,为AI聊天机器人的应用提供参考。方法:2024年10月至2025年3月在新加坡国立大学医院(NUHS)和中国北京京煤集团总医院进行大型语言模型(LLM)ChatGPT-4.0与DeepSeek-V3两种AI聊天机器人对近视问题回答结果进行测试,经专家测评比较其准确性和全面性。近视问答内容为眼科临床中最常遇到的30道近视相关问题,包括近视的发病机制、临床表现、诊断、治疗、预防和预后6个主题,从准确性和全面性两方面对两种AI聊天机器人进行评分评价。结果:准确性评价方面,ChatGPT-4.0聊天机器人回答结果被测评为“良好”的11题(占36.7%),DeepSeek-V3聊天机器人为23题(占76.7%),其占比比较差异有统计学意义(x^(2)=9.791,P<0.05)。全面性评价方面,对准确性评价为“良好”的答案,ChatGPT-4.0聊天机器人回答问题全面性评分为(2.44±0.33)分,DeepSeek-V3聊天机器人为(2.63±0.17)分,差异无统计学意义(P>0.05)。结论:AI聊天机器人可为用户的近视咨询提供有效帮助,DeepSeek-V3聊天机器人对近视问题应答的准确性较ChatGPT-4.0聊天机器人更高。 展开更多
关键词 近视 chatgpt-4.0聊天机器人 DeepSeek-V3聊天机器人 大语言模型(LLM)
暂未订购
基于人工智能生成内容语言模型的中医术语英译对比研究 被引量:3
2
作者 张小薇 汤思哲 +2 位作者 桑楠 王艺超 刘艾娟 《世界中西医结合杂志》 2025年第6期1255-1262,共8页
近年来,人工智能生成内容(AI-generated content,AIGC)语言模型的迅猛发展使翻译领域焕发出新的生机。为了考察AIGC语言模型在中医翻译中的表现,文章利用ChatGPT-4、星火认知大模型V3.5、文心一言4.0三款大语言模型翻译中成药说明书中... 近年来,人工智能生成内容(AI-generated content,AIGC)语言模型的迅猛发展使翻译领域焕发出新的生机。为了考察AIGC语言模型在中医翻译中的表现,文章利用ChatGPT-4、星火认知大模型V3.5、文心一言4.0三款大语言模型翻译中成药说明书中的功能主治术语,采用文本对比分析的方式,评估AIGC语言模型译文的质量。研究显示,三款大语言模型在中医术语翻译上各有长处,AIGC语言模型在中医翻译领域有很大的发展潜力。基于对三款主流大语言模型翻译质量的评估,文中针对性地提出AIGC时代背景下中医药外语人才培养建议,以期促进中医药外语人才符合时代需求。 展开更多
关键词 AIGC大语言模型 chatgpt-4 星火认知大模型V3.5 文心一言4.0 中医术语 翻译
暂未订购
Exploring the performance of large language models on hepatitis B infection-related questions:A comparative study 被引量:1
3
作者 Yu Li Chen-Kai Huang +3 位作者 Yi Hu Xiao-Dong Zhou Cong He Jia-Wei Zhong 《World Journal of Gastroenterology》 SCIE CAS 2025年第3期103-112,共10页
BACKGROUND Patients with hepatitis B virus(HBV)infection require chronic and personalized care to improve outcomes.Large language models(LLMs)can potentially provide medical information for patients.AIM To examine the... BACKGROUND Patients with hepatitis B virus(HBV)infection require chronic and personalized care to improve outcomes.Large language models(LLMs)can potentially provide medical information for patients.AIM To examine the performance of three LLMs,ChatGPT-3.5,ChatGPT-4.0,and Google Gemini,in answering HBV-related questions.METHODS LLMs’responses to HBV-related questions were independently graded by two medical professionals using a four-point accuracy scale,and disagreements were resolved by a third reviewer.Each question was run three times using three LLMs.Readability was assessed via the Gunning Fog index and Flesch-Kincaid grade level.RESULTS Overall,all three LLM chatbots achieved high average accuracy scores for subjective questions(ChatGPT-3.5:3.50;ChatGPT-4.0:3.69;Google Gemini:3.53,out of a maximum score of 4).With respect to objective questions,ChatGPT-4.0 achieved an 80.8%accuracy rate,compared with 62.9%for ChatGPT-3.5 and 73.1%for Google Gemini.Across the six domains,ChatGPT-4.0 performed better in terms of diagnosis,whereas Google Gemini demonstrated excellent clinical manifestations.Notably,in the readability analysis,the mean Gunning Fog index and Flesch-Kincaid grade level scores of the three LLM chatbots were significantly higher than the standard level eight,far exceeding the reading level of the normal population.CONCLUSION Our results highlight the potential of LLMs,especially ChatGPT-4.0,for delivering responses to HBV-related questions.LLMs may be an adjunctive informational tool for patients and physicians to improve outcomes.Nevertheless,current LLMs should not replace personalized treatment recommendations from physicians in the management of HBV infection. 展开更多
关键词 chatgpt-3.5 chatgpt-4.0 Google Gemini Hepatitis B infection ACCURACY
暂未订购
Assessing the possibility of using large language models in ocular surface diseases 被引量:1
4
作者 Qian Ling Zi-Song Xu +11 位作者 Yan-Mei Zeng Qi Hong Xian-Zhe Qian Jin-Yu Hu Chong-Gang Pei Hong Wei Jie Zou Cheng Chen Xiao-Yu Wang Xu Chen Zhen-Kai Wu Yi Shao 《International Journal of Ophthalmology(English edition)》 2025年第1期1-8,共8页
AIM:To assess the possibility of using different large language models(LLMs)in ocular surface diseases by selecting five different LLMS to test their accuracy in answering specialized questions related to ocular surfa... AIM:To assess the possibility of using different large language models(LLMs)in ocular surface diseases by selecting five different LLMS to test their accuracy in answering specialized questions related to ocular surface diseases:ChatGPT-4,ChatGPT-3.5,Claude 2,PaLM2,and SenseNova.METHODS:A group of experienced ophthalmology professors were asked to develop a 100-question singlechoice question on ocular surface diseases designed to assess the performance of LLMs and human participants in answering ophthalmology specialty exam questions.The exam includes questions on the following topics:keratitis disease(20 questions),keratoconus,keratomalaciac,corneal dystrophy,corneal degeneration,erosive corneal ulcers,and corneal lesions associated with systemic diseases(20 questions),conjunctivitis disease(20 questions),trachoma,pterygoid and conjunctival tumor diseases(20 questions),and dry eye disease(20 questions).Then the total score of each LLMs and compared their mean score,mean correlation,variance,and confidence were calculated.RESULTS:GPT-4 exhibited the highest performance in terms of LLMs.Comparing the average scores of the LLMs group with the four human groups,chief physician,attending physician,regular trainee,and graduate student,it was found that except for ChatGPT-4,the total score of the rest of the LLMs is lower than that of the graduate student group,which had the lowest score in the human group.Both ChatGPT-4 and PaLM2 were more likely to give exact and correct answers,giving very little chance of an incorrect answer.ChatGPT-4 showed higher credibility when answering questions,with a success rate of 59%,but gave the wrong answer to the question 28% of the time.CONCLUSION:GPT-4 model exhibits excellent performance in both answer relevance and confidence.PaLM2 shows a positive correlation(up to 0.8)in terms of answer accuracy during the exam.In terms of answer confidence,PaLM2 is second only to GPT4 and surpasses Claude 2,SenseNova,and GPT-3.5.Despite the fact that ocular surface disease is a highly specialized discipline,GPT-4 still exhibits superior performance,suggesting that its potential and ability to be applied in this field is enormous,perhaps with the potential to be a valuable resource for medical students and clinicians in the future. 展开更多
关键词 chatgpt-4.0 chatgpt-3.5 large language models ocular surface diseases
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部