期刊文献+
共找到6篇文章
< 1 >
每页显示 20 50 100
Evaluating ChatGPT-4o for ophthalmic image interpretation:From in-context learning to code-free clinical tool generation
1
作者 Joon Yul Choi Tae Keun Yoo 《Informatics and Health》 2025年第2期158-169,共12页
Background:Large language models(LLMs)such as ChatGPT-4o have demonstrated emerging capabilities in medical reasoning and image interpretation.However,their diagnostic applicability in ophthalmology,partic-ularly acro... Background:Large language models(LLMs)such as ChatGPT-4o have demonstrated emerging capabilities in medical reasoning and image interpretation.However,their diagnostic applicability in ophthalmology,partic-ularly across diverse imaging modalities,remains insufficiently characterized.This study evaluates ChatGPT-4o’s performance in ophthalmic image interpretation,exemplar-guided reasoning(in-context learning),and code-free diagnostic tool generation using publicly available datasets.Methods:We assessed ChatGPT-4o through three clinically relevant tasks:(1)image interpretation without prior examples,using fundus,external ocular,and facial photographs representing key ophthalmic conditions;(2)incontext learning with example-based prompts to improve classification accuracy;and(3)generation of an interactive HTML-based decision-support tool from a clinical diagnostic algorithm.All evaluations were per-formed using open-access datasets without model fine-tuning Results:When interpreting images without reference examples,ChatGPT-4o achieved diagnostic accuracies of 90.3%for diabetic retinopathy,77.4%for age-related macular degeneration,100%for conjunctival melanoma,97.3%for pterygium,and 85.7%for strabismus subtypes.In-context learning consistently improved diagnostic performance across all modalities,with strabismus classification reaching 100%accuracy.Compared to Effi-cientNetB2,ChatGPT-4o demonstrated comparable or superior performance in several diagnostic tasks.Addi-tionally,the model successfully translated schematic clinical algorithms into functional,browser-based diagnostic tools using natural language prompts alone.Conclusions:ChatGPT-4o demonstrates promise in ophthalmic image interpretation and low-code clinical tool development,particularly when guided by in-context learning.However,these findings are based on a limited diagnostic spectrum and publicly available datasets.Broader clinical validation and head-to-head comparisons with domain-specific models are needed to establish its practical utility in ophthalmology. 展开更多
关键词 Large language model Fundus photography Strabismus in-context learning Ophthalmic diagnosis Decision-support tool ChatGPT-4o
在线阅读 下载PDF
Dr.ICL:Demonstration-Retrieved In-context Learning 被引量:1
2
作者 Man Luo Xin Xu +5 位作者 Zhuyun Dai Panupong Pasupat Mehran Kazemi Chitta Baral Vaiva Imbrasaite Vincent Y Zhao 《Data Intelligence》 2024年第4期909-922,共14页
In-context learning(ICL), which teaches a large language model(LLM) to perform a task with few-shot demonstrations rather than adjusting the model parameters, has emerged as a strong paradigm for using LLMs. While ear... In-context learning(ICL), which teaches a large language model(LLM) to perform a task with few-shot demonstrations rather than adjusting the model parameters, has emerged as a strong paradigm for using LLMs. While early studies primarily used a fixed or random set of demonstrations for all test queries, recent research suggests that retrieving semantically similar demonstrations to the input from a pool of available demonstrations results in better performance. This work expands the applicability of retrieval-based ICL approaches along several dimensions. We extend the success of retrieval-based ICL to instructionfinetuned LLMs as well as Chain-of-Thought(CoT) prompting. While the prior work utilizes general Large Language Models(LLMs), such as GPT-3, we find that retrieved demonstrations also enhance instructionfinetuned LLMs. This insight implies that training data, despite being exposed during the fine-tuning phase, can still be effectively used through retrieval and in-context demonstrations during testing, resulting in superior outcomes when compared to utilizing no demonstrations or selecting them at random. For CoT, when the demonstrations contain reasoning chains, we get improvements by retrieving based on such chains. Finally, we train a task-specific demonstration retriever that outperforms off-the-shelf retrievers. 展开更多
关键词 Information retrieval in-context learning Large language models Retrieval augmented generation Large language model reasoning
原文传递
Context-Aware Visual Entailment Driven by Specific Instructions
3
作者 HAN Yufeng HAO Kuangrong +1 位作者 TANG Xuesong WEI Bing 《Journal of Donghua University(English Edition)》 2025年第2期177-186,共10页
Visual entailment(VE)is a prototypical task in multimodal visual reasoning,where current methods frequently utilize large language models(LLMs)as the knowledge base to assist in answering questions.These methods heavi... Visual entailment(VE)is a prototypical task in multimodal visual reasoning,where current methods frequently utilize large language models(LLMs)as the knowledge base to assist in answering questions.These methods heavily rely on the textual modality,which inherently cannot capture the full extent of information contained within images.We propose a context-aware visual entailment(CAVE)model,which introduces a novel aggregation module designed to extract high-level semantic features from images.This module integrates lower-level semantic image features into high-level visual tokens,formatting them similarly to text tokens so that they can serve as inputs for LLMs.The CAVE model compensates for the loss of image information and integrates it more effectively with textual comprehension.Additionally,the CAVE model incorporates a new input format and training methodology,which is rooted in instruction tuning and in-context learning techniques.The objective of this research is to maximize the inherent logical reasoning capabilities of LLMs.Experimental results on the E-SNLIVE dataset show that the proposed CAVE model exhibits outstanding performance. 展开更多
关键词 visual entailment(VE) textual-visual integration instruction tuning in-context learning
在线阅读 下载PDF
A Novel Optimization Scheme for Named Entity Recognition with Pre-trained Language Models
4
作者 Shuanglong Li Xulong Zhang Jianzong Wang 《Journal of Electronic Research and Application》 2024年第5期125-133,共9页
Named Entity Recognition(NER)is crucial for extracting structured information from text.While traditional methods rely on rules,Conditional Random Fields(CRFs),or deep learning,the advent of large-scale Pre-trained La... Named Entity Recognition(NER)is crucial for extracting structured information from text.While traditional methods rely on rules,Conditional Random Fields(CRFs),or deep learning,the advent of large-scale Pre-trained Language Models(PLMs)offers new possibilities.PLMs excel at contextual learning,potentially simplifying many natural language processing tasks.However,their application to NER remains underexplored.This paper investigates leveraging the GPT-3 PLM for NER without fine-tuning.We propose a novel scheme that utilizes carefully crafted templates and context examples selected based on semantic similarity.Our experimental results demonstrate the feasibility of this approach,suggesting a promising direction for harnessing PLMs in NER. 展开更多
关键词 GPT-3 Named Entity Recognition Sentence-BERT model in-context example
在线阅读 下载PDF
Large language models illuminate a progressive pathway to artificial intelligent healthcare assistant 被引量:5
5
作者 Mingze Yuan Peng Bao +9 位作者 Jiajia Yuan Yunhao Shen Zifan Chen Yi Xie Jie Zhao Quanzheng Li Yang Chen Li Zhang Lin Shen Bin Dong 《Medicine Plus》 2024年第2期102-124,共23页
With the rapid development of artificial intelligence,large language models(LLMs)have shown promising capabilities in mimicking human-level language comprehen-sion and reasoning.This has sparked significant interest i... With the rapid development of artificial intelligence,large language models(LLMs)have shown promising capabilities in mimicking human-level language comprehen-sion and reasoning.This has sparked significant interest in applying LLMs to enhance various aspects of healthcare,ranging from medical education to clinical decision support.However,medicine involves multifaceted data modalities and nuanced reasoning skills,presenting challenges for integrating LLMs.This review introduces the fundamental applications of general-purpose and specialized LLMs,demon-strating their utilities in knowledge retrieval,research support,clinical workflow automation,and diagnostic assistance.Recognizing the inherent multimodality of medicine,the review emphasizes the multimodal LLMs and discusses their ability to process diverse data types like medical imaging and electronic health records to augment diagnostic accuracy.To address LLMs'limitations regarding personalization and complex clinical reasoning,the review further explores the emerging develop-ment of LLM-powered autonomous agents for healthcare.Moreover,it summarizes the evaluation methodologies for assessing LLMs'reliability and safety in medical contexts.LLMs have transformative potential in medicine;however,there is a pivotal need for continuous optimizations and ethical oversight before these models can be effectively integrated into clinical practice. 展开更多
关键词 Large language models Artificial intelligence Medicine Healthcare assistant Prompt engineering in-context learning
原文传递
VAGen:waterbody segmentation with prompting for visual in‑context learning
6
作者 Jiapei Zhao Nobuyoshi Yabuki Tomohiro Fukuda 《AI in Civil Engineering》 2024年第1期1-20,共20页
Effective water management and flood prevention are critical challenges encountered by both urban and rural areas,necessitating precise and prompt monitoring of waterbodies.As a fundamental step in the monitoring proc... Effective water management and flood prevention are critical challenges encountered by both urban and rural areas,necessitating precise and prompt monitoring of waterbodies.As a fundamental step in the monitoring process,waterbody segmentation involves precisely delineating waterbody boundaries from imagery.Previous research using satellite images often lacks the resolution and contextual detail needed for local-scale analysis.In response to these challenges,this study seeks to address them by leveraging common natural images that are more easily accessible and provide higher resolution and more contextual information compared to satellite images.However,the segmentation of waterbodies from ordinary images faces several obstacles,including variations in lighting,occlusions from objects like trees and buildings,and reflections on the water surface,all of which can mislead algorithms.Additionally,the diverse shapes and textures of waterbodies,alongside complex backgrounds,further complicate this task.While large-scale vision models have typically been leveraged for their generalizability across various downstream tasks that are pre-trained on large datasets,their application to waterbody segmentation from ground-level images remains underexplored.Hence,this research proposed the Visual Aquatic Generalist(VAGen)as a countermeasure.Being a lightweight model for waterbody segmentation inspired by visual In-Context Learning(ICL)and Visual Prompting(VP),VAGen refines large visual models by innovatively adding learnable perturbations to enhance the quality of prompts in ICL.As demonstrated by the experimental results,VAGen demonstrated a significant increase in the mean Intersection over Union(mIoU)metric,showing a 22.38%enhancement when compared to the baseline model that lacked the integration of learnable prompts.Moreover,VAGen surpassed the current stateof-the-art(SOTA)task-specific models designed for waterbody segmentation by 6.20%.The performance evaluation and analysis of VAGen indicated its capacity to substantially reduce the number of trainable parameters and computational overhead,and proved its feasibility to be deployed on cost-limited devices including unmanned aerial vehicles(UAVs)and mobile computing platforms.This study thereby makes a valuable contribution to the field of computer vision,offering practical solutions for engineering applications related to urban flood monitoring,agricultural water resource management,and environmental conservation efforts. 展开更多
关键词 Visual in-context learning Visual prompting Vision foundation model Parameter-efficient fine-tuning Waterbody segmentation Deep learning
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部