Background:This study addresses the challenge of enhancing Retrieval Augmented Generation(RAG)search engines for electronic medical records(EMR)by learning users’distinct search semantics.The specific aim is to devel...Background:This study addresses the challenge of enhancing Retrieval Augmented Generation(RAG)search engines for electronic medical records(EMR)by learning users’distinct search semantics.The specific aim is to develop a learning-to-rank system that improves the accuracy and relevance of search results to support RAG-based search engines.Methods:Given a prompt or search query,the system first asks the user to label a few randomly selected doc-uments,which contain some keywords,as relevant to the prompt or not.The system then identifies relevant sentences and adjusts word similarities by updating a medical semantic embedding.New documents are ranked by the number of relevant sentences identified by the weighted embedding.Only the top-ranked documents and sentences are provided to a Large-Language-Model(LLM)to generate answers for further review.Findings:To evaluate our approach,four medical researchers labeled documents based on their relevance to specific diseases.We measured the information retrieval performance of our approach and two baseline methods.Results show that our approach achieved at least a 0.60 Precision-at-10(P@10)score with only ten positive labels,outperforming the baseline methods.In our pilot study,we demonstrate that the learned semantic preference can transfer to the analysis of unseen datasets,boosting the accuracy of an RAG model in extracting and explaining cancer progression diagnoses from 0.14 to 0.50.Interpretation:This study demonstrates that a customized learning-to-rank method can enhance state-of-the-art natural language models,such as LLMs,by quickly adapting to users’semantics.This approach supports EMR document retrieval and helps RAG models generate clinically meaningful answers to specific questions,under-scoring the potential of user-tailored learning-to-rank methods in clinical practice.展开更多
Next-generation nuclear reactor technologies,such as molten salt and fast reactors present complex analytical challenges that require advanced modeling and simulation tools.Yet,traditional workflows for Monte Carlo si...Next-generation nuclear reactor technologies,such as molten salt and fast reactors present complex analytical challenges that require advanced modeling and simulation tools.Yet,traditional workflows for Monte Carlo simulations like FLUKA are labor-intensive and error-prone,relying on manual input file generation and postprocessing.This limits scalability and efficiency.In this work,we present AutoFLUKA,a novel framework that leverages domain knowledge-embedded large language models(LLMs)and AI agents to automate the entire FLUKA simulation workflow from input file creation to execution management,and data analysis.AutoFLUKA also integrates Retrieval-Augmented Generation(RAG)and a web-based user-friendly graphical interface,enabling users to interact with the system in real time.Benchmarking against manual FLUKA simulations,AutoFLUKA demonstrated substantial improvements in resolving FLUKA error-related queries,particularly those arising from input file creation and execution.Traditionally,such issues are addressed through expert support on the FLUKA user forum,often resulting in significant delays.The resolution time for these queries was also reduced from several days to under one minute.Additionally,human-induced simulation errors were mitigated,and a high accuracy in key simulation metrics,such as neutron fluence and microdosimetric quantities,was achieved,with uncertainties below 0.001%for large sample sizes.The flexibility of AutoFLUKA was demonstrated through successful application to both general and specialized nuclear scenarios,and its design allows for straightforward extension to other simulation platforms.These results highlight AutoFLUKA’s potential to transform nuclear engineering analysis by enhancing productivity,reliability,and accessibility through AI-driven automation.展开更多
In-context learning(ICL), which teaches a large language model(LLM) to perform a task with few-shot demonstrations rather than adjusting the model parameters, has emerged as a strong paradigm for using LLMs. While ear...In-context learning(ICL), which teaches a large language model(LLM) to perform a task with few-shot demonstrations rather than adjusting the model parameters, has emerged as a strong paradigm for using LLMs. While early studies primarily used a fixed or random set of demonstrations for all test queries, recent research suggests that retrieving semantically similar demonstrations to the input from a pool of available demonstrations results in better performance. This work expands the applicability of retrieval-based ICL approaches along several dimensions. We extend the success of retrieval-based ICL to instructionfinetuned LLMs as well as Chain-of-Thought(CoT) prompting. While the prior work utilizes general Large Language Models(LLMs), such as GPT-3, we find that retrieved demonstrations also enhance instructionfinetuned LLMs. This insight implies that training data, despite being exposed during the fine-tuning phase, can still be effectively used through retrieval and in-context demonstrations during testing, resulting in superior outcomes when compared to utilizing no demonstrations or selecting them at random. For CoT, when the demonstrations contain reasoning chains, we get improvements by retrieving based on such chains. Finally, we train a task-specific demonstration retriever that outperforms off-the-shelf retrievers.展开更多
基金Crowd Sourcing Labels from Electronic Medical Records to Enable Biomedical Research Award Number:1 UH2 CA203708-01.
文摘Background:This study addresses the challenge of enhancing Retrieval Augmented Generation(RAG)search engines for electronic medical records(EMR)by learning users’distinct search semantics.The specific aim is to develop a learning-to-rank system that improves the accuracy and relevance of search results to support RAG-based search engines.Methods:Given a prompt or search query,the system first asks the user to label a few randomly selected doc-uments,which contain some keywords,as relevant to the prompt or not.The system then identifies relevant sentences and adjusts word similarities by updating a medical semantic embedding.New documents are ranked by the number of relevant sentences identified by the weighted embedding.Only the top-ranked documents and sentences are provided to a Large-Language-Model(LLM)to generate answers for further review.Findings:To evaluate our approach,four medical researchers labeled documents based on their relevance to specific diseases.We measured the information retrieval performance of our approach and two baseline methods.Results show that our approach achieved at least a 0.60 Precision-at-10(P@10)score with only ten positive labels,outperforming the baseline methods.In our pilot study,we demonstrate that the learned semantic preference can transfer to the analysis of unseen datasets,boosting the accuracy of an RAG model in extracting and explaining cancer progression diagnoses from 0.14 to 0.50.Interpretation:This study demonstrates that a customized learning-to-rank method can enhance state-of-the-art natural language models,such as LLMs,by quickly adapting to users’semantics.This approach supports EMR document retrieval and helps RAG models generate clinically meaningful answers to specific questions,under-scoring the potential of user-tailored learning-to-rank methods in clinical practice.
基金supported by the US Department of Energy Office of Nuclear Energy Distinguished Early Career Program under contract number DE-NE0009468support is provided by the Texas A&M Institute of Data Science(TAMIDS)Seed Program for AI,Computing,and Data Science。
文摘Next-generation nuclear reactor technologies,such as molten salt and fast reactors present complex analytical challenges that require advanced modeling and simulation tools.Yet,traditional workflows for Monte Carlo simulations like FLUKA are labor-intensive and error-prone,relying on manual input file generation and postprocessing.This limits scalability and efficiency.In this work,we present AutoFLUKA,a novel framework that leverages domain knowledge-embedded large language models(LLMs)and AI agents to automate the entire FLUKA simulation workflow from input file creation to execution management,and data analysis.AutoFLUKA also integrates Retrieval-Augmented Generation(RAG)and a web-based user-friendly graphical interface,enabling users to interact with the system in real time.Benchmarking against manual FLUKA simulations,AutoFLUKA demonstrated substantial improvements in resolving FLUKA error-related queries,particularly those arising from input file creation and execution.Traditionally,such issues are addressed through expert support on the FLUKA user forum,often resulting in significant delays.The resolution time for these queries was also reduced from several days to under one minute.Additionally,human-induced simulation errors were mitigated,and a high accuracy in key simulation metrics,such as neutron fluence and microdosimetric quantities,was achieved,with uncertainties below 0.001%for large sample sizes.The flexibility of AutoFLUKA was demonstrated through successful application to both general and specialized nuclear scenarios,and its design allows for straightforward extension to other simulation platforms.These results highlight AutoFLUKA’s potential to transform nuclear engineering analysis by enhancing productivity,reliability,and accessibility through AI-driven automation.
文摘In-context learning(ICL), which teaches a large language model(LLM) to perform a task with few-shot demonstrations rather than adjusting the model parameters, has emerged as a strong paradigm for using LLMs. While early studies primarily used a fixed or random set of demonstrations for all test queries, recent research suggests that retrieving semantically similar demonstrations to the input from a pool of available demonstrations results in better performance. This work expands the applicability of retrieval-based ICL approaches along several dimensions. We extend the success of retrieval-based ICL to instructionfinetuned LLMs as well as Chain-of-Thought(CoT) prompting. While the prior work utilizes general Large Language Models(LLMs), such as GPT-3, we find that retrieved demonstrations also enhance instructionfinetuned LLMs. This insight implies that training data, despite being exposed during the fine-tuning phase, can still be effectively used through retrieval and in-context demonstrations during testing, resulting in superior outcomes when compared to utilizing no demonstrations or selecting them at random. For CoT, when the demonstrations contain reasoning chains, we get improvements by retrieving based on such chains. Finally, we train a task-specific demonstration retriever that outperforms off-the-shelf retrievers.