Objective:This study aimed to develop a Nursing Retrieval-Augmented Generation(NurRAG)system based on large language models(LLMs)and to evaluate its accuracy and clinical applicability in nursing question answering.Me...Objective:This study aimed to develop a Nursing Retrieval-Augmented Generation(NurRAG)system based on large language models(LLMs)and to evaluate its accuracy and clinical applicability in nursing question answering.Methods:A multidisciplinary team consisting of nursing experts,artificial intelligence researchers,and information engineers collaboratively designed the NurRAG framework following the principles of retrieval-augmented generation.The system included four functional modules:1)construction of a nursing knowledge base through document normalization,embedding,and vector indexing;2)nursing question filtering using a supervised classifier;3)semantic retrieval and re-ranking for evidence selection;and 4)evidence-conditioned language model generation to produce citation-based nursing answers.The system was securely deployed on hospital intranet servers using Docker containers.Performance evaluation was conducted with 1,000 expert-verified nursing question–answer pairs.Semantic fidelity was assessed using Recall Oriented Understudy for Gisting Evaluation–Longest Common Subsequence(ROUGE-L),and clinical correctness was measured using Accuracy.Results:The NurRAG system achieved significant improvements in both semantic fidelity and answer accuracy compared with conventional large language models.For ChatGLM2-6B,ROUGE-L increased from(30.73±1.48)%to(64.27±0.27)%,and accuracy increased from(49.08±0.92)%to(75.83±0.35)%.For LLaMA2-7B,ROUGE-L increased from(28.76±0.89)%to(60.33±0.21)%,and accuracy increased from(43.27±0.83)%to(73.29±0.33)%.All differences were statistically significant(P<0.001).A quantitative case analysis further demonstrated that NurRAG effectively reduced hallucinated outputs and generated evidence-based,guideline-concordant nursing responses.Conclusion:The NurRAG system integrates domain-specific retrieval with LLMs generation to provide accurate,reliable,and traceable evidence-based nursing answers.The findings demonstrate the system’s feasibility and potential to improve the accuracy of clinical knowledge access,support evidence-based nursing decision-making,and promote the safe application of artificial intelligence in nursing practice.展开更多
In the context of power generation companies, vast amounts of specialized data and expert knowledge have been accumulated. However, challenges such as data silos and fragmented knowledge hinder the effective utilizati...In the context of power generation companies, vast amounts of specialized data and expert knowledge have been accumulated. However, challenges such as data silos and fragmented knowledge hinder the effective utilization of this information. This study proposes a novel framework for intelligent Question-and-Answer (Q&A) systems based on Retrieval-Augmented Generation (RAG) to address these issues. The system efficiently acquires domain-specific knowledge by leveraging external databases, including Relational Databases (RDBs) and graph databases, without additional fine-tuning for Large Language Models (LLMs). Crucially, the framework integrates a Dynamic Knowledge Base Updating Mechanism (DKBUM) and a Weighted Context-Aware Similarity (WCAS) method to enhance retrieval accuracy and mitigate inherent limitations of LLMs, such as hallucinations and lack of specialization. Additionally, the proposed DKBUM dynamically adjusts knowledge weights within the database, ensuring that the most recent and relevant information is utilized, while WCAS refines the alignment between queries and knowledge items by enhanced context understanding. Experimental validation demonstrates that the system can generate timely, accurate, and context-sensitive responses, making it a robust solution for managing complex business logic in specialized industries.展开更多
This article examines the implementation of a virtual health assistant powered by Retrieval-Augmented Generation (RAG) and GPT-4, aimed at enhancing clinical support through personalized, real-time interactions with p...This article examines the implementation of a virtual health assistant powered by Retrieval-Augmented Generation (RAG) and GPT-4, aimed at enhancing clinical support through personalized, real-time interactions with patients. The system is hypothesized to improve healthcare accessibility, operational efficiency, and patient outcomes by automating routine tasks and delivering accurate health information. The assistant leverages natural language processing and real-time data retrieval models to respond to patient inquiries, schedule appointments, provide medication reminders, assist with symptom triage, and answer insurance-related questions. By integrating RAG-based virtual care, the system reduces the burden on healthcare specialists and helps mitigate healthcare disparities, particularly in rural areas where traditional care is limited. Although the initial scope of testing did not validate all potential benefits, the results demonstrated high patient satisfaction and strong response accuracy, both critical for systems of this nature. These findings underscore the transformative potential of AI-driven virtual health assistants in enhancing patient engagement, streamlining operational workflows, and improving healthcare accessibility, ultimately contributing to better outcomes and more cost-effective care delivery.展开更多
The emergence of artificial intelligence natural language large models has brought new dawn for the in-depth empowerment of the industry.Research on key technologies and applications of railway natural language large ...The emergence of artificial intelligence natural language large models has brought new dawn for the in-depth empowerment of the industry.Research on key technologies and applications of railway natural language large model is of great significance to promoting and coordinating the development of railway artificial intelligence.This paper puts forward the application scenarios of railway natural language large model according to the application requirements of railway artificial intelligence;designs the overall architecture of the railway natural language large model by relying on the railway artificial intelligence platform,studies the key technologies of the natural language large model,builds a railway industry large model oriented to intelligent question-answering,and verifies the model with actual data;finally,this paper prospects for the development and application of railway natural language large model from the aspects of railway traffic organization,railway operation safety and passenger service.展开更多
In response to the growing mismatch between nursing workforce demand and constrained clinical teaching resources,this study proposes a Generative Multi-Agent Virtual Patient(GMVP)framework for high-fidelity nursing ed...In response to the growing mismatch between nursing workforce demand and constrained clinical teaching resources,this study proposes a Generative Multi-Agent Virtual Patient(GMVP)framework for high-fidelity nursing education.Grounded in situated learning,cognitive apprenticeship,and distributed cognition,GMVP employs a triadic agent architecture comprising narrative,physiological,and evaluator agents to reconstruct social interaction,physiological coherence,and formative assessment in virtual clinical environments.A design-based research methodology guides iterative development and classroom deployment aligned with outcome-based education standards.To address hallucination risks in high-stakes medical content,the system integrates retrieval-augmented generation with modular validation and physiological consistency checks.The framework supports scalable case generation,learning analytics,and equitable access to complex clinical training scenarios.展开更多
With the proliferation of data and increased complexity of clinical decision-making in the medical field,powerful computational tools are needed to assist physicians in making precise and reliable decisions.While the ...With the proliferation of data and increased complexity of clinical decision-making in the medical field,powerful computational tools are needed to assist physicians in making precise and reliable decisions.While the Large Language Models(LLMs)with billions of parameters in model size have obtained a series of achievements in a broad range of biomedical and healthcare applications,the issues in terms of reliability and stability are still needed to be addressed.To this end,we propose the framework of MedRad,a system that combines LLMs,knowledge engineering,Chain of Thought(CoT)reasoning,Retrieval-Augmented Generation(RAG)techniques,and intelligent agents(Agents)to improve clinical decision-making reliability.Based on fine-tuned LLMs and existing studies in the biomedical and healthcare domain,we further concentrate on how these techniques could be utilized to achieve highly reliable clinical decision-making in scenarios with varying complexity,such as medical knowledge QA and clinical diagnosis recommendations.Experimental results demonstrate that MedRad has the ability to provide high-quality decision paths in the above scenarios,and the potential to extend to more biomedical and healthcare scenarios through its loosely coupled design.展开更多
This paper investigates the transformative potential of Generative AI(Gen-AI)technologies,particularly large language models,within the building industry.By leveraging these advanced AI tools,the study explores their ...This paper investigates the transformative potential of Generative AI(Gen-AI)technologies,particularly large language models,within the building industry.By leveraging these advanced AI tools,the study explores their application across key areas such as automated compliance checking and building design assistance.The research highlights how Gen-AI can automate labor-intensive processes,significantly improving efficiency and reducing costs in building practices.The paper first discusses the two widely applied fundamental models—Transformer and Diffusion model—and summarizes current pathways for accessing Gen-AI models and the most common techniques for customizing them.It then explores applications for text generation,such as compliance checking,control support,data mining,and building simulation input file editing.Additionally,it examines image generation,including direct generation through diffusion models and indirect generation through language model-supported template creation based on existing Computer-Aided Design or other design tools with rendering.The paper concludes with a comprehensive analysis of the current capabilities of Gen-AI in the building industry,outlining future directions for research and development,with the goal of paving the way for smarter,more effective,and responsive design,construction,and operational practices.展开更多
Reasoning has long been regarded as a distinctive hallmark of human cognition,and recent advances in the artificial intelligence community have increasingly focused on the reasoning large language models(rLLMs)However...Reasoning has long been regarded as a distinctive hallmark of human cognition,and recent advances in the artificial intelligence community have increasingly focused on the reasoning large language models(rLLMs)However,due to strict privacy regulations,the domain-specific reasoning knowledge is often distributed across multiple data owners,limiting the rLLM's ability to fully leverage such valuable resources.In this context,federated learning(FL)has gained increasing attention in both the academia and industry as a promising privacy-preserving paradigm for addressing the challenges in the data-efficient training of rLLMs.In this paper,we conduct a comprehensive survey on federated rLLMs and propose a novel taxonomy based on training signals,including training signals derived from raw data,learned representations,and preference feedback.For each category,we emphasize the emerging trends according to how to use FL to enhance reasoning capabilities of rLLMs considering the model effectiveness,communication cost and privacy preservation.Finally,we envision future research directions and challenges based on insights from existing studies.展开更多
Magnesium alloys,known for their lightweight advantages,are increasingly in demand across a range of applications,from aerospace to the automotive industry.With rising requirements for strength and corrosion resistanc...Magnesium alloys,known for their lightweight advantages,are increasingly in demand across a range of applications,from aerospace to the automotive industry.With rising requirements for strength and corrosion resistance,the development of new magnesium alloy systems has become critical.Phase diagrams play a crucial role in guiding the magnesium alloy design by providing key insights into phase stability,composition,and temperature ranges,enabling the optimization of alloy properties and processing conditions.However,accessing and interpreting phase diagram data with thermodynamic calculation software can be complex and time-consuming,often requiring intricate calculations and iterative refinement based on thermodynamic models.To address this challenge,we introduce PDGPT,a ChatGPT-based large language model designed to streamline the acquisition of magnesium alloys Phase Diagram information with high efficiency and accuracy.Enhanced by promptengineering,supervised fine-tuning and retrieval-augmented generation,PDGPT leverages the predictive and reasoning capabilities of large language models along with computational phase diagram data.By combining large language models with traditional phase diagram research tools,PDGPT not only improves the accessibility of critical phase diagram information but also sets the stage for future advancements in applying large language models to materials science.展开更多
This research explores the integration of large language models (LLMs) into scientific data assimilation, focusing on combustion science as a case study. Leveraging foundational models integrated with Retrieval-Augmen...This research explores the integration of large language models (LLMs) into scientific data assimilation, focusing on combustion science as a case study. Leveraging foundational models integrated with Retrieval-Augmented Generation (RAG) framework, the study introduces an approach to process diverse combustion research data, spanning experimental studies, simulations, and literature. The multifaceted nature of combustion research emphasizes the critical role of knowledge processing in navigating and extracting valuable information from a vast and diverse pool of sources. The developed approach minimizes computational and economic expenses while optimizing data privacy and accuracy. It incorporates prompt engineering and offline open-source LLMs, offering user autonomy in selecting base models. The study provides a thorough examination of text segmentation strategies, conducts comparative studies between LLMs, and explores various optimized prompts to demonstrate the effectiveness of the framework. By incorporating an external vector database, the framework outperforms a conventional LLM in generating accurate responses and constructing robust arguments. Additionally, the study delves into the investigation of optimized prompt templates for the purpose of efficient extraction of scientific literature. Furthermore, we present a targeted scaling study to quantify the algorithmic performance of the framework as the number of prompt tokens increases. The research addresses concerns related to hallucinations and false research articles by introducing a custom workflow developed with a detection algorithm to filter out inaccuracies. Despite identified areas for improvement, the framework consistently delivers accurate domain-specific responses with minimal human oversight. The prompt-agnostic approach introduced holds promise for future improvements. The study underscores the significance of integrating LLMs and knowledge processing techniques in scientific research, providing a foundation for advancements in data assimilation and utilization.展开更多
基金supported by the Young and Middle-aged Research Fund Project of Shenzhen People's Hospital(Grant No.SYHL2024-N0010)the Shenzhen Basic Research Program(General Program,Grant No.JCYJ20240813104409013)。
文摘Objective:This study aimed to develop a Nursing Retrieval-Augmented Generation(NurRAG)system based on large language models(LLMs)and to evaluate its accuracy and clinical applicability in nursing question answering.Methods:A multidisciplinary team consisting of nursing experts,artificial intelligence researchers,and information engineers collaboratively designed the NurRAG framework following the principles of retrieval-augmented generation.The system included four functional modules:1)construction of a nursing knowledge base through document normalization,embedding,and vector indexing;2)nursing question filtering using a supervised classifier;3)semantic retrieval and re-ranking for evidence selection;and 4)evidence-conditioned language model generation to produce citation-based nursing answers.The system was securely deployed on hospital intranet servers using Docker containers.Performance evaluation was conducted with 1,000 expert-verified nursing question–answer pairs.Semantic fidelity was assessed using Recall Oriented Understudy for Gisting Evaluation–Longest Common Subsequence(ROUGE-L),and clinical correctness was measured using Accuracy.Results:The NurRAG system achieved significant improvements in both semantic fidelity and answer accuracy compared with conventional large language models.For ChatGLM2-6B,ROUGE-L increased from(30.73±1.48)%to(64.27±0.27)%,and accuracy increased from(49.08±0.92)%to(75.83±0.35)%.For LLaMA2-7B,ROUGE-L increased from(28.76±0.89)%to(60.33±0.21)%,and accuracy increased from(43.27±0.83)%to(73.29±0.33)%.All differences were statistically significant(P<0.001).A quantitative case analysis further demonstrated that NurRAG effectively reduced hallucinated outputs and generated evidence-based,guideline-concordant nursing responses.Conclusion:The NurRAG system integrates domain-specific retrieval with LLMs generation to provide accurate,reliable,and traceable evidence-based nursing answers.The findings demonstrate the system’s feasibility and potential to improve the accuracy of clinical knowledge access,support evidence-based nursing decision-making,and promote the safe application of artificial intelligence in nursing practice.
文摘In the context of power generation companies, vast amounts of specialized data and expert knowledge have been accumulated. However, challenges such as data silos and fragmented knowledge hinder the effective utilization of this information. This study proposes a novel framework for intelligent Question-and-Answer (Q&A) systems based on Retrieval-Augmented Generation (RAG) to address these issues. The system efficiently acquires domain-specific knowledge by leveraging external databases, including Relational Databases (RDBs) and graph databases, without additional fine-tuning for Large Language Models (LLMs). Crucially, the framework integrates a Dynamic Knowledge Base Updating Mechanism (DKBUM) and a Weighted Context-Aware Similarity (WCAS) method to enhance retrieval accuracy and mitigate inherent limitations of LLMs, such as hallucinations and lack of specialization. Additionally, the proposed DKBUM dynamically adjusts knowledge weights within the database, ensuring that the most recent and relevant information is utilized, while WCAS refines the alignment between queries and knowledge items by enhanced context understanding. Experimental validation demonstrates that the system can generate timely, accurate, and context-sensitive responses, making it a robust solution for managing complex business logic in specialized industries.
文摘This article examines the implementation of a virtual health assistant powered by Retrieval-Augmented Generation (RAG) and GPT-4, aimed at enhancing clinical support through personalized, real-time interactions with patients. The system is hypothesized to improve healthcare accessibility, operational efficiency, and patient outcomes by automating routine tasks and delivering accurate health information. The assistant leverages natural language processing and real-time data retrieval models to respond to patient inquiries, schedule appointments, provide medication reminders, assist with symptom triage, and answer insurance-related questions. By integrating RAG-based virtual care, the system reduces the burden on healthcare specialists and helps mitigate healthcare disparities, particularly in rural areas where traditional care is limited. Although the initial scope of testing did not validate all potential benefits, the results demonstrated high patient satisfaction and strong response accuracy, both critical for systems of this nature. These findings underscore the transformative potential of AI-driven virtual health assistants in enhancing patient engagement, streamlining operational workflows, and improving healthcare accessibility, ultimately contributing to better outcomes and more cost-effective care delivery.
文摘The emergence of artificial intelligence natural language large models has brought new dawn for the in-depth empowerment of the industry.Research on key technologies and applications of railway natural language large model is of great significance to promoting and coordinating the development of railway artificial intelligence.This paper puts forward the application scenarios of railway natural language large model according to the application requirements of railway artificial intelligence;designs the overall architecture of the railway natural language large model by relying on the railway artificial intelligence platform,studies the key technologies of the natural language large model,builds a railway industry large model oriented to intelligent question-answering,and verifies the model with actual data;finally,this paper prospects for the development and application of railway natural language large model from the aspects of railway traffic organization,railway operation safety and passenger service.
文摘In response to the growing mismatch between nursing workforce demand and constrained clinical teaching resources,this study proposes a Generative Multi-Agent Virtual Patient(GMVP)framework for high-fidelity nursing education.Grounded in situated learning,cognitive apprenticeship,and distributed cognition,GMVP employs a triadic agent architecture comprising narrative,physiological,and evaluator agents to reconstruct social interaction,physiological coherence,and formative assessment in virtual clinical environments.A design-based research methodology guides iterative development and classroom deployment aligned with outcome-based education standards.To address hallucination risks in high-stakes medical content,the system integrates retrieval-augmented generation with modular validation and physiological consistency checks.The framework supports scalable case generation,learning analytics,and equitable access to complex clinical training scenarios.
基金funded by the CAMS Fund,Grant no.2024-ZHCH630-01.
文摘With the proliferation of data and increased complexity of clinical decision-making in the medical field,powerful computational tools are needed to assist physicians in making precise and reliable decisions.While the Large Language Models(LLMs)with billions of parameters in model size have obtained a series of achievements in a broad range of biomedical and healthcare applications,the issues in terms of reliability and stability are still needed to be addressed.To this end,we propose the framework of MedRad,a system that combines LLMs,knowledge engineering,Chain of Thought(CoT)reasoning,Retrieval-Augmented Generation(RAG)techniques,and intelligent agents(Agents)to improve clinical decision-making reliability.Based on fine-tuned LLMs and existing studies in the biomedical and healthcare domain,we further concentrate on how these techniques could be utilized to achieve highly reliable clinical decision-making in scenarios with varying complexity,such as medical knowledge QA and clinical diagnosis recommendations.Experimental results demonstrate that MedRad has the ability to provide high-quality decision paths in the above scenarios,and the potential to extend to more biomedical and healthcare scenarios through its loosely coupled design.
基金support of the U.S.Department of Energy’s Office of Energy Efficiency and Renewable Energy(EERE)through Battelle Memorial Institute under Contract No.DE-AC05-76RL01830.
文摘This paper investigates the transformative potential of Generative AI(Gen-AI)technologies,particularly large language models,within the building industry.By leveraging these advanced AI tools,the study explores their application across key areas such as automated compliance checking and building design assistance.The research highlights how Gen-AI can automate labor-intensive processes,significantly improving efficiency and reducing costs in building practices.The paper first discusses the two widely applied fundamental models—Transformer and Diffusion model—and summarizes current pathways for accessing Gen-AI models and the most common techniques for customizing them.It then explores applications for text generation,such as compliance checking,control support,data mining,and building simulation input file editing.Additionally,it examines image generation,including direct generation through diffusion models and indirect generation through language model-supported template creation based on existing Computer-Aided Design or other design tools with rendering.The paper concludes with a comprehensive analysis of the current capabilities of Gen-AI in the building industry,outlining future directions for research and development,with the goal of paving the way for smarter,more effective,and responsive design,construction,and operational practices.
基金partially supported by the National Natural Science Foundation of China(NSFC)(Grant Nos.62425202,U21A20516,62336003)the Beijing Natural Science Foundation(Z230001)+2 种基金the Fundamental Research Funds for the Central Universities(No.JK2024-03)the Didi Collaborative Research Program and the State Key Laboratory of Complex&Critical Software Environment(SKLCCSE)supported by Chow Sang Sang Group Research Fund(No.9229139).
文摘Reasoning has long been regarded as a distinctive hallmark of human cognition,and recent advances in the artificial intelligence community have increasingly focused on the reasoning large language models(rLLMs)However,due to strict privacy regulations,the domain-specific reasoning knowledge is often distributed across multiple data owners,limiting the rLLM's ability to fully leverage such valuable resources.In this context,federated learning(FL)has gained increasing attention in both the academia and industry as a promising privacy-preserving paradigm for addressing the challenges in the data-efficient training of rLLMs.In this paper,we conduct a comprehensive survey on federated rLLMs and propose a novel taxonomy based on training signals,including training signals derived from raw data,learned representations,and preference feedback.For each category,we emphasize the emerging trends according to how to use FL to enhance reasoning capabilities of rLLMs considering the model effectiveness,communication cost and privacy preservation.Finally,we envision future research directions and challenges based on insights from existing studies.
基金the financial support provided by the National Natural Science Foundation of China(Grant Nos.52425101,52401216,52471012)Hongbin Zhang acknowledges also the funding by the Deutsche Forschungsgemeinschaft(DFG,German Research Foundation)-Project-ID 405553726-TRR 270.
文摘Magnesium alloys,known for their lightweight advantages,are increasingly in demand across a range of applications,from aerospace to the automotive industry.With rising requirements for strength and corrosion resistance,the development of new magnesium alloy systems has become critical.Phase diagrams play a crucial role in guiding the magnesium alloy design by providing key insights into phase stability,composition,and temperature ranges,enabling the optimization of alloy properties and processing conditions.However,accessing and interpreting phase diagram data with thermodynamic calculation software can be complex and time-consuming,often requiring intricate calculations and iterative refinement based on thermodynamic models.To address this challenge,we introduce PDGPT,a ChatGPT-based large language model designed to streamline the acquisition of magnesium alloys Phase Diagram information with high efficiency and accuracy.Enhanced by promptengineering,supervised fine-tuning and retrieval-augmented generation,PDGPT leverages the predictive and reasoning capabilities of large language models along with computational phase diagram data.By combining large language models with traditional phase diagram research tools,PDGPT not only improves the accessibility of critical phase diagram information but also sets the stage for future advancements in applying large language models to materials science.
基金support from the Defense Threat Reduction Agency(DTRA)under Grant No.HDTRA12110012with Dr.Richard Fry as the Program Officer,and partial project support from the Air Force Office of Scientific Research(AFOSR)under Grant No.FA9550-24-1-0017with Dr.Chiping Li as the Program Officer.
文摘This research explores the integration of large language models (LLMs) into scientific data assimilation, focusing on combustion science as a case study. Leveraging foundational models integrated with Retrieval-Augmented Generation (RAG) framework, the study introduces an approach to process diverse combustion research data, spanning experimental studies, simulations, and literature. The multifaceted nature of combustion research emphasizes the critical role of knowledge processing in navigating and extracting valuable information from a vast and diverse pool of sources. The developed approach minimizes computational and economic expenses while optimizing data privacy and accuracy. It incorporates prompt engineering and offline open-source LLMs, offering user autonomy in selecting base models. The study provides a thorough examination of text segmentation strategies, conducts comparative studies between LLMs, and explores various optimized prompts to demonstrate the effectiveness of the framework. By incorporating an external vector database, the framework outperforms a conventional LLM in generating accurate responses and constructing robust arguments. Additionally, the study delves into the investigation of optimized prompt templates for the purpose of efficient extraction of scientific literature. Furthermore, we present a targeted scaling study to quantify the algorithmic performance of the framework as the number of prompt tokens increases. The research addresses concerns related to hallucinations and false research articles by introducing a custom workflow developed with a detection algorithm to filter out inaccuracies. Despite identified areas for improvement, the framework consistently delivers accurate domain-specific responses with minimal human oversight. The prompt-agnostic approach introduced holds promise for future improvements. The study underscores the significance of integrating LLMs and knowledge processing techniques in scientific research, providing a foundation for advancements in data assimilation and utilization.