Amazon Web Services(AWS)Cloud Trail auditing service provides detailed records of operational and security events,enabling cloud administrators to monitor user activity and manage compliance.Although signaturebased th...Amazon Web Services(AWS)Cloud Trail auditing service provides detailed records of operational and security events,enabling cloud administrators to monitor user activity and manage compliance.Although signaturebased threat detection methods have been enhanced with machine learning and Large Language Models(LLMs),these approaches remain limited in addressing emerging threats.This study evaluates a two-step Retrieval Augmented Generation(RAG)approach using Gemini 2.5 Pro to enhance threat detection accuracy and contextual relevance.The RAG system integrates external cybersecurity knowledge sources including the MITRE ATT&CK framework,AWS Threat Technique Catalogue,and threat reports to overcome limitations of static pre-trained LLMs.We constructed an evaluation dataset of 200 unique CloudTrail events(122 malicious,78 benign)using the Stratus Red Team adversary emulation framework,covering 9 MITRE ATT&CK techniques across 8 tactics.Events were sampled from 1724 total events using stratified sampling.Ground truth labels were created through systematic expert annotation with 90%inter-annotator agreement.The RAG-enabled model achieved estimated 78%accuracy,85%precision,and 79%F1-score,representing 70.5%accuracy improvement and 76.4%F1-score improvement over baseline Gemini 2.5 Pro(46%accuracy,45%F1-score).Performance are based on evaluation results on 200-event dataset.Cost-latency analysis revealed processing time of 4.1 s and cost of$0.00376 per event,comparable to commercial SIEM solutions while providing superior MITRE ATT&CK attribution.The findings demonstrate that RAG substantially enhances context-aware threat detection,providing actionable insights for cloud security operations.展开更多
In the context of power generation companies, vast amounts of specialized data and expert knowledge have been accumulated. However, challenges such as data silos and fragmented knowledge hinder the effective utilizati...In the context of power generation companies, vast amounts of specialized data and expert knowledge have been accumulated. However, challenges such as data silos and fragmented knowledge hinder the effective utilization of this information. This study proposes a novel framework for intelligent Question-and-Answer (Q&A) systems based on Retrieval-Augmented Generation (RAG) to address these issues. The system efficiently acquires domain-specific knowledge by leveraging external databases, including Relational Databases (RDBs) and graph databases, without additional fine-tuning for Large Language Models (LLMs). Crucially, the framework integrates a Dynamic Knowledge Base Updating Mechanism (DKBUM) and a Weighted Context-Aware Similarity (WCAS) method to enhance retrieval accuracy and mitigate inherent limitations of LLMs, such as hallucinations and lack of specialization. Additionally, the proposed DKBUM dynamically adjusts knowledge weights within the database, ensuring that the most recent and relevant information is utilized, while WCAS refines the alignment between queries and knowledge items by enhanced context understanding. Experimental validation demonstrates that the system can generate timely, accurate, and context-sensitive responses, making it a robust solution for managing complex business logic in specialized industries.展开更多
Objective:This study aimed to develop a Nursing Retrieval-Augmented Generation(NurRAG)system based on large language models(LLMs)and to evaluate its accuracy and clinical applicability in nursing question answering.Me...Objective:This study aimed to develop a Nursing Retrieval-Augmented Generation(NurRAG)system based on large language models(LLMs)and to evaluate its accuracy and clinical applicability in nursing question answering.Methods:A multidisciplinary team consisting of nursing experts,artificial intelligence researchers,and information engineers collaboratively designed the NurRAG framework following the principles of retrieval-augmented generation.The system included four functional modules:1)construction of a nursing knowledge base through document normalization,embedding,and vector indexing;2)nursing question filtering using a supervised classifier;3)semantic retrieval and re-ranking for evidence selection;and 4)evidence-conditioned language model generation to produce citation-based nursing answers.The system was securely deployed on hospital intranet servers using Docker containers.Performance evaluation was conducted with 1,000 expert-verified nursing question–answer pairs.Semantic fidelity was assessed using Recall Oriented Understudy for Gisting Evaluation–Longest Common Subsequence(ROUGE-L),and clinical correctness was measured using Accuracy.Results:The NurRAG system achieved significant improvements in both semantic fidelity and answer accuracy compared with conventional large language models.For ChatGLM2-6B,ROUGE-L increased from(30.73±1.48)%to(64.27±0.27)%,and accuracy increased from(49.08±0.92)%to(75.83±0.35)%.For LLaMA2-7B,ROUGE-L increased from(28.76±0.89)%to(60.33±0.21)%,and accuracy increased from(43.27±0.83)%to(73.29±0.33)%.All differences were statistically significant(P<0.001).A quantitative case analysis further demonstrated that NurRAG effectively reduced hallucinated outputs and generated evidence-based,guideline-concordant nursing responses.Conclusion:The NurRAG system integrates domain-specific retrieval with LLMs generation to provide accurate,reliable,and traceable evidence-based nursing answers.The findings demonstrate the system’s feasibility and potential to improve the accuracy of clinical knowledge access,support evidence-based nursing decision-making,and promote the safe application of artificial intelligence in nursing practice.展开更多
This article examines the implementation of a virtual health assistant powered by Retrieval-Augmented Generation (RAG) and GPT-4, aimed at enhancing clinical support through personalized, real-time interactions with p...This article examines the implementation of a virtual health assistant powered by Retrieval-Augmented Generation (RAG) and GPT-4, aimed at enhancing clinical support through personalized, real-time interactions with patients. The system is hypothesized to improve healthcare accessibility, operational efficiency, and patient outcomes by automating routine tasks and delivering accurate health information. The assistant leverages natural language processing and real-time data retrieval models to respond to patient inquiries, schedule appointments, provide medication reminders, assist with symptom triage, and answer insurance-related questions. By integrating RAG-based virtual care, the system reduces the burden on healthcare specialists and helps mitigate healthcare disparities, particularly in rural areas where traditional care is limited. Although the initial scope of testing did not validate all potential benefits, the results demonstrated high patient satisfaction and strong response accuracy, both critical for systems of this nature. These findings underscore the transformative potential of AI-driven virtual health assistants in enhancing patient engagement, streamlining operational workflows, and improving healthcare accessibility, ultimately contributing to better outcomes and more cost-effective care delivery.展开更多
Large language models(LLMs)excel in various natural language processing tasks and are increasingly applied in specialized fields like medicine.However,their deployment in the medical domain is challenged by limited do...Large language models(LLMs)excel in various natural language processing tasks and are increasingly applied in specialized fields like medicine.However,their deployment in the medical domain is challenged by limited domain-specific data and the tendency to generate inaccurate information,known as“hallucinations.”While domainspecific fine-tuning has improved open-source LLMs,they still underperform compared to proprietary models like ChatGPT and PaLM.To address this gap,retrieval-augmented generation(RAG)techniques have been explored to enhance LLMs by integrating external knowledge bases.Nevertheless,the success of RAG depends on the quality of retrieved documents,and its application within the medical field remains in the early stages.In this paper,we introduce the“Bailicai”framework as an exploratory approach to integrating RAG with LLMs in the medical field.The framework employs fine-tuning to improve the RAG process,where“falsely relevant”and“completely irrelevant”interference documents are intentionally included in the training data.This enables Bailicai to develop the ability to assess the quality of retrieved documents and selectively incorporate them.The framework is organized into four modules:(1)medical knowledge injection,(2)self-knowledge boundary identification,(3)directed acyclic graph task decomposition,and(4)retrieval-augmented generation.Through the synergy of these modules,Bailicai achieves superior performance on multiple medical benchmarks,outperforming existing large models in the medical domain,RAG-based methods,and proprietary models such as GPT-3.5.Furthermore,Bailicai effectively mitigates the hallucination problem common in LLMs applied to medical tasks and enhances the robustness of RAG when dealing with irrelevant or misleading documents,enabling more accurate information retrieval and integration.展开更多
The emergence of Medical Large Language Models has significantly transformed healthcare.Medical Large Language Models(Med-LLMs)serve as transformative tools that enhance clinical practice through applications in decis...The emergence of Medical Large Language Models has significantly transformed healthcare.Medical Large Language Models(Med-LLMs)serve as transformative tools that enhance clinical practice through applications in decision support,documentation,and diagnostics.This evaluation examines the performance of leading Med-LLMs,including GPT-4Med,Med-PaLM,MEDITRON,PubMedGPT,and MedAlpaca,across diverse medical datasets.It provides graphical comparisons of their effectiveness in distinct healthcare domains.The study introduces a domain-specific categorization system that aligns these models with optimal applications in clinical decision-making,documentation,drug discovery,research,patient interaction,and public health.The paper addresses deployment challenges of Medical-LLMs,emphasizing trustworthiness and explainability as essential requirements for healthcare AI.It presents current evaluation techniques that improve model transparency in high-stakes medical contexts and analyzes regulatory frameworks using benchmarking datasets such asMedQA,MedMCQA,PubMedQA,and MIMIC.By identifying ongoing challenges in biasmitigation,reliability,and ethical compliance,thiswork serves as a resource for selecting appropriate Med-LLMs and outlines future directions in the field.This analysis offers a roadmap for developing Med-LLMs that balance technological innovation with the trust and transparency required for clinical integration,a perspective often overlooked in existing literature.展开更多
The emergence of artificial intelligence natural language large models has brought new dawn for the in-depth empowerment of the industry.Research on key technologies and applications of railway natural language large ...The emergence of artificial intelligence natural language large models has brought new dawn for the in-depth empowerment of the industry.Research on key technologies and applications of railway natural language large model is of great significance to promoting and coordinating the development of railway artificial intelligence.This paper puts forward the application scenarios of railway natural language large model according to the application requirements of railway artificial intelligence;designs the overall architecture of the railway natural language large model by relying on the railway artificial intelligence platform,studies the key technologies of the natural language large model,builds a railway industry large model oriented to intelligent question-answering,and verifies the model with actual data;finally,this paper prospects for the development and application of railway natural language large model from the aspects of railway traffic organization,railway operation safety and passenger service.展开更多
Retrieval-Augmented Generation(RAG)enhances Large Language Models(LLMs)by integrating external knowledge,leading to significant improvements in both factual accuracy and task performance.However,existing dense retriev...Retrieval-Augmented Generation(RAG)enhances Large Language Models(LLMs)by integrating external knowledge,leading to significant improvements in both factual accuracy and task performance.However,existing dense retrievers face considerable challenges when handling numerical constraints,particularly in queries requiring precise filtering conditions.To systematically explore these issues,we introduce Numerical Constraint Question(NumConQ),a comprehensive multi-domain benchmark dataset that contains more than 6500 queries covering healthcare,finance,education,sports,and movies.Empirical analysis reveals that state-of-the-art dense retrievers achieve only 16.3%accuracy in numerical constraint satisfaction,significantly underperforming relative to their semantic matching capabilities.To address these limitations,we propose Numerical Constraint-aware Retriever(NC-Retriever),which features:(1)a two-phase contrastive learning framework that combines in-batch negative samplings with progressively introduced hard negatives,and(2)a hybrid numerical representation scheme for consistent tokenization.Extensive experiments show that NC-Retriever achieves a relative improvement of 65.84%in recall@10 and a 78.28%increase in precision@10 compared to current state-of-the-art methods.The code and benchmark dataset are available at https://github.com/Tongji-KGLLM/NumConQ.展开更多
Radiology report generation aims to produce textual reports automatically based on input images,a critical process that aids in accurate diagnoses and lightens the workload of radiologists.Following recent advances in...Radiology report generation aims to produce textual reports automatically based on input images,a critical process that aids in accurate diagnoses and lightens the workload of radiologists.Following recent advances in Large Language Models(LLMs),several Retrieval-Augmented Generation(RAG)based report generation models have been proposed.Despite the continuously improved performance,these report generation models often suffer from two main limitations,i.e.,interference of irrelevant information,and lack of alignment between the input image and the resulting generated report.In this study,we propose the Semantic feedback based RAG Radiology report generation model,namely RAGSemRad.RAGSemRad comprises two key components:the fine-grained semantic retrieval module and the semantic assessment module.The fine-grained semantic retrieval module is designed to retrieve adequate and relevant prompt information,while ignoring irrelevant interference.This is achieved by clustering the data at the semantic level and leveraging the domain knowledge within a large pre-trained visual-language model,thus alleviating the issues of hallucination and databias.Further,the semantic assessment module enhances the performance of the upper bound by enhancing the alignment between the input image and the resulting generated report,utilizing supervision signals derived from paired image-label data.Experimental evaluations are conducted on two benchmarks,IU X-Ray and MIMIC-CXR,to assess the performance of RAGSemRad.The results demonstrate RAGSemRad exhibits competitive performance compared to the state-of-the-art methods,showcasing its potential to advance automatic radiology report generation.展开更多
Drug-drug interactions(DDIs)can significantly impact drug efficacy and safety,potentially leading to severe adverse effects.Existing works on DDI event prediction have typically relied on labels of specific events for...Drug-drug interactions(DDIs)can significantly impact drug efficacy and safety,potentially leading to severe adverse effects.Existing works on DDI event prediction have typically relied on labels of specific events for supervision,neglecting the importance of mining textual descriptions.This limits their ability to address two challenges:(1)the lack of observable data for new drugs,hindering meaningful feature extraction;(2)the highly imbalanced event distribution,which causes models to overfit to common categories and struggle with rare interactions.To address these challenges,we propose RADDI,a retrieval-augmented DDI prediction method.This approach improves prediction accuracy and adapts to the dynamic nature of new drug discovery.Specifically,to solve the first challenge,RADDI introduces a collaborative prediction strategy that integrates general knowledge transfer with specialized knowledge retrieval.This approach uses pretrained language models to generate embeddings for drug descriptions at a coarse level,enabling broad interaction classification.At a finer level,RADDI incorporates retrieval augmentation,using drug pair descriptions as retrieval keys and interaction categories as retrieval targets,thereby enhancing semantic understanding.For the second challenge,we design a class-aware probability distribution strategy to mitigate class imbalance.By leveraging the prior distribution of event categories,RADDI adjusts the retrieval sample weights and normalizes category probabilities,thereby improving the prediction accuracy for rare-class interactions while reducing over-reliance on high-frequency categories.Experiments on benchmark datasets demonstrate that RADDI excels in zero-shot DDI prediction scenarios,effectively balancing generalization to new drugs and maintaining high accuracy across various interaction categories.展开更多
In response to the growing mismatch between nursing workforce demand and constrained clinical teaching resources,this study proposes a Generative Multi-Agent Virtual Patient(GMVP)framework for high-fidelity nursing ed...In response to the growing mismatch between nursing workforce demand and constrained clinical teaching resources,this study proposes a Generative Multi-Agent Virtual Patient(GMVP)framework for high-fidelity nursing education.Grounded in situated learning,cognitive apprenticeship,and distributed cognition,GMVP employs a triadic agent architecture comprising narrative,physiological,and evaluator agents to reconstruct social interaction,physiological coherence,and formative assessment in virtual clinical environments.A design-based research methodology guides iterative development and classroom deployment aligned with outcome-based education standards.To address hallucination risks in high-stakes medical content,the system integrates retrieval-augmented generation with modular validation and physiological consistency checks.The framework supports scalable case generation,learning analytics,and equitable access to complex clinical training scenarios.展开更多
With the proliferation of data and increased complexity of clinical decision-making in the medical field,powerful computational tools are needed to assist physicians in making precise and reliable decisions.While the ...With the proliferation of data and increased complexity of clinical decision-making in the medical field,powerful computational tools are needed to assist physicians in making precise and reliable decisions.While the Large Language Models(LLMs)with billions of parameters in model size have obtained a series of achievements in a broad range of biomedical and healthcare applications,the issues in terms of reliability and stability are still needed to be addressed.To this end,we propose the framework of MedRad,a system that combines LLMs,knowledge engineering,Chain of Thought(CoT)reasoning,Retrieval-Augmented Generation(RAG)techniques,and intelligent agents(Agents)to improve clinical decision-making reliability.Based on fine-tuned LLMs and existing studies in the biomedical and healthcare domain,we further concentrate on how these techniques could be utilized to achieve highly reliable clinical decision-making in scenarios with varying complexity,such as medical knowledge QA and clinical diagnosis recommendations.Experimental results demonstrate that MedRad has the ability to provide high-quality decision paths in the above scenarios,and the potential to extend to more biomedical and healthcare scenarios through its loosely coupled design.展开更多
This paper investigates the transformative potential of Generative AI(Gen-AI)technologies,particularly large language models,within the building industry.By leveraging these advanced AI tools,the study explores their ...This paper investigates the transformative potential of Generative AI(Gen-AI)technologies,particularly large language models,within the building industry.By leveraging these advanced AI tools,the study explores their application across key areas such as automated compliance checking and building design assistance.The research highlights how Gen-AI can automate labor-intensive processes,significantly improving efficiency and reducing costs in building practices.The paper first discusses the two widely applied fundamental models—Transformer and Diffusion model—and summarizes current pathways for accessing Gen-AI models and the most common techniques for customizing them.It then explores applications for text generation,such as compliance checking,control support,data mining,and building simulation input file editing.Additionally,it examines image generation,including direct generation through diffusion models and indirect generation through language model-supported template creation based on existing Computer-Aided Design or other design tools with rendering.The paper concludes with a comprehensive analysis of the current capabilities of Gen-AI in the building industry,outlining future directions for research and development,with the goal of paving the way for smarter,more effective,and responsive design,construction,and operational practices.展开更多
Reasoning has long been regarded as a distinctive hallmark of human cognition,and recent advances in the artificial intelligence community have increasingly focused on the reasoning large language models(rLLMs)However...Reasoning has long been regarded as a distinctive hallmark of human cognition,and recent advances in the artificial intelligence community have increasingly focused on the reasoning large language models(rLLMs)However,due to strict privacy regulations,the domain-specific reasoning knowledge is often distributed across multiple data owners,limiting the rLLM's ability to fully leverage such valuable resources.In this context,federated learning(FL)has gained increasing attention in both the academia and industry as a promising privacy-preserving paradigm for addressing the challenges in the data-efficient training of rLLMs.In this paper,we conduct a comprehensive survey on federated rLLMs and propose a novel taxonomy based on training signals,including training signals derived from raw data,learned representations,and preference feedback.For each category,we emphasize the emerging trends according to how to use FL to enhance reasoning capabilities of rLLMs considering the model effectiveness,communication cost and privacy preservation.Finally,we envision future research directions and challenges based on insights from existing studies.展开更多
This research explores the integration of large language models (LLMs) into scientific data assimilation, focusing on combustion science as a case study. Leveraging foundational models integrated with Retrieval-Augmen...This research explores the integration of large language models (LLMs) into scientific data assimilation, focusing on combustion science as a case study. Leveraging foundational models integrated with Retrieval-Augmented Generation (RAG) framework, the study introduces an approach to process diverse combustion research data, spanning experimental studies, simulations, and literature. The multifaceted nature of combustion research emphasizes the critical role of knowledge processing in navigating and extracting valuable information from a vast and diverse pool of sources. The developed approach minimizes computational and economic expenses while optimizing data privacy and accuracy. It incorporates prompt engineering and offline open-source LLMs, offering user autonomy in selecting base models. The study provides a thorough examination of text segmentation strategies, conducts comparative studies between LLMs, and explores various optimized prompts to demonstrate the effectiveness of the framework. By incorporating an external vector database, the framework outperforms a conventional LLM in generating accurate responses and constructing robust arguments. Additionally, the study delves into the investigation of optimized prompt templates for the purpose of efficient extraction of scientific literature. Furthermore, we present a targeted scaling study to quantify the algorithmic performance of the framework as the number of prompt tokens increases. The research addresses concerns related to hallucinations and false research articles by introducing a custom workflow developed with a detection algorithm to filter out inaccuracies. Despite identified areas for improvement, the framework consistently delivers accurate domain-specific responses with minimal human oversight. The prompt-agnostic approach introduced holds promise for future improvements. The study underscores the significance of integrating LLMs and knowledge processing techniques in scientific research, providing a foundation for advancements in data assimilation and utilization.展开更多
Magnesium alloys,known for their lightweight advantages,are increasingly in demand across a range of applications,from aerospace to the automotive industry.With rising requirements for strength and corrosion resistanc...Magnesium alloys,known for their lightweight advantages,are increasingly in demand across a range of applications,from aerospace to the automotive industry.With rising requirements for strength and corrosion resistance,the development of new magnesium alloy systems has become critical.Phase diagrams play a crucial role in guiding the magnesium alloy design by providing key insights into phase stability,composition,and temperature ranges,enabling the optimization of alloy properties and processing conditions.However,accessing and interpreting phase diagram data with thermodynamic calculation software can be complex and time-consuming,often requiring intricate calculations and iterative refinement based on thermodynamic models.To address this challenge,we introduce PDGPT,a ChatGPT-based large language model designed to streamline the acquisition of magnesium alloys Phase Diagram information with high efficiency and accuracy.Enhanced by promptengineering,supervised fine-tuning and retrieval-augmented generation,PDGPT leverages the predictive and reasoning capabilities of large language models along with computational phase diagram data.By combining large language models with traditional phase diagram research tools,PDGPT not only improves the accessibility of critical phase diagram information but also sets the stage for future advancements in applying large language models to materials science.展开更多
文摘Amazon Web Services(AWS)Cloud Trail auditing service provides detailed records of operational and security events,enabling cloud administrators to monitor user activity and manage compliance.Although signaturebased threat detection methods have been enhanced with machine learning and Large Language Models(LLMs),these approaches remain limited in addressing emerging threats.This study evaluates a two-step Retrieval Augmented Generation(RAG)approach using Gemini 2.5 Pro to enhance threat detection accuracy and contextual relevance.The RAG system integrates external cybersecurity knowledge sources including the MITRE ATT&CK framework,AWS Threat Technique Catalogue,and threat reports to overcome limitations of static pre-trained LLMs.We constructed an evaluation dataset of 200 unique CloudTrail events(122 malicious,78 benign)using the Stratus Red Team adversary emulation framework,covering 9 MITRE ATT&CK techniques across 8 tactics.Events were sampled from 1724 total events using stratified sampling.Ground truth labels were created through systematic expert annotation with 90%inter-annotator agreement.The RAG-enabled model achieved estimated 78%accuracy,85%precision,and 79%F1-score,representing 70.5%accuracy improvement and 76.4%F1-score improvement over baseline Gemini 2.5 Pro(46%accuracy,45%F1-score).Performance are based on evaluation results on 200-event dataset.Cost-latency analysis revealed processing time of 4.1 s and cost of$0.00376 per event,comparable to commercial SIEM solutions while providing superior MITRE ATT&CK attribution.The findings demonstrate that RAG substantially enhances context-aware threat detection,providing actionable insights for cloud security operations.
文摘In the context of power generation companies, vast amounts of specialized data and expert knowledge have been accumulated. However, challenges such as data silos and fragmented knowledge hinder the effective utilization of this information. This study proposes a novel framework for intelligent Question-and-Answer (Q&A) systems based on Retrieval-Augmented Generation (RAG) to address these issues. The system efficiently acquires domain-specific knowledge by leveraging external databases, including Relational Databases (RDBs) and graph databases, without additional fine-tuning for Large Language Models (LLMs). Crucially, the framework integrates a Dynamic Knowledge Base Updating Mechanism (DKBUM) and a Weighted Context-Aware Similarity (WCAS) method to enhance retrieval accuracy and mitigate inherent limitations of LLMs, such as hallucinations and lack of specialization. Additionally, the proposed DKBUM dynamically adjusts knowledge weights within the database, ensuring that the most recent and relevant information is utilized, while WCAS refines the alignment between queries and knowledge items by enhanced context understanding. Experimental validation demonstrates that the system can generate timely, accurate, and context-sensitive responses, making it a robust solution for managing complex business logic in specialized industries.
基金supported by the Young and Middle-aged Research Fund Project of Shenzhen People's Hospital(Grant No.SYHL2024-N0010)the Shenzhen Basic Research Program(General Program,Grant No.JCYJ20240813104409013)。
文摘Objective:This study aimed to develop a Nursing Retrieval-Augmented Generation(NurRAG)system based on large language models(LLMs)and to evaluate its accuracy and clinical applicability in nursing question answering.Methods:A multidisciplinary team consisting of nursing experts,artificial intelligence researchers,and information engineers collaboratively designed the NurRAG framework following the principles of retrieval-augmented generation.The system included four functional modules:1)construction of a nursing knowledge base through document normalization,embedding,and vector indexing;2)nursing question filtering using a supervised classifier;3)semantic retrieval and re-ranking for evidence selection;and 4)evidence-conditioned language model generation to produce citation-based nursing answers.The system was securely deployed on hospital intranet servers using Docker containers.Performance evaluation was conducted with 1,000 expert-verified nursing question–answer pairs.Semantic fidelity was assessed using Recall Oriented Understudy for Gisting Evaluation–Longest Common Subsequence(ROUGE-L),and clinical correctness was measured using Accuracy.Results:The NurRAG system achieved significant improvements in both semantic fidelity and answer accuracy compared with conventional large language models.For ChatGLM2-6B,ROUGE-L increased from(30.73±1.48)%to(64.27±0.27)%,and accuracy increased from(49.08±0.92)%to(75.83±0.35)%.For LLaMA2-7B,ROUGE-L increased from(28.76±0.89)%to(60.33±0.21)%,and accuracy increased from(43.27±0.83)%to(73.29±0.33)%.All differences were statistically significant(P<0.001).A quantitative case analysis further demonstrated that NurRAG effectively reduced hallucinated outputs and generated evidence-based,guideline-concordant nursing responses.Conclusion:The NurRAG system integrates domain-specific retrieval with LLMs generation to provide accurate,reliable,and traceable evidence-based nursing answers.The findings demonstrate the system’s feasibility and potential to improve the accuracy of clinical knowledge access,support evidence-based nursing decision-making,and promote the safe application of artificial intelligence in nursing practice.
文摘This article examines the implementation of a virtual health assistant powered by Retrieval-Augmented Generation (RAG) and GPT-4, aimed at enhancing clinical support through personalized, real-time interactions with patients. The system is hypothesized to improve healthcare accessibility, operational efficiency, and patient outcomes by automating routine tasks and delivering accurate health information. The assistant leverages natural language processing and real-time data retrieval models to respond to patient inquiries, schedule appointments, provide medication reminders, assist with symptom triage, and answer insurance-related questions. By integrating RAG-based virtual care, the system reduces the burden on healthcare specialists and helps mitigate healthcare disparities, particularly in rural areas where traditional care is limited. Although the initial scope of testing did not validate all potential benefits, the results demonstrated high patient satisfaction and strong response accuracy, both critical for systems of this nature. These findings underscore the transformative potential of AI-driven virtual health assistants in enhancing patient engagement, streamlining operational workflows, and improving healthcare accessibility, ultimately contributing to better outcomes and more cost-effective care delivery.
基金supported by the State Key Program of National Natural Science of China(No.61533018)the National Natural Science Foundation of China(No.61402220)+3 种基金the Philosophy and Social Science Foundation of Hunan Province(No.16YBA323)the Natural Science Foundation of Hunan Province(Nos.2020JJ4525,2022JJ30495,and 2025JJ50384)the Scientific Research Fund of Hunan Provincial Education Department(Nos.18B279,19A439,and 22A0316)the CCF-Zhipu AI Large Model Fund.
文摘Large language models(LLMs)excel in various natural language processing tasks and are increasingly applied in specialized fields like medicine.However,their deployment in the medical domain is challenged by limited domain-specific data and the tendency to generate inaccurate information,known as“hallucinations.”While domainspecific fine-tuning has improved open-source LLMs,they still underperform compared to proprietary models like ChatGPT and PaLM.To address this gap,retrieval-augmented generation(RAG)techniques have been explored to enhance LLMs by integrating external knowledge bases.Nevertheless,the success of RAG depends on the quality of retrieved documents,and its application within the medical field remains in the early stages.In this paper,we introduce the“Bailicai”framework as an exploratory approach to integrating RAG with LLMs in the medical field.The framework employs fine-tuning to improve the RAG process,where“falsely relevant”and“completely irrelevant”interference documents are intentionally included in the training data.This enables Bailicai to develop the ability to assess the quality of retrieved documents and selectively incorporate them.The framework is organized into four modules:(1)medical knowledge injection,(2)self-knowledge boundary identification,(3)directed acyclic graph task decomposition,and(4)retrieval-augmented generation.Through the synergy of these modules,Bailicai achieves superior performance on multiple medical benchmarks,outperforming existing large models in the medical domain,RAG-based methods,and proprietary models such as GPT-3.5.Furthermore,Bailicai effectively mitigates the hallucination problem common in LLMs applied to medical tasks and enhances the robustness of RAG when dealing with irrelevant or misleading documents,enabling more accurate information retrieval and integration.
文摘The emergence of Medical Large Language Models has significantly transformed healthcare.Medical Large Language Models(Med-LLMs)serve as transformative tools that enhance clinical practice through applications in decision support,documentation,and diagnostics.This evaluation examines the performance of leading Med-LLMs,including GPT-4Med,Med-PaLM,MEDITRON,PubMedGPT,and MedAlpaca,across diverse medical datasets.It provides graphical comparisons of their effectiveness in distinct healthcare domains.The study introduces a domain-specific categorization system that aligns these models with optimal applications in clinical decision-making,documentation,drug discovery,research,patient interaction,and public health.The paper addresses deployment challenges of Medical-LLMs,emphasizing trustworthiness and explainability as essential requirements for healthcare AI.It presents current evaluation techniques that improve model transparency in high-stakes medical contexts and analyzes regulatory frameworks using benchmarking datasets such asMedQA,MedMCQA,PubMedQA,and MIMIC.By identifying ongoing challenges in biasmitigation,reliability,and ethical compliance,thiswork serves as a resource for selecting appropriate Med-LLMs and outlines future directions in the field.This analysis offers a roadmap for developing Med-LLMs that balance technological innovation with the trust and transparency required for clinical integration,a perspective often overlooked in existing literature.
文摘The emergence of artificial intelligence natural language large models has brought new dawn for the in-depth empowerment of the industry.Research on key technologies and applications of railway natural language large model is of great significance to promoting and coordinating the development of railway artificial intelligence.This paper puts forward the application scenarios of railway natural language large model according to the application requirements of railway artificial intelligence;designs the overall architecture of the railway natural language large model by relying on the railway artificial intelligence platform,studies the key technologies of the natural language large model,builds a railway industry large model oriented to intelligent question-answering,and verifies the model with actual data;finally,this paper prospects for the development and application of railway natural language large model from the aspects of railway traffic organization,railway operation safety and passenger service.
基金supported by the National Natural Science Foundation of China(Nos.62276063,U23B2057,and 62176185)the Natural Science Foundation of Jiangsu Province(No.BK20221457)+2 种基金the Natural Science Foundation of Beijing Municipality(No.L247008)the Tongji University Innovative Design and Intelligent Manufacturing Discipline Group Project,Tongji University Construction Project of the National Artificial Intelligence Industry-Academia Collaborative Innovation PlatformTongji University 2023 Interdisciplinary Collaborative Research Project.
文摘Retrieval-Augmented Generation(RAG)enhances Large Language Models(LLMs)by integrating external knowledge,leading to significant improvements in both factual accuracy and task performance.However,existing dense retrievers face considerable challenges when handling numerical constraints,particularly in queries requiring precise filtering conditions.To systematically explore these issues,we introduce Numerical Constraint Question(NumConQ),a comprehensive multi-domain benchmark dataset that contains more than 6500 queries covering healthcare,finance,education,sports,and movies.Empirical analysis reveals that state-of-the-art dense retrievers achieve only 16.3%accuracy in numerical constraint satisfaction,significantly underperforming relative to their semantic matching capabilities.To address these limitations,we propose Numerical Constraint-aware Retriever(NC-Retriever),which features:(1)a two-phase contrastive learning framework that combines in-batch negative samplings with progressively introduced hard negatives,and(2)a hybrid numerical representation scheme for consistent tokenization.Extensive experiments show that NC-Retriever achieves a relative improvement of 65.84%in recall@10 and a 78.28%increase in precision@10 compared to current state-of-the-art methods.The code and benchmark dataset are available at https://github.com/Tongji-KGLLM/NumConQ.
基金supported by the Fundamental Research Funds for the Central Universities(No.2232025D-34)the Noncommunicable Chronic Diseases-National Science and Technology Major Project(Nos.2024ZD0532400 and 2024ZD0532403)+1 种基金the Sichuan Provincial Science and Technology Program Key Research and Development Project(No.2024YFFK0443)the National Natural Science Foundation of China(Nos.62477006 and 61975124).
文摘Radiology report generation aims to produce textual reports automatically based on input images,a critical process that aids in accurate diagnoses and lightens the workload of radiologists.Following recent advances in Large Language Models(LLMs),several Retrieval-Augmented Generation(RAG)based report generation models have been proposed.Despite the continuously improved performance,these report generation models often suffer from two main limitations,i.e.,interference of irrelevant information,and lack of alignment between the input image and the resulting generated report.In this study,we propose the Semantic feedback based RAG Radiology report generation model,namely RAGSemRad.RAGSemRad comprises two key components:the fine-grained semantic retrieval module and the semantic assessment module.The fine-grained semantic retrieval module is designed to retrieve adequate and relevant prompt information,while ignoring irrelevant interference.This is achieved by clustering the data at the semantic level and leveraging the domain knowledge within a large pre-trained visual-language model,thus alleviating the issues of hallucination and databias.Further,the semantic assessment module enhances the performance of the upper bound by enhancing the alignment between the input image and the resulting generated report,utilizing supervision signals derived from paired image-label data.Experimental evaluations are conducted on two benchmarks,IU X-Ray and MIMIC-CXR,to assess the performance of RAGSemRad.The results demonstrate RAGSemRad exhibits competitive performance compared to the state-of-the-art methods,showcasing its potential to advance automatic radiology report generation.
基金supported by the Noncommunicable Chronic Diseases-National Science and Technology Major Project(Nos.2024ZD0532400 and 2024ZD0532403)the National Key Research and Development Plan Project(No.2022YFC3600901).
文摘Drug-drug interactions(DDIs)can significantly impact drug efficacy and safety,potentially leading to severe adverse effects.Existing works on DDI event prediction have typically relied on labels of specific events for supervision,neglecting the importance of mining textual descriptions.This limits their ability to address two challenges:(1)the lack of observable data for new drugs,hindering meaningful feature extraction;(2)the highly imbalanced event distribution,which causes models to overfit to common categories and struggle with rare interactions.To address these challenges,we propose RADDI,a retrieval-augmented DDI prediction method.This approach improves prediction accuracy and adapts to the dynamic nature of new drug discovery.Specifically,to solve the first challenge,RADDI introduces a collaborative prediction strategy that integrates general knowledge transfer with specialized knowledge retrieval.This approach uses pretrained language models to generate embeddings for drug descriptions at a coarse level,enabling broad interaction classification.At a finer level,RADDI incorporates retrieval augmentation,using drug pair descriptions as retrieval keys and interaction categories as retrieval targets,thereby enhancing semantic understanding.For the second challenge,we design a class-aware probability distribution strategy to mitigate class imbalance.By leveraging the prior distribution of event categories,RADDI adjusts the retrieval sample weights and normalizes category probabilities,thereby improving the prediction accuracy for rare-class interactions while reducing over-reliance on high-frequency categories.Experiments on benchmark datasets demonstrate that RADDI excels in zero-shot DDI prediction scenarios,effectively balancing generalization to new drugs and maintaining high accuracy across various interaction categories.
文摘In response to the growing mismatch between nursing workforce demand and constrained clinical teaching resources,this study proposes a Generative Multi-Agent Virtual Patient(GMVP)framework for high-fidelity nursing education.Grounded in situated learning,cognitive apprenticeship,and distributed cognition,GMVP employs a triadic agent architecture comprising narrative,physiological,and evaluator agents to reconstruct social interaction,physiological coherence,and formative assessment in virtual clinical environments.A design-based research methodology guides iterative development and classroom deployment aligned with outcome-based education standards.To address hallucination risks in high-stakes medical content,the system integrates retrieval-augmented generation with modular validation and physiological consistency checks.The framework supports scalable case generation,learning analytics,and equitable access to complex clinical training scenarios.
基金funded by the CAMS Fund,Grant no.2024-ZHCH630-01.
文摘With the proliferation of data and increased complexity of clinical decision-making in the medical field,powerful computational tools are needed to assist physicians in making precise and reliable decisions.While the Large Language Models(LLMs)with billions of parameters in model size have obtained a series of achievements in a broad range of biomedical and healthcare applications,the issues in terms of reliability and stability are still needed to be addressed.To this end,we propose the framework of MedRad,a system that combines LLMs,knowledge engineering,Chain of Thought(CoT)reasoning,Retrieval-Augmented Generation(RAG)techniques,and intelligent agents(Agents)to improve clinical decision-making reliability.Based on fine-tuned LLMs and existing studies in the biomedical and healthcare domain,we further concentrate on how these techniques could be utilized to achieve highly reliable clinical decision-making in scenarios with varying complexity,such as medical knowledge QA and clinical diagnosis recommendations.Experimental results demonstrate that MedRad has the ability to provide high-quality decision paths in the above scenarios,and the potential to extend to more biomedical and healthcare scenarios through its loosely coupled design.
基金support of the U.S.Department of Energy’s Office of Energy Efficiency and Renewable Energy(EERE)through Battelle Memorial Institute under Contract No.DE-AC05-76RL01830.
文摘This paper investigates the transformative potential of Generative AI(Gen-AI)technologies,particularly large language models,within the building industry.By leveraging these advanced AI tools,the study explores their application across key areas such as automated compliance checking and building design assistance.The research highlights how Gen-AI can automate labor-intensive processes,significantly improving efficiency and reducing costs in building practices.The paper first discusses the two widely applied fundamental models—Transformer and Diffusion model—and summarizes current pathways for accessing Gen-AI models and the most common techniques for customizing them.It then explores applications for text generation,such as compliance checking,control support,data mining,and building simulation input file editing.Additionally,it examines image generation,including direct generation through diffusion models and indirect generation through language model-supported template creation based on existing Computer-Aided Design or other design tools with rendering.The paper concludes with a comprehensive analysis of the current capabilities of Gen-AI in the building industry,outlining future directions for research and development,with the goal of paving the way for smarter,more effective,and responsive design,construction,and operational practices.
基金partially supported by the National Natural Science Foundation of China(NSFC)(Grant Nos.62425202,U21A20516,62336003)the Beijing Natural Science Foundation(Z230001)+2 种基金the Fundamental Research Funds for the Central Universities(No.JK2024-03)the Didi Collaborative Research Program and the State Key Laboratory of Complex&Critical Software Environment(SKLCCSE)supported by Chow Sang Sang Group Research Fund(No.9229139).
文摘Reasoning has long been regarded as a distinctive hallmark of human cognition,and recent advances in the artificial intelligence community have increasingly focused on the reasoning large language models(rLLMs)However,due to strict privacy regulations,the domain-specific reasoning knowledge is often distributed across multiple data owners,limiting the rLLM's ability to fully leverage such valuable resources.In this context,federated learning(FL)has gained increasing attention in both the academia and industry as a promising privacy-preserving paradigm for addressing the challenges in the data-efficient training of rLLMs.In this paper,we conduct a comprehensive survey on federated rLLMs and propose a novel taxonomy based on training signals,including training signals derived from raw data,learned representations,and preference feedback.For each category,we emphasize the emerging trends according to how to use FL to enhance reasoning capabilities of rLLMs considering the model effectiveness,communication cost and privacy preservation.Finally,we envision future research directions and challenges based on insights from existing studies.
基金support from the Defense Threat Reduction Agency(DTRA)under Grant No.HDTRA12110012with Dr.Richard Fry as the Program Officer,and partial project support from the Air Force Office of Scientific Research(AFOSR)under Grant No.FA9550-24-1-0017with Dr.Chiping Li as the Program Officer.
文摘This research explores the integration of large language models (LLMs) into scientific data assimilation, focusing on combustion science as a case study. Leveraging foundational models integrated with Retrieval-Augmented Generation (RAG) framework, the study introduces an approach to process diverse combustion research data, spanning experimental studies, simulations, and literature. The multifaceted nature of combustion research emphasizes the critical role of knowledge processing in navigating and extracting valuable information from a vast and diverse pool of sources. The developed approach minimizes computational and economic expenses while optimizing data privacy and accuracy. It incorporates prompt engineering and offline open-source LLMs, offering user autonomy in selecting base models. The study provides a thorough examination of text segmentation strategies, conducts comparative studies between LLMs, and explores various optimized prompts to demonstrate the effectiveness of the framework. By incorporating an external vector database, the framework outperforms a conventional LLM in generating accurate responses and constructing robust arguments. Additionally, the study delves into the investigation of optimized prompt templates for the purpose of efficient extraction of scientific literature. Furthermore, we present a targeted scaling study to quantify the algorithmic performance of the framework as the number of prompt tokens increases. The research addresses concerns related to hallucinations and false research articles by introducing a custom workflow developed with a detection algorithm to filter out inaccuracies. Despite identified areas for improvement, the framework consistently delivers accurate domain-specific responses with minimal human oversight. The prompt-agnostic approach introduced holds promise for future improvements. The study underscores the significance of integrating LLMs and knowledge processing techniques in scientific research, providing a foundation for advancements in data assimilation and utilization.
基金the financial support provided by the National Natural Science Foundation of China(Grant Nos.52425101,52401216,52471012)Hongbin Zhang acknowledges also the funding by the Deutsche Forschungsgemeinschaft(DFG,German Research Foundation)-Project-ID 405553726-TRR 270.
文摘Magnesium alloys,known for their lightweight advantages,are increasingly in demand across a range of applications,from aerospace to the automotive industry.With rising requirements for strength and corrosion resistance,the development of new magnesium alloy systems has become critical.Phase diagrams play a crucial role in guiding the magnesium alloy design by providing key insights into phase stability,composition,and temperature ranges,enabling the optimization of alloy properties and processing conditions.However,accessing and interpreting phase diagram data with thermodynamic calculation software can be complex and time-consuming,often requiring intricate calculations and iterative refinement based on thermodynamic models.To address this challenge,we introduce PDGPT,a ChatGPT-based large language model designed to streamline the acquisition of magnesium alloys Phase Diagram information with high efficiency and accuracy.Enhanced by promptengineering,supervised fine-tuning and retrieval-augmented generation,PDGPT leverages the predictive and reasoning capabilities of large language models along with computational phase diagram data.By combining large language models with traditional phase diagram research tools,PDGPT not only improves the accessibility of critical phase diagram information but also sets the stage for future advancements in applying large language models to materials science.