In this paper, a class of real-time parallel combined methods (RTPCM) of the digital simulation for a partitioned large system is presented. By means of combination of the parallelism across the system with the parall...In this paper, a class of real-time parallel combined methods (RTPCM) of the digital simulation for a partitioned large system is presented. By means of combination of the parallelism across the system with the parallelism across the method, stiff and non-stiff subsystems are solved in parallel on parallel computer by a parallel Rosenbrock method and a parallel RK method, respectively. Their construction, convergence and numerical stability are discussed, and the digitalsimulation experiments are conducted.展开更多
Aiming at the problem of reliability allocation for a complicated largesystem, a new thought is brought up. Reliability allocation should be a kind of decision-makingbehavior; therefore the more information is used wh...Aiming at the problem of reliability allocation for a complicated largesystem, a new thought is brought up. Reliability allocation should be a kind of decision-makingbehavior; therefore the more information is used when apportioning a reliability index, the morereasonable an allocation is obtained. Reliability allocation for a complicated large system consistsof two processes, the first one is a reliability information reporting process from bottom to top,and the other one is a reliability index apportioning process from top to bottom. By a typicalexample, we illustrate the concrete process of reliability allocation algorithms.展开更多
This paper studies the effect of phase noise and fronthaul compression on a downlink cloud radio access network(C-RAN), where several remote radio heads(RRHs) are coordinated to communicate with users by a baseband un...This paper studies the effect of phase noise and fronthaul compression on a downlink cloud radio access network(C-RAN), where several remote radio heads(RRHs) are coordinated to communicate with users by a baseband unit(BBU) on the cloud server. In the system, the baseband signals are precoded at BBU, and then compressed before being transmitted to RRHs through capacity-limited fronthaul links which results in the compressive quantization noise. We assume the regularized zero-forcing precoding is performed with an imperfect channel state information and a compression strategy is applied at BBU. The effect of phase noise arising from nonideal local oscillators both at RRHs and users is considered. We propose an approximate expression for the downlink ergodic sum-rate of considered C-RAN utilizing large dimensional random matrix theory in the large-system regime. From simulation results, the accuracy of the approximate expression is validated, and the effect of phase noise and fronthaul compression can be analyzed theoretically based on the approximate expression.展开更多
Background:Assess ChatGPT and Bard's effectiveness in the initial identification of articles for Otolaryngology—Head and Neck Surgery systematic literature reviews.Methods:Three PRISMA-based systematic reviews(Ja...Background:Assess ChatGPT and Bard's effectiveness in the initial identification of articles for Otolaryngology—Head and Neck Surgery systematic literature reviews.Methods:Three PRISMA-based systematic reviews(Jabbour et al.2017,Wong et al.2018,and Wu et al.2021)were replicated using ChatGPTv3.5 and Bard.Outputs(author,title,publication year,and journal)were compared to the original references and cross-referenced with medical databases for authenticity and recall.Results:Several themes emerged when comparing Bard and ChatGPT across the three reviews.Bard generated more outputs and had greater recall in Wong et al.'s review,with a broader date range in Jabbour et al.'s review.In Wu et al.'s review,ChatGPT-2 had higher recall and identified more authentic outputs than Bard-2.Conclusion:Large language models(LLMs)failed to fully replicate peer-reviewed methodologies,producing outputs with inaccuracies but identifying relevant,especially recent,articles missed by the references.While human-led PRISMA-based reviews remain the gold standard,refining LLMs for literature reviews shows potential.展开更多
Recommendation systems are key to boosting user engagement,satisfaction,and retention,particularly on media platforms where personalized content is vital.Sequential recommendation systems learn from user-item interact...Recommendation systems are key to boosting user engagement,satisfaction,and retention,particularly on media platforms where personalized content is vital.Sequential recommendation systems learn from user-item interactions to predict future items of interest.However,many current methods rely on unique user and item IDs,limiting their ability to represent users and items effectively,especially in zero-shot learning scenarios where training data is scarce.With the rapid development of Large Language Models(LLMs),researchers are exploring their potential to enhance recommendation systems.However,there is a semantic gap between the linguistic semantics of LLMs and the collaborative semantics of recommendation systems,where items are typically indexed by IDs.Moreover,most research focuses on item representations,neglecting personalized user modeling.To address these issues,we propose a sequential recommendation framework using LLMs,called CIT-Rec,a model that integrates Collaborative semantics for user representation and Image and Text information for item representation to enhance Recommendations.Specifically,by aligning intuitive image information with text containing semantic features,we can more accurately represent items,improving item representation quality.We focus not only on item representations but also on user representations.To more precisely capture users’personalized preferences,we use traditional sequential recommendation models to train on users’historical interaction data,effectively capturing behavioral patterns.Finally,by combining LLMs and traditional sequential recommendation models,we allow the LLM to understand linguistic semantics while capturing collaborative semantics.Extensive evaluations on real-world datasets show that our model outperforms baseline methods,effectively combining user interaction history with item visual and textual modalities to provide personalized recommendations.展开更多
Building reliable intent-based,task-oriented dialog systems typically requires substantial manual effort:designers must derive intents,entities,responses,and control logic from raw conversational data,then iterate unt...Building reliable intent-based,task-oriented dialog systems typically requires substantial manual effort:designers must derive intents,entities,responses,and control logic from raw conversational data,then iterate until the assistant behaves consistently.This paper investigates how far large language models(LLMs)can automate this development.In this paper,we use two reference corpora,Let’s Go(English,public transport)and MEDIA(French,hotel booking),to prompt four LLM families(GPT-4o,Claude,Gemini,Mistral Small)and generate the core specifications required by the rasa platform.These include intent sets with example utterances,entity definitions with slot mappings,response templates,and basic dialog flows.To structure this process,we introduce a model-and platform-agnostic pipelinewith two phases.The first normalizes and validates LLM-generated artifacts,enforcing crossfile consistency andmaking slot usage explicit.The second uses a lightweight dialog harness that runs scripted tests and incrementally patches failure points until conversations complete reliably.Across eight projects,all models required some targeted repairs before training.After applying our pipeline,all reached≥70%task completion(many above 84%),while NLU performance ranged from mid-0.6 to 1.0 macro-F1 depending on domain breadth.These results show that,with modest guidance,current LLMs can produce workable end-to-end dialog prototypes directly fromraw transcripts.Our main contributions are:(i)a reusable bootstrap method aligned with industry domain-specific languages(DSLs),(ii)a small set of high-impact corrective patterns,and(iii)a simple but effective harness for closed-loop refinement across conversational platforms.展开更多
The design of casting gating system directly determines the solidification sequence,defect severity,and overall quality of the casting.A novel machine learning strategy was developed to design the counter pressure cas...The design of casting gating system directly determines the solidification sequence,defect severity,and overall quality of the casting.A novel machine learning strategy was developed to design the counter pressure casting gating system of a large thin-walled cabin casting.A high-quality dataset was established through orthogonal experiments combined with design criteria for the gating system.Spearman’s correlation analysis was used to select high-quality features.The gating system dimensions were predicted using a gated recurrent unit(GRU)recurrent neural network and an elastic network model.Using EasyCast and ProCAST casting software,a comparative analysis of the flow field,temperature field,and solidification field can be conducted to demonstrate the achievement of steady filling and top-down sequential solidification.Compared to the empirical formula method,this method eliminates trial-and-error iterations,reduces porosity,reduces casting defect volume from 11.23 cubic centimeters to 2.23 cubic centimeters,eliminates internal casting defects through the incorporation of an internally cooled iron,fulfilling the goal of intelligent gating system design.展开更多
We proposes an AI-assisted framework for integrated natural disaster prevention and emergency response,leveraging the DeepSeek large language model(LLM)to advance intelligent decision-making in geohazard management.We...We proposes an AI-assisted framework for integrated natural disaster prevention and emergency response,leveraging the DeepSeek large language model(LLM)to advance intelligent decision-making in geohazard management.We systematically analyze the technical pathways for deploying LLMs in disaster scenarios,emphasizing three breakthrough directions:(1)knowledge graph-driven dynamic risk modeling,(2)reinforcement learning-optimized emergency decision systems,and(3)secure local deployment architectures.The DeepSeek model demonstrates unique advantages through its hybrid reasoning mechanism combining semantic analysis with geospatial pattern recognition,enabling cost-effective processing of multi-source data spanning historical disaster records,real-time IoT sensor feeds,and socio-environmental parameters.A modular system architecture is designed to achieve three critical objectives:(a)automated construction of domain-specific knowledge graphs through unsupervised learning of disaster physics relationships,(b)scenario-adaptive resource allocation using risk simulations,and(c)preserving emergency coordination via federated learning across distributed response nodes.The proposed local deployment paradigm addresses critical data security concerns in cross-border disaster management while complying with the FAIR principles(Findable,Accessible,Interoperable,Reusable)for geoscientific data governance.This work establishes a methodological foundation for next-generation AI-earth science convergence in disaster mitigation.展开更多
The rapid advancement of artificial intelligence(AI)has ushered in a new era of medical multimodal large language models(MLLMs),which integrate diverse data modalities such as text,imaging,physiological signals,and ge...The rapid advancement of artificial intelligence(AI)has ushered in a new era of medical multimodal large language models(MLLMs),which integrate diverse data modalities such as text,imaging,physiological signals,and genomics to enhance clinical decision-making.This systematic review explores the core methodologies and applied research frontiers of medical MLLMs,focusing on their architecture,training methods,evaluation techniques,and applications.We highlight the transformative potential of MLLMs in achieving cross-modal semantic alignment,medical knowledge integration,and robust clinical reasoning.Despite their promise,challenges such as data heterogeneity,hallucination,and computational efficiency persist.By reviewing state-of-the-art solutions and future directions,this paper provides a comprehensive technical guide for developing reliable and interpretable medical MLLMs,ultimately aiming to bridge the gap between AI and clinical practice.展开更多
For computer science majors in higher education institutions,programming courses are one of the most important professional foundation courses.Proficiency in independent programming skills is of great help to the stud...For computer science majors in higher education institutions,programming courses are one of the most important professional foundation courses.Proficiency in independent programming skills is of great help to the study of subsequent courses and the personal development of students.In the teaching process of programming courses,online judgement systems are often used to improve students’programming level.Traditional online judgement systems lack guidance for students,and it is often difficult for inexperienced students to find and correct errors in their codes by themselves.We propose an online judgement system that integrates a large model of error correction to help students find errors and improve their programming skills.展开更多
In this paper,we establish some strong laws of large numbers,which are for nonindependent random variables under the framework of sublinear expectations.One of our main results is for blockwise m-dependent random vari...In this paper,we establish some strong laws of large numbers,which are for nonindependent random variables under the framework of sublinear expectations.One of our main results is for blockwise m-dependent random variables,and another is for sub-orthogonal random variables.Both extend the strong law of large numbers for independent random variables under sublinear expectations to the non-independent case.展开更多
In view of the deficiencies in aspects such as failure rate requirements and analysis assumptions of advisory circular,this paper investigates the sources of high safety requirements,and the top-down design method for...In view of the deficiencies in aspects such as failure rate requirements and analysis assumptions of advisory circular,this paper investigates the sources of high safety requirements,and the top-down design method for the flight control system life cycle.Correspondingly,measures are proposed,including enhancing the safety target value to 10^(−10)per flight hour and implementing development assurance.In view of the shortcomings of mainstream aircraft flight control systems,such as weak backup capability and complex fault reconfiguration logic,improvements have been made to the system’s operating modes,control channel allocation,and common mode failure mitigation schemes based on the existing flight control architecture.The flight control design trends and philosophies have been analyzed.A flight control system architecture scheme is proposed,which includes three operating modes and multi-level voters/monitors,three main control channels,and a backup system independent of the main control system,which has been confirmed through functional modeling simulations.The proposed method plays an important role in the architecture design of safety-critical flight control system.展开更多
UAV geophysical surveys can adapt to complex ground exploration environments and greatly reduce the safety risk of operators, which may be applied in geophysical surveys, geological investigations, resource exploratio...UAV geophysical surveys can adapt to complex ground exploration environments and greatly reduce the safety risk of operators, which may be applied in geophysical surveys, geological investigations, resource exploration and other fields. Rotary-wing UAV is characterized by its flexible start-stop mode, high safety profile and night navigation. In this paper, according to the DY-115 rotary-wing UAV, an aeromagnetic measuring system with 115KG large load capacity was designed and integrated, and a magnetic compensation flight and test flight were successively carried out. The data satisfi ed the requirements of the technical specifi cations. By comparing and analyzing the test aeromagnetic anomaly data with the field magnetic data, the overall trend of the contour was observed to be basically the same as the shape. Accordingly, the aeromagnetic anomaly was found to be smoother and more continuous, which aligned with the interpretation and inversion of the anomaly, further verifying the stability, reliability and practicability of the large load rotary-wing UAV aeromagnetic measurement system.展开更多
To obtain the certificate of airworthiness,it is essential to conduct a full-scale aircraft static test.During such test,accurate and comprehensive wing deformation measurement is crucial for assessing its strength,st...To obtain the certificate of airworthiness,it is essential to conduct a full-scale aircraft static test.During such test,accurate and comprehensive wing deformation measurement is crucial for assessing its strength,stiffness,and bearing capability.This paper proposes a novel and cost-effective videogrammetric method using multi-camera system to achieve the non-contact,highprecision,and 3D measurement of overall static deformation for the large-scale wing structure.To overcome the difficulties of making,carrying,and employing the large 2D or 3D target for calibrating the cameras with large field of view,a flexible stereo cameras calibration method combining 1D target and epipolar geometry is proposed.The global calibration method,aided by a total station,is employed to unify the 3D data obtained from various binocular subsystems.A series of static load tests using a 10-meter-long large-scale wing have been conducted to validate the proposed system and methods.Furthermore,the proposed method was applied to the practical wing deformation measurement of both wings with a wingspan of 33.6 m in the full-size civil aircraft static test.The overall 3D profile and displacement data of the tested wing under various loads can be accurately obtained.The maximum error of distance and displacement measurement is less than 4.5 mm within the measurement range of 35 m in all load cases.These results demonstrate that the proposed method achieves effective,high-accuracy,on-site,and visualized wing deformation measurement,making it a promising approach for full-scale aircraft wing static test.展开更多
Recent advancements in large language models(LLMs)have driven remarkable progress in text process-ing,opening new avenues for medical knowledge discovery.In this study,we present ERQA,a mEdical knowledge Retrieval and...Recent advancements in large language models(LLMs)have driven remarkable progress in text process-ing,opening new avenues for medical knowledge discovery.In this study,we present ERQA,a mEdical knowledge Retrieval and Question-Answering framework powered by an enhanced LLM that integrates a semantic vector database and a curated literature repository.The ERQA framework leverages domain-specific incremental pretraining and conducts supervised fine-tuning on medical literature,enabling retrieval and question-answering(QA)tasks to be completed with high precision.Performance evaluations implemented on the coronavirus disease 2019(COVID-19)and TripClick data-sets demonstrate the robust capabilities of ERQA across multiple tasks.On the COVID-19 dataset,ERQA-13B achieves state-of-the-art retrieval metrics,with normalized discounted cumulative gain at top 10(NDCG@10)0.297,recall values at top 10(Recall@10)0.347,and mean reciprocal rank(MRR)=0.370;it also attains strong abstract summarization performance,with a recall-oriented understudy for gisting evaluation(ROUGE)-1 score of 0.434,and QA performance,with a bilingual evaluation understudy(BLEU)-1 score of 7.851.The comparable performance achieved on the TripClick dataset further under-scores the adaptability of ERQA across diverse medical topics.These findings suggest that ERQA repre-sents a significant step toward efficient biomedical knowledge retrieval and QA.展开更多
Objective:Generative artificial intelligence(AI)technology,represented by large language models(LLMs),has gradually been developed for traditional Chinese medicine(TCM);however,challenges remain in effectively enhanci...Objective:Generative artificial intelligence(AI)technology,represented by large language models(LLMs),has gradually been developed for traditional Chinese medicine(TCM);however,challenges remain in effectively enhancing AI applications for TCM.Therefore,this study is the first systematic review to analyze LLMs in TCM retrospectively,focusing on and summarizing the evidence of their performance in generative tasks.Methods:We extensively searched electronic databases for articles published until June 2024 to identify publicly available studies on LLMs in TCM.Two investigators independently selected and extracted the related information and evaluation metrics.Based on the available data,this study used descriptive analysis for a comprehensive systematic review of LLM technology related to TCM.Results:Ten studies published between 2023 and 2024 met our eligibility criteria and were included in this review,including 40%LLMs in the TCM vertical domain,40%containing TCM data,and 20%honoring the TCM contribution,with a foundational model parameter range from 1.8 to 33 billion.All included studies used manual or automatic evaluation metrics to evaluate model performance and fully discussed the challenges and contributions through an overview of LLMs in TCM.Conclusions:LLMs have achieved significant advantages in TCM applications and can effectively address intelligent TCM tasks.Further in-depth development of LLMs is needed in various vertical TCM fields,including clinical and fundamental research.Focusing on the functional segmentation development direction of generative AI technologies in TCM application scenarios to meet the practical needs-oriented demands of TCM digitalization is essential.展开更多
Model evaluation using benchmark datasets is an important method to measure the capability of large language models(LLMs)in specific domains,and it is mainly used to assess the knowledge and reasoning abilities of LLM...Model evaluation using benchmark datasets is an important method to measure the capability of large language models(LLMs)in specific domains,and it is mainly used to assess the knowledge and reasoning abilities of LLMs.Therefore,in order to better assess the capability of LLMs in the agricultural domain,Agri-Eval was proposed as a benchmark for assessing the knowledge and reasoning ability of LLMs in agriculture.The assessment dataset used in Agri-Eval covered seven major disciplines in the agricultural domain:crop science,horticulture,plant protection,animal husbandry,forest science,aquaculture science,and grass science,and contained a total of 2283 questions.Among domestic general-purpose LLMs,DeepSeek R1 performed best with an accuracy rate of 75.49%.In the realm of international general-purpose LLMs,Gemini 2.0 pro exp 0205 standed out as the top performer,achieving an accuracy rate of 74.28%.As an LLMs in agriculture vertical,Shennong V2.0 outperformed all the LLMs in China,and the answer accuracy rate of agricultural knowledge exceeded that of all the existing general-purpose LLMs.The launch of Agri-Eval helped the LLM developers to comprehensively evaluate the model's capability in the field of agriculture through a variety of tasks and tests to promote the development of the LLMs in the field of agriculture.展开更多
To the Editor,Artificial intelligence(AI)usage has been increasing.Many fields have implemented the use of AI and Large LanguageModels(LLMs),especially in medicine.Furthermore,manypatients have increasingly been using...To the Editor,Artificial intelligence(AI)usage has been increasing.Many fields have implemented the use of AI and Large LanguageModels(LLMs),especially in medicine.Furthermore,manypatients have increasingly been using AI;often,they will prompt AI with questions before even stepping into a physi-cian's office.The question lies in whether the information produced by AI is reliable and if this information is concise and easy to read across all patient populations.展开更多
This study evaluated the accuracy,completeness,and comprehensibility of responses from mainstream large language models(LLMs)to hepatitis C virus(HCV)-related questions,aiming to assess their performance in addressing...This study evaluated the accuracy,completeness,and comprehensibility of responses from mainstream large language models(LLMs)to hepatitis C virus(HCV)-related questions,aiming to assess their performance in addressing patient queries about disease and lifestyle behaviors.The models selected were ChatGPT-4o,Gemini 2.0 Pro,Claude 3.5 Sonnet,and DeepSeek V3,with 12 questions chosen by two HCV experts from the domains of prevention,diagnosis,and treatment.展开更多
It is known that correlation does not imply causality.Some relationships identified in the analysis of data are coincidental or unknown,and some are produced by real-world causality of the situation,which is problemat...It is known that correlation does not imply causality.Some relationships identified in the analysis of data are coincidental or unknown,and some are produced by real-world causality of the situation,which is problematic,since there is a need to differentiate between these two scenarios.Until recently,the proper−semantic−causality of the relationship could have been determined only by human experts from the area of expertise of the studied data.This has changed with the advance of large language models,which are often utilized as surrogates for such human experts,making the process automated and readily available to all data analysts.This motivates the main objective of this work,which is to introduce the design and implementation of a large language model-based semantic causality evaluator based on correlation analysis,together with its visual analysis model called Causal heatmap.After the implementation itself,the model is evaluated from the point of view of the quality of the visual model,from the point of view of the quality of causal evaluation based on large language models,and from the point of view of comparative analysis,while the results reached in the study highlight the usability of large language models in the task and the potential of the proposed approach in the analysis of unknown datasets.The results of the experimental evaluation demonstrate the usefulness of the Causal heatmap method,supported by the evident highlighting of interesting relationships,while suppressing irrelevant ones.展开更多
文摘In this paper, a class of real-time parallel combined methods (RTPCM) of the digital simulation for a partitioned large system is presented. By means of combination of the parallelism across the system with the parallelism across the method, stiff and non-stiff subsystems are solved in parallel on parallel computer by a parallel Rosenbrock method and a parallel RK method, respectively. Their construction, convergence and numerical stability are discussed, and the digitalsimulation experiments are conducted.
文摘Aiming at the problem of reliability allocation for a complicated largesystem, a new thought is brought up. Reliability allocation should be a kind of decision-makingbehavior; therefore the more information is used when apportioning a reliability index, the morereasonable an allocation is obtained. Reliability allocation for a complicated large system consistsof two processes, the first one is a reliability information reporting process from bottom to top,and the other one is a reliability index apportioning process from top to bottom. By a typicalexample, we illustrate the concrete process of reliability allocation algorithms.
基金supported in part by the Natural Science Foundation of China (NSFC) under Grant U1805262, 61871446, and 61671251supported by NSFC under Grant 61625106 and Grant 61531011
文摘This paper studies the effect of phase noise and fronthaul compression on a downlink cloud radio access network(C-RAN), where several remote radio heads(RRHs) are coordinated to communicate with users by a baseband unit(BBU) on the cloud server. In the system, the baseband signals are precoded at BBU, and then compressed before being transmitted to RRHs through capacity-limited fronthaul links which results in the compressive quantization noise. We assume the regularized zero-forcing precoding is performed with an imperfect channel state information and a compression strategy is applied at BBU. The effect of phase noise arising from nonideal local oscillators both at RRHs and users is considered. We propose an approximate expression for the downlink ergodic sum-rate of considered C-RAN utilizing large dimensional random matrix theory in the large-system regime. From simulation results, the accuracy of the approximate expression is validated, and the effect of phase noise and fronthaul compression can be analyzed theoretically based on the approximate expression.
文摘Background:Assess ChatGPT and Bard's effectiveness in the initial identification of articles for Otolaryngology—Head and Neck Surgery systematic literature reviews.Methods:Three PRISMA-based systematic reviews(Jabbour et al.2017,Wong et al.2018,and Wu et al.2021)were replicated using ChatGPTv3.5 and Bard.Outputs(author,title,publication year,and journal)were compared to the original references and cross-referenced with medical databases for authenticity and recall.Results:Several themes emerged when comparing Bard and ChatGPT across the three reviews.Bard generated more outputs and had greater recall in Wong et al.'s review,with a broader date range in Jabbour et al.'s review.In Wu et al.'s review,ChatGPT-2 had higher recall and identified more authentic outputs than Bard-2.Conclusion:Large language models(LLMs)failed to fully replicate peer-reviewed methodologies,producing outputs with inaccuracies but identifying relevant,especially recent,articles missed by the references.While human-led PRISMA-based reviews remain the gold standard,refining LLMs for literature reviews shows potential.
基金supported by the National Key R&D Program of China[2022YFF0902703]the State Administration for Market Regulation Science and Technology Plan Project(2024MK033).
文摘Recommendation systems are key to boosting user engagement,satisfaction,and retention,particularly on media platforms where personalized content is vital.Sequential recommendation systems learn from user-item interactions to predict future items of interest.However,many current methods rely on unique user and item IDs,limiting their ability to represent users and items effectively,especially in zero-shot learning scenarios where training data is scarce.With the rapid development of Large Language Models(LLMs),researchers are exploring their potential to enhance recommendation systems.However,there is a semantic gap between the linguistic semantics of LLMs and the collaborative semantics of recommendation systems,where items are typically indexed by IDs.Moreover,most research focuses on item representations,neglecting personalized user modeling.To address these issues,we propose a sequential recommendation framework using LLMs,called CIT-Rec,a model that integrates Collaborative semantics for user representation and Image and Text information for item representation to enhance Recommendations.Specifically,by aligning intuitive image information with text containing semantic features,we can more accurately represent items,improving item representation quality.We focus not only on item representations but also on user representations.To more precisely capture users’personalized preferences,we use traditional sequential recommendation models to train on users’historical interaction data,effectively capturing behavioral patterns.Finally,by combining LLMs and traditional sequential recommendation models,we allow the LLM to understand linguistic semantics while capturing collaborative semantics.Extensive evaluations on real-world datasets show that our model outperforms baseline methods,effectively combining user interaction history with item visual and textual modalities to provide personalized recommendations.
基金This publication is part of the TrustBoost project,that has received funding from MICIU/AEI/10.13039/501100011033,from FEDER,UEIt is a coordinated project by a multidisciplinary team from the Universidad Politécnica de Madrid(UPM)and University of Granada(UGR),with two subprojects that address TrustBoost’s objectives:“Enhancing Trustworthiness in Conversational AI through Multimodal Affective Awareness”(Trust Boost-UPM,ref.PID2023-150584OB-C21)“Breaking the Duality of Conversational AI:Going beyond Guided Conversations While Ensuring Compliance with Domain Rules and Constraints”(Trust Boost-UGR,ref.PID2023-150584OB-C22).
文摘Building reliable intent-based,task-oriented dialog systems typically requires substantial manual effort:designers must derive intents,entities,responses,and control logic from raw conversational data,then iterate until the assistant behaves consistently.This paper investigates how far large language models(LLMs)can automate this development.In this paper,we use two reference corpora,Let’s Go(English,public transport)and MEDIA(French,hotel booking),to prompt four LLM families(GPT-4o,Claude,Gemini,Mistral Small)and generate the core specifications required by the rasa platform.These include intent sets with example utterances,entity definitions with slot mappings,response templates,and basic dialog flows.To structure this process,we introduce a model-and platform-agnostic pipelinewith two phases.The first normalizes and validates LLM-generated artifacts,enforcing crossfile consistency andmaking slot usage explicit.The second uses a lightweight dialog harness that runs scripted tests and incrementally patches failure points until conversations complete reliably.Across eight projects,all models required some targeted repairs before training.After applying our pipeline,all reached≥70%task completion(many above 84%),while NLU performance ranged from mid-0.6 to 1.0 macro-F1 depending on domain breadth.These results show that,with modest guidance,current LLMs can produce workable end-to-end dialog prototypes directly fromraw transcripts.Our main contributions are:(i)a reusable bootstrap method aligned with industry domain-specific languages(DSLs),(ii)a small set of high-impact corrective patterns,and(iii)a simple but effective harness for closed-loop refinement across conversational platforms.
基金supported by the National Natural Science Foundation of China(Nos.52074246,52275390,52375394)the National Defense Basic Scientific Research Program of China(No.JCKY2020408B002)the Key R&D Program of Shanxi Province(No.202102050201011).
文摘The design of casting gating system directly determines the solidification sequence,defect severity,and overall quality of the casting.A novel machine learning strategy was developed to design the counter pressure casting gating system of a large thin-walled cabin casting.A high-quality dataset was established through orthogonal experiments combined with design criteria for the gating system.Spearman’s correlation analysis was used to select high-quality features.The gating system dimensions were predicted using a gated recurrent unit(GRU)recurrent neural network and an elastic network model.Using EasyCast and ProCAST casting software,a comparative analysis of the flow field,temperature field,and solidification field can be conducted to demonstrate the achievement of steady filling and top-down sequential solidification.Compared to the empirical formula method,this method eliminates trial-and-error iterations,reduces porosity,reduces casting defect volume from 11.23 cubic centimeters to 2.23 cubic centimeters,eliminates internal casting defects through the incorporation of an internally cooled iron,fulfilling the goal of intelligent gating system design.
基金funded by the Chongqing Water Resources Bureau,China(Project No.CQS24C00836).
文摘We proposes an AI-assisted framework for integrated natural disaster prevention and emergency response,leveraging the DeepSeek large language model(LLM)to advance intelligent decision-making in geohazard management.We systematically analyze the technical pathways for deploying LLMs in disaster scenarios,emphasizing three breakthrough directions:(1)knowledge graph-driven dynamic risk modeling,(2)reinforcement learning-optimized emergency decision systems,and(3)secure local deployment architectures.The DeepSeek model demonstrates unique advantages through its hybrid reasoning mechanism combining semantic analysis with geospatial pattern recognition,enabling cost-effective processing of multi-source data spanning historical disaster records,real-time IoT sensor feeds,and socio-environmental parameters.A modular system architecture is designed to achieve three critical objectives:(a)automated construction of domain-specific knowledge graphs through unsupervised learning of disaster physics relationships,(b)scenario-adaptive resource allocation using risk simulations,and(c)preserving emergency coordination via federated learning across distributed response nodes.The proposed local deployment paradigm addresses critical data security concerns in cross-border disaster management while complying with the FAIR principles(Findable,Accessible,Interoperable,Reusable)for geoscientific data governance.This work establishes a methodological foundation for next-generation AI-earth science convergence in disaster mitigation.
基金supported by the National Natural Science Foundation of China(Grant No.:62172458).
文摘The rapid advancement of artificial intelligence(AI)has ushered in a new era of medical multimodal large language models(MLLMs),which integrate diverse data modalities such as text,imaging,physiological signals,and genomics to enhance clinical decision-making.This systematic review explores the core methodologies and applied research frontiers of medical MLLMs,focusing on their architecture,training methods,evaluation techniques,and applications.We highlight the transformative potential of MLLMs in achieving cross-modal semantic alignment,medical knowledge integration,and robust clinical reasoning.Despite their promise,challenges such as data heterogeneity,hallucination,and computational efficiency persist.By reviewing state-of-the-art solutions and future directions,this paper provides a comprehensive technical guide for developing reliable and interpretable medical MLLMs,ultimately aiming to bridge the gap between AI and clinical practice.
基金supported by Research and Construction of Experimental Teaching Aid Platform for Programming under the Teaching Reform Research Project of Shandong University。
文摘For computer science majors in higher education institutions,programming courses are one of the most important professional foundation courses.Proficiency in independent programming skills is of great help to the study of subsequent courses and the personal development of students.In the teaching process of programming courses,online judgement systems are often used to improve students’programming level.Traditional online judgement systems lack guidance for students,and it is often difficult for inexperienced students to find and correct errors in their codes by themselves.We propose an online judgement system that integrates a large model of error correction to help students find errors and improve their programming skills.
文摘In this paper,we establish some strong laws of large numbers,which are for nonindependent random variables under the framework of sublinear expectations.One of our main results is for blockwise m-dependent random variables,and another is for sub-orthogonal random variables.Both extend the strong law of large numbers for independent random variables under sublinear expectations to the non-independent case.
文摘In view of the deficiencies in aspects such as failure rate requirements and analysis assumptions of advisory circular,this paper investigates the sources of high safety requirements,and the top-down design method for the flight control system life cycle.Correspondingly,measures are proposed,including enhancing the safety target value to 10^(−10)per flight hour and implementing development assurance.In view of the shortcomings of mainstream aircraft flight control systems,such as weak backup capability and complex fault reconfiguration logic,improvements have been made to the system’s operating modes,control channel allocation,and common mode failure mitigation schemes based on the existing flight control architecture.The flight control design trends and philosophies have been analyzed.A flight control system architecture scheme is proposed,which includes three operating modes and multi-level voters/monitors,three main control channels,and a backup system independent of the main control system,which has been confirmed through functional modeling simulations.The proposed method plays an important role in the architecture design of safety-critical flight control system.
基金supported by the project "Development of High-Temperature Superconducting Aeromagnetic Full Tensor Gradient Measurement System and Research on Error Compensation Methods"(Grant No. XZ202501ZY0136)。
文摘UAV geophysical surveys can adapt to complex ground exploration environments and greatly reduce the safety risk of operators, which may be applied in geophysical surveys, geological investigations, resource exploration and other fields. Rotary-wing UAV is characterized by its flexible start-stop mode, high safety profile and night navigation. In this paper, according to the DY-115 rotary-wing UAV, an aeromagnetic measuring system with 115KG large load capacity was designed and integrated, and a magnetic compensation flight and test flight were successively carried out. The data satisfi ed the requirements of the technical specifi cations. By comparing and analyzing the test aeromagnetic anomaly data with the field magnetic data, the overall trend of the contour was observed to be basically the same as the shape. Accordingly, the aeromagnetic anomaly was found to be smoother and more continuous, which aligned with the interpretation and inversion of the anomaly, further verifying the stability, reliability and practicability of the large load rotary-wing UAV aeromagnetic measurement system.
文摘To obtain the certificate of airworthiness,it is essential to conduct a full-scale aircraft static test.During such test,accurate and comprehensive wing deformation measurement is crucial for assessing its strength,stiffness,and bearing capability.This paper proposes a novel and cost-effective videogrammetric method using multi-camera system to achieve the non-contact,highprecision,and 3D measurement of overall static deformation for the large-scale wing structure.To overcome the difficulties of making,carrying,and employing the large 2D or 3D target for calibrating the cameras with large field of view,a flexible stereo cameras calibration method combining 1D target and epipolar geometry is proposed.The global calibration method,aided by a total station,is employed to unify the 3D data obtained from various binocular subsystems.A series of static load tests using a 10-meter-long large-scale wing have been conducted to validate the proposed system and methods.Furthermore,the proposed method was applied to the practical wing deformation measurement of both wings with a wingspan of 33.6 m in the full-size civil aircraft static test.The overall 3D profile and displacement data of the tested wing under various loads can be accurately obtained.The maximum error of distance and displacement measurement is less than 4.5 mm within the measurement range of 35 m in all load cases.These results demonstrate that the proposed method achieves effective,high-accuracy,on-site,and visualized wing deformation measurement,making it a promising approach for full-scale aircraft wing static test.
基金supported by the Innovation Fund for Medical Sciences of the Chinese Academy of Medical Sciences(2021-I2M-1-033)the National Key Research and Development Program of China(2022YFF0711900).
文摘Recent advancements in large language models(LLMs)have driven remarkable progress in text process-ing,opening new avenues for medical knowledge discovery.In this study,we present ERQA,a mEdical knowledge Retrieval and Question-Answering framework powered by an enhanced LLM that integrates a semantic vector database and a curated literature repository.The ERQA framework leverages domain-specific incremental pretraining and conducts supervised fine-tuning on medical literature,enabling retrieval and question-answering(QA)tasks to be completed with high precision.Performance evaluations implemented on the coronavirus disease 2019(COVID-19)and TripClick data-sets demonstrate the robust capabilities of ERQA across multiple tasks.On the COVID-19 dataset,ERQA-13B achieves state-of-the-art retrieval metrics,with normalized discounted cumulative gain at top 10(NDCG@10)0.297,recall values at top 10(Recall@10)0.347,and mean reciprocal rank(MRR)=0.370;it also attains strong abstract summarization performance,with a recall-oriented understudy for gisting evaluation(ROUGE)-1 score of 0.434,and QA performance,with a bilingual evaluation understudy(BLEU)-1 score of 7.851.The comparable performance achieved on the TripClick dataset further under-scores the adaptability of ERQA across diverse medical topics.These findings suggest that ERQA repre-sents a significant step toward efficient biomedical knowledge retrieval and QA.
基金supported by the National Multidisciplinary Innovation Team of Traditional Chinese Medicine(ZYYCXTD-D-202204)China Postdoctoral Science Foundation(2023M742627)+1 种基金Postdoctoral Fellowship Program of CPSF(GZC20231928)Foundation of State Key Laboratory of Component-based Chinese Medicine(CBCM2023201).
文摘Objective:Generative artificial intelligence(AI)technology,represented by large language models(LLMs),has gradually been developed for traditional Chinese medicine(TCM);however,challenges remain in effectively enhancing AI applications for TCM.Therefore,this study is the first systematic review to analyze LLMs in TCM retrospectively,focusing on and summarizing the evidence of their performance in generative tasks.Methods:We extensively searched electronic databases for articles published until June 2024 to identify publicly available studies on LLMs in TCM.Two investigators independently selected and extracted the related information and evaluation metrics.Based on the available data,this study used descriptive analysis for a comprehensive systematic review of LLM technology related to TCM.Results:Ten studies published between 2023 and 2024 met our eligibility criteria and were included in this review,including 40%LLMs in the TCM vertical domain,40%containing TCM data,and 20%honoring the TCM contribution,with a foundational model parameter range from 1.8 to 33 billion.All included studies used manual or automatic evaluation metrics to evaluate model performance and fully discussed the challenges and contributions through an overview of LLMs in TCM.Conclusions:LLMs have achieved significant advantages in TCM applications and can effectively address intelligent TCM tasks.Further in-depth development of LLMs is needed in various vertical TCM fields,including clinical and fundamental research.Focusing on the functional segmentation development direction of generative AI technologies in TCM application scenarios to meet the practical needs-oriented demands of TCM digitalization is essential.
文摘Model evaluation using benchmark datasets is an important method to measure the capability of large language models(LLMs)in specific domains,and it is mainly used to assess the knowledge and reasoning abilities of LLMs.Therefore,in order to better assess the capability of LLMs in the agricultural domain,Agri-Eval was proposed as a benchmark for assessing the knowledge and reasoning ability of LLMs in agriculture.The assessment dataset used in Agri-Eval covered seven major disciplines in the agricultural domain:crop science,horticulture,plant protection,animal husbandry,forest science,aquaculture science,and grass science,and contained a total of 2283 questions.Among domestic general-purpose LLMs,DeepSeek R1 performed best with an accuracy rate of 75.49%.In the realm of international general-purpose LLMs,Gemini 2.0 pro exp 0205 standed out as the top performer,achieving an accuracy rate of 74.28%.As an LLMs in agriculture vertical,Shennong V2.0 outperformed all the LLMs in China,and the answer accuracy rate of agricultural knowledge exceeded that of all the existing general-purpose LLMs.The launch of Agri-Eval helped the LLM developers to comprehensively evaluate the model's capability in the field of agriculture through a variety of tasks and tests to promote the development of the LLMs in the field of agriculture.
文摘To the Editor,Artificial intelligence(AI)usage has been increasing.Many fields have implemented the use of AI and Large LanguageModels(LLMs),especially in medicine.Furthermore,manypatients have increasingly been using AI;often,they will prompt AI with questions before even stepping into a physi-cian's office.The question lies in whether the information produced by AI is reliable and if this information is concise and easy to read across all patient populations.
基金funded by the National Key Research and Development Program of China(No.2021YFA1100500)the National Natural Science Foundation of China(No.82370662)the Key Research&Development Plan of Zhejiang Province(No.2024C03051).
文摘This study evaluated the accuracy,completeness,and comprehensibility of responses from mainstream large language models(LLMs)to hepatitis C virus(HCV)-related questions,aiming to assess their performance in addressing patient queries about disease and lifestyle behaviors.The models selected were ChatGPT-4o,Gemini 2.0 Pro,Claude 3.5 Sonnet,and DeepSeek V3,with 12 questions chosen by two HCV experts from the domains of prevention,diagnosis,and treatment.
基金supported by University Grant Agency of Matej Bel University in Banská Bystrica project number UGA-14-PDS-2025.
文摘It is known that correlation does not imply causality.Some relationships identified in the analysis of data are coincidental or unknown,and some are produced by real-world causality of the situation,which is problematic,since there is a need to differentiate between these two scenarios.Until recently,the proper−semantic−causality of the relationship could have been determined only by human experts from the area of expertise of the studied data.This has changed with the advance of large language models,which are often utilized as surrogates for such human experts,making the process automated and readily available to all data analysts.This motivates the main objective of this work,which is to introduce the design and implementation of a large language model-based semantic causality evaluator based on correlation analysis,together with its visual analysis model called Causal heatmap.After the implementation itself,the model is evaluated from the point of view of the quality of the visual model,from the point of view of the quality of causal evaluation based on large language models,and from the point of view of comparative analysis,while the results reached in the study highlight the usability of large language models in the task and the potential of the proposed approach in the analysis of unknown datasets.The results of the experimental evaluation demonstrate the usefulness of the Causal heatmap method,supported by the evident highlighting of interesting relationships,while suppressing irrelevant ones.