期刊文献+
共找到4,367篇文章
< 1 2 219 >
每页显示 20 50 100
Strong Laws of Large Numbers for Sequences of Blockwise m-Dependent and Sub-Orthogonal Random Variables under Sublinear Expectations 被引量:1
1
作者 Jialiang FU 《Journal of Mathematical Research with Applications》 2026年第1期103-118,共16页
In this paper,we establish some strong laws of large numbers,which are for nonindependent random variables under the framework of sublinear expectations.One of our main results is for blockwise m-dependent random vari... In this paper,we establish some strong laws of large numbers,which are for nonindependent random variables under the framework of sublinear expectations.One of our main results is for blockwise m-dependent random variables,and another is for sub-orthogonal random variables.Both extend the strong law of large numbers for independent random variables under sublinear expectations to the non-independent case. 展开更多
关键词 sublinear expectations strong law of large numbers blockwise m-dependent suborthogonal random variables
原文传递
Agri-Eval:Multi-level Large Language Model Valuation Benchmark for Agriculture
2
作者 WANG Yaojun GE Mingliang +2 位作者 XU Guowei ZHANG Qiyu BIE Yuhui 《农业机械学报》 北大核心 2026年第1期290-299,共10页
Model evaluation using benchmark datasets is an important method to measure the capability of large language models(LLMs)in specific domains,and it is mainly used to assess the knowledge and reasoning abilities of LLM... Model evaluation using benchmark datasets is an important method to measure the capability of large language models(LLMs)in specific domains,and it is mainly used to assess the knowledge and reasoning abilities of LLMs.Therefore,in order to better assess the capability of LLMs in the agricultural domain,Agri-Eval was proposed as a benchmark for assessing the knowledge and reasoning ability of LLMs in agriculture.The assessment dataset used in Agri-Eval covered seven major disciplines in the agricultural domain:crop science,horticulture,plant protection,animal husbandry,forest science,aquaculture science,and grass science,and contained a total of 2283 questions.Among domestic general-purpose LLMs,DeepSeek R1 performed best with an accuracy rate of 75.49%.In the realm of international general-purpose LLMs,Gemini 2.0 pro exp 0205 standed out as the top performer,achieving an accuracy rate of 74.28%.As an LLMs in agriculture vertical,Shennong V2.0 outperformed all the LLMs in China,and the answer accuracy rate of agricultural knowledge exceeded that of all the existing general-purpose LLMs.The launch of Agri-Eval helped the LLM developers to comprehensively evaluate the model's capability in the field of agriculture through a variety of tasks and tests to promote the development of the LLMs in the field of agriculture. 展开更多
关键词 large language models assessment systems agricultural knowledge agricultural datasets
在线阅读 下载PDF
Evaluating Large Language Model Adherence to Targeted Fifth‐Grade Readability Standards in Patient Educationon Chronic Conditions
3
作者 Faheed Shafau Chase Wahl +1 位作者 Marcus Kado Garrett Miedema 《Chronic Diseases and Translational Medicine》 2026年第1期73-74,共2页
To the Editor,Artificial intelligence(AI)usage has been increasing.Many fields have implemented the use of AI and Large LanguageModels(LLMs),especially in medicine.Furthermore,manypatients have increasingly been using... To the Editor,Artificial intelligence(AI)usage has been increasing.Many fields have implemented the use of AI and Large LanguageModels(LLMs),especially in medicine.Furthermore,manypatients have increasingly been using AI;often,they will prompt AI with questions before even stepping into a physi-cian's office.The question lies in whether the information produced by AI is reliable and if this information is concise and easy to read across all patient populations. 展开更多
关键词 large languagemodels llms especially fifth grade readability standards artificial intelligence large language models patient education chronic conditions prompt ai READABILITY
原文传递
Hepatitis C Patient Education:Large Language Models Show Promise in Disseminating Guidelines
4
作者 Jinyan Chen Ruijie Zhao +10 位作者 Chiyu He Huigang Li Yajie You Zuyuan Lin Ze Xiang Jianyong Zhuo Wei Shen Zhihang Hu Shusen Zheng Xiao Xu Di Lu 《Journal of Clinical and Translational Hepatology》 2026年第1期116-119,共4页
This study evaluated the accuracy,completeness,and comprehensibility of responses from mainstream large language models(LLMs)to hepatitis C virus(HCV)-related questions,aiming to assess their performance in addressing... This study evaluated the accuracy,completeness,and comprehensibility of responses from mainstream large language models(LLMs)to hepatitis C virus(HCV)-related questions,aiming to assess their performance in addressing patient queries about disease and lifestyle behaviors.The models selected were ChatGPT-4o,Gemini 2.0 Pro,Claude 3.5 Sonnet,and DeepSeek V3,with 12 questions chosen by two HCV experts from the domains of prevention,diagnosis,and treatment. 展开更多
关键词 addressing patient queries disease lifestyle behaviorsthe large language models large language models llms GUIDELINES hepatitis C accuracy patient education COMPREHENSIBILITY
原文传递
Semantic Causality Evaluation of Correlation Analysis Utilizing Large Language Models
5
作者 Adam Dudáš 《Computers, Materials & Continua》 2026年第5期2246-2269,共24页
It is known that correlation does not imply causality.Some relationships identified in the analysis of data are coincidental or unknown,and some are produced by real-world causality of the situation,which is problemat... It is known that correlation does not imply causality.Some relationships identified in the analysis of data are coincidental or unknown,and some are produced by real-world causality of the situation,which is problematic,since there is a need to differentiate between these two scenarios.Until recently,the proper−semantic−causality of the relationship could have been determined only by human experts from the area of expertise of the studied data.This has changed with the advance of large language models,which are often utilized as surrogates for such human experts,making the process automated and readily available to all data analysts.This motivates the main objective of this work,which is to introduce the design and implementation of a large language model-based semantic causality evaluator based on correlation analysis,together with its visual analysis model called Causal heatmap.After the implementation itself,the model is evaluated from the point of view of the quality of the visual model,from the point of view of the quality of causal evaluation based on large language models,and from the point of view of comparative analysis,while the results reached in the study highlight the usability of large language models in the task and the potential of the proposed approach in the analysis of unknown datasets.The results of the experimental evaluation demonstrate the usefulness of the Causal heatmap method,supported by the evident highlighting of interesting relationships,while suppressing irrelevant ones. 展开更多
关键词 CORRELATION CAUSALITY correlation analysis large language models VISUALIZATION
在线阅读 下载PDF
Scalable Fabrication of Large-Scale Electrochromic Smart Windows for Superior Solar Radiation Regulation and Energy Savings
6
作者 Yanbang Tang Junyu Yuan +1 位作者 Rongzong Zheng Chunyang Jia 《Nano-Micro Letters》 2026年第6期823-839,共17页
Electrochromic smart windows(ESWs)can significantly reduce building energy consumption,but the high cost hinders large-scale production.The in situ growth of tungsten oxide(WO_(3))films is only by a simple immersion p... Electrochromic smart windows(ESWs)can significantly reduce building energy consumption,but the high cost hinders large-scale production.The in situ growth of tungsten oxide(WO_(3))films is only by a simple immersion process,the silver nanowires(AgNWs)undergo oxidation to Ag^(+)ions through electron loss,and the liberated electrons provide driving force for the deposition of WO_(4)^(2-).Enabled the fabrication of large-area WO_(3)films and ESWs were fabricated under minimal laboratory conditions,demonstrating the economic feasibility,efficient and reliable nature of industrial production.Structural characterization and density functional theory calculations were combined to confirm that AgNWs effectively regulate oxygen vacancies of WO_(3)films and promote the in situ growth process.The optimized WO_(3)exhibits a maximum transmittance modulation of 90.8%and excellent cycling stability of 20,000 cycles.The largescale WO_(3)-based ESWs can save building energy up to 140.0 MJ m^(-2)compared to traditional windows in tropical regions,as verified by simulations more than40 global cities.This research provides a new approach for improving the performance and industrial production of ESW,providing the full understanding and development direction to short the distance of the ESW commercial production. 展开更多
关键词 Electrochromic Smart window Tungsten oxide Silver nanowire large area
在线阅读 下载PDF
Assessing Large Language Models for Early Article Identification in Otolaryngology—Head and Neck Surgery Systematic Reviews
7
作者 Ajibola B.Bakare Young Lee +2 位作者 Jhuree Hong Claus-Peter Richter Jonathan P.Kuriakose 《Health Care Science》 2026年第1期19-28,共10页
Background:Assess ChatGPT and Bard's effectiveness in the initial identification of articles for Otolaryngology—Head and Neck Surgery systematic literature reviews.Methods:Three PRISMA-based systematic reviews(Ja... Background:Assess ChatGPT and Bard's effectiveness in the initial identification of articles for Otolaryngology—Head and Neck Surgery systematic literature reviews.Methods:Three PRISMA-based systematic reviews(Jabbour et al.2017,Wong et al.2018,and Wu et al.2021)were replicated using ChatGPTv3.5 and Bard.Outputs(author,title,publication year,and journal)were compared to the original references and cross-referenced with medical databases for authenticity and recall.Results:Several themes emerged when comparing Bard and ChatGPT across the three reviews.Bard generated more outputs and had greater recall in Wong et al.'s review,with a broader date range in Jabbour et al.'s review.In Wu et al.'s review,ChatGPT-2 had higher recall and identified more authentic outputs than Bard-2.Conclusion:Large language models(LLMs)failed to fully replicate peer-reviewed methodologies,producing outputs with inaccuracies but identifying relevant,especially recent,articles missed by the references.While human-led PRISMA-based reviews remain the gold standard,refining LLMs for literature reviews shows potential. 展开更多
关键词 artificial intelligence BARD ChatGPT large language models systematic review
暂未订购
When Large Language Models and Machine Learning Meet Multi-Criteria Decision Making: Fully Integrated Approach for Social Media Moderation
8
作者 Noreen Fuentes Janeth Ugang +4 位作者 Narcisan Galamiton Suzette Bacus Samantha Shane Evangelista Fatima Maturan Lanndon Ocampo 《Computers, Materials & Continua》 2026年第1期2137-2162,共26页
This study demonstrates a novel integration of large language models,machine learning,and multicriteria decision-making to investigate self-moderation in small online communities,a topic under-explored compared to use... This study demonstrates a novel integration of large language models,machine learning,and multicriteria decision-making to investigate self-moderation in small online communities,a topic under-explored compared to user behavior and platform-driven moderation on social media.The proposed methodological framework(1)utilizes large language models for social media post analysis and categorization,(2)employs k-means clustering for content characterization,and(3)incorporates the TODIM(Tomada de Decisão Interativa Multicritério)method to determine moderation strategies based on expert judgments.In general,the fully integrated framework leverages the strengths of these intelligent systems in a more systematic evaluation of large-scale decision problems.When applied in social media moderation,this approach promotes nuanced and context-sensitive self-moderation by taking into account factors such as cultural background and geographic location.The application of this framework is demonstrated within Facebook groups.Eight distinct content clusters encompassing safety,harassment,diversity,and misinformation are identified.Analysis revealed a preference for content removal across all clusters,suggesting a cautious approach towards potentially harmful content.However,the framework also highlights the use of other moderation actions,like account suspension,depending on the content category.These findings contribute to the growing body of research on self-moderation and offer valuable insights for creating safer and more inclusive online spaces within smaller communities. 展开更多
关键词 Self-moderation user-generated content k-means clustering TODIM large language models
在线阅读 下载PDF
Turbulence-induced disturbances and their evolution to stall onset in a compressor cascade using large eddy simulation
9
作者 Tianyu PAN Teng LI +1 位作者 Zhaoqi YAN Qiushi LI 《Chinese Journal of Aeronautics》 2026年第2期1-19,共19页
This study investigates the turbulence-induced disturbances and stall precursor triggering mechanism in NACA65-18(10)cascade based on large eddy simulations.The results indicate that the disturbances exist under vario... This study investigates the turbulence-induced disturbances and stall precursor triggering mechanism in NACA65-18(10)cascade based on large eddy simulations.The results indicate that the disturbances exist under various operating conditions along the performance curve.The shear layer is the physical structure responsible for the generation,propagation,and dissipation of disturbances.When operating near stall,the separation on the suction surface intensifies,and strong unsteady backflow occurs at the trailing edge of the passage.Under the influence of inlet disturbances,unsteady behaviors between passages form specific phase differences,leading the entire system to oscillate in a first-order mode.As the flow develops from near-stall to stall,axial momentum decreases further,reducing the main flow’s ability to drive blockages downstream through convection.Consequently,the blockage accumulates during the circumferential propagation process until the stall onset.Based on the above mechanism,this study proposes factors describing the size of the backflow zone,shedding frequency,and convection velocity to characterize blockage dynamics,identifying critical values that represent the stall onset. 展开更多
关键词 Stall onset Pre-stall Disturbances in cascade Stall indicator large eddy simulation
原文传递
PROMPTx-PE:Adaptive Optimization of Prompt Engineering Strategies for Accuracy and Robustness in Large Language Models
10
作者 Talha Farooq Khan Fahad Ali +2 位作者 Majid Hussain Lal Khan Hsien-Tsung Chang 《Computers, Materials & Continua》 2026年第5期685-715,共31页
The outstanding growth in the applications of large language models(LLMs)demonstrates the significance of adaptive and efficient prompt engineering tactics.The existing methods may not be variable,vigorous and streaml... The outstanding growth in the applications of large language models(LLMs)demonstrates the significance of adaptive and efficient prompt engineering tactics.The existing methods may not be variable,vigorous and streamlined in different domains.The offered study introduces an immediate optimization outline,named PROMPTx-PE,that is going to yield a greater level of precision and strength when it comes to the assignments that are premised on LLM.The proposed systemfeatures a timely selection schemewhich is informed by reinforcement learning,a contextual layer and a dynamic weighting module which is regulated by Lyapunov-based stability guidelines.The PROMPTx-PE dynamically varies the exploration and exploitation of the prompt space,depending on real-time feedback and multi-objective reward development.Extensive testing on both benchmark(GLUE,SuperGLUE)and domain-specific data(Healthcare-QA and Industrial-NER)demonstrates a large best performance to be 89.4%and a strong robustness disconnect with under 3%computation expense.The results confirm the effectiveness,consistency,and scalability of PROMPTx-PE as a platform of adaptive prompt engineering based on recent uses of LLMs. 展开更多
关键词 Prompt engineering large language models adaptive optimization ROBUSTNESS multi-objective optimization reinforcement learning natural language processing
在线阅读 下载PDF
Laparoscopic ureterolithotomy combined with flexible cystoscopy for the treatment of large impacted ureteral calculi with renal stones
11
作者 Zhenghui Wang Mingchao Wang +1 位作者 Jie Yuan Liwei Xu 《Laparoscopic, Endoscopic and Robotic Surgery》 2026年第1期52-55,共4页
Impacted upper ureteral stones are definedas calculi that remain lodged in the same location within the upper ureter for more than two months,1 and they are typically associated with inflammation,mucosal edema,and fib... Impacted upper ureteral stones are definedas calculi that remain lodged in the same location within the upper ureter for more than two months,1 and they are typically associated with inflammation,mucosal edema,and fibrosisof the surrounding ureteral wall.These stones often lead to significantclinical consequences,including persistent flankpain,hydronephrosis,infection,impaired renal function,and in severe cases,irreversible kidney damage. 展开更多
关键词 large impacted ureteral calculi mucosal edema INFLAMMATION renal stones flexible cystoscopy FIBROSIS laparoscopic ureterolithotomy upper ureteral stones
原文传递
Harnessing computational power for intelligent oncology in the age of large models: Status, challenges, and prospects
12
作者 Kexin Xu Yueran Xu Qing Shi 《Intelligent Oncology》 2026年第1期51-63,共13页
The integration of large-scale foundation models(e.g.,GPT series and AlphaFold)into oncology is fundamentally transforming both research methodologies and clinical practices,driven by unprecedented advancements in com... The integration of large-scale foundation models(e.g.,GPT series and AlphaFold)into oncology is fundamentally transforming both research methodologies and clinical practices,driven by unprecedented advancements in computational power.This review synthesizes recent progress in the application of large language models to core oncological tasks,including medical imaging analysis,genomic interpretation,and personalized treatment planning.Underpinned by advanced computational infrastructures,such as graphics processing unit/tensor processing unit clusters,heterogeneous computing,and cloud platforms,these models enable superior representation learning and generalization across multimodal data sources.This review examines how these infrastructures overcome key bottlenecks in intelligent oncology through scalable optimization strategies,including mixed-precision training,memory optimization,and heterogeneous computing.Alongside these technical advancements,the review explores pressing challenges,such as data heterogeneity,limited model interpretability,regulatory uncertainties,and the environmental impact of artificial intelligence(AI)systems.Special emphasis is placed on emerging solutions,encompassing green AI and edge computing,which offer promising approaches for low-resource deployment scenarios.Additionally,the review highlights the critical role of interdisciplinary collaboration among oncology,computer science,ethics,and policy to ensure that AI systems are not only powerful but also transparent,safe,and clinically relevant.Finally,the review outlines potential avenues for future research aimed at developing robust,scalable,and human-centered frameworks for intelligent oncology. 展开更多
关键词 large language models Intelligent oncology Medical AI Computational infrastructure High-performance computing
在线阅读 下载PDF
Decision-making performance of large language models vs.human physicians in challenging lung cancer cases:A real-world case-based study
13
作者 Ning Yang Kailai Li +19 位作者 Baiyang Liu Xiting Chen Aimin Jiang Chang Qi Wenyi Gan Lingxuan Zhu Weiming Mou Dongqiang Zeng Mingjia Xiao Guangdi Chu Shengkun Peng Hank ZHWong Lin Zhang Hengguo Zhang Xinpei Deng Quan Cheng Bufu Tang Anqi Lin Juan Zhou Peng Luo 《Intelligent Oncology》 2026年第1期15-24,共10页
Background:Despite the promise shown by large language models(LLMs)for standardized tasks,their multidimensional performance in real-world oncology decision-making remains unevaluated.This study aims to introduce a fr... Background:Despite the promise shown by large language models(LLMs)for standardized tasks,their multidimensional performance in real-world oncology decision-making remains unevaluated.This study aims to introduce a framework for evaluating LLMs and physician decisions in challenging lung cancer cases.Methods:We curated 50 challenging lung cancer cases(25 local and 25 published)classified as complex,rare,or refractory.Blinded three-dimensional,five-point Likert evaluations(1–5 for comprehensiveness,specificity,and readability)compared standalone LLMs(DeepSeek R1,Claude 3.5,Gemini 1.5,and GPT-4o),physicians by experience level(junior,intermediate,and senior),and AI-assisted juniors;intergroup differences and augmentation effects were analyzed statistically.Results:Of 50 challenging cases(18 complex,17 rare,and 15 refractory)rated by three experts,DeepSeek R1 achieved scores of 3.95±0.33,3.71±0.53,and 4.26±0.18 for comprehensiveness,specificity,and readability,respectively,positioning it between intermediate(3.68,3.68,3.75)and senior(4.50,4.64,4.53)physicians.GPT-4o and Claude 3.5 reached intermediate physician–level comprehensiveness(3.76±0.39,3.60±0.39)but junior-to-intermediate physician–level specificity(3.39±0.39,3.39±0.49).All LLMs scored higher on rare cases than intermediate physicians but fell below junior physicians in refractory-case specificity.AIassisted junior physicians showed marked gains in rare cases,with comprehensiveness rising from 2.32 to 4.29(84.8%),specificity from 2.24 to 4.26(90.8%),and readability from 2.76 to 4.59(66.0%),while specificity declined by 3.2%(3.17 to 3.07)in refractory cases.Error analysis showed complementary strengths,with physicians demonstrating reasoning stability and LLMs excelling in knowledge updating and risk management.Conclusions:LLMs performed variably in clinical decision-making tasks depending on case type,performing better in rare cases and worse in refractory cases requiring longitudinal reasoning.Complementary strengths between LLMs and physicians support case-and task-tailored human–AI collaboration. 展开更多
关键词 large language models Clinical evaluation DECISION-MAKING Lung cancer
暂未订购
The Combined Immune Effects of Perfluorooctanoic Acid(PFOA)and Perfluorobutanoic Acid(PFBA)on Intestinal Microbiota of Large Yellow Croaker(Larimichthys crocea)
14
作者 XUE Yadong HAN Ping +3 位作者 LIU Xiumei CHEN Jianming YUAN Mingzhe WANG Xubo 《Journal of Ocean University of China》 2026年第1期312-322,共11页
Polyfluoroalkyl substances(PFAS)have emerged as persistent environmental contaminants because of their chemical stability,degradation-resistance and bioaccumulation potential.However,current studies mainly focus on th... Polyfluoroalkyl substances(PFAS)have emerged as persistent environmental contaminants because of their chemical stability,degradation-resistance and bioaccumulation potential.However,current studies mainly focus on the toxicity of single PFAS such as perfluorooctanoic acid(PFOA)and perfluorobutanoic acid(PFBA),the knowledge of their combined effects is relatively limited.In this study,we explored the immune response of the gut in large yellow croaker(Larimichthys crocea)under the combined stress of PFOA and PFBA.Histologicalanalyses revealed that the combined effect induced intestinal vacuolization and decreased the length of intestinal villi.And it significantly activated pro-inflammatory pathways with marked upregulation of tnfα,il1β,il6 and myd88 expressions,particularly after 14 days of exposure.Gut microbiota analysis revealed substantial dysbiosis,including 1)reduced alpha diversity,2)increased abundance of potential pathogenic taxa(Proteobacteria and Spirochaetota),and 3)depletion of beneficial Firmicutes.PICRUSt-based functional prediction indicated temporal metabolic shifts,with upregulation of DNA repair pathways at day 3 and enhanced bacterial motility protein activity at days 7 and 14 of post-exposure.The Pearson correlation analysis further indicated that these immune genes had significant positive correlations with Vibrio and Brevinema,and negative correlations with Streptococcus.Our present study will provide novel insights into the microbiome-mediated immunomodulation in the larger yellow croaker exposed to combined PFAS,which will be helpful for healthy farming of economically important marine species. 展开更多
关键词 large yellow croaker GUT combined stress immune response
在线阅读 下载PDF
Multiphysics Implicit Coupling Method for Fluid,Particles,and Large-Deformation Structures
15
作者 Xiangxiang Wang Hualong Xie +3 位作者 Yue Yu Min Li Yubin Wang Fei Xing 《Computer Modeling in Engineering & Sciences》 2026年第2期367-401,共35页
This study presents an implicit multiphysics coupling method integrating Computational Fluid Dynamics(CFD),the Multiphase Particle-in-Cell(MPPIC)model,and the Finite Element Method(FEM),implemented with OpenFOAM,Calcu... This study presents an implicit multiphysics coupling method integrating Computational Fluid Dynamics(CFD),the Multiphase Particle-in-Cell(MPPIC)model,and the Finite Element Method(FEM),implemented with OpenFOAM,CalculiX,and preCICE to simulate fluid-particle-structure interactions with large deformations.Mesh motion in the fluid field is handled using the radial basis function(RBF)method.The particle phase is modeled by MPPIC,where fluid-particle interaction is described through momentum exchange,and inter-particle collisions are characterized by collision stress.The structural field is solved by nonlinear FEM to capture large deformations induced by geometric nonlinearity.Coupling among fields is realized through a partitioned,parallel,and non-intrusive iterative strategy,ensuring stable transfer and convergence of interface forces and displacements.Notably,the influence of particles on the structure is not direct but mediated by the fluid,while structural motion directly affects particle dynamics.The results demonstrate that the proposed approach effectively captures multiphysics interaction processes and provides a valuable reference for numerical modeling of coupled fluid-particle-structure systems. 展开更多
关键词 Fluid-particle-structure interaction large deformation partitioned method non-intrusive coupling
在线阅读 下载PDF
Detection of Maliciously Disseminated Hate Speech in Spanish Using Fine-Tuning and In-Context Learning Techniques with Large Language Models
16
作者 Tomás Bernal-Beltrán RonghaoPan +3 位作者 JoséAntonio García-Díaz María del Pilar Salas-Zárate Mario Andrés Paredes-Valverde Rafael Valencia-García 《Computers, Materials & Continua》 2026年第4期353-390,共38页
The malicious dissemination of hate speech via compromised accounts,automated bot networks and malware-driven social media campaigns has become a growing cybersecurity concern.Automatically detecting such content in S... The malicious dissemination of hate speech via compromised accounts,automated bot networks and malware-driven social media campaigns has become a growing cybersecurity concern.Automatically detecting such content in Spanish is challenging due to linguistic complexity and the scarcity of annotated resources.In this paper,we compare two predominant AI-based approaches for the forensic detection of malicious hate speech:(1)finetuning encoder-only models that have been trained in Spanish and(2)In-Context Learning techniques(Zero-and Few-Shot Learning)with large-scale language models.Our approach goes beyond binary classification,proposing a comprehensive,multidimensional evaluation that labels each text by:(1)type of speech,(2)recipient,(3)level of intensity(ordinal)and(4)targeted group(multi-label).Performance is evaluated using an annotated Spanish corpus,standard metrics such as precision,recall and F1-score and stability-oriented metrics to evaluate the stability of the transition from zero-shot to few-shot prompting(Zero-to-Few Shot Retention and Zero-to-Few Shot Gain)are applied.The results indicate that fine-tuned encoder-only models(notably MarIA and BETO variants)consistently deliver the strongest and most reliable performance:in our experiments their macro F1-scores lie roughly in the range of approximately 46%–66%depending on the task.Zero-shot approaches are much less stable and typically yield substantially lower performance(observed F1-scores range approximately 0%–39%),often producing invalid outputs in practice.Few-shot prompting(e.g.,Qwen 38B,Mistral 7B)generally improves stability and recall relative to pure zero-shot,bringing F1-scores into a moderate range of approximately 20%–51%but still falling short of fully fine-tuned models.These findings highlight the importance of supervised adaptation and discuss the potential of both paradigms as components in AI-powered cybersecurity and malware forensics systems designed to identify and mitigate coordinated online hate campaigns. 展开更多
关键词 Hate speech detection malicious communication campaigns AI-driven cybersecurity social media analytics large language models prompt-tuning fine-tuning in-context learning natural language processing
在线阅读 下载PDF
Beyond Accuracy:Evaluating and Explaining the Capability Boundaries of Large Language Models in Syntax-Preserving Code Translation
17
作者 Yaxin Zhao Qi Han +1 位作者 Hui Shu Yan Guang 《Computers, Materials & Continua》 2026年第2期1371-1394,共24页
LargeLanguageModels(LLMs)are increasingly appliedinthe fieldof code translation.However,existing evaluation methodologies suffer from two major limitations:(1)the high overlap between test data and pretraining corpora... LargeLanguageModels(LLMs)are increasingly appliedinthe fieldof code translation.However,existing evaluation methodologies suffer from two major limitations:(1)the high overlap between test data and pretraining corpora,which introduces significant bias in performance evaluation;and(2)mainstream metrics focus primarily on surface-level accuracy,failing to uncover the underlying factors that constrain model capabilities.To address these issues,this paper presents TCode(Translation-Oriented Code Evaluation benchmark)—a complexity-controllable,contamination-free benchmark dataset for code translation—alongside a dedicated static feature sensitivity evaluation framework.The dataset is carefully designed to control complexity along multiple dimensions—including syntactic nesting and expression intricacy—enabling both broad coverage and fine-grained differentiation of sample difficulty.This design supports precise evaluation of model capabilities across a wide spectrum of translation challenges.The proposed evaluation framework introduces a correlation-driven analysis mechanism based on static program features,enabling predictive modeling of translation success from two perspectives:Code Form Complexity(e.g.,code length and character density)and Semantic Modeling Complexity(e.g.,syntactic depth,control-flow nesting,and type system complexity).Empirical evaluations across representative LLMs—including Qwen2.5-72B and Llama3.3-70B—demonstrate that even state-of-the-art models achieve over 80% compilation success on simple samples,but their accuracy drops sharply below 40% on complex cases.Further correlation analysis indicates that Semantic Modeling Complexity alone is correlated with up to 60% of the variance in translation success,with static program features exhibiting nonlinear threshold effects that highlight clear capability boundaries.This study departs fromthe traditional accuracy-centric evaluation paradigm and,for the first time,systematically characterizes the capabilities of large languagemodels in translation tasks through the lens of programstatic features.The findings provide actionable insights for model refinement and training strategy development. 展开更多
关键词 large language models(LLMs) code translation compiler testing program analysis complexity-based evaluation
在线阅读 下载PDF
Prompt Injection Attacks on Large Language Models:A Survey of Attack Methods,Root Causes,and Defense Strategies
18
作者 Tongcheng Geng Zhiyuan Xu +1 位作者 Yubin Qu W.Eric Wong 《Computers, Materials & Continua》 2026年第4期134-185,共52页
Large language models(LLMs)have revolutionized AI applications across diverse domains.However,their widespread deployment has introduced critical security vulnerabilities,particularly prompt injection attacks that man... Large language models(LLMs)have revolutionized AI applications across diverse domains.However,their widespread deployment has introduced critical security vulnerabilities,particularly prompt injection attacks that manipulate model behavior through malicious instructions.Following Kitchenham’s guidelines,this systematic review synthesizes 128 peer-reviewed studies from 2022 to 2025 to provide a unified understanding of this rapidly evolving threat landscape.Our findings reveal a swift progression from simple direct injections to sophisticated multimodal attacks,achieving over 90%success rates against unprotected systems.In response,defense mechanisms show varying effectiveness:input preprocessing achieves 60%–80%detection rates and advanced architectural defenses demonstrate up to 95%protection against known patterns,though significant gaps persist against novel attack vectors.We identified 37 distinct defense approaches across three categories,but standardized evaluation frameworks remain limited.Our analysis attributes these vulnerabilities to fundamental LLM architectural limitations,such as the inability to distinguish instructions from data and attention mechanism vulnerabilities.This highlights critical research directions such as formal verification methods,standardized evaluation protocols,and architectural innovations for inherently secure LLM designs. 展开更多
关键词 Prompt injection attacks large language models defense mechanisms security evaluation
在线阅读 下载PDF
A Deep Learning–Based Bias Correction Model for Tropical Cyclone Track and Intensity towards Forecasting of the TianXing Large Weather Model
19
作者 Shijin YUAN Xingzhou WANG +3 位作者 Bin MU Guansong WANG Zeyi NIU Hao LI 《Advances in Atmospheric Sciences》 2026年第3期612-630,共19页
Accurate forecasting of tropical cyclone(TC)tracks and intensities is essential.Although the TianXing large weather model,a six-hourly forecasting model surpassing operational forecasts,exhibits superior performance,i... Accurate forecasting of tropical cyclone(TC)tracks and intensities is essential.Although the TianXing large weather model,a six-hourly forecasting model surpassing operational forecasts,exhibits superior performance,its TC forecasts still require enhancement.Prediction errors persist due to biases in the training data and smoothing effects in data-driven methods.To address this,we introduce CycloneBCNet,a deep-learning model designed to correct TianXing’s TC forecast biases by leveraging spatial and temporal data.CycloneBCNet utilizes the SimVP(simpler yet better video prediction)framework with spatial attention to highlight cyclone core regions in forecast fields.It also incorporates TC trend information(center position,maximum wind speed,and minimum sea level pressure)via an LSTM(long short-term memory)module.These TC vectors are derived from post-processed TianXing forecasts.By fusing features from forecast fields and TC vectors,CycloneBCNet corrects biases across multiple lead times.At a 96-h lead time,the track error reduces from 162.4 to 86.4 km,the wind speed error from 17.2 to 6.69 m s^(-1),and the pressure error from 22.2 to 9.36 hPa.Interpretability analysis shows that CycloneBCNet adjusts its attention across forecast lead times.Intensity corrections prioritize inner-core dynamics,particularly the eye and eyewall,while track corrections shift from lower-level variables and the cyclone’s core to broader environmental factors and mid-to upper-level features as the forecast duration increases.These findings demonstrate that CycloneBCNet effectively captures key TC dynamics consistent with meteorological principles,including the dominance of near-surface conditions for intensity and the increasing influence of steering currents on track prediction. 展开更多
关键词 tropical cyclone TianXing large weather model bias correction interpretability analysis deep learning-based model
在线阅读 下载PDF
Clinical decision and prescription generation for diarrhea in traditional Chinese medicine based on large language model
20
作者 Jiaze Wu Hao Liang +2 位作者 Haoran Dai Hongliang Rui Baoli Liu 《Digital Chinese Medicine》 2026年第1期13-30,共18页
Objective To develop a clinical decision and prescription generation system(CDPGS)specifically for diarrhea in traditional Chinese medicine(TCM),utilizing a specialized large language model(LLM),Qwen-TCM-Dia,to standa... Objective To develop a clinical decision and prescription generation system(CDPGS)specifically for diarrhea in traditional Chinese medicine(TCM),utilizing a specialized large language model(LLM),Qwen-TCM-Dia,to standardize diagnostic processes and prescription generation.Methods Two primary datasets were constructed:an evaluation benchmark and a fine-tuning dataset consisting of fundamental diarrhea knowledge,medical records,and chain-ofthought(CoT)reasoning datasets.After an initial evaluation of 16 open-source LLMs across inference time,accuracy,and output quality,Qwen2.5 was selected as the base model due to its superior overall performance.We then employed a two-stage low-rank adaptation(LoRA)fine-tuning strategy,integrating continued pre-training on domain-specific knowledge with instruction fine-tuning using CoT-enriched medical records.This approach was designed to embed the clinical logic(symptoms→pathogenesis→therapeutic principles→prescriptions)into the model’s reasoning capabilities.The resulting fine-tuned model,specialized for TCM diarrhea,was designated as Qwen-TCM-Dia.Model performance was evaluated for disease diagnosis and syndrome type differentiation using accuracy,precision,recall,and F1-score.Furthermore,the quality of the generated prescriptions was compared with that of established open-source TCM LLMs.Results Qwen-TCM-Dia achieved peak performance compared to both the base Qwen2.5 model and five other open-source TCM LLMs.It achieved 97.05%accuracy and 91.48%F1-score in disease diagnosis,and 74.54%accuracy and 74.21%F1-score in syndrome type differentiation.Compared with existing open-source TCM LLMs(BianCang,HuangDi,LingDan,TCMLLM-PR,and ZhongJing),Qwen-TCM-Dia exhibited higher fidelity in reconstructing the“symptoms→pathogenesis→therapeutic principles→prescriptions”logic chain.It provided complete prescriptions,whereas other models often omitted dosages or generated mismatched prescriptions.Conclusion By integrating continued pre-training,CoT reasoning,and a two-stage fine-tuning strategy,this study establishes a CDPGS for diarrhea in TCM.The results demonstrate the synergistic effect of strengthening domain representation through pre-training and activating logical reasoning via CoT.This research not only provides critical technical support for the standardized diagnosis and treatment of diarrhea but also offers a scalable paradigm for the digital inheritance of expert TCM experience and the intelligent transformation of TCM. 展开更多
关键词 DIARRHEA Traditional Chinese medicine large language model Clinical decision and prescription generation Natural language processing
暂未订购
上一页 1 2 219 下一页 到第
使用帮助 返回顶部