期刊文献+
共找到659篇文章
< 1 2 33 >
每页显示 20 50 100
Global open source and international standards promote the inclusive development of large models
1
作者 Lin Yonghua 《China Standardization》 2025年第5期25-25,共1页
In the era of AI,especially large models,the importance of open source has become increasingly prominent.First,open source allows innovation to avoid starting from scratch.Through iterative innovation,it promotes tech... In the era of AI,especially large models,the importance of open source has become increasingly prominent.First,open source allows innovation to avoid starting from scratch.Through iterative innovation,it promotes technical exchanges and learning globally.Second,resources required for large model R&D are difficult for a single institution to obtain.The evaluation of general large models also requires the participation of experts from various industries.Third,without open source collaboration,it is difficult to form a unified upper-layer software ecosystem.Therefore,open source has become an important cooperation mechanism to promote the development of AI and large models.There are two cases to illustrate how open source and international standards interact with each other. 展开更多
关键词 open source large model international standards inclusive development iterative innovationit large modelsthe evaluation general large models large models
原文传递
Harnessing computational power for intelligent oncology in the age of large models: Status, challenges, and prospects
2
作者 Kexin Xu Yueran Xu Qing Shi 《Intelligent Oncology》 2026年第1期51-63,共13页
The integration of large-scale foundation models(e.g.,GPT series and AlphaFold)into oncology is fundamentally transforming both research methodologies and clinical practices,driven by unprecedented advancements in com... The integration of large-scale foundation models(e.g.,GPT series and AlphaFold)into oncology is fundamentally transforming both research methodologies and clinical practices,driven by unprecedented advancements in computational power.This review synthesizes recent progress in the application of large language models to core oncological tasks,including medical imaging analysis,genomic interpretation,and personalized treatment planning.Underpinned by advanced computational infrastructures,such as graphics processing unit/tensor processing unit clusters,heterogeneous computing,and cloud platforms,these models enable superior representation learning and generalization across multimodal data sources.This review examines how these infrastructures overcome key bottlenecks in intelligent oncology through scalable optimization strategies,including mixed-precision training,memory optimization,and heterogeneous computing.Alongside these technical advancements,the review explores pressing challenges,such as data heterogeneity,limited model interpretability,regulatory uncertainties,and the environmental impact of artificial intelligence(AI)systems.Special emphasis is placed on emerging solutions,encompassing green AI and edge computing,which offer promising approaches for low-resource deployment scenarios.Additionally,the review highlights the critical role of interdisciplinary collaboration among oncology,computer science,ethics,and policy to ensure that AI systems are not only powerful but also transparent,safe,and clinically relevant.Finally,the review outlines potential avenues for future research aimed at developing robust,scalable,and human-centered frameworks for intelligent oncology. 展开更多
关键词 large language models Intelligent oncology Medical AI Computational infrastructure High-performance computing
在线阅读 下载PDF
A new human-computer interaction paradigm: Agent interaction model based on large models and its prospects
3
作者 Yang LIU 《虚拟现实与智能硬件(中英文)》 2025年第3期237-266,共30页
This study examines the advent of agent interaction(AIx)as a transformative paradigm in humancomputer interaction(HCI),signifying a notable evolution beyond traditional graphical interfaces and touchscreen interaction... This study examines the advent of agent interaction(AIx)as a transformative paradigm in humancomputer interaction(HCI),signifying a notable evolution beyond traditional graphical interfaces and touchscreen interactions.Within the context of large models,AIx is characterized by its innovative interaction patterns and a plethora of application scenarios that hold great potential.The paper highlights the pivotal role of AIx in shaping the future landscape of the large model industry,emphasizing its adoption and necessity from a user's perspective.This study underscores the pivotal role of AIx in dictating the future trajectory of a large model industry by emphasizing the importance of its adoption and necessity from a user-centric perspective.The fundamental drivers of AIx include the introduction of novel capabilities,replication of capabilities(both anthropomorphic and superhuman),migration of capabilities,aggregation of intelligence,and multiplication of capabilities.These elements are essential for propelling innovation,expanding the frontiers of capability,and realizing the exponential superposition of capabilities,thereby mitigating labor redundancy and addressing a spectrum of human needs.Furthermore,this study provides an in-depth analysis of the structural components and operational mechanisms of agents supported by large models.Such advancements significantly enhance the capacity of agents to tackle complex problems and provide intelligent services,thereby facilitating a more intuitive,adaptive,and personalized engagement between humans and machines.The study further delineates four principal categories of interaction patterns that encompass eight distinct modalities of interaction,corresponding to twenty-one specific scenarios,including applications in smart home systems,health assistance,and elderly care.This emphasizes the significance of this new paradigm in advancing HCI,fostering technological advancements,and redefining user experiences.However,it also acknowledges the challenges and ethical considerations that accompany this paradigm shift,recognizing the need for a balanced approach to harness the full potential of AIx in modern society. 展开更多
关键词 Interaction paradigm Agent interaction large models
在线阅读 下载PDF
Special Topic on Security of Large Models
4
作者 SU Zhou DU Linkang 《ZTE Communications》 2025年第3期1-2,共2页
Large models,such as large language models(LLMs),vision-language models(VLMs),and multimodal agents,have become key elements in artificial intelli⁃gence(AI)systems.Their rapid development has greatly improved percepti... Large models,such as large language models(LLMs),vision-language models(VLMs),and multimodal agents,have become key elements in artificial intelli⁃gence(AI)systems.Their rapid development has greatly improved perception,generation,and decision-making in various fields.However,their vast scale and complexity bring about new security challenges.Issues such as backdoor vulnerabilities during training,jailbreaking in multimodal rea⁃soning,and data provenance and copyright auditing have made security a critical focus for both academia and industry. 展开更多
关键词 large modelssuch SECURITY multimodal agentshave multimodal rea soningand large language models llms vision language data provenance copyright auditing backdoor vulnerabilities vision language models
在线阅读 下载PDF
Envisioning the blueprint:Aeronautics in large models era 被引量:1
5
作者 Weiwei ZHANG Shule ZHAO 《Chinese Journal of Aeronautics》 2025年第8期139-141,共3页
Following the groundbreaking introduction of the Transformer architecture in 2017,the development of Large Language Models(LLMs)formally commenced.In May 2020,Chat GPT-3,with over one hundred billion parameters,entere... Following the groundbreaking introduction of the Transformer architecture in 2017,the development of Large Language Models(LLMs)formally commenced.In May 2020,Chat GPT-3,with over one hundred billion parameters,entered the public eye,marking a significant milestone in LLM advancement. 展开更多
关键词 AERONAUTICS large languagemodels transformer architecture transformerarchitecture llms chatgpt large language models llms formally
原文传递
Dataset Copyright Auditing for Large Models:Fundamentals,Open Problems,and Future Directions
6
作者 DU Linkang SU Zhou YU Xinyi 《ZTE Communications》 2025年第3期38-47,共10页
The unprecedented scale of large models,such as large language models(LLMs)and text-to-image diffusion models,has raised critical concerns about the unauthorized use of copyrighted data during model training.These con... The unprecedented scale of large models,such as large language models(LLMs)and text-to-image diffusion models,has raised critical concerns about the unauthorized use of copyrighted data during model training.These concerns have spurred a growing demand for dataset copyright auditing techniques,which aim to detect and verify potential infringements in the training data of commercial AI systems.This paper presents a survey of existing auditing solutions,categorizing them across key dimensions:data modality,model training stage,data overlap scenarios,and model access levels.We highlight major trends,including the prevalence of black-box auditing methods and the emphasis on fine-tuning rather than pre-training.Through an in-depth analysis of 12 representative works,we extract four key observations that reveal the limitations of current methods.Furthermore,we identify three open challenges and propose future directions for robust,multimodal,and scalable auditing solutions.Our findings underscore the urgent need to establish standardized benchmarks and develop auditing frameworks that are resilient to low watermark densities and applicable in diverse deployment settings. 展开更多
关键词 dataset copyright auditing large language models diffusion models multimodal auditing membership inference
在线阅读 下载PDF
The Synergy of Seeing and Saying: Revolutionary Advances in Multi-modality Medical Vision-Language Large Models
7
作者 Xiang LI Yu SUN +3 位作者 Jia LIN Like LI Ting FENG Shen YIN 《Artificial Intelligence Science and Engineering》 2025年第2期79-97,共19页
The application of visual-language large models in the field of medical health has gradually become a research focus.The models combine the capability for image understanding and natural language processing,and can si... The application of visual-language large models in the field of medical health has gradually become a research focus.The models combine the capability for image understanding and natural language processing,and can simultaneously process multi-modality data such as medical images and medical reports.These models can not only recognize images,but also understand the semantic relationship between images and texts,effectively realize the integration of medical information,and provide strong support for clinical decision-making and disease diagnosis.The visual-language large model has good performance for specific medical tasks,and also shows strong potential and high intelligence in the general task models.This paper provides a comprehensive review of the visual-language large model in the field of medical health.Specifically,this paper first introduces the basic theoretical basis and technical principles.Then,this paper introduces the specific application scenarios in the field of medical health,including modality fusion,semi-supervised learning,weakly supervised learning,unsupervised learning,cross-domain model and general models.Finally,the challenges including insufficient data,interpretability,and practical deployment are discussed.According to the existing challenges,four potential future development directions are given. 展开更多
关键词 large language models vision-language models medical health multimodality models
在线阅读 下载PDF
Neuromorphic Computing in the Era of Large Models
8
作者 Haoxuan SHAN Chiyue WEI +4 位作者 Nicolas RAMOS Xiaoxuan YANG Cong GUO Hai(Helen)LI Yiran CHEN 《Artificial Intelligence Science and Engineering》 2025年第1期17-30,共14页
The rapid advancement of deep learning and the emergence of largescale neural models,such as bidirectional encoder representations from transformers(BERT),generative pre-trained transformer(GPT),and large language mod... The rapid advancement of deep learning and the emergence of largescale neural models,such as bidirectional encoder representations from transformers(BERT),generative pre-trained transformer(GPT),and large language model Meta AI(LLaMa),have brought significant computational and energy challenges.Neuromorphic computing presents a biologically inspired approach to addressing these issues,leveraging event-driven processing and in-memory computation for enhanced energy efficiency.This survey explores the intersection of neuromorphic computing and large-scale deep learning models,focusing on neuromorphic models,learning methods,and hardware.We highlight transferable techniques from deep learning to neuromorphic computing and examine the memoryrelated scalability limitations of current neuromorphic systems.Furthermore,we identify potential directions to enable neuromorphic systems to meet the growing demands of modern AI workloads. 展开更多
关键词 neuromorphic computing spiking neural networks large deep learning models
在线阅读 下载PDF
Research status and application of artificial intelligence large models in the oil and gas industry 被引量:5
9
作者 LIU He REN Yili +6 位作者 LI Xin DENG Yue WANG Yongtao CAO Qianwen DU Jinyang LIN Zhiwei WANG Wenjie 《Petroleum Exploration and Development》 SCIE 2024年第4期1049-1065,共17页
This article elucidates the concept of large model technology,summarizes the research status of large model technology both domestically and internationally,provides an overview of the application status of large mode... This article elucidates the concept of large model technology,summarizes the research status of large model technology both domestically and internationally,provides an overview of the application status of large models in vertical industries,outlines the challenges and issues confronted in applying large models in the oil and gas sector,and offers prospects for the application of large models in the oil and gas industry.The existing large models can be briefly divided into three categories:large language models,visual large models,and multimodal large models.The application of large models in the oil and gas industry is still in its infancy.Based on open-source large language models,some oil and gas enterprises have released large language model products using methods like fine-tuning and retrieval augmented generation.Scholars have attempted to develop scenario-specific models for oil and gas operations by using visual/multimodal foundation models.A few researchers have constructed pre-trained foundation models for seismic data processing and interpretation,as well as core analysis.The application of large models in the oil and gas industry faces challenges such as current data quantity and quality being difficult to support the training of large models,high research and development costs,and poor algorithm autonomy and control.The application of large models should be guided by the needs of oil and gas business,taking the application of large models as an opportunity to improve data lifecycle management,enhance data governance capabilities,promote the construction of computing power,strengthen the construction of“artificial intelligence+energy”composite teams,and boost the autonomy and control of large model technology. 展开更多
关键词 foundation model large language mode visual large model multimodal large model large model of oil and gas industry pre-training fine-tuning
在线阅读 下载PDF
Implementation of Digital Classroom State System for Teachers and Students Based on Large Models
10
作者 Wenbo Lyu Guangmin Zhu +2 位作者 Ziyi Qin Mengting Yan Jie Zhang 《Journal of Contemporary Educational Research》 2024年第11期101-106,共6页
Deep learning has become a hot field of artificial intelligence,and the deep learning large model framework has become a bridgehead for the active layout of Chinese and foreign technology companies.Large models play a... Deep learning has become a hot field of artificial intelligence,and the deep learning large model framework has become a bridgehead for the active layout of Chinese and foreign technology companies.Large models play a significant role in the application field,greatly improving the efficiency of training and optimization,and contributing to the landing of many innovative artificial intelligence tools.Based on the Chinese PaddlePaddle large model framework,an application system is designed in combination with the intelligent classroom teaching scenario,which uses machine vision algorithms to distinguish and present teachers’and students’behaviors,that is,the digitization and multi-classification scheme of class character states.After having digital data,data analysis can be carried out to evaluate the class status of teachers and students,and the traditional subjective judgment such as peacetime grades and teaching ability can be upgraded to the objective judgment of artificial intelligence. 展开更多
关键词 large model Machine vision DIGITALIZATION STATUS Technical realization
在线阅读 下载PDF
Hepatitis C Patient Education:Large Language Models Show Promise in Disseminating Guidelines
11
作者 Jinyan Chen Ruijie Zhao +10 位作者 Chiyu He Huigang Li Yajie You Zuyuan Lin Ze Xiang Jianyong Zhuo Wei Shen Zhihang Hu Shusen Zheng Xiao Xu Di Lu 《Journal of Clinical and Translational Hepatology》 2026年第1期116-119,共4页
This study evaluated the accuracy,completeness,and comprehensibility of responses from mainstream large language models(LLMs)to hepatitis C virus(HCV)-related questions,aiming to assess their performance in addressing... This study evaluated the accuracy,completeness,and comprehensibility of responses from mainstream large language models(LLMs)to hepatitis C virus(HCV)-related questions,aiming to assess their performance in addressing patient queries about disease and lifestyle behaviors.The models selected were ChatGPT-4o,Gemini 2.0 Pro,Claude 3.5 Sonnet,and DeepSeek V3,with 12 questions chosen by two HCV experts from the domains of prevention,diagnosis,and treatment. 展开更多
关键词 addressing patient queries disease lifestyle behaviorsthe large language models large language models llms GUIDELINES hepatitis C accuracy patient education COMPREHENSIBILITY
原文传递
Semantic Causality Evaluation of Correlation Analysis Utilizing Large Language Models
12
作者 Adam Dudáš 《Computers, Materials & Continua》 2026年第5期2246-2269,共24页
It is known that correlation does not imply causality.Some relationships identified in the analysis of data are coincidental or unknown,and some are produced by real-world causality of the situation,which is problemat... It is known that correlation does not imply causality.Some relationships identified in the analysis of data are coincidental or unknown,and some are produced by real-world causality of the situation,which is problematic,since there is a need to differentiate between these two scenarios.Until recently,the proper−semantic−causality of the relationship could have been determined only by human experts from the area of expertise of the studied data.This has changed with the advance of large language models,which are often utilized as surrogates for such human experts,making the process automated and readily available to all data analysts.This motivates the main objective of this work,which is to introduce the design and implementation of a large language model-based semantic causality evaluator based on correlation analysis,together with its visual analysis model called Causal heatmap.After the implementation itself,the model is evaluated from the point of view of the quality of the visual model,from the point of view of the quality of causal evaluation based on large language models,and from the point of view of comparative analysis,while the results reached in the study highlight the usability of large language models in the task and the potential of the proposed approach in the analysis of unknown datasets.The results of the experimental evaluation demonstrate the usefulness of the Causal heatmap method,supported by the evident highlighting of interesting relationships,while suppressing irrelevant ones. 展开更多
关键词 CORRELATION CAUSALITY correlation analysis large language models VISUALIZATION
在线阅读 下载PDF
When Large Language Models and Machine Learning Meet Multi-Criteria Decision Making: Fully Integrated Approach for Social Media Moderation
13
作者 Noreen Fuentes Janeth Ugang +4 位作者 Narcisan Galamiton Suzette Bacus Samantha Shane Evangelista Fatima Maturan Lanndon Ocampo 《Computers, Materials & Continua》 2026年第1期2137-2162,共26页
This study demonstrates a novel integration of large language models,machine learning,and multicriteria decision-making to investigate self-moderation in small online communities,a topic under-explored compared to use... This study demonstrates a novel integration of large language models,machine learning,and multicriteria decision-making to investigate self-moderation in small online communities,a topic under-explored compared to user behavior and platform-driven moderation on social media.The proposed methodological framework(1)utilizes large language models for social media post analysis and categorization,(2)employs k-means clustering for content characterization,and(3)incorporates the TODIM(Tomada de Decisão Interativa Multicritério)method to determine moderation strategies based on expert judgments.In general,the fully integrated framework leverages the strengths of these intelligent systems in a more systematic evaluation of large-scale decision problems.When applied in social media moderation,this approach promotes nuanced and context-sensitive self-moderation by taking into account factors such as cultural background and geographic location.The application of this framework is demonstrated within Facebook groups.Eight distinct content clusters encompassing safety,harassment,diversity,and misinformation are identified.Analysis revealed a preference for content removal across all clusters,suggesting a cautious approach towards potentially harmful content.However,the framework also highlights the use of other moderation actions,like account suspension,depending on the content category.These findings contribute to the growing body of research on self-moderation and offer valuable insights for creating safer and more inclusive online spaces within smaller communities. 展开更多
关键词 Self-moderation user-generated content k-means clustering TODIM large language models
在线阅读 下载PDF
Assessing Large Language Models for Early Article Identification in Otolaryngology—Head and Neck Surgery Systematic Reviews
14
作者 Ajibola B.Bakare Young Lee +2 位作者 Jhuree Hong Claus-Peter Richter Jonathan P.Kuriakose 《Health Care Science》 2026年第1期19-28,共10页
Background:Assess ChatGPT and Bard's effectiveness in the initial identification of articles for Otolaryngology—Head and Neck Surgery systematic literature reviews.Methods:Three PRISMA-based systematic reviews(Ja... Background:Assess ChatGPT and Bard's effectiveness in the initial identification of articles for Otolaryngology—Head and Neck Surgery systematic literature reviews.Methods:Three PRISMA-based systematic reviews(Jabbour et al.2017,Wong et al.2018,and Wu et al.2021)were replicated using ChatGPTv3.5 and Bard.Outputs(author,title,publication year,and journal)were compared to the original references and cross-referenced with medical databases for authenticity and recall.Results:Several themes emerged when comparing Bard and ChatGPT across the three reviews.Bard generated more outputs and had greater recall in Wong et al.'s review,with a broader date range in Jabbour et al.'s review.In Wu et al.'s review,ChatGPT-2 had higher recall and identified more authentic outputs than Bard-2.Conclusion:Large language models(LLMs)failed to fully replicate peer-reviewed methodologies,producing outputs with inaccuracies but identifying relevant,especially recent,articles missed by the references.While human-led PRISMA-based reviews remain the gold standard,refining LLMs for literature reviews shows potential. 展开更多
关键词 artificial intelligence BARD ChatGPT large language models systematic review
暂未订购
PROMPTx-PE:Adaptive Optimization of Prompt Engineering Strategies for Accuracy and Robustness in Large Language Models
15
作者 Talha Farooq Khan Fahad Ali +2 位作者 Majid Hussain Lal Khan Hsien-Tsung Chang 《Computers, Materials & Continua》 2026年第5期685-715,共31页
The outstanding growth in the applications of large language models(LLMs)demonstrates the significance of adaptive and efficient prompt engineering tactics.The existing methods may not be variable,vigorous and streaml... The outstanding growth in the applications of large language models(LLMs)demonstrates the significance of adaptive and efficient prompt engineering tactics.The existing methods may not be variable,vigorous and streamlined in different domains.The offered study introduces an immediate optimization outline,named PROMPTx-PE,that is going to yield a greater level of precision and strength when it comes to the assignments that are premised on LLM.The proposed systemfeatures a timely selection schemewhich is informed by reinforcement learning,a contextual layer and a dynamic weighting module which is regulated by Lyapunov-based stability guidelines.The PROMPTx-PE dynamically varies the exploration and exploitation of the prompt space,depending on real-time feedback and multi-objective reward development.Extensive testing on both benchmark(GLUE,SuperGLUE)and domain-specific data(Healthcare-QA and Industrial-NER)demonstrates a large best performance to be 89.4%and a strong robustness disconnect with under 3%computation expense.The results confirm the effectiveness,consistency,and scalability of PROMPTx-PE as a platform of adaptive prompt engineering based on recent uses of LLMs. 展开更多
关键词 Prompt engineering large language models adaptive optimization ROBUSTNESS multi-objective optimization reinforcement learning natural language processing
在线阅读 下载PDF
Decision-making performance of large language models vs.human physicians in challenging lung cancer cases:A real-world case-based study
16
作者 Ning Yang Kailai Li +19 位作者 Baiyang Liu Xiting Chen Aimin Jiang Chang Qi Wenyi Gan Lingxuan Zhu Weiming Mou Dongqiang Zeng Mingjia Xiao Guangdi Chu Shengkun Peng Hank ZHWong Lin Zhang Hengguo Zhang Xinpei Deng Quan Cheng Bufu Tang Anqi Lin Juan Zhou Peng Luo 《Intelligent Oncology》 2026年第1期15-24,共10页
Background:Despite the promise shown by large language models(LLMs)for standardized tasks,their multidimensional performance in real-world oncology decision-making remains unevaluated.This study aims to introduce a fr... Background:Despite the promise shown by large language models(LLMs)for standardized tasks,their multidimensional performance in real-world oncology decision-making remains unevaluated.This study aims to introduce a framework for evaluating LLMs and physician decisions in challenging lung cancer cases.Methods:We curated 50 challenging lung cancer cases(25 local and 25 published)classified as complex,rare,or refractory.Blinded three-dimensional,five-point Likert evaluations(1–5 for comprehensiveness,specificity,and readability)compared standalone LLMs(DeepSeek R1,Claude 3.5,Gemini 1.5,and GPT-4o),physicians by experience level(junior,intermediate,and senior),and AI-assisted juniors;intergroup differences and augmentation effects were analyzed statistically.Results:Of 50 challenging cases(18 complex,17 rare,and 15 refractory)rated by three experts,DeepSeek R1 achieved scores of 3.95±0.33,3.71±0.53,and 4.26±0.18 for comprehensiveness,specificity,and readability,respectively,positioning it between intermediate(3.68,3.68,3.75)and senior(4.50,4.64,4.53)physicians.GPT-4o and Claude 3.5 reached intermediate physician–level comprehensiveness(3.76±0.39,3.60±0.39)but junior-to-intermediate physician–level specificity(3.39±0.39,3.39±0.49).All LLMs scored higher on rare cases than intermediate physicians but fell below junior physicians in refractory-case specificity.AIassisted junior physicians showed marked gains in rare cases,with comprehensiveness rising from 2.32 to 4.29(84.8%),specificity from 2.24 to 4.26(90.8%),and readability from 2.76 to 4.59(66.0%),while specificity declined by 3.2%(3.17 to 3.07)in refractory cases.Error analysis showed complementary strengths,with physicians demonstrating reasoning stability and LLMs excelling in knowledge updating and risk management.Conclusions:LLMs performed variably in clinical decision-making tasks depending on case type,performing better in rare cases and worse in refractory cases requiring longitudinal reasoning.Complementary strengths between LLMs and physicians support case-and task-tailored human–AI collaboration. 展开更多
关键词 large language models Clinical evaluation DECISION-MAKING Lung cancer
暂未订购
Beyond Accuracy:Evaluating and Explaining the Capability Boundaries of Large Language Models in Syntax-Preserving Code Translation
17
作者 Yaxin Zhao Qi Han +1 位作者 Hui Shu Yan Guang 《Computers, Materials & Continua》 2026年第2期1371-1394,共24页
LargeLanguageModels(LLMs)are increasingly appliedinthe fieldof code translation.However,existing evaluation methodologies suffer from two major limitations:(1)the high overlap between test data and pretraining corpora... LargeLanguageModels(LLMs)are increasingly appliedinthe fieldof code translation.However,existing evaluation methodologies suffer from two major limitations:(1)the high overlap between test data and pretraining corpora,which introduces significant bias in performance evaluation;and(2)mainstream metrics focus primarily on surface-level accuracy,failing to uncover the underlying factors that constrain model capabilities.To address these issues,this paper presents TCode(Translation-Oriented Code Evaluation benchmark)—a complexity-controllable,contamination-free benchmark dataset for code translation—alongside a dedicated static feature sensitivity evaluation framework.The dataset is carefully designed to control complexity along multiple dimensions—including syntactic nesting and expression intricacy—enabling both broad coverage and fine-grained differentiation of sample difficulty.This design supports precise evaluation of model capabilities across a wide spectrum of translation challenges.The proposed evaluation framework introduces a correlation-driven analysis mechanism based on static program features,enabling predictive modeling of translation success from two perspectives:Code Form Complexity(e.g.,code length and character density)and Semantic Modeling Complexity(e.g.,syntactic depth,control-flow nesting,and type system complexity).Empirical evaluations across representative LLMs—including Qwen2.5-72B and Llama3.3-70B—demonstrate that even state-of-the-art models achieve over 80% compilation success on simple samples,but their accuracy drops sharply below 40% on complex cases.Further correlation analysis indicates that Semantic Modeling Complexity alone is correlated with up to 60% of the variance in translation success,with static program features exhibiting nonlinear threshold effects that highlight clear capability boundaries.This study departs fromthe traditional accuracy-centric evaluation paradigm and,for the first time,systematically characterizes the capabilities of large languagemodels in translation tasks through the lens of programstatic features.The findings provide actionable insights for model refinement and training strategy development. 展开更多
关键词 large language models(LLMs) code translation compiler testing program analysis complexity-based evaluation
在线阅读 下载PDF
Command-agent:Reconstructing warfare simulation and command decision-making using large language models
18
作者 Mengwei Zhang Minchi Kuang +3 位作者 Heng Shi Jihong Zhu Jingyu Zhu Xiao Jiang 《Defence Technology(防务技术)》 2026年第2期294-313,共20页
War rehearsals have become increasingly important in national security due to the growing complexity of international affairs.However,traditional rehearsal methods,such as military chess simulations,are inefficient an... War rehearsals have become increasingly important in national security due to the growing complexity of international affairs.However,traditional rehearsal methods,such as military chess simulations,are inefficient and inflexible,with particularly pronounced limitations in command and decision-making.The overwhelming volume of information and high decision complexity hinder the realization of autonomous and agile command and control.To address this challenge,an intelligent warfare simulation framework named Command-Agent is proposed,which deeply integrates large language models(LLMs)with digital twin battlefields.By constructing a highly realistic battlefield environment through real-time simulation and multi-source data fusion,the natural language interaction capabilities of LLMs are leveraged to lower the command threshold and to enable autonomous command through the Observe-Orient-Decide-Act(OODA)feedback loop.Within the Command-Agent framework,a multimodel collaborative architecture is further adopted to decouple the decision-generation and command-execution functions of LLMs.By combining specialized models such as Deep Seek-R1 and MCTool,the limitations of single-model capabilities are overcome.MCTool is a lightweight execution model fine-tuned for military Function Calling tasks.The framework also introduces a Vector Knowledge Base to mitigate hallucinations commonly exhibited by LLMs.Experimental results demonstrate that Command-Agent not only enables natural language-driven simulation and control but also deeply understands commander intent.Leveraging the multi-model collaborative architecture,during red-blue UAV confrontations involving 2 to 8 UAVs,the integrated score is improved by an average of 41.8%compared to the single-agent system(MCTool),accompanied by a 161.8%optimization in the battle loss ratio.Furthermore,when compared with multi-agent systems lacking the knowledge base,the inclusion of the Vector Knowledge Base further improves overall performance by 16.8%.In comparison with the general model(Qwen2.5-7B),the fine-tuned MCTool leads by 5%in execution efficiency.Therefore,the proposed Command-Agent introduces a novel perspective to the military command system and offers a feasible solution for intelligent battlefield decision-making. 展开更多
关键词 Digital twin battlefield large language models Multi-agent system Military command
在线阅读 下载PDF
Detection of Maliciously Disseminated Hate Speech in Spanish Using Fine-Tuning and In-Context Learning Techniques with Large Language Models
19
作者 Tomás Bernal-Beltrán RonghaoPan +3 位作者 JoséAntonio García-Díaz María del Pilar Salas-Zárate Mario Andrés Paredes-Valverde Rafael Valencia-García 《Computers, Materials & Continua》 2026年第4期353-390,共38页
The malicious dissemination of hate speech via compromised accounts,automated bot networks and malware-driven social media campaigns has become a growing cybersecurity concern.Automatically detecting such content in S... The malicious dissemination of hate speech via compromised accounts,automated bot networks and malware-driven social media campaigns has become a growing cybersecurity concern.Automatically detecting such content in Spanish is challenging due to linguistic complexity and the scarcity of annotated resources.In this paper,we compare two predominant AI-based approaches for the forensic detection of malicious hate speech:(1)finetuning encoder-only models that have been trained in Spanish and(2)In-Context Learning techniques(Zero-and Few-Shot Learning)with large-scale language models.Our approach goes beyond binary classification,proposing a comprehensive,multidimensional evaluation that labels each text by:(1)type of speech,(2)recipient,(3)level of intensity(ordinal)and(4)targeted group(multi-label).Performance is evaluated using an annotated Spanish corpus,standard metrics such as precision,recall and F1-score and stability-oriented metrics to evaluate the stability of the transition from zero-shot to few-shot prompting(Zero-to-Few Shot Retention and Zero-to-Few Shot Gain)are applied.The results indicate that fine-tuned encoder-only models(notably MarIA and BETO variants)consistently deliver the strongest and most reliable performance:in our experiments their macro F1-scores lie roughly in the range of approximately 46%–66%depending on the task.Zero-shot approaches are much less stable and typically yield substantially lower performance(observed F1-scores range approximately 0%–39%),often producing invalid outputs in practice.Few-shot prompting(e.g.,Qwen 38B,Mistral 7B)generally improves stability and recall relative to pure zero-shot,bringing F1-scores into a moderate range of approximately 20%–51%but still falling short of fully fine-tuned models.These findings highlight the importance of supervised adaptation and discuss the potential of both paradigms as components in AI-powered cybersecurity and malware forensics systems designed to identify and mitigate coordinated online hate campaigns. 展开更多
关键词 Hate speech detection malicious communication campaigns AI-driven cybersecurity social media analytics large language models prompt-tuning fine-tuning in-context learning natural language processing
在线阅读 下载PDF
Prompt Injection Attacks on Large Language Models:A Survey of Attack Methods,Root Causes,and Defense Strategies
20
作者 Tongcheng Geng Zhiyuan Xu +1 位作者 Yubin Qu W.Eric Wong 《Computers, Materials & Continua》 2026年第4期134-185,共52页
Large language models(LLMs)have revolutionized AI applications across diverse domains.However,their widespread deployment has introduced critical security vulnerabilities,particularly prompt injection attacks that man... Large language models(LLMs)have revolutionized AI applications across diverse domains.However,their widespread deployment has introduced critical security vulnerabilities,particularly prompt injection attacks that manipulate model behavior through malicious instructions.Following Kitchenham’s guidelines,this systematic review synthesizes 128 peer-reviewed studies from 2022 to 2025 to provide a unified understanding of this rapidly evolving threat landscape.Our findings reveal a swift progression from simple direct injections to sophisticated multimodal attacks,achieving over 90%success rates against unprotected systems.In response,defense mechanisms show varying effectiveness:input preprocessing achieves 60%–80%detection rates and advanced architectural defenses demonstrate up to 95%protection against known patterns,though significant gaps persist against novel attack vectors.We identified 37 distinct defense approaches across three categories,but standardized evaluation frameworks remain limited.Our analysis attributes these vulnerabilities to fundamental LLM architectural limitations,such as the inability to distinguish instructions from data and attention mechanism vulnerabilities.This highlights critical research directions such as formal verification methods,standardized evaluation protocols,and architectural innovations for inherently secure LLM designs. 展开更多
关键词 Prompt injection attacks large language models defense mechanisms security evaluation
在线阅读 下载PDF
上一页 1 2 33 下一页 到第
使用帮助 返回顶部