期刊文献+
共找到120,012篇文章
< 1 2 250 >
每页显示 20 50 100
基于大语言模型的SQL注入漏洞检测载荷生成方法
1
作者 顾兆军 李丽 隋翯 《信息网络安全》 北大核心 2026年第2期274-290,共17页
针对现有SQL注入漏洞检测方法存在鲁棒性不足以及测试用例缺乏针对性等问题,文章提出一种基于大语言模型的SQL注入漏洞检测载荷生成方法。该方法通过生成针对性的检测载荷实现SQL注入漏洞检测,借助提示工程与DeepSeek-V3模型自动提取和... 针对现有SQL注入漏洞检测方法存在鲁棒性不足以及测试用例缺乏针对性等问题,文章提出一种基于大语言模型的SQL注入漏洞检测载荷生成方法。该方法通过生成针对性的检测载荷实现SQL注入漏洞检测,借助提示工程与DeepSeek-V3模型自动提取和统一构建漏洞特征;利用贡献度对漏洞特征进行分析和选择,构建模型的核心输入;通过将关键特征组织成思维链的形式促进多维度漏洞表征融合,并采用低秩适配技术对Qwen模型进行领域自适应监督微调。实验在多个公开漏洞靶场中验证Qwen模型与SqliGPT、GPT-2-web和SQLMap等模型的性能差异和生成质量,并深入分析DeepSeek-V3模型在复杂SQL注入漏洞数据中的特征提取能力。实验结果表明,Qwen模型的平均检测准确率达到75%以上,比SqliGPT、GPT-2-web和SQLMap模型分别提升49.18%、59.64%和15.19%,且载荷生成质量显著优于现有模型,证明了基于大语言模型生成检测载荷,实现SQL注入漏洞检测方法的有效性与优越性。 展开更多
关键词 大语言模型 sql注入漏洞 代码生成 检测载荷
在线阅读 下载PDF
基于结构感知与蒙特卡洛树搜索的SQL生成
2
作者 富宇 李浩冉 《计算机技术与发展》 2026年第3期118-123,117,共7页
自然语言到SQL(Text-to-SQL)任务旨在将用户查询映射为可执行的SQL语句,是自然语言与数据库交互的核心技术。当前主流大型语言模型在处理复杂结构、多表关联及嵌套逻辑时,常出现结构错误、语义偏离和执行失败,限制了其可靠性与泛化能力... 自然语言到SQL(Text-to-SQL)任务旨在将用户查询映射为可执行的SQL语句,是自然语言与数据库交互的核心技术。当前主流大型语言模型在处理复杂结构、多表关联及嵌套逻辑时,常出现结构错误、语义偏离和执行失败,限制了其可靠性与泛化能力。为此,该文提出Struct-MCTS,一种基于结构感知与蒙特卡洛树搜索(MCTS)的Text-to-SQL生成框架。该框架通过细粒度结构化动作建模SQL生成过程,并结合多模型并行生成与协同辩论对候选路径进行动态打分,从而提升生成结果的鲁棒性与一致性。在零样本条件下,Struct-MCTS在Spider和BIRD等复杂数据集上表现出领先的执行准确率,显示出强泛化能力与实际应用潜力。 展开更多
关键词 Text-to-sql 大语言模型 结构感知 蒙特卡洛树搜索 多模型辩论 零样本学习
在线阅读 下载PDF
Hint-SQL:基于自动线索生成的Text-to-SQL提示方法
3
作者 谭钊 刘喜平 +4 位作者 舒晴 万齐智 刘德喜 万常选 廖国琼 《计算机学报》 北大核心 2026年第3期700-720,共21页
Text-to-SQL旨在将自然语言问题翻译为可被数据库系统执行的SQL语句,从而为数据查询提供便利。随着大语言模型(LLMs)技术的发展,基于LLMs的Text-to-SQL提示方法成为该领域的主流解决方案。近年来,研究者在LLMs的提示词中加入线索(Hint)... Text-to-SQL旨在将自然语言问题翻译为可被数据库系统执行的SQL语句,从而为数据查询提供便利。随着大语言模型(LLMs)技术的发展,基于LLMs的Text-to-SQL提示方法成为该领域的主流解决方案。近年来,研究者在LLMs的提示词中加入线索(Hint)来传递具体的Text-to-SQL建议,以引导LLMs生成SQL。然而,现有线索多由研究者根据Text-to-SQL任务的特点人为撰写,其内容过于宽泛,难以根据具体的任务需求做出调整,无法适配所有Text-to-SQL任务。本文提出基于自动线索生成的Text-to-SQL提示方法Hint-SQL,它能够根据当前Text-to-SQL任务自动地生成合适的语义、操作和结构线索,从而引导LLMs生成语义一致、结构正确的SQL。为了生成任务定制化线索,我们构建了线索生成智能体(HAgent)。HAgent基于两阶段微调框架,由开源LLMs微调而来,该框架自动合成微调所需数据,无需人工标注,为监督微调和偏好学习优化提供支持。HintSQL既可以单独使用,也可以用来增强现有方法。大规模实验结果显示,HintSQL独立使用时可以媲美主流方法,也可以显著增强现有方法性能,在BIRD数据集上,HintSQL将当前最好方法的准确率提升到了71.58%,提升幅度达到4.37%。本研究揭示了线索在Text-to-SQL任务中的重要作用,为Text-to-SQL的后续研究提供了参考。 展开更多
关键词 自然语言处理 Text-to-sql 大语言模型 提示工程 线索
在线阅读 下载PDF
融合GAT与可解释DQN的SQL注入攻击检测模型
4
作者 邓钰洋 芦天亮 +2 位作者 李知皓 孟昊阳 马远声 《信息网络安全》 北大核心 2026年第1期150-167,共18页
随着Web应用的持续演进及数据库驱动系统的广泛部署,SQL注入攻击作为一种高度隐蔽且破坏力强的网络攻击方式,依然是当前Web安全防护的重要研究对象。针对SQL注入语句结构复杂、语义多样以及攻击样本稀缺等问题,文章提出一种融合图结构... 随着Web应用的持续演进及数据库驱动系统的广泛部署,SQL注入攻击作为一种高度隐蔽且破坏力强的网络攻击方式,依然是当前Web安全防护的重要研究对象。针对SQL注入语句结构复杂、语义多样以及攻击样本稀缺等问题,文章提出一种融合图结构建模与强化学习机制的SQL注入攻击检测方法。该方法将SQL语句建模为图结构,通过改进的图注意力网络GAT融合节点与边的语法特征,并构建了包含4个专门化检测专家的多智能体强化学习框架,实现动态集成决策。同时,该检测方法设计了针对SQL注入攻击混淆特点的对抗样本生成模块,增强了模型对复杂变形攻击的识别能力。此外,结合LIME与SHAP方法对检测结果进行可解释性分析,增强系统的透明度与实用性。实验结果表明,该方法在保持较低计算资源消耗的前提下,有效缓解了样本不均衡与攻击模式多样化引起的检测偏差问题。该方法在综合性SQL注入数据集上的检测准确率达0.955,AUC值为0.978,显著优于现有基线方法,为SQL注入攻击的智能化检测提供了有效解决方案。 展开更多
关键词 sql注入攻击检测 图注意力网络 多智能体 DQN 可解释强化学习
在线阅读 下载PDF
增强模式链接与多生成器协同的SQL生成框架MG-SQL
5
作者 吴定佳 崔喆 《计算机应用》 北大核心 2026年第3期723-731,共9页
针对大语言模型(LLM)在复杂多表数据库场景下生成结构化查询语言(SQL)的局限性,提出基于多生成器协同的Text-to-SQL框架——MG-SQL(Multi-Generator SQL)。首先,针对无关模式信息导致的噪声干扰,通过生成初始SQL,并结合语义相似度检索,... 针对大语言模型(LLM)在复杂多表数据库场景下生成结构化查询语言(SQL)的局限性,提出基于多生成器协同的Text-to-SQL框架——MG-SQL(Multi-Generator SQL)。首先,针对无关模式信息导致的噪声干扰,通过生成初始SQL,并结合语义相似度检索,提出增强模式链接优化方法。其次,为提高候选SQL的质量、增强多样性,基于精简模式构建多策略协同生成框架:1)使用经验生成器检索动态示例;2)使用思维链生成器强化逻辑推理;3)使用查询计划生成器模拟数据库的执行流程;4)使用渐进生成器进行迭代优化。再次,使用投票机制对SQL进行择优。最后,进一步提出反思学习机制,通过对比生成结果与参考SQL形成反思样本,动态构建领域经验库以实现持续学习。在BIRD基准测试中的结果表明,采用轻量级GPT-4o-mini模型时,所提框架的模式链接实现了98.89%的严格召回率(SRR),有效筛除了44.91%无关列;所提框架生成的SQL的执行准确率(EX)达69.69%,有效效率分数(VES)达79.59%,超越基于GPT-4o的主流方法,验证了所提框架在复杂场景下的有效性。 展开更多
关键词 模式链接 大语言模型 Text-to-sql 检索增强 上下文学习
在线阅读 下载PDF
基于特征融合的SQL注入多分类检测
6
作者 姜珍珍 杨彬彬 薛峰 《合肥工业大学学报(自然科学版)》 北大核心 2026年第2期167-172,193,共7页
SQL注入攻击是一种常见的网络安全威胁,因此检测SQL注入成为网络安全领域的一项重要研究内容。传统SQL注入检测方法存在准确性低、无法确定SQL注入攻击的具体类型等问题,文章提出一种基于特征融合的SQL注入攻击多分类检测方法(feature f... SQL注入攻击是一种常见的网络安全威胁,因此检测SQL注入成为网络安全领域的一项重要研究内容。传统SQL注入检测方法存在准确性低、无法确定SQL注入攻击的具体类型等问题,文章提出一种基于特征融合的SQL注入攻击多分类检测方法(feature fusion-based multi-class SQL injection detection,FMCSID)。实验结果表明,该方法不仅达到了99.99%的准确率,而且能够确定SQL注入攻击的具体类型,为安全人员提供更加具体的SQL注入攻击的描述信息和意图,以制定更有针对性的应对措施,提高网络安全的防护能力。 展开更多
关键词 sql注入检测 网络安全 多分类 特征融合 深度学习 sql标准化
在线阅读 下载PDF
基于C#+SQL的MDB格式管线成果转换软件设计与实现
7
作者 张济楷 郭世极 李鑫 《城市勘测》 2026年第1期96-100,共5页
为解决多源管线成果数据统一化的问题,通过分析MDB管线成果的数据结构与属性字段,使用C#和SQL语言开发了管线MDB成果转换软件。本文提出功能解耦与模块化设计的开发思路,以减少软件开发工作量,提升软件的兼容性与灵活性。结合工程实际需... 为解决多源管线成果数据统一化的问题,通过分析MDB管线成果的数据结构与属性字段,使用C#和SQL语言开发了管线MDB成果转换软件。本文提出功能解耦与模块化设计的开发思路,以减少软件开发工作量,提升软件的兼容性与灵活性。结合工程实际需求,设计了数据转换质量检查机制,有效保障了转换数据的准确性和可靠性。以某市管线探测项目为实例,验证了转换方法的可行性及软件的应用价值,为解决类似技术难题提供了有力参考。 展开更多
关键词 管线成果 MDB 数据转换 C# sql
在线阅读 下载PDF
Agri-Eval:Multi-level Large Language Model Valuation Benchmark for Agriculture
8
作者 WANG Yaojun GE Mingliang +2 位作者 XU Guowei ZHANG Qiyu BIE Yuhui 《农业机械学报》 北大核心 2026年第1期290-299,共10页
Model evaluation using benchmark datasets is an important method to measure the capability of large language models(LLMs)in specific domains,and it is mainly used to assess the knowledge and reasoning abilities of LLM... Model evaluation using benchmark datasets is an important method to measure the capability of large language models(LLMs)in specific domains,and it is mainly used to assess the knowledge and reasoning abilities of LLMs.Therefore,in order to better assess the capability of LLMs in the agricultural domain,Agri-Eval was proposed as a benchmark for assessing the knowledge and reasoning ability of LLMs in agriculture.The assessment dataset used in Agri-Eval covered seven major disciplines in the agricultural domain:crop science,horticulture,plant protection,animal husbandry,forest science,aquaculture science,and grass science,and contained a total of 2283 questions.Among domestic general-purpose LLMs,DeepSeek R1 performed best with an accuracy rate of 75.49%.In the realm of international general-purpose LLMs,Gemini 2.0 pro exp 0205 standed out as the top performer,achieving an accuracy rate of 74.28%.As an LLMs in agriculture vertical,Shennong V2.0 outperformed all the LLMs in China,and the answer accuracy rate of agricultural knowledge exceeded that of all the existing general-purpose LLMs.The launch of Agri-Eval helped the LLM developers to comprehensively evaluate the model's capability in the field of agriculture through a variety of tasks and tests to promote the development of the LLMs in the field of agriculture. 展开更多
关键词 large language models assessment systems agricultural knowledge agricultural datasets
在线阅读 下载PDF
LinguTimeX a Framework for Multilingual CTC Detection Using Explainable AI and Natural Language Processing
9
作者 Omar Darwish Shorouq Al-Eidi +4 位作者 Abdallah Al-Shorman Majdi Maabreh Anas Alsobeh Plamen Zahariev Yahya Tashtoush 《Computers, Materials & Continua》 2026年第1期2231-2251,共21页
Covert timing channels(CTC)exploit network resources to establish hidden communication pathways,posing signi cant risks to data security and policy compliance.erefore,detecting such hidden and dangerous threats remain... Covert timing channels(CTC)exploit network resources to establish hidden communication pathways,posing signi cant risks to data security and policy compliance.erefore,detecting such hidden and dangerous threats remains one of the security challenges. is paper proposes LinguTimeX,a new framework that combines natural language processing with arti cial intelligence,along with explainable Arti cial Intelligence(AI)not only to detect CTC but also to provide insights into the decision process.LinguTimeX performs multidimensional feature extraction by fusing linguistic attributes with temporal network patterns to identify covert channels precisely.LinguTimeX demonstrates strong e ectiveness in detecting CTC across multiple languages;namely English,Arabic,and Chinese.Speci cally,the LSTM and RNN models achieved F1 scores of 90%on the English dataset,89%on the Arabic dataset,and 88%on the Chinese dataset,showcasing their superior performance and ability to generalize across multiple languages. is highlights their robustness in detecting CTCs within security systems,regardless of the language or cultural context of the data.In contrast,the DeepForest model produced F1-scores ranging from 86%to 87%across the same datasets,further con rming its e ectiveness in CTC detection.Although other algorithms also showed reasonable accuracy,the LSTM and RNN models consistently outperformed them in multilingual settings,suggesting that deep learning models might be better suited for this particular problem. 展开更多
关键词 Arabic language Chinese language covert timing channel CYBERSECURITY deep learning English language language processing machine learning
在线阅读 下载PDF
面向船舶设计管理的SQL Agent应用设计
10
作者 周泽鹏 董柏廷 +3 位作者 陈刚 张延昌 袁飞晖 李思远 《无线互联科技》 2026年第6期77-84,共8页
针对船舶设计管理中因数据表单存储形式多样而引发的设计数据分散、过程状态难以跟踪、跨专业协同信息提取困难等瓶颈问题,文章开展了结构化查询语言智能体(Structured Query Language Agent,SQL Agent)系统在船舶设计管理场景下的应用... 针对船舶设计管理中因数据表单存储形式多样而引发的设计数据分散、过程状态难以跟踪、跨专业协同信息提取困难等瓶颈问题,文章开展了结构化查询语言智能体(Structured Query Language Agent,SQL Agent)系统在船舶设计管理场景下的应用方案设计。其中,在SQL Agent的背景知识管理方面,依托向量数据库中存储问答对与表单基本信息等方式,快速检索输入问题的相关背景知识,将其注入SQL Agent系统提示词,辅助后续SQL Agent的工具选择与信息生成。同时,文章依托业务流管理,设计SQL Agent执行过程,明确业务关键节点,提升SQL生成可靠性与准确性。文章分别开发了基于Vanna和Langgraph开发框架的SQL Agent系统,实现了船舶设计场景下的自然语言问数功能并进行了性能对比与分析,为后续进一步工程场景下的应用提供技术参考与支撑。 展开更多
关键词 向量数据库 sql Agent 船舶设计管理 工作流管理
在线阅读 下载PDF
Connecting Hearts,Not Just Dots A Lao language editor's perspective on China-Laos relations
11
作者 Yang Chunmei 《China Report ASEAN》 2026年第1期41-43,共3页
As an ordinary Yunnan local,I never imagined becoming so closely connected to the exotic land of Laos.The luckiest event of my life was probably my choice to tick a box on a 2007 college entrance examination applicati... As an ordinary Yunnan local,I never imagined becoming so closely connected to the exotic land of Laos.The luckiest event of my life was probably my choice to tick a box on a 2007 college entrance examination application form,indicating my willingness to enrollin a major other than my preference,which led me into the world of the Lao language. 展开更多
关键词 education China language learning personal experience RELATIONS lao language Laos
在线阅读 下载PDF
CIT-Rec:Enhancing Sequential Recommendation System with Large Language Models
12
作者 Ziyu Li Zhen Chen +2 位作者 Xuejing Fu Tong Mo Weiping Li 《Computers, Materials & Continua》 2026年第3期2328-2343,共16页
Recommendation systems are key to boosting user engagement,satisfaction,and retention,particularly on media platforms where personalized content is vital.Sequential recommendation systems learn from user-item interact... Recommendation systems are key to boosting user engagement,satisfaction,and retention,particularly on media platforms where personalized content is vital.Sequential recommendation systems learn from user-item interactions to predict future items of interest.However,many current methods rely on unique user and item IDs,limiting their ability to represent users and items effectively,especially in zero-shot learning scenarios where training data is scarce.With the rapid development of Large Language Models(LLMs),researchers are exploring their potential to enhance recommendation systems.However,there is a semantic gap between the linguistic semantics of LLMs and the collaborative semantics of recommendation systems,where items are typically indexed by IDs.Moreover,most research focuses on item representations,neglecting personalized user modeling.To address these issues,we propose a sequential recommendation framework using LLMs,called CIT-Rec,a model that integrates Collaborative semantics for user representation and Image and Text information for item representation to enhance Recommendations.Specifically,by aligning intuitive image information with text containing semantic features,we can more accurately represent items,improving item representation quality.We focus not only on item representations but also on user representations.To more precisely capture users’personalized preferences,we use traditional sequential recommendation models to train on users’historical interaction data,effectively capturing behavioral patterns.Finally,by combining LLMs and traditional sequential recommendation models,we allow the LLM to understand linguistic semantics while capturing collaborative semantics.Extensive evaluations on real-world datasets show that our model outperforms baseline methods,effectively combining user interaction history with item visual and textual modalities to provide personalized recommendations. 展开更多
关键词 Large language models vision language models sequential recommendation instruction tuning
在线阅读 下载PDF
PROMPTx-PE:Adaptive Optimization of Prompt Engineering Strategies for Accuracy and Robustness in Large Language Models
13
作者 Talha Farooq Khan Fahad Ali +2 位作者 Majid Hussain Lal Khan Hsien-Tsung Chang 《Computers, Materials & Continua》 2026年第5期685-715,共31页
The outstanding growth in the applications of large language models(LLMs)demonstrates the significance of adaptive and efficient prompt engineering tactics.The existing methods may not be variable,vigorous and streaml... The outstanding growth in the applications of large language models(LLMs)demonstrates the significance of adaptive and efficient prompt engineering tactics.The existing methods may not be variable,vigorous and streamlined in different domains.The offered study introduces an immediate optimization outline,named PROMPTx-PE,that is going to yield a greater level of precision and strength when it comes to the assignments that are premised on LLM.The proposed systemfeatures a timely selection schemewhich is informed by reinforcement learning,a contextual layer and a dynamic weighting module which is regulated by Lyapunov-based stability guidelines.The PROMPTx-PE dynamically varies the exploration and exploitation of the prompt space,depending on real-time feedback and multi-objective reward development.Extensive testing on both benchmark(GLUE,SuperGLUE)and domain-specific data(Healthcare-QA and Industrial-NER)demonstrates a large best performance to be 89.4%and a strong robustness disconnect with under 3%computation expense.The results confirm the effectiveness,consistency,and scalability of PROMPTx-PE as a platform of adaptive prompt engineering based on recent uses of LLMs. 展开更多
关键词 Prompt engineering large language models adaptive optimization ROBUSTNESS multi-objective optimization reinforcement learning natural language processing
在线阅读 下载PDF
Clinical decision and prescription generation for diarrhea in traditional Chinese medicine based on large language model
14
作者 Jiaze Wu Hao Liang +2 位作者 Haoran Dai Hongliang Rui Baoli Liu 《Digital Chinese Medicine》 2026年第1期13-30,共18页
Objective To develop a clinical decision and prescription generation system(CDPGS)specifically for diarrhea in traditional Chinese medicine(TCM),utilizing a specialized large language model(LLM),Qwen-TCM-Dia,to standa... Objective To develop a clinical decision and prescription generation system(CDPGS)specifically for diarrhea in traditional Chinese medicine(TCM),utilizing a specialized large language model(LLM),Qwen-TCM-Dia,to standardize diagnostic processes and prescription generation.Methods Two primary datasets were constructed:an evaluation benchmark and a fine-tuning dataset consisting of fundamental diarrhea knowledge,medical records,and chain-ofthought(CoT)reasoning datasets.After an initial evaluation of 16 open-source LLMs across inference time,accuracy,and output quality,Qwen2.5 was selected as the base model due to its superior overall performance.We then employed a two-stage low-rank adaptation(LoRA)fine-tuning strategy,integrating continued pre-training on domain-specific knowledge with instruction fine-tuning using CoT-enriched medical records.This approach was designed to embed the clinical logic(symptoms→pathogenesis→therapeutic principles→prescriptions)into the model’s reasoning capabilities.The resulting fine-tuned model,specialized for TCM diarrhea,was designated as Qwen-TCM-Dia.Model performance was evaluated for disease diagnosis and syndrome type differentiation using accuracy,precision,recall,and F1-score.Furthermore,the quality of the generated prescriptions was compared with that of established open-source TCM LLMs.Results Qwen-TCM-Dia achieved peak performance compared to both the base Qwen2.5 model and five other open-source TCM LLMs.It achieved 97.05%accuracy and 91.48%F1-score in disease diagnosis,and 74.54%accuracy and 74.21%F1-score in syndrome type differentiation.Compared with existing open-source TCM LLMs(BianCang,HuangDi,LingDan,TCMLLM-PR,and ZhongJing),Qwen-TCM-Dia exhibited higher fidelity in reconstructing the“symptoms→pathogenesis→therapeutic principles→prescriptions”logic chain.It provided complete prescriptions,whereas other models often omitted dosages or generated mismatched prescriptions.Conclusion By integrating continued pre-training,CoT reasoning,and a two-stage fine-tuning strategy,this study establishes a CDPGS for diarrhea in TCM.The results demonstrate the synergistic effect of strengthening domain representation through pre-training and activating logical reasoning via CoT.This research not only provides critical technical support for the standardized diagnosis and treatment of diarrhea but also offers a scalable paradigm for the digital inheritance of expert TCM experience and the intelligent transformation of TCM. 展开更多
关键词 DIARRHEA Traditional Chinese medicine Large language model Clinical decision and prescription generation Natural language processing
暂未订购
Detection of Maliciously Disseminated Hate Speech in Spanish Using Fine-Tuning and In-Context Learning Techniques with Large Language Models
15
作者 Tomás Bernal-Beltrán RonghaoPan +3 位作者 JoséAntonio García-Díaz María del Pilar Salas-Zárate Mario Andrés Paredes-Valverde Rafael Valencia-García 《Computers, Materials & Continua》 2026年第4期353-390,共38页
The malicious dissemination of hate speech via compromised accounts,automated bot networks and malware-driven social media campaigns has become a growing cybersecurity concern.Automatically detecting such content in S... The malicious dissemination of hate speech via compromised accounts,automated bot networks and malware-driven social media campaigns has become a growing cybersecurity concern.Automatically detecting such content in Spanish is challenging due to linguistic complexity and the scarcity of annotated resources.In this paper,we compare two predominant AI-based approaches for the forensic detection of malicious hate speech:(1)finetuning encoder-only models that have been trained in Spanish and(2)In-Context Learning techniques(Zero-and Few-Shot Learning)with large-scale language models.Our approach goes beyond binary classification,proposing a comprehensive,multidimensional evaluation that labels each text by:(1)type of speech,(2)recipient,(3)level of intensity(ordinal)and(4)targeted group(multi-label).Performance is evaluated using an annotated Spanish corpus,standard metrics such as precision,recall and F1-score and stability-oriented metrics to evaluate the stability of the transition from zero-shot to few-shot prompting(Zero-to-Few Shot Retention and Zero-to-Few Shot Gain)are applied.The results indicate that fine-tuned encoder-only models(notably MarIA and BETO variants)consistently deliver the strongest and most reliable performance:in our experiments their macro F1-scores lie roughly in the range of approximately 46%–66%depending on the task.Zero-shot approaches are much less stable and typically yield substantially lower performance(observed F1-scores range approximately 0%–39%),often producing invalid outputs in practice.Few-shot prompting(e.g.,Qwen 38B,Mistral 7B)generally improves stability and recall relative to pure zero-shot,bringing F1-scores into a moderate range of approximately 20%–51%but still falling short of fully fine-tuned models.These findings highlight the importance of supervised adaptation and discuss the potential of both paradigms as components in AI-powered cybersecurity and malware forensics systems designed to identify and mitigate coordinated online hate campaigns. 展开更多
关键词 Hate speech detection malicious communication campaigns AI-driven cybersecurity social media analytics large language models prompt-tuning fine-tuning in-context learning natural language processing
在线阅读 下载PDF
Hepatitis C Patient Education:Large Language Models Show Promise in Disseminating Guidelines
16
作者 Jinyan Chen Ruijie Zhao +10 位作者 Chiyu He Huigang Li Yajie You Zuyuan Lin Ze Xiang Jianyong Zhuo Wei Shen Zhihang Hu Shusen Zheng Xiao Xu Di Lu 《Journal of Clinical and Translational Hepatology》 2026年第1期116-119,共4页
This study evaluated the accuracy,completeness,and comprehensibility of responses from mainstream large language models(LLMs)to hepatitis C virus(HCV)-related questions,aiming to assess their performance in addressing... This study evaluated the accuracy,completeness,and comprehensibility of responses from mainstream large language models(LLMs)to hepatitis C virus(HCV)-related questions,aiming to assess their performance in addressing patient queries about disease and lifestyle behaviors.The models selected were ChatGPT-4o,Gemini 2.0 Pro,Claude 3.5 Sonnet,and DeepSeek V3,with 12 questions chosen by two HCV experts from the domains of prevention,diagnosis,and treatment. 展开更多
关键词 addressing patient queries disease lifestyle behaviorsthe large language models large language models llms GUIDELINES hepatitis C accuracy patient education COMPREHENSIBILITY
原文传递
Automating the Initial Development of Intent-Based Task-Oriented Dialog Systems Using Large Language Models:Experiences and Challenges
17
作者 Ksenia Kharitonova David Pérez-Fernández +1 位作者 Zoraida Callejas David Griol 《Computers, Materials & Continua》 2026年第5期1021-1062,共42页
Building reliable intent-based,task-oriented dialog systems typically requires substantial manual effort:designers must derive intents,entities,responses,and control logic from raw conversational data,then iterate unt... Building reliable intent-based,task-oriented dialog systems typically requires substantial manual effort:designers must derive intents,entities,responses,and control logic from raw conversational data,then iterate until the assistant behaves consistently.This paper investigates how far large language models(LLMs)can automate this development.In this paper,we use two reference corpora,Let’s Go(English,public transport)and MEDIA(French,hotel booking),to prompt four LLM families(GPT-4o,Claude,Gemini,Mistral Small)and generate the core specifications required by the rasa platform.These include intent sets with example utterances,entity definitions with slot mappings,response templates,and basic dialog flows.To structure this process,we introduce a model-and platform-agnostic pipelinewith two phases.The first normalizes and validates LLM-generated artifacts,enforcing crossfile consistency andmaking slot usage explicit.The second uses a lightweight dialog harness that runs scripted tests and incrementally patches failure points until conversations complete reliably.Across eight projects,all models required some targeted repairs before training.After applying our pipeline,all reached≥70%task completion(many above 84%),while NLU performance ranged from mid-0.6 to 1.0 macro-F1 depending on domain breadth.These results show that,with modest guidance,current LLMs can produce workable end-to-end dialog prototypes directly fromraw transcripts.Our main contributions are:(i)a reusable bootstrap method aligned with industry domain-specific languages(DSLs),(ii)a small set of high-impact corrective patterns,and(iii)a simple but effective harness for closed-loop refinement across conversational platforms. 展开更多
关键词 Task-oriented dialog systems large language models(LLMs) RASA dialog automation natural language understanding(NLU) slot filling conversational AI human-in-the-loop NLP
在线阅读 下载PDF
Semantic Causality Evaluation of Correlation Analysis Utilizing Large Language Models
18
作者 Adam Dudáš 《Computers, Materials & Continua》 2026年第5期2246-2269,共24页
It is known that correlation does not imply causality.Some relationships identified in the analysis of data are coincidental or unknown,and some are produced by real-world causality of the situation,which is problemat... It is known that correlation does not imply causality.Some relationships identified in the analysis of data are coincidental or unknown,and some are produced by real-world causality of the situation,which is problematic,since there is a need to differentiate between these two scenarios.Until recently,the proper−semantic−causality of the relationship could have been determined only by human experts from the area of expertise of the studied data.This has changed with the advance of large language models,which are often utilized as surrogates for such human experts,making the process automated and readily available to all data analysts.This motivates the main objective of this work,which is to introduce the design and implementation of a large language model-based semantic causality evaluator based on correlation analysis,together with its visual analysis model called Causal heatmap.After the implementation itself,the model is evaluated from the point of view of the quality of the visual model,from the point of view of the quality of causal evaluation based on large language models,and from the point of view of comparative analysis,while the results reached in the study highlight the usability of large language models in the task and the potential of the proposed approach in the analysis of unknown datasets.The results of the experimental evaluation demonstrate the usefulness of the Causal heatmap method,supported by the evident highlighting of interesting relationships,while suppressing irrelevant ones. 展开更多
关键词 CORRELATION CAUSALITY correlation analysis large language models VISUALIZATION
在线阅读 下载PDF
Preferences of Chinese Dermatologists for Large Language Model Responses in Clinical Psoriasis Scenarios:A Nationwide Cross-Sectional Survey in China
19
作者 Jungang Yang Jingkai Xu +6 位作者 Xuejiao Song Chengxu Li Lili Chen Lingbo Bi Tingting Jiang Xianbo Zuo Yong Cui 《Health Care Science》 2026年第1期40-48,共9页
Background:Large language models(LLMs)have shown considerable promise in supporting clinical decision-making.However,their adoption and evaluation in dermatology remains limited.This study aimed to explore the prefere... Background:Large language models(LLMs)have shown considerable promise in supporting clinical decision-making.However,their adoption and evaluation in dermatology remains limited.This study aimed to explore the preferences of Chinese dermatologists regarding LLM-generated responses in clinical psoriasis scenarios and to assess how they prioritize key quality dimensions,including accuracy,traceability,and logicality.Methods:A cross-sectional,web-based survey was conducted between December 25,2024,and January 22,2025,following the Checklist for Reporting Results of Internet E-Surveys guidelines.A total of 1247 valid responses were collected from practicing dermatologists across 33 of China's provincial-level administrative divisions.Participants evaluated responses to five categories of clinical questions(etiology,clinical presentation,differential diagnosis,treatment,and case study)generated by five LLMs:ChatGPT-4o,Kimi.ai,Doubao,ZuoYiGPT,and Lingyi-agent.Statistical associations between participant characteristics and model preferences were examined using chi-square tests.Results:ChatGPT-4o(Model 1)emerged as the most preferred model across all clinical tasks,consistently receiving the highest number of votes in case study(n=740),clinical presentation(n=666),differential diagnosis(n=707),etiology(n=602),and treatment(n=656).Significant variation in model preference by professional title was observed only for the differential diagnosis task(χ^(2)=21.13,df=12,p=0.0485),while no significant differences were found across hospital tiers(p>0.05).In terms of evaluation dimensions,accuracy was most frequently rated as“very important”(n=635).A significant association existed between hospital tier and the most valued dimension(χ^(2)=27.667,df=9,p=0.0011),with dermatologists in primary hospitals prioritizing traceability more than their peers in higher-tier hospitals.No significant associations were found across professional titles(p=0.127).Conclusions:Chinese dermatologists suggest a strong preference for ChatGPT-4o over domestic LLMs in psoriasis-related clinical tasks.While accuracy remains the primary criterion,traceability and logicality are also critical,particularly for clinicians in lower-tier hospitals.These findings suggest that future clinical LLMs should prioritize not only content accuracy but also source transparency and structural clarity to meet the diverse needs of different clinical settings. 展开更多
关键词 DERMATOLOGY large language model model evaluation
暂未订购
On the Evolutionary Logic of Chinese Culture’s Integration Into Foreign Language Education in China:A Bibliometric Study of CSSCI Source Journals(1980-2025)
20
作者 ZOU Yanqun 《Sino-US English Teaching》 2026年第1期1-9,共9页
This paper undertakes a systematic combing of the development of research on integrating Chinese culture into foreign language education in China from the 1980s to 2025,dividing it into three stages:cultural attachmen... This paper undertakes a systematic combing of the development of research on integrating Chinese culture into foreign language education in China from the 1980s to 2025,dividing it into three stages:cultural attachment,cultural compensation,and cultural symbiosis,and reveals the logical shift of the research from the dominance of target language culture to the construction of the subjectivity of Chinese culture.Through quantitative and qualitative analysis of 435 CSSCI papers,three core themes are extracted:what to integrate,why to integrate,and how to integrate.This paper critically analyzes three pairs of contradictions:the imbalance between instrumentality and humanism,the separation of national narrative and individual expression,and the disconnection between traditional inheritance and modern transformation.It is proposed that future research should reconstruct the educational logic based on the Chinese context,integrate the national and individual dimensions,and build a dialogue mechanism between tradition and modernity,so as to provide theoretical and practical reference for the construction of a foreign language education system with Chinese characteristics. 展开更多
关键词 Chinese culture foreign language education cultural integration
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部