期刊文献+
共找到66篇文章
< 1 2 4 >
每页显示 20 50 100
Optimizing Fine-Tuning in Quantized Language Models:An In-Depth Analysis of Key Variables
1
作者 Ao Shen Zhiquan Lai +1 位作者 Dongsheng Li Xiaoyu Hu 《Computers, Materials & Continua》 SCIE EI 2025年第1期307-325,共19页
Large-scale Language Models(LLMs)have achieved significant breakthroughs in Natural Language Processing(NLP),driven by the pre-training and fine-tuning paradigm.While this approach allows models to specialize in speci... Large-scale Language Models(LLMs)have achieved significant breakthroughs in Natural Language Processing(NLP),driven by the pre-training and fine-tuning paradigm.While this approach allows models to specialize in specific tasks with reduced training costs,the substantial memory requirements during fine-tuning present a barrier to broader deployment.Parameter-Efficient Fine-Tuning(PEFT)techniques,such as Low-Rank Adaptation(LoRA),and parameter quantization methods have emerged as solutions to address these challenges by optimizing memory usage and computational efficiency.Among these,QLoRA,which combines PEFT and quantization,has demonstrated notable success in reducing memory footprints during fine-tuning,prompting the development of various QLoRA variants.Despite these advancements,the quantitative impact of key variables on the fine-tuning performance of quantized LLMs remains underexplored.This study presents a comprehensive analysis of these key variables,focusing on their influence across different layer types and depths within LLM architectures.Our investigation uncovers several critical findings:(1)Larger layers,such as MLP layers,can maintain performance despite reductions in adapter rank,while smaller layers,like self-attention layers,aremore sensitive to such changes;(2)The effectiveness of balancing factors depends more on specific values rather than layer type or depth;(3)In quantization-aware fine-tuning,larger layers can effectively utilize smaller adapters,whereas smaller layers struggle to do so.These insights suggest that layer type is a more significant determinant of fine-tuning success than layer depth when optimizing quantized LLMs.Moreover,for the same discount of trainable parameters,reducing the trainable parameters in a larger layer is more effective in preserving fine-tuning accuracy than in a smaller one.This study provides valuable guidance for more efficient fine-tuning strategies and opens avenues for further research into optimizing LLM fine-tuning in resource-constrained environments. 展开更多
关键词 Large-scale Language Model Parameter-Efficient fine-tuning parameter quantization key variable trainable parameters experimental analysis
在线阅读 下载PDF
Optimizing Airline Review Sentiment Analysis:A Comparative Analysis of LLaMA and BERT Models through Fine-Tuning and Few-Shot Learning
2
作者 Konstantinos I.Roumeliotis Nikolaos D.Tselikas Dimitrios K.Nasiopoulos 《Computers, Materials & Continua》 2025年第2期2769-2792,共24页
In the rapidly evolving landscape of natural language processing(NLP)and sentiment analysis,improving the accuracy and efficiency of sentiment classification models is crucial.This paper investigates the performance o... In the rapidly evolving landscape of natural language processing(NLP)and sentiment analysis,improving the accuracy and efficiency of sentiment classification models is crucial.This paper investigates the performance of two advanced models,the Large Language Model(LLM)LLaMA model and NLP BERT model,in the context of airline review sentiment analysis.Through fine-tuning,domain adaptation,and the application of few-shot learning,the study addresses the subtleties of sentiment expressions in airline-related text data.Employing predictive modeling and comparative analysis,the research evaluates the effectiveness of Large Language Model Meta AI(LLaMA)and Bidirectional Encoder Representations from Transformers(BERT)in capturing sentiment intricacies.Fine-tuning,including domain adaptation,enhances the models'performance in sentiment classification tasks.Additionally,the study explores the potential of few-shot learning to improve model generalization using minimal annotated data for targeted sentiment analysis.By conducting experiments on a diverse airline review dataset,the research quantifies the impact of fine-tuning,domain adaptation,and few-shot learning on model performance,providing valuable insights for industries aiming to predict recommendations and enhance customer satisfaction through a deeper understanding of sentiment in user-generated content(UGC).This research contributes to refining sentiment analysis models,ultimately fostering improved customer satisfaction in the airline industry. 展开更多
关键词 Sentiment classification review sentiment analysis user-generated content domain adaptation customer satisfaction LLaMA model BERT model airline reviews LLM classification fine-tuning
在线阅读 下载PDF
An Analytical Review of Large Language Models Leveraging KDGI Fine-Tuning,Quantum Embedding’s,and Multimodal Architectures
3
作者 Uddagiri Sirisha Chanumolu Kiran Kumar +2 位作者 Revathi Durgam Poluru Eswaraiah G Muni Nagamani 《Computers, Materials & Continua》 2025年第6期4031-4059,共29页
A complete examination of Large Language Models’strengths,problems,and applications is needed due to their rising use across disciplines.Current studies frequently focus on single-use situations and lack a comprehens... A complete examination of Large Language Models’strengths,problems,and applications is needed due to their rising use across disciplines.Current studies frequently focus on single-use situations and lack a comprehensive understanding of LLM architectural performance,strengths,and weaknesses.This gap precludes finding the appropriate models for task-specific applications and limits awareness of emerging LLM optimization and deployment strategies.In this research,50 studies on 25+LLMs,including GPT-3,GPT-4,Claude 3.5,DeepKet,and hybrid multimodal frameworks like ContextDET and GeoRSCLIP,are thoroughly reviewed.We propose LLM application taxonomy by grouping techniques by task focus—healthcare,chemistry,sentiment analysis,agent-based simulations,and multimodal integration.Advanced methods like parameter-efficient tuning(LoRA),quantumenhanced embeddings(DeepKet),retrieval-augmented generation(RAG),and safety-focused models(GalaxyGPT)are evaluated for dataset requirements,computational efficiency,and performance measures.Frameworks for ethical issues,data limited hallucinations,and KDGI-enhanced fine-tuning like Woodpecker’s post-remedy corrections are highlighted.The investigation’s scope,mad,and methods are described,but the primary results are not.The work reveals that domain-specialized fine-tuned LLMs employing RAG and quantum-enhanced embeddings performbetter for context-heavy applications.In medical text normalization,ChatGPT-4 outperforms previous models,while two multimodal frameworks,GeoRSCLIP,increase remote sensing.Parameter-efficient tuning technologies like LoRA have minimal computing cost and similar performance,demonstrating the necessity for adaptive models in multiple domains.To discover the optimum domain-specific models,explain domain-specific fine-tuning,and present quantum andmultimodal LLMs to address scalability and cross-domain issues.The framework helps academics and practitioners identify,adapt,and innovate LLMs for different purposes.This work advances the field of efficient,interpretable,and ethical LLM application research. 展开更多
关键词 Large languagemodels quantum embeddings fine-tuning techniques multimodal architectures ethical AI scenarios
在线阅读 下载PDF
Fine-tuning a large language model for automating computational fluid dynamics simulations
4
作者 Zhehao Dong Zhen Lu Yue Yang 《Theoretical & Applied Mechanics Letters》 2025年第3期219-225,共7页
Configuring computational fluid dynamics(CFD)simulations typically demands extensive domain expertise,limiting broader access.Although large language models(LLMs)have advanced scientific computing,their use in automat... Configuring computational fluid dynamics(CFD)simulations typically demands extensive domain expertise,limiting broader access.Although large language models(LLMs)have advanced scientific computing,their use in automating CFD workflows is underdeveloped.We introduce a novel approach centered on domain-specific LLM adaptation.By fine-tuning Qwen2.5-7B-Instruct on NL2FOAM,our custom dataset of 28,716 natural language-to-OpenFOAM configuration pairs with chain-of-thought(CoT)annotations enables direct translation from natural language descriptions to executable CFD setups.A multi-agent system orchestrates the process,autonomously verifying inputs,generating configurations,running simulations,and correcting errors.Evaluation on a benchmark of 21 diverse flow cases demonstrates state-of-the-art performance,achieving 88.7%solution accuracy and 82.6%first-attempt success rate.This significantly outperforms larger general-purpose models such as Qwen2.5-72B-Instruct,DeepSeek-R1,and Llama3.3-70B-Instruct,while also requiring fewer correction iterations and maintaining high computational efficiency.The results highlight the critical role of domain-specific adaptation in deploying LLM assistants for complex engineering workflows.Our code and fine-tuned model have been deposited at https://github.com/YYgroup/AutoCFD. 展开更多
关键词 Large language models fine-tuning Computational fluid dynamics Automated CFD Multi-agent system
在线阅读 下载PDF
Fine-tuning electronic structure of N-doped graphitic carbon-supported Co-and Fe-incorporated Mo_(2)C to achieve ultrahigh electrochemical water oxidation activity 被引量:2
5
作者 Md.Selim Arif Sher Shah Hyeonjung Jung +3 位作者 Vinod K.Paidi Kug-Seung Lee Jeong Woo Han Jong Hyeok Park 《Carbon Energy》 SCIE EI CAS CSCD 2024年第7期134-149,共16页
Mo_(2)C is an excellent electrocatalyst for hydrogen evolution reaction(HER).However,Mo_(2)C is a poor electrocatalyst for oxygen evolution reaction(OER).Herein,two different elements,namely Co and Fe,are incorporated... Mo_(2)C is an excellent electrocatalyst for hydrogen evolution reaction(HER).However,Mo_(2)C is a poor electrocatalyst for oxygen evolution reaction(OER).Herein,two different elements,namely Co and Fe,are incorporated in Mo_(2)C that,therefore,has a finely tuned electronic structure,which is not achievable by incorporation of any one of the metals.Consequently,the resulting electrocatalyst Co_(0.8)Fe_(0.2)-Mo_(2)C-80 displayed excellent OER catalytic performance,which is evidenced by a low overpotential of 214.0(and 246.5)mV to attain a current density of 10(and 50)mA cm^(-2),an ultralow Tafel slope of 38.4 mV dec^(-1),and longterm stability in alkaline medium.Theoretical data demonstrates that Co_(0.8)Fe_(0.2)-Mo_(2)C-80 requires the lowest overpotential(1.00 V)for OER and Co centers to be the active sites.The ultrahigh catalytic performance of the electrocatalyst is attributed to the excellent intrinsic catalytic activity due to high Brunauer-Emmett-Teller specific surface area,large electrochemically active surface area,small Tafel slope,and low chargetransfer resistance. 展开更多
关键词 fine-tuning electronic structures heteronanostructures Mo_(2)C multimetal(Co/Fe) oxygen evolution reaction
在线阅读 下载PDF
Comparing Fine-Tuning, Zero and Few-Shot Strategies with Large Language Models in Hate Speech Detection in English
6
作者 Ronghao Pan JoséAntonio García-Díaz Rafael Valencia-García 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第9期2849-2868,共20页
Large Language Models(LLMs)are increasingly demonstrating their ability to understand natural language and solve complex tasks,especially through text generation.One of the relevant capabilities is contextual learning... Large Language Models(LLMs)are increasingly demonstrating their ability to understand natural language and solve complex tasks,especially through text generation.One of the relevant capabilities is contextual learning,which involves the ability to receive instructions in natural language or task demonstrations to generate expected outputs for test instances without the need for additional training or gradient updates.In recent years,the popularity of social networking has provided a medium through which some users can engage in offensive and harmful online behavior.In this study,we investigate the ability of different LLMs,ranging from zero-shot and few-shot learning to fine-tuning.Our experiments show that LLMs can identify sexist and hateful online texts using zero-shot and few-shot approaches through information retrieval.Furthermore,it is found that the encoder-decoder model called Zephyr achieves the best results with the fine-tuning approach,scoring 86.811%on the Explainable Detection of Online Sexism(EDOS)test-set and 57.453%on the Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter(HatEval)test-set.Finally,it is confirmed that the evaluated models perform well in hate text detection,as they beat the best result in the HatEval task leaderboard.The error analysis shows that contextual learning had difficulty distinguishing between types of hate speech and figurative language.However,the fine-tuned approach tends to produce many false positives. 展开更多
关键词 Hate speech detection zero-shot few-shot fine-tuning natural language processing
在线阅读 下载PDF
Optimizing Enterprise Conversational AI: Accelerating Response Accuracy with Custom Dataset Fine-Tuning
7
作者 Yash Kishore 《Intelligent Information Management》 2024年第2期65-76,共12页
As the realm of enterprise-level conversational AI continues to evolve, it becomes evident that while generalized Large Language Models (LLMs) like GPT-3.5 bring remarkable capabilities, they also bring forth formidab... As the realm of enterprise-level conversational AI continues to evolve, it becomes evident that while generalized Large Language Models (LLMs) like GPT-3.5 bring remarkable capabilities, they also bring forth formidable challenges. These models, honed on vast and diverse datasets, have undoubtedly pushed the boundaries of natural language understanding and generation. However, they often stumble when faced with the intricate demands of nuanced enterprise applications. This research advocates for a strategic paradigm shift, urging enterprises to embrace a fine-tuning approach as a means to optimize conversational AI. While generalized LLMs are linguistic marvels, their inability to cater to the specific needs of businesses across various industries poses a critical challenge. This strategic shift involves empowering enterprises to seamlessly integrate their own datasets into LLMs, a process that extends beyond linguistic enhancement. The core concept of this approach centers on customization, enabling businesses to fine-tune the AI’s functionality to fit precisely within their unique business landscapes. By immersing the LLM in industry-specific documents, customer interaction records, internal reports, and regulatory guidelines, the AI transcends its generic capabilities to become a sophisticated conversational partner aligned with the intricacies of the enterprise’s domain. The transformative potential of this fine-tuning approach cannot be overstated. It enables a transition from a universal AI solution to a highly customizable tool. The AI evolves from being a linguistic powerhouse to a contextually aware, industry-savvy assistant. As a result, it not only responds with linguistic accuracy but also with depth, relevance, and resonance, significantly elevating user experiences and operational efficiency. In the subsequent sections, this paper delves into the intricacies of fine-tuning, exploring the multifaceted challenges and abundant opportunities it presents. It addresses the technical intricacies of data integration, ethical considerations surrounding data usage, and the broader implications for the future of enterprise AI. The journey embarked upon in this research holds the potential to redefine the role of conversational AI in enterprises, ushering in an era where AI becomes a dynamic, deeply relevant, and highly effective tool, empowering businesses to excel in an ever-evolving digital landscape. 展开更多
关键词 fine-tuning DATASET AI CONVERSATIONAL ENTERPRISE LLM
在线阅读 下载PDF
Rotary-scaling fine-tuning (RSFT) method for optimizing railway wheel profiles and its application to a locomotive 被引量:13
8
作者 Yunguang Ye Yayun Qi +3 位作者 Dachuan Shi Yu Sun Yichang Zhou Markus Hecht 《Railway Engineering Science》 2020年第2期160-183,共24页
The existing multi-objective wheel profile optimization methods mainly consist of three sub-modules:(1)wheel profile generation,(2)multi-body dynamics simulation,and(3)an optimization algorithm.For the first module,a ... The existing multi-objective wheel profile optimization methods mainly consist of three sub-modules:(1)wheel profile generation,(2)multi-body dynamics simulation,and(3)an optimization algorithm.For the first module,a comparably conservative rotary-scaling finetuning(RSFT)method,which introduces two design variables and an empirical formula,is proposed to fine-tune the traditional wheel profiles for improving their engineering applicability.For the second module,for the TRAXX locomotives serving on the Blankenburg–Rubeland line,an optimization function representing the relationship between the wheel profile and the wheel–rail wear number is established based on Kriging surrogate model(KSM).For the third module,a method combining the regression capability of KSM with the iterative computing power of particle swarm optimization(PSO)is proposed to quickly and reliably implement the task of optimizing wheel profiles.Finally,with the RSFT–KSM–PSO method,we propose two wear-resistant wheel profiles for the TRAXX locomotives serving on the Blankenburg–Rubeland line,namely S1002-S and S1002-M.The S1002-S profile minimizes the total wear number by 30%,while the S1002-M profile makes the wear distribution more uniform through a proper sacrifice of the tread wear number,and the total wear number is reduced by 21%.The quasi-static and hunting stability tests further demonstrate that the profile designed by the RSFT–KSM–PSO method is promising for practical engineering applications. 展开更多
关键词 Wheel profile optimization Wear reduction Rotary-scaling fine-tuning Particle swarm optimization Kriging surrogate model
在线阅读 下载PDF
Railway wheel profile fine-tuning system for profile recommendation 被引量:3
9
作者 Yunguang Ye Jonas Vuitton +1 位作者 Yu Sun Markus Hecht 《Railway Engineering Science》 2021年第1期74-93,共20页
This paper develops a wheel profile fine-tuning system(WPFTS)that comprehensively considers the influence of wheel profile on wheel damage,vehicle stability,vehicle safety,and passenger comfort.WPFTS can recommend one... This paper develops a wheel profile fine-tuning system(WPFTS)that comprehensively considers the influence of wheel profile on wheel damage,vehicle stability,vehicle safety,and passenger comfort.WPFTS can recommend one or more optimized wheel profiles according to train operators’needs,e.g.,reducing wheel wear,mitigating the development of wheel out-of-roundness(OOR),improving the shape stability of the wheel profile.Specifically,WPFTS includes four modules:(I)a wheel profile generation module based on the rotary-scaling finetuning(RSFT)method;(II)a multi-objective generation module consisting of a rigid multi-body dynamics simulation(MBS)model,an analytical model,and a rigid–flexible MBS model,for generating 11 objectives related to wheel damage,vehicle stability,vehicle safety,and passenger comfort;(III)a weight assignment module consisting of an adaptive weight assignment strategy and a manual weight assignment strategy;and(IV)an optimization module based on radial basis function(RBF)and particle swarm optimization(PSO).Finally,three cases are introduced to show how WPTFS recommends a wheel profile according to train operators’needs.Among them,a wheel profile with high shape stability,a wheel profile for mitigating the development of wheel OOR,and a wheel profile considering hunting stability and derailment safety are developed,respectively. 展开更多
关键词 Wheel profile fine-tuning system Optimization RECOMMENDATION WEAR Contact concentration index Multi-body dynamics simulation(MBS) Railway wheel
在线阅读 下载PDF
Fine-tuning of cortical progenitor proliferation by thalamic afferents
10
作者 Katrin Gerstmann Geraldine Zimmer 《Neural Regeneration Research》 SCIE CAS CSCD 2015年第6期887-888,共2页
During cerebral cortical cortex neurogenesis two major types of progenitors generate a variety of morphologically and functionally diverse projection neurons destined for the different cortical layers in non-gyrified ... During cerebral cortical cortex neurogenesis two major types of progenitors generate a variety of morphologically and functionally diverse projection neurons destined for the different cortical layers in non-gyrified mice. Radial glia cells (RGCs) undergo mitosis in the cortical ventricular zone and exhibit an apical-basal cell polarity, whereas non-polar intermediate progenitor cells (IPCs) divide basally in the subventricular zone (Franco and Muller, 2013; Taverna et al., 2014). 展开更多
关键词 Eph fine-tuning of cortical progenitor proliferation by thalamic afferents
暂未订购
New approach to assess sperm DNA fragmentation dynamics: Fine-tuning mathematical models
11
作者 Isabel Ortiz Jesus Dorado +4 位作者 Jane Morrell Jaime Gosalvez Francisco Crespo Juan M.Jimenez Manuel Hidalgo 《Journal of Animal Science and Biotechnology》 SCIE CAS CSCD 2017年第3期592-600,共9页
Background: Sperm DNA fragmentation(sDF) has been proved to be an important parameter in order to predict in vitro the potential fertility of a semen sample. Colloid centrifugation could be a suitable technique to ... Background: Sperm DNA fragmentation(sDF) has been proved to be an important parameter in order to predict in vitro the potential fertility of a semen sample. Colloid centrifugation could be a suitable technique to select those donkey sperm more resistant to DNA fragmentation after thawing. Previous studies have shown that to elucidate the latent damage of the DNA molecule, sDF should be assessed dynamically, where the rate of fragmentation between treatments indicates how resistant the DNA is to iatrogenic damage. The rate of fragmentation is calculated using the slope of a linear regression equation. However, it has not been studied if s DF dynamics fit this model. The objectives of this study were to evaluate the effect of different after-thawing centrifugation protocols on sperm DNA fragmentation and elucidate the most accurate mathematical model(linear regression, exponential or polynomial) for DNA fragmentation over time in frozen-thawed donkey semen.Results: After submitting post-thaw semen samples to no centrifugation(UDC), sperm washing(SW) or single layer centrifugation(SLC) protocols, sD F values after 6 h of incubation were significantly lower in SLC samples than in SW or UDC.Coefficient of determination(R-2) values were significantly higher for a second order polynomial model than for linear or exponential. The highest values for acceleration of fragmentation(aSDF) were obtained for SW, fol owed by SLC and UDC.Conclusion: SLC after thawing seems to preserve longer DNA longevity in comparison to UDC and SW. Moreover,the fine-tuning of models has shown that sDF dynamics in frozen-thawed donkey semen fit a second order polynomial model, which implies that fragmentation rate is not constant and fragmentation acceleration must be taken into account to elucidate hidden damage in the DNA molecule. 展开更多
关键词 Colloid centrifugation Dynamics fine-tuning Mathematical models Sperm DNA fragmentation
在线阅读 下载PDF
Fine-Tuning Bilateral Ties
12
作者 Ni Yanshuo 《ChinAfrica》 2011年第2期14-17,共4页
Chinese Vice Premier’s visit to Africa continues to emphasize the mutual cooperation,with a focus on agriculture FOR many years,the Chinese Government has dispatched the minister of foreign affairs to Africa for the ... Chinese Vice Premier’s visit to Africa continues to emphasize the mutual cooperation,with a focus on agriculture FOR many years,the Chinese Government has dispatched the minister of foreign affairs to Africa for the first official visit of a year.This year,however,that rule was broken when Hui Liangyu,Chinese Vice Premier,made the 14-day trip. On January 6-19,Hui paid official visits to Mauritius,Zambia,the Democratic Republic of Congo(DRC),Cameroon and Senegal,focusing on economic and agri- 展开更多
关键词 fine-tuning Bilateral Ties DRC
原文传递
Decision-focused fine-tuning of time series foundation models for dispatchable feeder optimization
13
作者 Maximilian Beichter Nils Friederich +7 位作者 Janik Pinter Dorina Werling Kaleb Phipps Sebastian Beichter Oliver Neumann Ralf Mikut Veit Hagenmeyer Benedikt Heidrich 《Energy and AI》 2025年第3期466-479,共14页
Time series foundation models provide a universal solution for generating forecasts to support optimization problems in energy systems.Those foundation models are typically trained in a prediction-focused manner to ma... Time series foundation models provide a universal solution for generating forecasts to support optimization problems in energy systems.Those foundation models are typically trained in a prediction-focused manner to maximize forecast quality.In contrast,decision-focused learning directly improves the resulting value of the forecast in downstream optimization rather than merely maximizing forecasting quality.The practical integration of forecast values into forecasting models is challenging,particularly when addressing complex applications with diverse instances,such as buildings.This becomes even more complicated when instances possess specific characteristics that require instance-specific,tailored predictions to increase the forecast value.To tackle this challenge,we use decision-focused fine-tuning within time series foundation models to offer a scalable and efficient solution for decision-focused learning applied to the dispatchable feeder optimization problem.To obtain more robust predictions for scarce building data,we use Moirai as a state-of-the-art foundation model,which offers robust and generalized results with few-shot parameter-efficient fine-tuning.Comparing the decision-focused fine-tuned Moirai with a state-of-the-art classical prediction-focused fine-tuning Moirai,we observe an improvement of 9.45%in Average Daily Total Costs. 展开更多
关键词 Deep learning Decision-focused learning OPTIMIZATION Dispatchable feeder optimization Time series foundation models Parameter efficient fine-tuning
在线阅读 下载PDF
A Survey on Quality Evaluation of Instruction Fine-tuning Datasets for Large Language Models
14
作者 Yitian Luo Yu Liu +2 位作者 Lu Zhang Feng Gao Jinguang Gu 《Data Intelligence》 2025年第3期527-566,共40页
Instruction fine-tuning is a key method for adapting large language models(LLMs)to domain-specific tasks,and instruction quality significantly impacts model performance after fine-tuning.Hence,evaluating the quality o... Instruction fine-tuning is a key method for adapting large language models(LLMs)to domain-specific tasks,and instruction quality significantly impacts model performance after fine-tuning.Hence,evaluating the quality of instruction and selecting high-quality instructions are essential steps in the process of LLM instruction fine-tuning.Although existing studies provide important theoretical foundations and techniques for this,there is still room for improvement in terms of generality,the relationship between methods and experimental verification.Current methods for evaluating instruction quality can be classified into four main categories:human evaluation,statistics-based evaluation,model-based evaluation,and LLMs-based evaluation.Among these methods,human evaluation relies on the subjective judgment and domain expertise of the evaluators,which offers interpretability and is suitable for scenarios involving small-scale data and sufficient budgets.Statistics-based evaluation estimates the quality of instructions using indicators such as stopwords and lexical diversity,providing high efficiency and a suitable evaluation for large-scale data.Model-based evaluation employs specific models to quantify indicators such as perplexity(PPL)and instruction following difficulty(IFD),which is flexible and suitable for specific tasks.The LLMs-based evaluation rates the quality of instructions through prompt-based interaction with LLMs,focusing on aspects such as accuracy and coherence,which is highly automated and customizable,simplifying the evaluation process.Finally,considering the limitations of current quality evaluation methods,some future research directions are proposed for improvement.These include refining instruction categories,extending evaluation indicators,enhancing human-AI interaction evaluation method,applying agents in instruction quality evaluation,and developing a comprehensive evaluation framework. 展开更多
关键词 Large language models Instruction fine-tuning datasets Quality evaluation Human evaluation Statistics-based evaluation Model-based evaluation LLMs-based evaluation
原文传递
Few-shot exemplar-driven inpainting with parameter-efficient diffusion fine-tuning
15
作者 Shiyuan YANG Zheng GU +3 位作者 Wenyue HAO Yi WANG Huaiyu CAI Xiaodong CHEN 《Frontiers of Information Technology & Electronic Engineering》 2025年第8期1428-1440,共13页
Text-to-image diffusion models have demonstrated impressive capabilities in image generation and have been effectively applied to image inpainting.While text prompt provides an intuitive guidance for conditional inpai... Text-to-image diffusion models have demonstrated impressive capabilities in image generation and have been effectively applied to image inpainting.While text prompt provides an intuitive guidance for conditional inpainting,users often seek the ability to inpaint a specific object with customized appearance by providing an exemplar image.Unfortunately,existing methods struggle to achieve high fidelity in exemplar-driven inpainting.To address this,we use a plug-and-play low-rank adaptation(LoRA)module based on a pretrained text-driven inpainting model.The LoRA module is dedicated to learn the exemplar-specific concepts through few-shot fine-tuning,bringing improved fitting capability to customized exemplar images,without intensive training on large-scale datasets.Additionally,we introduce GPT-4V prompting and prior noise initialization techniques to further facilitate the fidelity in inpainting results.In brief,the denoising diffusion process first starts with the noise derived from a composite exemplar-background image,and is subsequently guided by an expressive prompt generated from the exemplar using the GPT-4V model.Extensive experiments demonstrate that our method achieves state-of-the-art performance,qualitatively and quantitatively,offering users an exemplar-driven inpainting tool with enhanced customization capability. 展开更多
关键词 Diffusion model Image inpainting Exemplar-driven Few-shot fine-tuning
原文传递
Scientific Claim Recognition via Staged Fine-Tuning with LoRA
16
作者 Xin Lin Yajiao Wang +1 位作者 Zhixiong Zhang Mengting Zhang 《Data Intelligence》 2025年第2期303-335,共33页
Scientific claims,which present propositions as facts,are fundamental to scientific knowledge.Despite their significance,current methods for scientific claim recognition are hindered by the scarcity of annotated datas... Scientific claims,which present propositions as facts,are fundamental to scientific knowledge.Despite their significance,current methods for scientific claim recognition are hindered by the scarcity of annotated datasets,particularly those covering full-text documents rather than just abstracts.To bridge this gap,this study aims to enhance scientific claim recognition by leveraging transfer learning through a staged fine-tuning approach.Specifically,we employ a large move prediction dataset(RCMR 280k)alongside the smaller SciClaim dataset we developed,to enhance our scientific claim recognition model’s ability to distinguish between various types of scientific narratives and their roles within research papers.We converted the labeled sentences from both datasets into a question-answer format,aligning them with the fine-tuning requirements of large language models.During the fine-tuning process,we explore two distinct strategies for incorporating knowledge from previous phases.Results indicate that re-integrating LoRA trained on the RCMR 280k dataset into the original model,followed by the creation of a new LoRA specifically for SciClaim training,produces the best outcomes.This staged fine-tuning approach efficiently adapts the model to the task of scientific claim recognition.Our model,SciClaim Miner,outperforms state-of-the-art approaches,achieving an F1-score of 90.96%.The ablation study demonstrates that both the dataset and prompt design,as well as the model training strategies,significantly enhance performance.This work advances scientific claim recognition by introducing a robust methodology that bridges the gap between limited data and effective model training. 展开更多
关键词 LoRA fine-tuning Scientific claim recognition Discourse predict
原文传递
Fine-tuning growth in gold nanostructures from achiral 2D to chiral 3D geometries
17
作者 Lili Tan Zhi Chen +6 位作者 Chengyu Xiao Zhiyong Geng Yinran Jin Chaoyang Wei Fei Teng Wenlong Fu Peng-peng Wang 《Nano Research》 SCIE EI CSCD 2024年第7期6654-6660,共7页
Enriching the library of chiral plasmonic structures is of significant importance in advancing their applicability across diverse domains such as biosensing,nanophotonics,and catalysis.Here,employing triangle nanoplat... Enriching the library of chiral plasmonic structures is of significant importance in advancing their applicability across diverse domains such as biosensing,nanophotonics,and catalysis.Here,employing triangle nanoplates as growth seeds,we synthesized a novel class of chiral-shaped plasmonic nanostructures through a wet chemical strategy with dipeptide as chiral inducers,including chiral tri-blade boomerangs,concave rhombic dodecahedrons,and nanoflowers.The structural diversity in chiral plasmonic nanostructures was elucidated through their continuous morphological evolution from two-dimensional to threedimensional architectures.The fine-tuning of chiroptical properties was achieved by precisely manipulating crucial synthetic parameters such as the amount of chiral molecules,seeds,and gold precursor that significantly influenced chiral structure formation.The findings provide a promising avenue for enriching chiral materials with highly sophisticated structures,facilitating a fundamental understanding of the relationship between structural nuances and chiroptical properties. 展开更多
关键词 plasmonic nanostructures geometric chirality circular dichroism fine-tuning
原文传递
Fine-Tuning Channel-Pruned Deep Model via Knowledge Distillation
18
作者 Chong Zhang Hong-Zhi Wang +1 位作者 Hong-Wei Liu Yi-Lin Chen 《Journal of Computer Science & Technology》 CSCD 2024年第6期1238-1247,共10页
Deep convolutional neural networks with high performance are hard to be deployed in many real world applications, since the computing resources of edge devices such as smart phones or embedded GPU are limited. To alle... Deep convolutional neural networks with high performance are hard to be deployed in many real world applications, since the computing resources of edge devices such as smart phones or embedded GPU are limited. To alleviate this hardware limitation, the compression of deep neural networks from the model side becomes important. As one of the most popular methods in the spotlight, channel pruning of the deep convolutional model can effectively remove redundant convolutional channels from the CNN (convolutional neural network) without affecting the network’s performance remarkably. Existing methods focus on pruning design, evaluating the importance of different convolutional filters in the CNN model. A fast and effective fine-tuning method to restore accuracy is urgently needed. In this paper, we propose a fine-tuning method KDFT (Knowledge Distillation Based Fine-Tuning), which improves the accuracy of fine-tuned models with almost negligible training overhead by introducing knowledge distillation. Extensive experimental results on benchmark datasets with representative CNN models show that up to 4.86% accuracy improvement and 79% time saving can be obtained. 展开更多
关键词 model compression deep learning knowledge distillation fine-tuning
原文传递
Mining and fine-tuning sugar uptake system for titer improvement of milbemycins in Streptomyces bingchenggensis 被引量:2
19
作者 Pinjiao Jin Shanshan Li +4 位作者 Yanyan Zhang Liyang Chu Hairong He Zhuoxu Dong Wensheng Xiang 《Synthetic and Systems Biotechnology》 SCIE 2020年第3期214-221,共8页
Dramatic decrease of sugar uptake is a general phenomenon in Streptomyces at stationary phase,when antibiotics are extensively produced.Milbemycins produced by Streptomyces bingchenggensis are a group of valuable macr... Dramatic decrease of sugar uptake is a general phenomenon in Streptomyces at stationary phase,when antibiotics are extensively produced.Milbemycins produced by Streptomyces bingchenggensis are a group of valuable macrolide biopesticides,while the low yield and titer impede their broad applications in agricultural field.Considering that inadequate sugar uptake generally hinders titer improvement of desired products,we mined the underlying sugar uptake systems and fine-tuned their expression in this work.First,we screened the candidates at both genomic and transcriptomic level in S.bingchenggensis.Then,two ATP-binding cassette transporters named TP2 and TP5 were characterized to improve milbemycin titer and yield significantly.Next,the appropriate native temporal promoters were selected and used to tune the expression of TP2 and TP5,resulting in a maximal milbemycin A3/A4 titer increase by 36.9%to 3321 mg/L.Finally,TP2 and TP5 were broadly finetuned in another two macrolide biopesticide producers Streptomyces avermitilis and Streptomyces cyaneogriseus,leading to a maximal titer improvement of 34.1%and 52.6%for avermectin B1a and nemadectin,respectively.This work provides useful transporter tools and corresponding engineering strategy for Streptomyces. 展开更多
关键词 STREPTOMYCES Sugar uptake system fine-tuning Titer improvement Milbemycins Macrolide biopesticides
原文传递
Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation
20
作者 许一格 邱锡鹏 +1 位作者 周浬皋 黄萱菁 《Journal of Computer Science & Technology》 SCIE EI CSCD 2023年第4期853-866,共14页
Fine-tuning pre-trained language models like BERT have become an effective way in natural language processing(NLP)and yield state-of-the-art results on many downstream tasks.Recent studies on adapting BERT to new task... Fine-tuning pre-trained language models like BERT have become an effective way in natural language processing(NLP)and yield state-of-the-art results on many downstream tasks.Recent studies on adapting BERT to new tasks mainly focus on modifying the model structure,re-designing the pre-training tasks,and leveraging external data and knowledge.The fine-tuning strategy itself has yet to be fully explored.In this paper,we improve the fine-tuning of BERT with two effective mechanisms:self-ensemble and self-distillation.The self-ensemble mechanism utilizes the checkpoints from an experience pool to integrate the teacher model.In order to transfer knowledge from the teacher model to the student model efficiently,we further use knowledge distillation,which is called self-distillation because the distillation comes from the model itself through the time dimension.Experiments on the GLUE benchmark and the Text Classification benchmark show that our proposed approach can significantly improve the adaption of BERT without any external data or knowledge.We conduct exhaustive experiments to investigate the efficiency of the self-ensemble and self-distillation mechanisms,and our proposed approach achieves a new state-of-the-art result on the SNLI dataset. 展开更多
关键词 BERT deep learning fine-tuning natural language processing(NLP) pre-training model
原文传递
上一页 1 2 4 下一页 到第
使用帮助 返回顶部