Large-scale Language Models(LLMs)have achieved significant breakthroughs in Natural Language Processing(NLP),driven by the pre-training and fine-tuning paradigm.While this approach allows models to specialize in speci...Large-scale Language Models(LLMs)have achieved significant breakthroughs in Natural Language Processing(NLP),driven by the pre-training and fine-tuning paradigm.While this approach allows models to specialize in specific tasks with reduced training costs,the substantial memory requirements during fine-tuning present a barrier to broader deployment.Parameter-Efficient Fine-Tuning(PEFT)techniques,such as Low-Rank Adaptation(LoRA),and parameter quantization methods have emerged as solutions to address these challenges by optimizing memory usage and computational efficiency.Among these,QLoRA,which combines PEFT and quantization,has demonstrated notable success in reducing memory footprints during fine-tuning,prompting the development of various QLoRA variants.Despite these advancements,the quantitative impact of key variables on the fine-tuning performance of quantized LLMs remains underexplored.This study presents a comprehensive analysis of these key variables,focusing on their influence across different layer types and depths within LLM architectures.Our investigation uncovers several critical findings:(1)Larger layers,such as MLP layers,can maintain performance despite reductions in adapter rank,while smaller layers,like self-attention layers,aremore sensitive to such changes;(2)The effectiveness of balancing factors depends more on specific values rather than layer type or depth;(3)In quantization-aware fine-tuning,larger layers can effectively utilize smaller adapters,whereas smaller layers struggle to do so.These insights suggest that layer type is a more significant determinant of fine-tuning success than layer depth when optimizing quantized LLMs.Moreover,for the same discount of trainable parameters,reducing the trainable parameters in a larger layer is more effective in preserving fine-tuning accuracy than in a smaller one.This study provides valuable guidance for more efficient fine-tuning strategies and opens avenues for further research into optimizing LLM fine-tuning in resource-constrained environments.展开更多
In the rapidly evolving landscape of natural language processing(NLP)and sentiment analysis,improving the accuracy and efficiency of sentiment classification models is crucial.This paper investigates the performance o...In the rapidly evolving landscape of natural language processing(NLP)and sentiment analysis,improving the accuracy and efficiency of sentiment classification models is crucial.This paper investigates the performance of two advanced models,the Large Language Model(LLM)LLaMA model and NLP BERT model,in the context of airline review sentiment analysis.Through fine-tuning,domain adaptation,and the application of few-shot learning,the study addresses the subtleties of sentiment expressions in airline-related text data.Employing predictive modeling and comparative analysis,the research evaluates the effectiveness of Large Language Model Meta AI(LLaMA)and Bidirectional Encoder Representations from Transformers(BERT)in capturing sentiment intricacies.Fine-tuning,including domain adaptation,enhances the models'performance in sentiment classification tasks.Additionally,the study explores the potential of few-shot learning to improve model generalization using minimal annotated data for targeted sentiment analysis.By conducting experiments on a diverse airline review dataset,the research quantifies the impact of fine-tuning,domain adaptation,and few-shot learning on model performance,providing valuable insights for industries aiming to predict recommendations and enhance customer satisfaction through a deeper understanding of sentiment in user-generated content(UGC).This research contributes to refining sentiment analysis models,ultimately fostering improved customer satisfaction in the airline industry.展开更多
A complete examination of Large Language Models’strengths,problems,and applications is needed due to their rising use across disciplines.Current studies frequently focus on single-use situations and lack a comprehens...A complete examination of Large Language Models’strengths,problems,and applications is needed due to their rising use across disciplines.Current studies frequently focus on single-use situations and lack a comprehensive understanding of LLM architectural performance,strengths,and weaknesses.This gap precludes finding the appropriate models for task-specific applications and limits awareness of emerging LLM optimization and deployment strategies.In this research,50 studies on 25+LLMs,including GPT-3,GPT-4,Claude 3.5,DeepKet,and hybrid multimodal frameworks like ContextDET and GeoRSCLIP,are thoroughly reviewed.We propose LLM application taxonomy by grouping techniques by task focus—healthcare,chemistry,sentiment analysis,agent-based simulations,and multimodal integration.Advanced methods like parameter-efficient tuning(LoRA),quantumenhanced embeddings(DeepKet),retrieval-augmented generation(RAG),and safety-focused models(GalaxyGPT)are evaluated for dataset requirements,computational efficiency,and performance measures.Frameworks for ethical issues,data limited hallucinations,and KDGI-enhanced fine-tuning like Woodpecker’s post-remedy corrections are highlighted.The investigation’s scope,mad,and methods are described,but the primary results are not.The work reveals that domain-specialized fine-tuned LLMs employing RAG and quantum-enhanced embeddings performbetter for context-heavy applications.In medical text normalization,ChatGPT-4 outperforms previous models,while two multimodal frameworks,GeoRSCLIP,increase remote sensing.Parameter-efficient tuning technologies like LoRA have minimal computing cost and similar performance,demonstrating the necessity for adaptive models in multiple domains.To discover the optimum domain-specific models,explain domain-specific fine-tuning,and present quantum andmultimodal LLMs to address scalability and cross-domain issues.The framework helps academics and practitioners identify,adapt,and innovate LLMs for different purposes.This work advances the field of efficient,interpretable,and ethical LLM application research.展开更多
Configuring computational fluid dynamics(CFD)simulations typically demands extensive domain expertise,limiting broader access.Although large language models(LLMs)have advanced scientific computing,their use in automat...Configuring computational fluid dynamics(CFD)simulations typically demands extensive domain expertise,limiting broader access.Although large language models(LLMs)have advanced scientific computing,their use in automating CFD workflows is underdeveloped.We introduce a novel approach centered on domain-specific LLM adaptation.By fine-tuning Qwen2.5-7B-Instruct on NL2FOAM,our custom dataset of 28,716 natural language-to-OpenFOAM configuration pairs with chain-of-thought(CoT)annotations enables direct translation from natural language descriptions to executable CFD setups.A multi-agent system orchestrates the process,autonomously verifying inputs,generating configurations,running simulations,and correcting errors.Evaluation on a benchmark of 21 diverse flow cases demonstrates state-of-the-art performance,achieving 88.7%solution accuracy and 82.6%first-attempt success rate.This significantly outperforms larger general-purpose models such as Qwen2.5-72B-Instruct,DeepSeek-R1,and Llama3.3-70B-Instruct,while also requiring fewer correction iterations and maintaining high computational efficiency.The results highlight the critical role of domain-specific adaptation in deploying LLM assistants for complex engineering workflows.Our code and fine-tuned model have been deposited at https://github.com/YYgroup/AutoCFD.展开更多
Mo_(2)C is an excellent electrocatalyst for hydrogen evolution reaction(HER).However,Mo_(2)C is a poor electrocatalyst for oxygen evolution reaction(OER).Herein,two different elements,namely Co and Fe,are incorporated...Mo_(2)C is an excellent electrocatalyst for hydrogen evolution reaction(HER).However,Mo_(2)C is a poor electrocatalyst for oxygen evolution reaction(OER).Herein,two different elements,namely Co and Fe,are incorporated in Mo_(2)C that,therefore,has a finely tuned electronic structure,which is not achievable by incorporation of any one of the metals.Consequently,the resulting electrocatalyst Co_(0.8)Fe_(0.2)-Mo_(2)C-80 displayed excellent OER catalytic performance,which is evidenced by a low overpotential of 214.0(and 246.5)mV to attain a current density of 10(and 50)mA cm^(-2),an ultralow Tafel slope of 38.4 mV dec^(-1),and longterm stability in alkaline medium.Theoretical data demonstrates that Co_(0.8)Fe_(0.2)-Mo_(2)C-80 requires the lowest overpotential(1.00 V)for OER and Co centers to be the active sites.The ultrahigh catalytic performance of the electrocatalyst is attributed to the excellent intrinsic catalytic activity due to high Brunauer-Emmett-Teller specific surface area,large electrochemically active surface area,small Tafel slope,and low chargetransfer resistance.展开更多
Large Language Models(LLMs)are increasingly demonstrating their ability to understand natural language and solve complex tasks,especially through text generation.One of the relevant capabilities is contextual learning...Large Language Models(LLMs)are increasingly demonstrating their ability to understand natural language and solve complex tasks,especially through text generation.One of the relevant capabilities is contextual learning,which involves the ability to receive instructions in natural language or task demonstrations to generate expected outputs for test instances without the need for additional training or gradient updates.In recent years,the popularity of social networking has provided a medium through which some users can engage in offensive and harmful online behavior.In this study,we investigate the ability of different LLMs,ranging from zero-shot and few-shot learning to fine-tuning.Our experiments show that LLMs can identify sexist and hateful online texts using zero-shot and few-shot approaches through information retrieval.Furthermore,it is found that the encoder-decoder model called Zephyr achieves the best results with the fine-tuning approach,scoring 86.811%on the Explainable Detection of Online Sexism(EDOS)test-set and 57.453%on the Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter(HatEval)test-set.Finally,it is confirmed that the evaluated models perform well in hate text detection,as they beat the best result in the HatEval task leaderboard.The error analysis shows that contextual learning had difficulty distinguishing between types of hate speech and figurative language.However,the fine-tuned approach tends to produce many false positives.展开更多
As the realm of enterprise-level conversational AI continues to evolve, it becomes evident that while generalized Large Language Models (LLMs) like GPT-3.5 bring remarkable capabilities, they also bring forth formidab...As the realm of enterprise-level conversational AI continues to evolve, it becomes evident that while generalized Large Language Models (LLMs) like GPT-3.5 bring remarkable capabilities, they also bring forth formidable challenges. These models, honed on vast and diverse datasets, have undoubtedly pushed the boundaries of natural language understanding and generation. However, they often stumble when faced with the intricate demands of nuanced enterprise applications. This research advocates for a strategic paradigm shift, urging enterprises to embrace a fine-tuning approach as a means to optimize conversational AI. While generalized LLMs are linguistic marvels, their inability to cater to the specific needs of businesses across various industries poses a critical challenge. This strategic shift involves empowering enterprises to seamlessly integrate their own datasets into LLMs, a process that extends beyond linguistic enhancement. The core concept of this approach centers on customization, enabling businesses to fine-tune the AI’s functionality to fit precisely within their unique business landscapes. By immersing the LLM in industry-specific documents, customer interaction records, internal reports, and regulatory guidelines, the AI transcends its generic capabilities to become a sophisticated conversational partner aligned with the intricacies of the enterprise’s domain. The transformative potential of this fine-tuning approach cannot be overstated. It enables a transition from a universal AI solution to a highly customizable tool. The AI evolves from being a linguistic powerhouse to a contextually aware, industry-savvy assistant. As a result, it not only responds with linguistic accuracy but also with depth, relevance, and resonance, significantly elevating user experiences and operational efficiency. In the subsequent sections, this paper delves into the intricacies of fine-tuning, exploring the multifaceted challenges and abundant opportunities it presents. It addresses the technical intricacies of data integration, ethical considerations surrounding data usage, and the broader implications for the future of enterprise AI. The journey embarked upon in this research holds the potential to redefine the role of conversational AI in enterprises, ushering in an era where AI becomes a dynamic, deeply relevant, and highly effective tool, empowering businesses to excel in an ever-evolving digital landscape.展开更多
The existing multi-objective wheel profile optimization methods mainly consist of three sub-modules:(1)wheel profile generation,(2)multi-body dynamics simulation,and(3)an optimization algorithm.For the first module,a ...The existing multi-objective wheel profile optimization methods mainly consist of three sub-modules:(1)wheel profile generation,(2)multi-body dynamics simulation,and(3)an optimization algorithm.For the first module,a comparably conservative rotary-scaling finetuning(RSFT)method,which introduces two design variables and an empirical formula,is proposed to fine-tune the traditional wheel profiles for improving their engineering applicability.For the second module,for the TRAXX locomotives serving on the Blankenburg–Rubeland line,an optimization function representing the relationship between the wheel profile and the wheel–rail wear number is established based on Kriging surrogate model(KSM).For the third module,a method combining the regression capability of KSM with the iterative computing power of particle swarm optimization(PSO)is proposed to quickly and reliably implement the task of optimizing wheel profiles.Finally,with the RSFT–KSM–PSO method,we propose two wear-resistant wheel profiles for the TRAXX locomotives serving on the Blankenburg–Rubeland line,namely S1002-S and S1002-M.The S1002-S profile minimizes the total wear number by 30%,while the S1002-M profile makes the wear distribution more uniform through a proper sacrifice of the tread wear number,and the total wear number is reduced by 21%.The quasi-static and hunting stability tests further demonstrate that the profile designed by the RSFT–KSM–PSO method is promising for practical engineering applications.展开更多
This paper develops a wheel profile fine-tuning system(WPFTS)that comprehensively considers the influence of wheel profile on wheel damage,vehicle stability,vehicle safety,and passenger comfort.WPFTS can recommend one...This paper develops a wheel profile fine-tuning system(WPFTS)that comprehensively considers the influence of wheel profile on wheel damage,vehicle stability,vehicle safety,and passenger comfort.WPFTS can recommend one or more optimized wheel profiles according to train operators’needs,e.g.,reducing wheel wear,mitigating the development of wheel out-of-roundness(OOR),improving the shape stability of the wheel profile.Specifically,WPFTS includes four modules:(I)a wheel profile generation module based on the rotary-scaling finetuning(RSFT)method;(II)a multi-objective generation module consisting of a rigid multi-body dynamics simulation(MBS)model,an analytical model,and a rigid–flexible MBS model,for generating 11 objectives related to wheel damage,vehicle stability,vehicle safety,and passenger comfort;(III)a weight assignment module consisting of an adaptive weight assignment strategy and a manual weight assignment strategy;and(IV)an optimization module based on radial basis function(RBF)and particle swarm optimization(PSO).Finally,three cases are introduced to show how WPTFS recommends a wheel profile according to train operators’needs.Among them,a wheel profile with high shape stability,a wheel profile for mitigating the development of wheel OOR,and a wheel profile considering hunting stability and derailment safety are developed,respectively.展开更多
During cerebral cortical cortex neurogenesis two major types of progenitors generate a variety of morphologically and functionally diverse projection neurons destined for the different cortical layers in non-gyrified ...During cerebral cortical cortex neurogenesis two major types of progenitors generate a variety of morphologically and functionally diverse projection neurons destined for the different cortical layers in non-gyrified mice. Radial glia cells (RGCs) undergo mitosis in the cortical ventricular zone and exhibit an apical-basal cell polarity, whereas non-polar intermediate progenitor cells (IPCs) divide basally in the subventricular zone (Franco and Muller, 2013; Taverna et al., 2014).展开更多
Background: Sperm DNA fragmentation(sDF) has been proved to be an important parameter in order to predict in vitro the potential fertility of a semen sample. Colloid centrifugation could be a suitable technique to ...Background: Sperm DNA fragmentation(sDF) has been proved to be an important parameter in order to predict in vitro the potential fertility of a semen sample. Colloid centrifugation could be a suitable technique to select those donkey sperm more resistant to DNA fragmentation after thawing. Previous studies have shown that to elucidate the latent damage of the DNA molecule, sDF should be assessed dynamically, where the rate of fragmentation between treatments indicates how resistant the DNA is to iatrogenic damage. The rate of fragmentation is calculated using the slope of a linear regression equation. However, it has not been studied if s DF dynamics fit this model. The objectives of this study were to evaluate the effect of different after-thawing centrifugation protocols on sperm DNA fragmentation and elucidate the most accurate mathematical model(linear regression, exponential or polynomial) for DNA fragmentation over time in frozen-thawed donkey semen.Results: After submitting post-thaw semen samples to no centrifugation(UDC), sperm washing(SW) or single layer centrifugation(SLC) protocols, sD F values after 6 h of incubation were significantly lower in SLC samples than in SW or UDC.Coefficient of determination(R-2) values were significantly higher for a second order polynomial model than for linear or exponential. The highest values for acceleration of fragmentation(aSDF) were obtained for SW, fol owed by SLC and UDC.Conclusion: SLC after thawing seems to preserve longer DNA longevity in comparison to UDC and SW. Moreover,the fine-tuning of models has shown that sDF dynamics in frozen-thawed donkey semen fit a second order polynomial model, which implies that fragmentation rate is not constant and fragmentation acceleration must be taken into account to elucidate hidden damage in the DNA molecule.展开更多
Chinese Vice Premier’s visit to Africa continues to emphasize the mutual cooperation,with a focus on agriculture FOR many years,the Chinese Government has dispatched the minister of foreign affairs to Africa for the ...Chinese Vice Premier’s visit to Africa continues to emphasize the mutual cooperation,with a focus on agriculture FOR many years,the Chinese Government has dispatched the minister of foreign affairs to Africa for the first official visit of a year.This year,however,that rule was broken when Hui Liangyu,Chinese Vice Premier,made the 14-day trip. On January 6-19,Hui paid official visits to Mauritius,Zambia,the Democratic Republic of Congo(DRC),Cameroon and Senegal,focusing on economic and agri-展开更多
Time series foundation models provide a universal solution for generating forecasts to support optimization problems in energy systems.Those foundation models are typically trained in a prediction-focused manner to ma...Time series foundation models provide a universal solution for generating forecasts to support optimization problems in energy systems.Those foundation models are typically trained in a prediction-focused manner to maximize forecast quality.In contrast,decision-focused learning directly improves the resulting value of the forecast in downstream optimization rather than merely maximizing forecasting quality.The practical integration of forecast values into forecasting models is challenging,particularly when addressing complex applications with diverse instances,such as buildings.This becomes even more complicated when instances possess specific characteristics that require instance-specific,tailored predictions to increase the forecast value.To tackle this challenge,we use decision-focused fine-tuning within time series foundation models to offer a scalable and efficient solution for decision-focused learning applied to the dispatchable feeder optimization problem.To obtain more robust predictions for scarce building data,we use Moirai as a state-of-the-art foundation model,which offers robust and generalized results with few-shot parameter-efficient fine-tuning.Comparing the decision-focused fine-tuned Moirai with a state-of-the-art classical prediction-focused fine-tuning Moirai,we observe an improvement of 9.45%in Average Daily Total Costs.展开更多
Instruction fine-tuning is a key method for adapting large language models(LLMs)to domain-specific tasks,and instruction quality significantly impacts model performance after fine-tuning.Hence,evaluating the quality o...Instruction fine-tuning is a key method for adapting large language models(LLMs)to domain-specific tasks,and instruction quality significantly impacts model performance after fine-tuning.Hence,evaluating the quality of instruction and selecting high-quality instructions are essential steps in the process of LLM instruction fine-tuning.Although existing studies provide important theoretical foundations and techniques for this,there is still room for improvement in terms of generality,the relationship between methods and experimental verification.Current methods for evaluating instruction quality can be classified into four main categories:human evaluation,statistics-based evaluation,model-based evaluation,and LLMs-based evaluation.Among these methods,human evaluation relies on the subjective judgment and domain expertise of the evaluators,which offers interpretability and is suitable for scenarios involving small-scale data and sufficient budgets.Statistics-based evaluation estimates the quality of instructions using indicators such as stopwords and lexical diversity,providing high efficiency and a suitable evaluation for large-scale data.Model-based evaluation employs specific models to quantify indicators such as perplexity(PPL)and instruction following difficulty(IFD),which is flexible and suitable for specific tasks.The LLMs-based evaluation rates the quality of instructions through prompt-based interaction with LLMs,focusing on aspects such as accuracy and coherence,which is highly automated and customizable,simplifying the evaluation process.Finally,considering the limitations of current quality evaluation methods,some future research directions are proposed for improvement.These include refining instruction categories,extending evaluation indicators,enhancing human-AI interaction evaluation method,applying agents in instruction quality evaluation,and developing a comprehensive evaluation framework.展开更多
Text-to-image diffusion models have demonstrated impressive capabilities in image generation and have been effectively applied to image inpainting.While text prompt provides an intuitive guidance for conditional inpai...Text-to-image diffusion models have demonstrated impressive capabilities in image generation and have been effectively applied to image inpainting.While text prompt provides an intuitive guidance for conditional inpainting,users often seek the ability to inpaint a specific object with customized appearance by providing an exemplar image.Unfortunately,existing methods struggle to achieve high fidelity in exemplar-driven inpainting.To address this,we use a plug-and-play low-rank adaptation(LoRA)module based on a pretrained text-driven inpainting model.The LoRA module is dedicated to learn the exemplar-specific concepts through few-shot fine-tuning,bringing improved fitting capability to customized exemplar images,without intensive training on large-scale datasets.Additionally,we introduce GPT-4V prompting and prior noise initialization techniques to further facilitate the fidelity in inpainting results.In brief,the denoising diffusion process first starts with the noise derived from a composite exemplar-background image,and is subsequently guided by an expressive prompt generated from the exemplar using the GPT-4V model.Extensive experiments demonstrate that our method achieves state-of-the-art performance,qualitatively and quantitatively,offering users an exemplar-driven inpainting tool with enhanced customization capability.展开更多
Scientific claims,which present propositions as facts,are fundamental to scientific knowledge.Despite their significance,current methods for scientific claim recognition are hindered by the scarcity of annotated datas...Scientific claims,which present propositions as facts,are fundamental to scientific knowledge.Despite their significance,current methods for scientific claim recognition are hindered by the scarcity of annotated datasets,particularly those covering full-text documents rather than just abstracts.To bridge this gap,this study aims to enhance scientific claim recognition by leveraging transfer learning through a staged fine-tuning approach.Specifically,we employ a large move prediction dataset(RCMR 280k)alongside the smaller SciClaim dataset we developed,to enhance our scientific claim recognition model’s ability to distinguish between various types of scientific narratives and their roles within research papers.We converted the labeled sentences from both datasets into a question-answer format,aligning them with the fine-tuning requirements of large language models.During the fine-tuning process,we explore two distinct strategies for incorporating knowledge from previous phases.Results indicate that re-integrating LoRA trained on the RCMR 280k dataset into the original model,followed by the creation of a new LoRA specifically for SciClaim training,produces the best outcomes.This staged fine-tuning approach efficiently adapts the model to the task of scientific claim recognition.Our model,SciClaim Miner,outperforms state-of-the-art approaches,achieving an F1-score of 90.96%.The ablation study demonstrates that both the dataset and prompt design,as well as the model training strategies,significantly enhance performance.This work advances scientific claim recognition by introducing a robust methodology that bridges the gap between limited data and effective model training.展开更多
Enriching the library of chiral plasmonic structures is of significant importance in advancing their applicability across diverse domains such as biosensing,nanophotonics,and catalysis.Here,employing triangle nanoplat...Enriching the library of chiral plasmonic structures is of significant importance in advancing their applicability across diverse domains such as biosensing,nanophotonics,and catalysis.Here,employing triangle nanoplates as growth seeds,we synthesized a novel class of chiral-shaped plasmonic nanostructures through a wet chemical strategy with dipeptide as chiral inducers,including chiral tri-blade boomerangs,concave rhombic dodecahedrons,and nanoflowers.The structural diversity in chiral plasmonic nanostructures was elucidated through their continuous morphological evolution from two-dimensional to threedimensional architectures.The fine-tuning of chiroptical properties was achieved by precisely manipulating crucial synthetic parameters such as the amount of chiral molecules,seeds,and gold precursor that significantly influenced chiral structure formation.The findings provide a promising avenue for enriching chiral materials with highly sophisticated structures,facilitating a fundamental understanding of the relationship between structural nuances and chiroptical properties.展开更多
Deep convolutional neural networks with high performance are hard to be deployed in many real world applications, since the computing resources of edge devices such as smart phones or embedded GPU are limited. To alle...Deep convolutional neural networks with high performance are hard to be deployed in many real world applications, since the computing resources of edge devices such as smart phones or embedded GPU are limited. To alleviate this hardware limitation, the compression of deep neural networks from the model side becomes important. As one of the most popular methods in the spotlight, channel pruning of the deep convolutional model can effectively remove redundant convolutional channels from the CNN (convolutional neural network) without affecting the network’s performance remarkably. Existing methods focus on pruning design, evaluating the importance of different convolutional filters in the CNN model. A fast and effective fine-tuning method to restore accuracy is urgently needed. In this paper, we propose a fine-tuning method KDFT (Knowledge Distillation Based Fine-Tuning), which improves the accuracy of fine-tuned models with almost negligible training overhead by introducing knowledge distillation. Extensive experimental results on benchmark datasets with representative CNN models show that up to 4.86% accuracy improvement and 79% time saving can be obtained.展开更多
Dramatic decrease of sugar uptake is a general phenomenon in Streptomyces at stationary phase,when antibiotics are extensively produced.Milbemycins produced by Streptomyces bingchenggensis are a group of valuable macr...Dramatic decrease of sugar uptake is a general phenomenon in Streptomyces at stationary phase,when antibiotics are extensively produced.Milbemycins produced by Streptomyces bingchenggensis are a group of valuable macrolide biopesticides,while the low yield and titer impede their broad applications in agricultural field.Considering that inadequate sugar uptake generally hinders titer improvement of desired products,we mined the underlying sugar uptake systems and fine-tuned their expression in this work.First,we screened the candidates at both genomic and transcriptomic level in S.bingchenggensis.Then,two ATP-binding cassette transporters named TP2 and TP5 were characterized to improve milbemycin titer and yield significantly.Next,the appropriate native temporal promoters were selected and used to tune the expression of TP2 and TP5,resulting in a maximal milbemycin A3/A4 titer increase by 36.9%to 3321 mg/L.Finally,TP2 and TP5 were broadly finetuned in another two macrolide biopesticide producers Streptomyces avermitilis and Streptomyces cyaneogriseus,leading to a maximal titer improvement of 34.1%and 52.6%for avermectin B1a and nemadectin,respectively.This work provides useful transporter tools and corresponding engineering strategy for Streptomyces.展开更多
Fine-tuning pre-trained language models like BERT have become an effective way in natural language processing(NLP)and yield state-of-the-art results on many downstream tasks.Recent studies on adapting BERT to new task...Fine-tuning pre-trained language models like BERT have become an effective way in natural language processing(NLP)and yield state-of-the-art results on many downstream tasks.Recent studies on adapting BERT to new tasks mainly focus on modifying the model structure,re-designing the pre-training tasks,and leveraging external data and knowledge.The fine-tuning strategy itself has yet to be fully explored.In this paper,we improve the fine-tuning of BERT with two effective mechanisms:self-ensemble and self-distillation.The self-ensemble mechanism utilizes the checkpoints from an experience pool to integrate the teacher model.In order to transfer knowledge from the teacher model to the student model efficiently,we further use knowledge distillation,which is called self-distillation because the distillation comes from the model itself through the time dimension.Experiments on the GLUE benchmark and the Text Classification benchmark show that our proposed approach can significantly improve the adaption of BERT without any external data or knowledge.We conduct exhaustive experiments to investigate the efficiency of the self-ensemble and self-distillation mechanisms,and our proposed approach achieves a new state-of-the-art result on the SNLI dataset.展开更多
基金supported by the National Key R&D Program of China(No.2021YFB0301200)National Natural Science Foundation of China(No.62025208).
文摘Large-scale Language Models(LLMs)have achieved significant breakthroughs in Natural Language Processing(NLP),driven by the pre-training and fine-tuning paradigm.While this approach allows models to specialize in specific tasks with reduced training costs,the substantial memory requirements during fine-tuning present a barrier to broader deployment.Parameter-Efficient Fine-Tuning(PEFT)techniques,such as Low-Rank Adaptation(LoRA),and parameter quantization methods have emerged as solutions to address these challenges by optimizing memory usage and computational efficiency.Among these,QLoRA,which combines PEFT and quantization,has demonstrated notable success in reducing memory footprints during fine-tuning,prompting the development of various QLoRA variants.Despite these advancements,the quantitative impact of key variables on the fine-tuning performance of quantized LLMs remains underexplored.This study presents a comprehensive analysis of these key variables,focusing on their influence across different layer types and depths within LLM architectures.Our investigation uncovers several critical findings:(1)Larger layers,such as MLP layers,can maintain performance despite reductions in adapter rank,while smaller layers,like self-attention layers,aremore sensitive to such changes;(2)The effectiveness of balancing factors depends more on specific values rather than layer type or depth;(3)In quantization-aware fine-tuning,larger layers can effectively utilize smaller adapters,whereas smaller layers struggle to do so.These insights suggest that layer type is a more significant determinant of fine-tuning success than layer depth when optimizing quantized LLMs.Moreover,for the same discount of trainable parameters,reducing the trainable parameters in a larger layer is more effective in preserving fine-tuning accuracy than in a smaller one.This study provides valuable guidance for more efficient fine-tuning strategies and opens avenues for further research into optimizing LLM fine-tuning in resource-constrained environments.
文摘In the rapidly evolving landscape of natural language processing(NLP)and sentiment analysis,improving the accuracy and efficiency of sentiment classification models is crucial.This paper investigates the performance of two advanced models,the Large Language Model(LLM)LLaMA model and NLP BERT model,in the context of airline review sentiment analysis.Through fine-tuning,domain adaptation,and the application of few-shot learning,the study addresses the subtleties of sentiment expressions in airline-related text data.Employing predictive modeling and comparative analysis,the research evaluates the effectiveness of Large Language Model Meta AI(LLaMA)and Bidirectional Encoder Representations from Transformers(BERT)in capturing sentiment intricacies.Fine-tuning,including domain adaptation,enhances the models'performance in sentiment classification tasks.Additionally,the study explores the potential of few-shot learning to improve model generalization using minimal annotated data for targeted sentiment analysis.By conducting experiments on a diverse airline review dataset,the research quantifies the impact of fine-tuning,domain adaptation,and few-shot learning on model performance,providing valuable insights for industries aiming to predict recommendations and enhance customer satisfaction through a deeper understanding of sentiment in user-generated content(UGC).This research contributes to refining sentiment analysis models,ultimately fostering improved customer satisfaction in the airline industry.
文摘A complete examination of Large Language Models’strengths,problems,and applications is needed due to their rising use across disciplines.Current studies frequently focus on single-use situations and lack a comprehensive understanding of LLM architectural performance,strengths,and weaknesses.This gap precludes finding the appropriate models for task-specific applications and limits awareness of emerging LLM optimization and deployment strategies.In this research,50 studies on 25+LLMs,including GPT-3,GPT-4,Claude 3.5,DeepKet,and hybrid multimodal frameworks like ContextDET and GeoRSCLIP,are thoroughly reviewed.We propose LLM application taxonomy by grouping techniques by task focus—healthcare,chemistry,sentiment analysis,agent-based simulations,and multimodal integration.Advanced methods like parameter-efficient tuning(LoRA),quantumenhanced embeddings(DeepKet),retrieval-augmented generation(RAG),and safety-focused models(GalaxyGPT)are evaluated for dataset requirements,computational efficiency,and performance measures.Frameworks for ethical issues,data limited hallucinations,and KDGI-enhanced fine-tuning like Woodpecker’s post-remedy corrections are highlighted.The investigation’s scope,mad,and methods are described,but the primary results are not.The work reveals that domain-specialized fine-tuned LLMs employing RAG and quantum-enhanced embeddings performbetter for context-heavy applications.In medical text normalization,ChatGPT-4 outperforms previous models,while two multimodal frameworks,GeoRSCLIP,increase remote sensing.Parameter-efficient tuning technologies like LoRA have minimal computing cost and similar performance,demonstrating the necessity for adaptive models in multiple domains.To discover the optimum domain-specific models,explain domain-specific fine-tuning,and present quantum andmultimodal LLMs to address scalability and cross-domain issues.The framework helps academics and practitioners identify,adapt,and innovate LLMs for different purposes.This work advances the field of efficient,interpretable,and ethical LLM application research.
基金supported by the National Natural Science Foundation of China(Grant Nos.52306126,22350710788,12432010,11988102,92270203)the Xplore Prize.
文摘Configuring computational fluid dynamics(CFD)simulations typically demands extensive domain expertise,limiting broader access.Although large language models(LLMs)have advanced scientific computing,their use in automating CFD workflows is underdeveloped.We introduce a novel approach centered on domain-specific LLM adaptation.By fine-tuning Qwen2.5-7B-Instruct on NL2FOAM,our custom dataset of 28,716 natural language-to-OpenFOAM configuration pairs with chain-of-thought(CoT)annotations enables direct translation from natural language descriptions to executable CFD setups.A multi-agent system orchestrates the process,autonomously verifying inputs,generating configurations,running simulations,and correcting errors.Evaluation on a benchmark of 21 diverse flow cases demonstrates state-of-the-art performance,achieving 88.7%solution accuracy and 82.6%first-attempt success rate.This significantly outperforms larger general-purpose models such as Qwen2.5-72B-Instruct,DeepSeek-R1,and Llama3.3-70B-Instruct,while also requiring fewer correction iterations and maintaining high computational efficiency.The results highlight the critical role of domain-specific adaptation in deploying LLM assistants for complex engineering workflows.Our code and fine-tuned model have been deposited at https://github.com/YYgroup/AutoCFD.
基金financial support from the SERB-SURE under file number of SUR/2022/003129Jong Hyeok Park acknowledges the support of the National Research Foundation of Korea (NRF)funded by the Ministry of Science and ICT (RS-2023-00302697,RS-2023-00268523).
文摘Mo_(2)C is an excellent electrocatalyst for hydrogen evolution reaction(HER).However,Mo_(2)C is a poor electrocatalyst for oxygen evolution reaction(OER).Herein,two different elements,namely Co and Fe,are incorporated in Mo_(2)C that,therefore,has a finely tuned electronic structure,which is not achievable by incorporation of any one of the metals.Consequently,the resulting electrocatalyst Co_(0.8)Fe_(0.2)-Mo_(2)C-80 displayed excellent OER catalytic performance,which is evidenced by a low overpotential of 214.0(and 246.5)mV to attain a current density of 10(and 50)mA cm^(-2),an ultralow Tafel slope of 38.4 mV dec^(-1),and longterm stability in alkaline medium.Theoretical data demonstrates that Co_(0.8)Fe_(0.2)-Mo_(2)C-80 requires the lowest overpotential(1.00 V)for OER and Co centers to be the active sites.The ultrahigh catalytic performance of the electrocatalyst is attributed to the excellent intrinsic catalytic activity due to high Brunauer-Emmett-Teller specific surface area,large electrochemically active surface area,small Tafel slope,and low chargetransfer resistance.
基金This work is part of the research projects LaTe4PoliticES(PID2022-138099OBI00)funded by MICIU/AEI/10.13039/501100011033the European Regional Development Fund(ERDF)-A Way of Making Europe and LT-SWM(TED2021-131167B-I00)funded by MICIU/AEI/10.13039/501100011033the European Union NextGenerationEU/PRTR.Mr.Ronghao Pan is supported by the Programa Investigo grant,funded by the Region of Murcia,the Spanish Ministry of Labour and Social Economy and the European Union-NextGenerationEU under the“Plan de Recuperación,Transformación y Resiliencia(PRTR).”。
文摘Large Language Models(LLMs)are increasingly demonstrating their ability to understand natural language and solve complex tasks,especially through text generation.One of the relevant capabilities is contextual learning,which involves the ability to receive instructions in natural language or task demonstrations to generate expected outputs for test instances without the need for additional training or gradient updates.In recent years,the popularity of social networking has provided a medium through which some users can engage in offensive and harmful online behavior.In this study,we investigate the ability of different LLMs,ranging from zero-shot and few-shot learning to fine-tuning.Our experiments show that LLMs can identify sexist and hateful online texts using zero-shot and few-shot approaches through information retrieval.Furthermore,it is found that the encoder-decoder model called Zephyr achieves the best results with the fine-tuning approach,scoring 86.811%on the Explainable Detection of Online Sexism(EDOS)test-set and 57.453%on the Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter(HatEval)test-set.Finally,it is confirmed that the evaluated models perform well in hate text detection,as they beat the best result in the HatEval task leaderboard.The error analysis shows that contextual learning had difficulty distinguishing between types of hate speech and figurative language.However,the fine-tuned approach tends to produce many false positives.
文摘As the realm of enterprise-level conversational AI continues to evolve, it becomes evident that while generalized Large Language Models (LLMs) like GPT-3.5 bring remarkable capabilities, they also bring forth formidable challenges. These models, honed on vast and diverse datasets, have undoubtedly pushed the boundaries of natural language understanding and generation. However, they often stumble when faced with the intricate demands of nuanced enterprise applications. This research advocates for a strategic paradigm shift, urging enterprises to embrace a fine-tuning approach as a means to optimize conversational AI. While generalized LLMs are linguistic marvels, their inability to cater to the specific needs of businesses across various industries poses a critical challenge. This strategic shift involves empowering enterprises to seamlessly integrate their own datasets into LLMs, a process that extends beyond linguistic enhancement. The core concept of this approach centers on customization, enabling businesses to fine-tune the AI’s functionality to fit precisely within their unique business landscapes. By immersing the LLM in industry-specific documents, customer interaction records, internal reports, and regulatory guidelines, the AI transcends its generic capabilities to become a sophisticated conversational partner aligned with the intricacies of the enterprise’s domain. The transformative potential of this fine-tuning approach cannot be overstated. It enables a transition from a universal AI solution to a highly customizable tool. The AI evolves from being a linguistic powerhouse to a contextually aware, industry-savvy assistant. As a result, it not only responds with linguistic accuracy but also with depth, relevance, and resonance, significantly elevating user experiences and operational efficiency. In the subsequent sections, this paper delves into the intricacies of fine-tuning, exploring the multifaceted challenges and abundant opportunities it presents. It addresses the technical intricacies of data integration, ethical considerations surrounding data usage, and the broader implications for the future of enterprise AI. The journey embarked upon in this research holds the potential to redefine the role of conversational AI in enterprises, ushering in an era where AI becomes a dynamic, deeply relevant, and highly effective tool, empowering businesses to excel in an ever-evolving digital landscape.
基金the Assets4Rail Project which is funded by the Shift2Rail Joint Undertaking under the EU’s H2020 program(Grant No.826250)the Open Research Fund of State Key Laboratory of Traction Power of Southwest Jiaotong University(Grant No.TPL2011)+1 种基金part of the experiment data concerning the railway line is supported by the DynoTRAIN Project,funded by European Commission(Grant No.234079)The first author is also supported by the China Scholarship Council(Grant No.201707000113).
文摘The existing multi-objective wheel profile optimization methods mainly consist of three sub-modules:(1)wheel profile generation,(2)multi-body dynamics simulation,and(3)an optimization algorithm.For the first module,a comparably conservative rotary-scaling finetuning(RSFT)method,which introduces two design variables and an empirical formula,is proposed to fine-tune the traditional wheel profiles for improving their engineering applicability.For the second module,for the TRAXX locomotives serving on the Blankenburg–Rubeland line,an optimization function representing the relationship between the wheel profile and the wheel–rail wear number is established based on Kriging surrogate model(KSM).For the third module,a method combining the regression capability of KSM with the iterative computing power of particle swarm optimization(PSO)is proposed to quickly and reliably implement the task of optimizing wheel profiles.Finally,with the RSFT–KSM–PSO method,we propose two wear-resistant wheel profiles for the TRAXX locomotives serving on the Blankenburg–Rubeland line,namely S1002-S and S1002-M.The S1002-S profile minimizes the total wear number by 30%,while the S1002-M profile makes the wear distribution more uniform through a proper sacrifice of the tread wear number,and the total wear number is reduced by 21%.The quasi-static and hunting stability tests further demonstrate that the profile designed by the RSFT–KSM–PSO method is promising for practical engineering applications.
基金This work was supported by China Scholarship Council(Grant No.201707000113).
文摘This paper develops a wheel profile fine-tuning system(WPFTS)that comprehensively considers the influence of wheel profile on wheel damage,vehicle stability,vehicle safety,and passenger comfort.WPFTS can recommend one or more optimized wheel profiles according to train operators’needs,e.g.,reducing wheel wear,mitigating the development of wheel out-of-roundness(OOR),improving the shape stability of the wheel profile.Specifically,WPFTS includes four modules:(I)a wheel profile generation module based on the rotary-scaling finetuning(RSFT)method;(II)a multi-objective generation module consisting of a rigid multi-body dynamics simulation(MBS)model,an analytical model,and a rigid–flexible MBS model,for generating 11 objectives related to wheel damage,vehicle stability,vehicle safety,and passenger comfort;(III)a weight assignment module consisting of an adaptive weight assignment strategy and a manual weight assignment strategy;and(IV)an optimization module based on radial basis function(RBF)and particle swarm optimization(PSO).Finally,three cases are introduced to show how WPTFS recommends a wheel profile according to train operators’needs.Among them,a wheel profile with high shape stability,a wheel profile for mitigating the development of wheel OOR,and a wheel profile considering hunting stability and derailment safety are developed,respectively.
文摘During cerebral cortical cortex neurogenesis two major types of progenitors generate a variety of morphologically and functionally diverse projection neurons destined for the different cortical layers in non-gyrified mice. Radial glia cells (RGCs) undergo mitosis in the cortical ventricular zone and exhibit an apical-basal cell polarity, whereas non-polar intermediate progenitor cells (IPCs) divide basally in the subventricular zone (Franco and Muller, 2013; Taverna et al., 2014).
基金partially supported by grants RZ2009-00006-00-00(Instituto Nacional de Investigacion y Tecnología Agraria y Alimentaria,Ministerio de Ciencia e Innovación,Spain)AGL-2013-42726-R(Secretaria de Estado de Investigacion,Desarrollo e Innovacion,Ministerio de Economia y Competitividad,Spain)+1 种基金supported by a Ph.D.fellowship from the ceiA3(Andalucia,Spain)with funding provided by Banco Santander through its Global Division,Santander Universidadesfunded by the Swedish Foundation for Equine Research,Stockholm,Sweden(H14-47-008)
文摘Background: Sperm DNA fragmentation(sDF) has been proved to be an important parameter in order to predict in vitro the potential fertility of a semen sample. Colloid centrifugation could be a suitable technique to select those donkey sperm more resistant to DNA fragmentation after thawing. Previous studies have shown that to elucidate the latent damage of the DNA molecule, sDF should be assessed dynamically, where the rate of fragmentation between treatments indicates how resistant the DNA is to iatrogenic damage. The rate of fragmentation is calculated using the slope of a linear regression equation. However, it has not been studied if s DF dynamics fit this model. The objectives of this study were to evaluate the effect of different after-thawing centrifugation protocols on sperm DNA fragmentation and elucidate the most accurate mathematical model(linear regression, exponential or polynomial) for DNA fragmentation over time in frozen-thawed donkey semen.Results: After submitting post-thaw semen samples to no centrifugation(UDC), sperm washing(SW) or single layer centrifugation(SLC) protocols, sD F values after 6 h of incubation were significantly lower in SLC samples than in SW or UDC.Coefficient of determination(R-2) values were significantly higher for a second order polynomial model than for linear or exponential. The highest values for acceleration of fragmentation(aSDF) were obtained for SW, fol owed by SLC and UDC.Conclusion: SLC after thawing seems to preserve longer DNA longevity in comparison to UDC and SW. Moreover,the fine-tuning of models has shown that sDF dynamics in frozen-thawed donkey semen fit a second order polynomial model, which implies that fragmentation rate is not constant and fragmentation acceleration must be taken into account to elucidate hidden damage in the DNA molecule.
文摘Chinese Vice Premier’s visit to Africa continues to emphasize the mutual cooperation,with a focus on agriculture FOR many years,the Chinese Government has dispatched the minister of foreign affairs to Africa for the first official visit of a year.This year,however,that rule was broken when Hui Liangyu,Chinese Vice Premier,made the 14-day trip. On January 6-19,Hui paid official visits to Mauritius,Zambia,the Democratic Republic of Congo(DRC),Cameroon and Senegal,focusing on economic and agri-
基金funded by the Helmholtz Association’s Initiative and Networking Fund through Helmholtz AI,the Helmholtz Association under the Program“Energy System Design”the German Research Foundation(DFG)as part of the Research Training Group 2153“En-ergy Status Data:Informatics Methods for its Collection,Analysis and Exploitation”+1 种基金supported by the Helmholtz Association Initiative and Networking Fund on the HAICORE@KIT partitionsupport by the KIT-Publication Fund of the Karlsruhe Institute of Technology.
文摘Time series foundation models provide a universal solution for generating forecasts to support optimization problems in energy systems.Those foundation models are typically trained in a prediction-focused manner to maximize forecast quality.In contrast,decision-focused learning directly improves the resulting value of the forecast in downstream optimization rather than merely maximizing forecasting quality.The practical integration of forecast values into forecasting models is challenging,particularly when addressing complex applications with diverse instances,such as buildings.This becomes even more complicated when instances possess specific characteristics that require instance-specific,tailored predictions to increase the forecast value.To tackle this challenge,we use decision-focused fine-tuning within time series foundation models to offer a scalable and efficient solution for decision-focused learning applied to the dispatchable feeder optimization problem.To obtain more robust predictions for scarce building data,we use Moirai as a state-of-the-art foundation model,which offers robust and generalized results with few-shot parameter-efficient fine-tuning.Comparing the decision-focused fine-tuned Moirai with a state-of-the-art classical prediction-focused fine-tuning Moirai,we observe an improvement of 9.45%in Average Daily Total Costs.
基金supported by National Natural Science Foundation of China(No.62261023)National Natural Science Foundation of China(No.U1836118)Science and Technology Innovation 2030“New Generation of Artificial Intelligence”(2020AAA0108501).
文摘Instruction fine-tuning is a key method for adapting large language models(LLMs)to domain-specific tasks,and instruction quality significantly impacts model performance after fine-tuning.Hence,evaluating the quality of instruction and selecting high-quality instructions are essential steps in the process of LLM instruction fine-tuning.Although existing studies provide important theoretical foundations and techniques for this,there is still room for improvement in terms of generality,the relationship between methods and experimental verification.Current methods for evaluating instruction quality can be classified into four main categories:human evaluation,statistics-based evaluation,model-based evaluation,and LLMs-based evaluation.Among these methods,human evaluation relies on the subjective judgment and domain expertise of the evaluators,which offers interpretability and is suitable for scenarios involving small-scale data and sufficient budgets.Statistics-based evaluation estimates the quality of instructions using indicators such as stopwords and lexical diversity,providing high efficiency and a suitable evaluation for large-scale data.Model-based evaluation employs specific models to quantify indicators such as perplexity(PPL)and instruction following difficulty(IFD),which is flexible and suitable for specific tasks.The LLMs-based evaluation rates the quality of instructions through prompt-based interaction with LLMs,focusing on aspects such as accuracy and coherence,which is highly automated and customizable,simplifying the evaluation process.Finally,considering the limitations of current quality evaluation methods,some future research directions are proposed for improvement.These include refining instruction categories,extending evaluation indicators,enhancing human-AI interaction evaluation method,applying agents in instruction quality evaluation,and developing a comprehensive evaluation framework.
基金Project supported by the National Natural Science Foundation of China(No.82027801)。
文摘Text-to-image diffusion models have demonstrated impressive capabilities in image generation and have been effectively applied to image inpainting.While text prompt provides an intuitive guidance for conditional inpainting,users often seek the ability to inpaint a specific object with customized appearance by providing an exemplar image.Unfortunately,existing methods struggle to achieve high fidelity in exemplar-driven inpainting.To address this,we use a plug-and-play low-rank adaptation(LoRA)module based on a pretrained text-driven inpainting model.The LoRA module is dedicated to learn the exemplar-specific concepts through few-shot fine-tuning,bringing improved fitting capability to customized exemplar images,without intensive training on large-scale datasets.Additionally,we introduce GPT-4V prompting and prior noise initialization techniques to further facilitate the fidelity in inpainting results.In brief,the denoising diffusion process first starts with the noise derived from a composite exemplar-background image,and is subsequently guided by an expressive prompt generated from the exemplar using the GPT-4V model.Extensive experiments demonstrate that our method achieves state-of-the-art performance,qualitatively and quantitatively,offering users an exemplar-driven inpainting tool with enhanced customization capability.
基金supported by the major project of the National Social Science Foundation of China“Big Data-driven Semantic Evaluation System of Science and Technology Literature”(Project No.21&ZD329).
文摘Scientific claims,which present propositions as facts,are fundamental to scientific knowledge.Despite their significance,current methods for scientific claim recognition are hindered by the scarcity of annotated datasets,particularly those covering full-text documents rather than just abstracts.To bridge this gap,this study aims to enhance scientific claim recognition by leveraging transfer learning through a staged fine-tuning approach.Specifically,we employ a large move prediction dataset(RCMR 280k)alongside the smaller SciClaim dataset we developed,to enhance our scientific claim recognition model’s ability to distinguish between various types of scientific narratives and their roles within research papers.We converted the labeled sentences from both datasets into a question-answer format,aligning them with the fine-tuning requirements of large language models.During the fine-tuning process,we explore two distinct strategies for incorporating knowledge from previous phases.Results indicate that re-integrating LoRA trained on the RCMR 280k dataset into the original model,followed by the creation of a new LoRA specifically for SciClaim training,produces the best outcomes.This staged fine-tuning approach efficiently adapts the model to the task of scientific claim recognition.Our model,SciClaim Miner,outperforms state-of-the-art approaches,achieving an F1-score of 90.96%.The ablation study demonstrates that both the dataset and prompt design,as well as the model training strategies,significantly enhance performance.This work advances scientific claim recognition by introducing a robust methodology that bridges the gap between limited data and effective model training.
基金supported by the National Natural Science Foundation of China(Nos.22001201 and 22075224)the Science and Technology Agency of Shaanxi Province(No.2022KWZ-21).
文摘Enriching the library of chiral plasmonic structures is of significant importance in advancing their applicability across diverse domains such as biosensing,nanophotonics,and catalysis.Here,employing triangle nanoplates as growth seeds,we synthesized a novel class of chiral-shaped plasmonic nanostructures through a wet chemical strategy with dipeptide as chiral inducers,including chiral tri-blade boomerangs,concave rhombic dodecahedrons,and nanoflowers.The structural diversity in chiral plasmonic nanostructures was elucidated through their continuous morphological evolution from two-dimensional to threedimensional architectures.The fine-tuning of chiroptical properties was achieved by precisely manipulating crucial synthetic parameters such as the amount of chiral molecules,seeds,and gold precursor that significantly influenced chiral structure formation.The findings provide a promising avenue for enriching chiral materials with highly sophisticated structures,facilitating a fundamental understanding of the relationship between structural nuances and chiroptical properties.
基金supported by the National Natural Science Foundation of China under Grant No.U1866602.
文摘Deep convolutional neural networks with high performance are hard to be deployed in many real world applications, since the computing resources of edge devices such as smart phones or embedded GPU are limited. To alleviate this hardware limitation, the compression of deep neural networks from the model side becomes important. As one of the most popular methods in the spotlight, channel pruning of the deep convolutional model can effectively remove redundant convolutional channels from the CNN (convolutional neural network) without affecting the network’s performance remarkably. Existing methods focus on pruning design, evaluating the importance of different convolutional filters in the CNN model. A fast and effective fine-tuning method to restore accuracy is urgently needed. In this paper, we propose a fine-tuning method KDFT (Knowledge Distillation Based Fine-Tuning), which improves the accuracy of fine-tuned models with almost negligible training overhead by introducing knowledge distillation. Extensive experimental results on benchmark datasets with representative CNN models show that up to 4.86% accuracy improvement and 79% time saving can be obtained.
基金This work was financially supported by National Natural Science Foundation of China(Grant Nos:31772242,31972348,and 31672092).
文摘Dramatic decrease of sugar uptake is a general phenomenon in Streptomyces at stationary phase,when antibiotics are extensively produced.Milbemycins produced by Streptomyces bingchenggensis are a group of valuable macrolide biopesticides,while the low yield and titer impede their broad applications in agricultural field.Considering that inadequate sugar uptake generally hinders titer improvement of desired products,we mined the underlying sugar uptake systems and fine-tuned their expression in this work.First,we screened the candidates at both genomic and transcriptomic level in S.bingchenggensis.Then,two ATP-binding cassette transporters named TP2 and TP5 were characterized to improve milbemycin titer and yield significantly.Next,the appropriate native temporal promoters were selected and used to tune the expression of TP2 and TP5,resulting in a maximal milbemycin A3/A4 titer increase by 36.9%to 3321 mg/L.Finally,TP2 and TP5 were broadly finetuned in another two macrolide biopesticide producers Streptomyces avermitilis and Streptomyces cyaneogriseus,leading to a maximal titer improvement of 34.1%and 52.6%for avermectin B1a and nemadectin,respectively.This work provides useful transporter tools and corresponding engineering strategy for Streptomyces.
基金supported by the National Key Research and Development Program of China under Grant No.2020AAA0106700the National Natural Science Foundation of China under Grant No.62022027.
文摘Fine-tuning pre-trained language models like BERT have become an effective way in natural language processing(NLP)and yield state-of-the-art results on many downstream tasks.Recent studies on adapting BERT to new tasks mainly focus on modifying the model structure,re-designing the pre-training tasks,and leveraging external data and knowledge.The fine-tuning strategy itself has yet to be fully explored.In this paper,we improve the fine-tuning of BERT with two effective mechanisms:self-ensemble and self-distillation.The self-ensemble mechanism utilizes the checkpoints from an experience pool to integrate the teacher model.In order to transfer knowledge from the teacher model to the student model efficiently,we further use knowledge distillation,which is called self-distillation because the distillation comes from the model itself through the time dimension.Experiments on the GLUE benchmark and the Text Classification benchmark show that our proposed approach can significantly improve the adaption of BERT without any external data or knowledge.We conduct exhaustive experiments to investigate the efficiency of the self-ensemble and self-distillation mechanisms,and our proposed approach achieves a new state-of-the-art result on the SNLI dataset.