期刊文献+
共找到6篇文章
< 1 >
每页显示 20 50 100
UniTrans:Unified Parameter-Efficient Transfer Learning and Multimodal Alignment for Large Multimodal Foundation Model
1
作者 Jiakang Sun Ke Chen +3 位作者 Xinyang He Xu Liu Ke Li Cheng Peng 《Computers, Materials & Continua》 2025年第4期219-238,共20页
With the advancements in parameter-efficient transfer learning techniques,it has become feasible to leverage large pre-trained language models for downstream tasks under low-cost and low-resource conditions.However,ap... With the advancements in parameter-efficient transfer learning techniques,it has become feasible to leverage large pre-trained language models for downstream tasks under low-cost and low-resource conditions.However,applying this technique to multimodal knowledge transfer introduces a significant challenge:ensuring alignment across modalities while minimizing the number of additional parameters required for downstream task adaptation.This paper introduces UniTrans,a framework aimed at facilitating efficient knowledge transfer across multiple modalities.UniTrans leverages Vector-based Cross-modal Random Matrix Adaptation to enable fine-tuning with minimal parameter overhead.To further enhance modality alignment,we introduce two key components:the Multimodal Consistency Alignment Module and the Query-Augmentation Side Network,specifically optimized for scenarios with extremely limited trainable parameters.Extensive evaluations on various cross-modal downstream tasks demonstrate that our approach surpasses state-of-the-art methods while using just 5%of their trainable parameters.Additionally,it achieves superior performance compared to fully fine-tuned models on certain benchmarks. 展开更多
关键词 parameter-efficient transfer learning multimodal alignment image captioning image-text retrieval visual question answering
在线阅读 下载PDF
Abnormal Action Detection Based on Parameter-Efficient Transfer Learning in Laboratory Scenarios
2
作者 Changyu Liu Hao Huang +2 位作者 Guogang Huang Chunyin Wu Yingqi Liang 《Computers, Materials & Continua》 SCIE EI 2024年第9期4219-4242,共24页
Laboratory safety is a critical area of broad societal concern,particularly in the detection of abnormal actions.To enhance the efficiency and accuracy of detecting such actions,this paper introduces a novel method ca... Laboratory safety is a critical area of broad societal concern,particularly in the detection of abnormal actions.To enhance the efficiency and accuracy of detecting such actions,this paper introduces a novel method called TubeRAPT(Tubelet Transformer based onAdapter and Prefix TrainingModule).Thismethod primarily comprises three key components:the TubeR network,an adaptive clustering attention mechanism,and a prefix training module.These components work in synergy to address the challenge of knowledge preservation in models pretrained on large datasets while maintaining training efficiency.The TubeR network serves as the backbone for spatio-temporal feature extraction,while the adaptive clustering attention mechanism refines the focus on relevant information.The prefix training module facilitates efficient fine-tuning and knowledge transfer.Experimental results demonstrate the effectiveness of TubeRAPT,achieving a 68.44%mean Average Precision(mAP)on the CLA(Crazy LabActivity)small-scale dataset,marking a significant improvement of 1.53%over the previous TubeR method.This research not only showcases the potential applications of TubeRAPT in the field of abnormal action detection but also offers innovative ideas and technical support for the future development of laboratory safety monitoring technologies.The proposed method has implications for improving safety management systems in various laboratory environments,potentially reducing accidents and enhancing overall workplace safety. 展开更多
关键词 parameter-efficient transfer learning laboratory scenarios TubeRAPT abnormal action detection
在线阅读 下载PDF
Optimizing Fine-Tuning in Quantized Language Models:An In-Depth Analysis of Key Variables
3
作者 Ao Shen Zhiquan Lai +1 位作者 Dongsheng Li Xiaoyu Hu 《Computers, Materials & Continua》 SCIE EI 2025年第1期307-325,共19页
Large-scale Language Models(LLMs)have achieved significant breakthroughs in Natural Language Processing(NLP),driven by the pre-training and fine-tuning paradigm.While this approach allows models to specialize in speci... Large-scale Language Models(LLMs)have achieved significant breakthroughs in Natural Language Processing(NLP),driven by the pre-training and fine-tuning paradigm.While this approach allows models to specialize in specific tasks with reduced training costs,the substantial memory requirements during fine-tuning present a barrier to broader deployment.Parameter-Efficient Fine-Tuning(PEFT)techniques,such as Low-Rank Adaptation(LoRA),and parameter quantization methods have emerged as solutions to address these challenges by optimizing memory usage and computational efficiency.Among these,QLoRA,which combines PEFT and quantization,has demonstrated notable success in reducing memory footprints during fine-tuning,prompting the development of various QLoRA variants.Despite these advancements,the quantitative impact of key variables on the fine-tuning performance of quantized LLMs remains underexplored.This study presents a comprehensive analysis of these key variables,focusing on their influence across different layer types and depths within LLM architectures.Our investigation uncovers several critical findings:(1)Larger layers,such as MLP layers,can maintain performance despite reductions in adapter rank,while smaller layers,like self-attention layers,aremore sensitive to such changes;(2)The effectiveness of balancing factors depends more on specific values rather than layer type or depth;(3)In quantization-aware fine-tuning,larger layers can effectively utilize smaller adapters,whereas smaller layers struggle to do so.These insights suggest that layer type is a more significant determinant of fine-tuning success than layer depth when optimizing quantized LLMs.Moreover,for the same discount of trainable parameters,reducing the trainable parameters in a larger layer is more effective in preserving fine-tuning accuracy than in a smaller one.This study provides valuable guidance for more efficient fine-tuning strategies and opens avenues for further research into optimizing LLM fine-tuning in resource-constrained environments. 展开更多
关键词 Large-scale Language Model parameter-efficient Fine-Tuning parameter quantization key variable trainable parameters experimental analysis
在线阅读 下载PDF
AdaptForever:Elastic and Mutual Learning for Continuous NLP Task Mastery
4
作者 Ke Chen Cheng Peng +4 位作者 Xinyang He Jiakang Sun Xu Liu Xiaolin Qin Yong Zhong 《Computers, Materials & Continua》 2025年第3期4003-4019,共17页
In natural language processing(NLP),managing multiple downstream tasks through fine-tuning pre-trained models often requires maintaining separate task-specific models,leading to practical inefficiencies.To address thi... In natural language processing(NLP),managing multiple downstream tasks through fine-tuning pre-trained models often requires maintaining separate task-specific models,leading to practical inefficiencies.To address this challenge,we introduce AdaptForever,a novel approach that enables continuous mastery of NLP tasks through the integration of elastic and mutual learning strategies with a stochastic expert mechanism.Our method freezes the pre-trained model weights while incorporating adapters enhanced with mutual learning capabilities,facilitating effective knowledge transfer from previous tasks to new ones.By combining Elastic Weight Consolidation(EWC)for knowledge preservation with specialized regularization terms,AdaptForever successfully maintains performance on earlier tasks while acquiring new capabilities.Experimental results demonstrate that AdaptForever achieves superior performance across a continuous sequence of NLP tasks compared to existing parameter-efficient methods,while effectively preventing catastrophic forgetting and enabling positive knowledge transfer between tasks. 展开更多
关键词 Adapter-tuning large language model pre-trained language model parameter-efficient fine tuning continue learning mutual learning mixture of expert
在线阅读 下载PDF
Collaborative Knowledge Infusion for Low-Resource Stance Detection 被引量:1
5
作者 Ming Yan Tianyi Zhou Joey W.Tsang Ivor 《Big Data Mining and Analytics》 EI CSCD 2024年第3期682-698,共17页
Stance detection is the view towards a specific target by a given context(e.g.tweets,commercial reviews).Target-related knowledge is often needed to assist stance detection models in understanding the target well and ... Stance detection is the view towards a specific target by a given context(e.g.tweets,commercial reviews).Target-related knowledge is often needed to assist stance detection models in understanding the target well and making detection correctly.However,prevailing works for knowledge-infused stance detection predominantly incorporate target knowledge from a singular source that lacks knowledge verification in limited domain knowledge.The low-resource training data further increase the challenge for the data-driven large models in this task.To address those challenges,we propose a collaborative knowledge infusion approach for low-resource stance detection tasks,employing a combination of aligned knowledge enhancement and efficient parameter learning techniques.Specifically,our stance detection approach leverages target background knowledge collaboratively from different knowledge sources with the help of knowledge alignment.Additionally,we also introduce the parameter-efficient collaborative adaptor with a staged optimization algorithm,which collaboratively addresses the challenges associated with low-resource stance detection tasks from both network structure and learning perspectives.To assess the effectiveness of our method,we conduct extensive experiments on three public stance detection datasets,including low-resource and cross-target settings.The results demonstrate significant performance improvements compared to the existing stance detection approaches. 展开更多
关键词 parameter-efficient learning low-resource stance detection knowledge infusion
原文传递
VAGen:waterbody segmentation with prompting for visual in‑context learning
6
作者 Jiapei Zhao Nobuyoshi Yabuki Tomohiro Fukuda 《AI in Civil Engineering》 2024年第1期1-20,共20页
Effective water management and flood prevention are critical challenges encountered by both urban and rural areas,necessitating precise and prompt monitoring of waterbodies.As a fundamental step in the monitoring proc... Effective water management and flood prevention are critical challenges encountered by both urban and rural areas,necessitating precise and prompt monitoring of waterbodies.As a fundamental step in the monitoring process,waterbody segmentation involves precisely delineating waterbody boundaries from imagery.Previous research using satellite images often lacks the resolution and contextual detail needed for local-scale analysis.In response to these challenges,this study seeks to address them by leveraging common natural images that are more easily accessible and provide higher resolution and more contextual information compared to satellite images.However,the segmentation of waterbodies from ordinary images faces several obstacles,including variations in lighting,occlusions from objects like trees and buildings,and reflections on the water surface,all of which can mislead algorithms.Additionally,the diverse shapes and textures of waterbodies,alongside complex backgrounds,further complicate this task.While large-scale vision models have typically been leveraged for their generalizability across various downstream tasks that are pre-trained on large datasets,their application to waterbody segmentation from ground-level images remains underexplored.Hence,this research proposed the Visual Aquatic Generalist(VAGen)as a countermeasure.Being a lightweight model for waterbody segmentation inspired by visual In-Context Learning(ICL)and Visual Prompting(VP),VAGen refines large visual models by innovatively adding learnable perturbations to enhance the quality of prompts in ICL.As demonstrated by the experimental results,VAGen demonstrated a significant increase in the mean Intersection over Union(mIoU)metric,showing a 22.38%enhancement when compared to the baseline model that lacked the integration of learnable prompts.Moreover,VAGen surpassed the current stateof-the-art(SOTA)task-specific models designed for waterbody segmentation by 6.20%.The performance evaluation and analysis of VAGen indicated its capacity to substantially reduce the number of trainable parameters and computational overhead,and proved its feasibility to be deployed on cost-limited devices including unmanned aerial vehicles(UAVs)and mobile computing platforms.This study thereby makes a valuable contribution to the field of computer vision,offering practical solutions for engineering applications related to urban flood monitoring,agricultural water resource management,and environmental conservation efforts. 展开更多
关键词 Visual in-context learning Visual prompting Vision foundation model parameter-efficient fine-tuning Waterbody segmentation Deep learning
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部