Lightweighting Large Models:A Review of Model Compression Techniques

导出

摘要 With the rapid development of deep learning,neural network models have achieved remarkable performance.However,their large scale and high computational demands still limit widespread deployment.Therefore,model compression techniques have emerged,aiming to reduce computational complexity,memory usage,and energy overhead while meeting practical deployment needs without sacrificing model performance.This paper provides a systematic overview of recent advances in model compression,with particular emphasis on large-scale models such as vision-language models(VLMs)and large language models(LLMs).We firstly clarify the core concepts and objectives of model compression,then outline persistent challenges,including limited adaptability across different model variants and the trade-off between efficiency and accuracy.We examine five mainstream compression methods in detail:pruning,quantization,knowledge distillation,low-rank factorization,and parameter sharing.For each method,we analyze the guiding principles,strengths,weaknesses,and representative application scenarios.We also present a comparative analysis of trade-offs among compression ratio,accuracy retention,and computational efficiency.Moreover,we review commonly-used benchmarks such as Image Net,Wiki Text,GLUE,and MMLU,along with metrics for evaluating effectiveness in both vision and language tasks.Finally,we outline promising future directions,including automated pipelines,hybrid strategies,hardware-aware optimization,and cross-domain adaptability.This survey provides a comprehensive overview and roadmap for advancing model compression research.

作者 Haoran Guan Junhao Lv Xi Chen Lei Qi

机构地区 Southeast University

出处《Data Intelligence》 2026年第1期92-136,共45页 数据智能(英文)

关键词 Model compression Deep learning Large language models Literature review

分类号 TP18 [自动化与计算机技术]

Data Intelligence

2026年第1期

浏览历史

内容加载中请稍等...

Lightweighting Large Models:A Review of Model Compression Techniques

相关作者

相关机构

相关主题

浏览历史