摘要
从大模型技术发展趋势出发,分析了多模态、长序列和混合专家模型的架构特征和算力需求特点。围绕大模型对巨量算力规模与复杂通信模式的需求,重点从算力利用效率、集群互联技术两方面量化分析了当前大模型算力基础设施存在的发展问题和面临的技术挑战,并提出了以应用为导向、以系统为核心、以效率为目标的高质量算力基础设施发展路径。
Starting from the latest technological development trends of large models,this paper first analyzes the architectural characteristics and computing power demand features of multimodal,long sequence,and mixture of experts models.Further,it focuses on the requirements of the latest large models for massive computing power scale and complex communication patterns.It quantitatively analyzes the current development problems and technical challenges faced by large model computing infrastructure from two aspects:computating efficiency and cluster interconnection technology.Finally,it proposes a high-quality computing infrastructure development trajectory oriented by applications,centered on systems,and targeted at efficiency.
作者
张政
冯少飞
ZHANG Zheng;FENG Shaofei(IEIT SYSTEMS Co.,Ltd.,Beijng 100089,China)
出处
《信息通信技术与政策》
2024年第6期2-9,共8页
Information and Communications Technology and Policy
关键词
多模态模型
长序列模型
混合专家模型
算力利用效率
集群互联
高质量算力
multimodal model
long sequence model
mixture of experts model
computating efficiency
cluster interconnection
high-quality computing power