INTRODUCTION In recent years,the development of large-scale foundationmodels(LFMs)has made great advances.However,the high training costs and computational demands have long been a bottleneck for the widespread adopti...INTRODUCTION In recent years,the development of large-scale foundationmodels(LFMs)has made great advances.However,the high training costs and computational demands have long been a bottleneck for the widespread adoption of this technology.With technological advancements,this situation is undergoing a fundamental transformation.The recent release of DeepSeek-V31 has sparked extensive discussions.Through innovative architectural design and efficient training strategies,it has significantly reduced training costswhile achieving performance comparable to top-tier closed-source models.The pre-training cost of DeepSeek-V3is only$5.576 million,far lower than the hundreds ofmillions of dollars required formodels like GPT-4.As shwon in Figure 1,this breakthrough not onlymarks the democratization of LFM technology but also opens up opportunities for more small-and medium-sized enterprises and research institutions to participate in AI innovation.In the future,LFMs will no longer be a game for the few.展开更多
基金supported by the National Natural Science Foundation of China under grant nos.62206266 and 62372430the Youth Innovation Promotion Association CAS no.2023112.
文摘INTRODUCTION In recent years,the development of large-scale foundationmodels(LFMs)has made great advances.However,the high training costs and computational demands have long been a bottleneck for the widespread adoption of this technology.With technological advancements,this situation is undergoing a fundamental transformation.The recent release of DeepSeek-V31 has sparked extensive discussions.Through innovative architectural design and efficient training strategies,it has significantly reduced training costswhile achieving performance comparable to top-tier closed-source models.The pre-training cost of DeepSeek-V3is only$5.576 million,far lower than the hundreds ofmillions of dollars required formodels like GPT-4.As shwon in Figure 1,this breakthrough not onlymarks the democratization of LFM technology but also opens up opportunities for more small-and medium-sized enterprises and research institutions to participate in AI innovation.In the future,LFMs will no longer be a game for the few.