DeepSeek,a Chinese artificial intelligence(AI)startup,has released their V3 and R1 series models,which attracted global attention due to their low cost,high performance,and open-source advantages.This paper begins by ...DeepSeek,a Chinese artificial intelligence(AI)startup,has released their V3 and R1 series models,which attracted global attention due to their low cost,high performance,and open-source advantages.This paper begins by reviewing the evolution of large AI models focusing on paradigm shifts,the mainstream large language model(LLM)paradigm,and the DeepSeek paradigm.Subsequently,the paper highlights novel algorithms introduced by DeepSeek,including multi-head latent attention(MLA),mixture-of-experts(MoE),multi-token prediction(MTP),and group relative policy optimization(GRPO).The paper then explores DeepSeek's engineering breakthroughs in LLM scaling,training,inference,and system-level optimization architecture.Moreover,the impact of DeepSeek models on the competitive AI landscape is analyzed,comparing them to mainstream LLMs across various fields.Finally,the paper reflects on the insights gained from DeepSeek's innovations and discusses future trends in the technical and engineering development of large AI models,particularly in data,training,and reasoning.展开更多
With the rapid development of large AI models,large decision models have further broken through the limits of human cognition and promoted the innovation of decision-making paradigms in extensive fields such as medici...With the rapid development of large AI models,large decision models have further broken through the limits of human cognition and promoted the innovation of decision-making paradigms in extensive fields such as medicine and transportation.In this paper,we systematically expound on the intelligent decision-making technology and prospects driven by large AI models.Specifically,we first review the development of large AI models in recent years.Then,from the perspective of methods,we introduce important theories and technologies of large decision models,such as model architecture and model adaptation.Next,from the perspective of applications,we introduce the cutting-edge applications of large decision models in various fields,such as autonomous driving and knowledge decision-making.Finally,we discuss existing challenges,such as security issues,decision bias and hallucination phenomenon as well as future prospects,from both technology development and domain applications.We hope this review paper can help researchers understand the important progress of intelligent decision-making driven by large AI models.展开更多
In recent years,large-scale artificial intelligence(AI)models have become a focal point in technology,attracting widespread attention and acclaim.Notable examples include Google’s BERT and OpenAI’s GPT,which have sc...In recent years,large-scale artificial intelligence(AI)models have become a focal point in technology,attracting widespread attention and acclaim.Notable examples include Google’s BERT and OpenAI’s GPT,which have scaled their parameter sizes to hundreds of billions or even tens of trillions.This growth has been accompanied by a significant increase in the amount of training data,significantly improving the capabilities and performance of these models.Unlike previous reviews,this paper provides a comprehensive discussion of the algorithmic principles of large-scale AI models and their industrial applications from multiple perspectives.We first outline the evolutionary history of these models,highlighting milestone algorithms while exploring their underlying principles and core technologies.We then evaluate the challenges and limitations of large-scale AI models,including computational resource requirements,model parameter inflation,data privacy concerns,and specific issues related to multi-modal AI models,such as reliance on text-image pairs,inconsistencies in understanding and generation capabilities,and the lack of true“multi-modality”.Various industrial applications of these models are also presented.Finally,we discuss future trends,predicting further expansion of model scale and the development of cross-modal fusion.This study provides valuable insights to inform and inspire future future research and practice.展开更多
In recent years,the rapid advancement of artificial intelligence(AI)has fostered deep integration between large AI models and robotic technology.Robots such as robotic dogs capable of carrying heavy loads on mountaino...In recent years,the rapid advancement of artificial intelligence(AI)has fostered deep integration between large AI models and robotic technology.Robots such as robotic dogs capable of carrying heavy loads on mountainous terrain or performing waste disposal tasks and humanoid robots that can execute high-precision component installations have gradually reached the public eye,raising expectations for embodied intelligent robots.展开更多
We proposes an AI-assisted framework for integrated natural disaster prevention and emergency response,leveraging the DeepSeek large language model(LLM)to advance intelligent decision-making in geohazard management.We...We proposes an AI-assisted framework for integrated natural disaster prevention and emergency response,leveraging the DeepSeek large language model(LLM)to advance intelligent decision-making in geohazard management.We systematically analyze the technical pathways for deploying LLMs in disaster scenarios,emphasizing three breakthrough directions:(1)knowledge graph-driven dynamic risk modeling,(2)reinforcement learning-optimized emergency decision systems,and(3)secure local deployment architectures.The DeepSeek model demonstrates unique advantages through its hybrid reasoning mechanism combining semantic analysis with geospatial pattern recognition,enabling cost-effective processing of multi-source data spanning historical disaster records,real-time IoT sensor feeds,and socio-environmental parameters.A modular system architecture is designed to achieve three critical objectives:(a)automated construction of domain-specific knowledge graphs through unsupervised learning of disaster physics relationships,(b)scenario-adaptive resource allocation using risk simulations,and(c)preserving emergency coordination via federated learning across distributed response nodes.The proposed local deployment paradigm addresses critical data security concerns in cross-border disaster management while complying with the FAIR principles(Findable,Accessible,Interoperable,Reusable)for geoscientific data governance.This work establishes a methodological foundation for next-generation AI-earth science convergence in disaster mitigation.展开更多
The rapid advancement of artificial intelligence technologies,particularly in recent years,has led to the emergence of several large parameter artificial intelligence weather forecast models.These models represent a s...The rapid advancement of artificial intelligence technologies,particularly in recent years,has led to the emergence of several large parameter artificial intelligence weather forecast models.These models represent a significant breakthrough,overcoming the limitations of traditional numerical weather prediction models and indicating the emergence of profound potential tools for atmosphere-ocean forecasts.This study explores the evolution of these advanced artificial intelligence forecast models,and based on the identified commonalities,proposes the“Three Large Rules”for large weather forecast models:a large number of parameters,a large number of predictands,and large potential applications.We discuss the capacity of artificial intelligence to revolutionize numerical weather prediction,briefly outlining the underlying reasons for the significant improvement in weather forecasting.While acknowledging the high accuracy,computational efficiency,and ease of deployment of large artificial intelligence forecast models,we also emphasize the irreplaceable values of traditional numerical forecasts and explore the challenges in the future development of large-scale artificial intelligence atmosphere-ocean forecast models.We believe that the optimal future of atmosphere-ocean weather forecast lies in achieving a seamless integration of artificial intelligence and traditional numerical models.Such a synthesis is anticipated to offer a more advanced and reliable approach for improved atmosphere-ocean forecasts.Finally,we illustrate how forecasters can leverage the large weather forecast models through an example by building an artificial intelligence model for global ocean wave forecast.展开更多
基金supported by the National Natural Science Foundation of China(62233005,62293502,U2441245,62176185,U23B2057,62306112)the STCSM Science and Technology Innovation Action Plan Computational Biology Program(24JS2830400)+2 种基金the State Key Laboratory of Industrial Control Technology,China(ICT2024A22)the Shanghai Sailing Program(23YF1409400)the National Science and Technology Major Project(2024ZD0532403).
文摘DeepSeek,a Chinese artificial intelligence(AI)startup,has released their V3 and R1 series models,which attracted global attention due to their low cost,high performance,and open-source advantages.This paper begins by reviewing the evolution of large AI models focusing on paradigm shifts,the mainstream large language model(LLM)paradigm,and the DeepSeek paradigm.Subsequently,the paper highlights novel algorithms introduced by DeepSeek,including multi-head latent attention(MLA),mixture-of-experts(MoE),multi-token prediction(MTP),and group relative policy optimization(GRPO).The paper then explores DeepSeek's engineering breakthroughs in LLM scaling,training,inference,and system-level optimization architecture.Moreover,the impact of DeepSeek models on the competitive AI landscape is analyzed,comparing them to mainstream LLMs across various fields.Finally,the paper reflects on the insights gained from DeepSeek's innovations and discusses future trends in the technical and engineering development of large AI models,particularly in data,training,and reasoning.
基金supported by the National Natural Science Foundation of China(Grant 62293545)Shenzhen Science and Technology Program(Grant ZDSYS20220323112000001).
文摘With the rapid development of large AI models,large decision models have further broken through the limits of human cognition and promoted the innovation of decision-making paradigms in extensive fields such as medicine and transportation.In this paper,we systematically expound on the intelligent decision-making technology and prospects driven by large AI models.Specifically,we first review the development of large AI models in recent years.Then,from the perspective of methods,we introduce important theories and technologies of large decision models,such as model architecture and model adaptation.Next,from the perspective of applications,we introduce the cutting-edge applications of large decision models in various fields,such as autonomous driving and knowledge decision-making.Finally,we discuss existing challenges,such as security issues,decision bias and hallucination phenomenon as well as future prospects,from both technology development and domain applications.We hope this review paper can help researchers understand the important progress of intelligent decision-making driven by large AI models.
基金supported in part by the National Natural Science Foundation of China(Nos.62406207 and 62476224)the Project of Basic Scientific Research of Central Universities of China(No.J2023-026)+2 种基金the project of Science and Technology Department in Sichuan Province(No.25QNJJ5597)the Science and Technology Project of the Tibet Autonomous Region(No.XZ202401ZY0016)the Project of Sichuan Province Engineering Technology Research Center of General Aircraft Maintenance(No.GAMRC2023YB06).
文摘In recent years,large-scale artificial intelligence(AI)models have become a focal point in technology,attracting widespread attention and acclaim.Notable examples include Google’s BERT and OpenAI’s GPT,which have scaled their parameter sizes to hundreds of billions or even tens of trillions.This growth has been accompanied by a significant increase in the amount of training data,significantly improving the capabilities and performance of these models.Unlike previous reviews,this paper provides a comprehensive discussion of the algorithmic principles of large-scale AI models and their industrial applications from multiple perspectives.We first outline the evolutionary history of these models,highlighting milestone algorithms while exploring their underlying principles and core technologies.We then evaluate the challenges and limitations of large-scale AI models,including computational resource requirements,model parameter inflation,data privacy concerns,and specific issues related to multi-modal AI models,such as reliance on text-image pairs,inconsistencies in understanding and generation capabilities,and the lack of true“multi-modality”.Various industrial applications of these models are also presented.Finally,we discuss future trends,predicting further expansion of model scale and the development of cross-modal fusion.This study provides valuable insights to inform and inspire future future research and practice.
文摘In recent years,the rapid advancement of artificial intelligence(AI)has fostered deep integration between large AI models and robotic technology.Robots such as robotic dogs capable of carrying heavy loads on mountainous terrain or performing waste disposal tasks and humanoid robots that can execute high-precision component installations have gradually reached the public eye,raising expectations for embodied intelligent robots.
基金funded by the Chongqing Water Resources Bureau,China(Project No.CQS24C00836).
文摘We proposes an AI-assisted framework for integrated natural disaster prevention and emergency response,leveraging the DeepSeek large language model(LLM)to advance intelligent decision-making in geohazard management.We systematically analyze the technical pathways for deploying LLMs in disaster scenarios,emphasizing three breakthrough directions:(1)knowledge graph-driven dynamic risk modeling,(2)reinforcement learning-optimized emergency decision systems,and(3)secure local deployment architectures.The DeepSeek model demonstrates unique advantages through its hybrid reasoning mechanism combining semantic analysis with geospatial pattern recognition,enabling cost-effective processing of multi-source data spanning historical disaster records,real-time IoT sensor feeds,and socio-environmental parameters.A modular system architecture is designed to achieve three critical objectives:(a)automated construction of domain-specific knowledge graphs through unsupervised learning of disaster physics relationships,(b)scenario-adaptive resource allocation using risk simulations,and(c)preserving emergency coordination via federated learning across distributed response nodes.The proposed local deployment paradigm addresses critical data security concerns in cross-border disaster management while complying with the FAIR principles(Findable,Accessible,Interoperable,Reusable)for geoscientific data governance.This work establishes a methodological foundation for next-generation AI-earth science convergence in disaster mitigation.
基金supported by the National Key Research and Development Program of China(Grant No.2020YFA0608000)the National Natural Science Foundation of China(Grant No.42030605)。
文摘The rapid advancement of artificial intelligence technologies,particularly in recent years,has led to the emergence of several large parameter artificial intelligence weather forecast models.These models represent a significant breakthrough,overcoming the limitations of traditional numerical weather prediction models and indicating the emergence of profound potential tools for atmosphere-ocean forecasts.This study explores the evolution of these advanced artificial intelligence forecast models,and based on the identified commonalities,proposes the“Three Large Rules”for large weather forecast models:a large number of parameters,a large number of predictands,and large potential applications.We discuss the capacity of artificial intelligence to revolutionize numerical weather prediction,briefly outlining the underlying reasons for the significant improvement in weather forecasting.While acknowledging the high accuracy,computational efficiency,and ease of deployment of large artificial intelligence forecast models,we also emphasize the irreplaceable values of traditional numerical forecasts and explore the challenges in the future development of large-scale artificial intelligence atmosphere-ocean forecast models.We believe that the optimal future of atmosphere-ocean weather forecast lies in achieving a seamless integration of artificial intelligence and traditional numerical models.Such a synthesis is anticipated to offer a more advanced and reliable approach for improved atmosphere-ocean forecasts.Finally,we illustrate how forecasters can leverage the large weather forecast models through an example by building an artificial intelligence model for global ocean wave forecast.