The rise of foundation models has brought significant advances to artificial intelligence,especially in reasoning,commonsense understanding,and tool use.These capabilities,when integrated into agent systems,hold great...The rise of foundation models has brought significant advances to artificial intelligence,especially in reasoning,commonsense understanding,and tool use.These capabilities,when integrated into agent systems,hold great promise for real-world applications such as vision-language navigation(VLN)and vision-language action(VLA).However,deploying such models in practice presents ongoing challenges,particularly in adapting and optimizing them across diverse and changing environments.This letter proposes a parallel deep foundation model(PDFM)framework to support continuous model evolution in cloud-edge-device systems.The framework establishes a co-evolution process between two complementary capabilities:embodied cognition,which reflects the model’s grounded understanding and task adaptation in physical systems,and analogical imagination,which enables creative exploration and capacity expansion in virtual environments.Through three core processes,learning and training,experiment and evaluation,and management and control,the system supports iterative refinement and dynamic interaction between virtual and real spaces.This enables general-purpose models to gradually converge toward domain-specific intelligence,supporting longterm,adaptive deployment.展开更多
基金supported by the National Natural Science Foundation of China(No.62303460)the Science and Technology Development Fund of Macao SAR(Nos.0145/2023/RIA3 and 0093/2023/RIA2)the Young Elite Scientists Sponsorship Program of China Association for Science and Technology(No.YESS20220372).
文摘The rise of foundation models has brought significant advances to artificial intelligence,especially in reasoning,commonsense understanding,and tool use.These capabilities,when integrated into agent systems,hold great promise for real-world applications such as vision-language navigation(VLN)and vision-language action(VLA).However,deploying such models in practice presents ongoing challenges,particularly in adapting and optimizing them across diverse and changing environments.This letter proposes a parallel deep foundation model(PDFM)framework to support continuous model evolution in cloud-edge-device systems.The framework establishes a co-evolution process between two complementary capabilities:embodied cognition,which reflects the model’s grounded understanding and task adaptation in physical systems,and analogical imagination,which enables creative exploration and capacity expansion in virtual environments.Through three core processes,learning and training,experiment and evaluation,and management and control,the system supports iterative refinement and dynamic interaction between virtual and real spaces.This enables general-purpose models to gradually converge toward domain-specific intelligence,supporting longterm,adaptive deployment.