DRIVEN by advancements in artificial intelligence technologies such as deep learning,core intelligent driving technologies like advanced driver assistance systems(ADAS)have made significant advances.Some advanced ADAS...DRIVEN by advancements in artificial intelligence technologies such as deep learning,core intelligent driving technologies like advanced driver assistance systems(ADAS)have made significant advances.Some advanced ADAS systems,particularly in highway scenarios,have reached or even surpassed human drivers in terms of precision and reliability[1].This mainstream development path is based on a replacement paradigm,whose central goal is to relieve human drivers of monotonous,repetitive tasks such as highway commuting,maximizing traffic efficiency and safety[2].This paradigm aims to replace error-prone human operators with a tireless,consistent machine intelligence.展开更多
The rise of foundation models has brought significant advances to artificial intelligence,especially in reasoning,commonsense understanding,and tool use.These capabilities,when integrated into agent systems,hold great...The rise of foundation models has brought significant advances to artificial intelligence,especially in reasoning,commonsense understanding,and tool use.These capabilities,when integrated into agent systems,hold great promise for real-world applications such as vision-language navigation(VLN)and vision-language action(VLA).However,deploying such models in practice presents ongoing challenges,particularly in adapting and optimizing them across diverse and changing environments.This letter proposes a parallel deep foundation model(PDFM)framework to support continuous model evolution in cloud-edge-device systems.The framework establishes a co-evolution process between two complementary capabilities:embodied cognition,which reflects the model’s grounded understanding and task adaptation in physical systems,and analogical imagination,which enables creative exploration and capacity expansion in virtual environments.Through three core processes,learning and training,experiment and evaluation,and management and control,the system supports iterative refinement and dynamic interaction between virtual and real spaces.This enables general-purpose models to gradually converge toward domain-specific intelligence,supporting longterm,adaptive deployment.展开更多
The vision-language-action(VLA)paradigm is gradually becoming the core path of embodied intelligence.However,its training and validation,which rely on simulation environments,face serious sim2real challenges,such as n...The vision-language-action(VLA)paradigm is gradually becoming the core path of embodied intelligence.However,its training and validation,which rely on simulation environments,face serious sim2real challenges,such as navigation deviations in drones caused by wind speed differences between simulation and real-world environments.Existing iterative methods based on digital twins can alleviate the problem of virtual-real alignment to some extent.However,their high dependence on twin consistency limits their adaptability and scalability in complex environments.To break through this bottleneck,the PiVLA framework is proposed in this letter to reconstruct the VLA paradigm with parallel intelligence.Furthermore,we introduce the parallel deep foundation model(PDFM)and,based on it,propose model parallel control(MPC)and the parallel interaction protocol(PIP),establishing a unified interaction mechanism for disembodied agents and embodied agents.This provides a scalable and robust solution for complex tasks involving embodied intelligence.展开更多
Virtual simulation testing of Autonomous Vehicles(AVs)is gradually being accepted as a mandatory way to test the feasibility of driving strategies for AVs.Mainstream methods focus on improving testing efficiency by ex...Virtual simulation testing of Autonomous Vehicles(AVs)is gradually being accepted as a mandatory way to test the feasibility of driving strategies for AVs.Mainstream methods focus on improving testing efficiency by extracting critical scenarios from naturalistic driving datasets.However,the criticalities defined in their testing tasks are based on fixed assumptions,the obtained scenarios cannot pose a challenge to AVs with different strategies.To fill this gap,we propose an intelligent testing method based on operable testing tasks.We found that the driving behavior of Surrounding Vehicles(SVs)has a critical impact on AV,which can be used to adjust the testing task difficulty to find more challenging scenarios.To model different driving behaviors,we utilize behavioral utility functions with binary driving strategies.Further,we construct a vehicle interaction model,based on which we theoretically analyze the impact of changing the driving behaviors on the testing task difficulty.Finally,by adjusting SV’s strategies,we can generate more corner cases when testing different AVs in a finite number of simulations.展开更多
基金supported in part by the Science and Technology Development Fund,Macao Special Administrative Region(SAR)(0145/2023/RIA3)in part by the DeSciCPI Project from the Obuda University,Hungary.
文摘DRIVEN by advancements in artificial intelligence technologies such as deep learning,core intelligent driving technologies like advanced driver assistance systems(ADAS)have made significant advances.Some advanced ADAS systems,particularly in highway scenarios,have reached or even surpassed human drivers in terms of precision and reliability[1].This mainstream development path is based on a replacement paradigm,whose central goal is to relieve human drivers of monotonous,repetitive tasks such as highway commuting,maximizing traffic efficiency and safety[2].This paradigm aims to replace error-prone human operators with a tireless,consistent machine intelligence.
基金supported by the National Natural Science Foundation of China(No.62303460)the Science and Technology Development Fund of Macao SAR(Nos.0145/2023/RIA3 and 0093/2023/RIA2)the Young Elite Scientists Sponsorship Program of China Association for Science and Technology(No.YESS20220372).
文摘The rise of foundation models has brought significant advances to artificial intelligence,especially in reasoning,commonsense understanding,and tool use.These capabilities,when integrated into agent systems,hold great promise for real-world applications such as vision-language navigation(VLN)and vision-language action(VLA).However,deploying such models in practice presents ongoing challenges,particularly in adapting and optimizing them across diverse and changing environments.This letter proposes a parallel deep foundation model(PDFM)framework to support continuous model evolution in cloud-edge-device systems.The framework establishes a co-evolution process between two complementary capabilities:embodied cognition,which reflects the model’s grounded understanding and task adaptation in physical systems,and analogical imagination,which enables creative exploration and capacity expansion in virtual environments.Through three core processes,learning and training,experiment and evaluation,and management and control,the system supports iterative refinement and dynamic interaction between virtual and real spaces.This enables general-purpose models to gradually converge toward domain-specific intelligence,supporting longterm,adaptive deployment.
基金supported by the Science and Technology Development Fund,Macao Special Administrative Region(Nos.0157/2024/RIA2,0145/2023/RIA3,and 0093/2023/RIA2).
文摘The vision-language-action(VLA)paradigm is gradually becoming the core path of embodied intelligence.However,its training and validation,which rely on simulation environments,face serious sim2real challenges,such as navigation deviations in drones caused by wind speed differences between simulation and real-world environments.Existing iterative methods based on digital twins can alleviate the problem of virtual-real alignment to some extent.However,their high dependence on twin consistency limits their adaptability and scalability in complex environments.To break through this bottleneck,the PiVLA framework is proposed in this letter to reconstruct the VLA paradigm with parallel intelligence.Furthermore,we introduce the parallel deep foundation model(PDFM)and,based on it,propose model parallel control(MPC)and the parallel interaction protocol(PIP),establishing a unified interaction mechanism for disembodied agents and embodied agents.This provides a scalable and robust solution for complex tasks involving embodied intelligence.
基金supported in part by the National Key Research and Development(No.2021YFB2501200).
文摘Virtual simulation testing of Autonomous Vehicles(AVs)is gradually being accepted as a mandatory way to test the feasibility of driving strategies for AVs.Mainstream methods focus on improving testing efficiency by extracting critical scenarios from naturalistic driving datasets.However,the criticalities defined in their testing tasks are based on fixed assumptions,the obtained scenarios cannot pose a challenge to AVs with different strategies.To fill this gap,we propose an intelligent testing method based on operable testing tasks.We found that the driving behavior of Surrounding Vehicles(SVs)has a critical impact on AV,which can be used to adjust the testing task difficulty to find more challenging scenarios.To model different driving behaviors,we utilize behavioral utility functions with binary driving strategies.Further,we construct a vehicle interaction model,based on which we theoretically analyze the impact of changing the driving behaviors on the testing task difficulty.Finally,by adjusting SV’s strategies,we can generate more corner cases when testing different AVs in a finite number of simulations.