Large language models(LLMs)have undergone significant expansion and have been increasingly integrated across various domains.Notably,in the realm of robot task planning,LLMs harness their advanced reasoning and langua...Large language models(LLMs)have undergone significant expansion and have been increasingly integrated across various domains.Notably,in the realm of robot task planning,LLMs harness their advanced reasoning and language comprehension capabilities to formulate precise and efficient action plans based on natural language instructions.However,for embodied tasks,where robots interact with complex environments,textonly LLMs often face challenges due to a lack of compatibility with robotic visual perception.This study provides a comprehensive overview of the emerging integration of LLMs and multimodal LLMs into various robotic tasks.Additionally,we propose a framework that utilizes multimodal GPT-4V to enhance embodied task planning through the combination of natural language instructions and robot visual perceptions.Our results,based on diverse datasets,indicate that GPT-4V effectively enhances robot performance in embodied tasks.This extensive survey and evaluation of LLMs and multimodal LLMs across a variety of robotic tasks enriches the understanding of LLM-centric embodied intelligence and provides forward-looking insights towards bridging the gap in Human-Robot-Environment interaction.展开更多
At the start of 2025,DeepSeek sparked a global wave of general artificial intelligence(AI)applications in medicine.1-3 Currently,large language models(LLMs)such as GPT-4o,DeepSeek-R1,Gemini 2.0,Command-R,Claude 3,Qwen...At the start of 2025,DeepSeek sparked a global wave of general artificial intelligence(AI)applications in medicine.1-3 Currently,large language models(LLMs)such as GPT-4o,DeepSeek-R1,Gemini 2.0,Command-R,Claude 3,Qwen,and Grok 3 exhibit distinct character-istics but a shared strength in advanced logical reasoning.This strength has the potential to significantly impact medical decision-making models.展开更多
伴随RESTful API在现代Web服务中的普及,安全问题日益凸显。而现有的主流API识别与漏洞检测工具依赖API文档或公开路径进行扫描,在识别隐藏API或无文档API时效果有限,在复杂或动态API环境下漏洞误报率高。针对这些挑战,基于上下文协议(M...伴随RESTful API在现代Web服务中的普及,安全问题日益凸显。而现有的主流API识别与漏洞检测工具依赖API文档或公开路径进行扫描,在识别隐藏API或无文档API时效果有限,在复杂或动态API环境下漏洞误报率高。针对这些挑战,基于上下文协议(MCP)无缝通信智能体,提出一种隐藏API发现和漏洞检测的智能体系统A2A(Agent to API vulnerability detection)来实现从API发现到漏洞检测的全流程自动化。A2A通过自适应枚举和HTTP响应分析自动识别潜在的隐藏API端点,并结合服务特定的API指纹库进行隐藏API的确认和发现。A2A在API漏洞检测上则是结合大语言模型(LLM)与检索增强生成(RAG)技术,并通过反馈迭代优化策略,自动生成高质量测试用例以验证漏洞是否存在。实验评估结果表明,A2A的平均API发现率为91.9%,假发现率为7.8%,并成功发现NAUTILUS和RESTler未能检测到的多个隐藏API漏洞。展开更多
基金supported by National Natural Science Foundation of China(62376219 and 62006194)Foundational Research Project in Specialized Discipline(Grant No.G2024WD0146)Faculty Construction Project(Grant No.24GH0201148).
文摘Large language models(LLMs)have undergone significant expansion and have been increasingly integrated across various domains.Notably,in the realm of robot task planning,LLMs harness their advanced reasoning and language comprehension capabilities to formulate precise and efficient action plans based on natural language instructions.However,for embodied tasks,where robots interact with complex environments,textonly LLMs often face challenges due to a lack of compatibility with robotic visual perception.This study provides a comprehensive overview of the emerging integration of LLMs and multimodal LLMs into various robotic tasks.Additionally,we propose a framework that utilizes multimodal GPT-4V to enhance embodied task planning through the combination of natural language instructions and robot visual perceptions.Our results,based on diverse datasets,indicate that GPT-4V effectively enhances robot performance in embodied tasks.This extensive survey and evaluation of LLMs and multimodal LLMs across a variety of robotic tasks enriches the understanding of LLM-centric embodied intelligence and provides forward-looking insights towards bridging the gap in Human-Robot-Environment interaction.
文摘At the start of 2025,DeepSeek sparked a global wave of general artificial intelligence(AI)applications in medicine.1-3 Currently,large language models(LLMs)such as GPT-4o,DeepSeek-R1,Gemini 2.0,Command-R,Claude 3,Qwen,and Grok 3 exhibit distinct character-istics but a shared strength in advanced logical reasoning.This strength has the potential to significantly impact medical decision-making models.
文摘伴随RESTful API在现代Web服务中的普及,安全问题日益凸显。而现有的主流API识别与漏洞检测工具依赖API文档或公开路径进行扫描,在识别隐藏API或无文档API时效果有限,在复杂或动态API环境下漏洞误报率高。针对这些挑战,基于上下文协议(MCP)无缝通信智能体,提出一种隐藏API发现和漏洞检测的智能体系统A2A(Agent to API vulnerability detection)来实现从API发现到漏洞检测的全流程自动化。A2A通过自适应枚举和HTTP响应分析自动识别潜在的隐藏API端点,并结合服务特定的API指纹库进行隐藏API的确认和发现。A2A在API漏洞检测上则是结合大语言模型(LLM)与检索增强生成(RAG)技术,并通过反馈迭代优化策略,自动生成高质量测试用例以验证漏洞是否存在。实验评估结果表明,A2A的平均API发现率为91.9%,假发现率为7.8%,并成功发现NAUTILUS和RESTler未能检测到的多个隐藏API漏洞。
文摘2024年是新人工智能(Artificial Intelligence,AI)的里程碑之年,人工智能科学取得一系列突破性进展.文中从大模型与具身智能、人工智能生成内容(Artificial Intelligence Generated Content,AIGC)、AI智能体、人工智能驱动的科学研究(AI for Science,AI4S)与人工智能相关科学研究(Science for AI,S4AI)、AI相关政策与平台等方面回顾2024年人工智能领域热点和动向.随着大模型、智能体等技术的飞速进步,AI应用领域持续扩大,对各行各业引发新的冲击,政策与平台建设不断完善,新质生产力展现出更大的发展潜力.围绕自主智能“新AI”,期待诞生更多AI里程碑工作.
基金江苏省社会科学基金一般项目“人机混合信息环境下社交媒体AIGC风险敏捷治理研究”(项目编号:25XZB001)中央高校基本科研业务费专项资金资助南京大学文科人工智能交叉研究计划(AI for HASS)首批专项课题“生成式人工智能环境下网络信息内容风险识别与智能治理研究”(项目编号:2025300123)研究成果之一。