现有的野生动物识别方法主要依赖于静态数据集,难以适应物种动态迁移和新增类别识别的需求,导致监测效率低下。针对这一问题,提出多粒度提示驱动的野生动物识别方法(multi-granularity prompt-driven for wildlife recognition,MGP-WILD...现有的野生动物识别方法主要依赖于静态数据集,难以适应物种动态迁移和新增类别识别的需求,导致监测效率低下。针对这一问题,提出多粒度提示驱动的野生动物识别方法(multi-granularity prompt-driven for wildlife recognition,MGP-WILD)。通过云端大语言模型生成层次化语义描述(粗粒度生物分类+细粒度形态特征),由边缘节点协同维护动态知识表。具体而言,MGP-WILD利用大语言模型生成多粒度文本提示,相较于传统单粒度提示方法,本工作通过多粒度语义描述生成,实现了粗细粒度特征的深度融合,并结合视觉语言模型的跨模态对齐能力,实现了零样本精准识别。实验结果表明,该方法在多个数据集上均有较大提升,尤其在开放集识别任务中展现了较强的适应性。该系统已成功应用于青海野生动物栖息地保护,构建了基于真实场景的动物图像数据集,为生态脆弱区的生物多样性保护提供了创新技术范式。代码及部分数据集将在GitHub上公开。展开更多
In the face of constantly changing environments,the central nervous system(CNS)rapidly and accurately calculates the body's needs,regulates feeding behavior,and maintains energy homeostasis.The arcuate nucleus of ...In the face of constantly changing environments,the central nervous system(CNS)rapidly and accurately calculates the body's needs,regulates feeding behavior,and maintains energy homeostasis.The arcuate nucleus of the hypothalamus(ARC)plays a key role in this process,serv-ing as a critical brain region for detecting nutrition-related hormones and regulating appetite and energy homeostasis.Agouti-related protein(AgRP)/neuropeptide Y(NPY)neu-rons in the ARC are core elements that interact with other brain regions through a complex appetite-regulating network to comprehensively control energy homeostasis.In this review,we explore the discovery and research progress of AgRP neurons in regulating feeding and energy metabolism.In addition,recent advances in terms of feeding behavior and energy homeostasis,along with the redundant neural mecha-nisms involved in energy metabolism,are discussed.Finally,the challenges and opportunities in the field of neural regula-tion of feeding and energy metabolism are briefly discussed.展开更多
In multimodal learning, Vision-Language Models (VLMs) have become a critical research focus, enabling the integration of textual and visual data. These models have shown significant promise across various natural lang...In multimodal learning, Vision-Language Models (VLMs) have become a critical research focus, enabling the integration of textual and visual data. These models have shown significant promise across various natural language processing tasks, such as visual question answering and computer vision applications, including image captioning and image-text retrieval, highlighting their adaptability for complex, multimodal datasets. In this work, we review the landscape of Bootstrapping Language-Image Pre-training (BLIP) and other VLM techniques. A comparative analysis is conducted to assess VLMs’ strengths, limitations, and applicability across tasks while examining challenges such as scalability, data quality, and fine-tuning complexities. The work concludes by outlining potential future directions in VLM research, focusing on enhancing model interpretability, addressing ethical implications, and advancing multimodal integration in real-world applications.展开更多
随着自动驾驶、智能导航等领域的快速发展,对时空轨迹预测的准确性和鲁棒性的要求不断提高。传统轨迹预测方法主要依赖运动历史数据,忽略了环境中的语义信息,在复杂场景下往往难以取得理想的预测效果。对轨迹预测领域相关研究进行综述,...随着自动驾驶、智能导航等领域的快速发展,对时空轨迹预测的准确性和鲁棒性的要求不断提高。传统轨迹预测方法主要依赖运动历史数据,忽略了环境中的语义信息,在复杂场景下往往难以取得理想的预测效果。对轨迹预测领域相关研究进行综述,特别是基于空间语义分析的轨迹预测研究进展。重点探讨了视觉语言模型(Vision Language Model,VLM)和大语言模型(Large Language Model,LLM)在轨迹预测方面的应用,介绍了多种基于空间语义分析的轨迹预测模型。通过实验结果分析发现,VLM和LLM能够显著提升轨迹预测的准确率。基于空间语义分析的轨迹预测方法未来将考虑多模态融合、提升模型架构、提高推理速度等方向,以进一步提升大规模轨迹预测的性能。展开更多
利用纳米压痕仪的连续刚度测量模式测试了常温氙离子辐照后Hastelloy N合金的纳米硬度。结果表明,辐照样品的纳米硬度均大于未辐照样品的纳米硬度,且辐照剂量在0.5~3.0 dpa这一范围内时,辐照样品的纳米硬度处于饱和状态。在Nix-Gao模型...利用纳米压痕仪的连续刚度测量模式测试了常温氙离子辐照后Hastelloy N合金的纳米硬度。结果表明,辐照样品的纳米硬度均大于未辐照样品的纳米硬度,且辐照剂量在0.5~3.0 dpa这一范围内时,辐照样品的纳米硬度处于饱和状态。在Nix-Gao模型的基础上,分离出未辐照样品和辐照样品的压痕尺寸效应,并通过VLM(volume law of mixture)模型来模拟实验测得的纳米硬度。由于随着压头压入深度的增加,塑性影响区中将同时包含辐照损伤层与基体,在VLM模型中引入“界面参数”(χ)以修正基体的形变量,改进后的模型能够更好地模拟纳米压痕的实验结果。展开更多
文摘现有的野生动物识别方法主要依赖于静态数据集,难以适应物种动态迁移和新增类别识别的需求,导致监测效率低下。针对这一问题,提出多粒度提示驱动的野生动物识别方法(multi-granularity prompt-driven for wildlife recognition,MGP-WILD)。通过云端大语言模型生成层次化语义描述(粗粒度生物分类+细粒度形态特征),由边缘节点协同维护动态知识表。具体而言,MGP-WILD利用大语言模型生成多粒度文本提示,相较于传统单粒度提示方法,本工作通过多粒度语义描述生成,实现了粗细粒度特征的深度融合,并结合视觉语言模型的跨模态对齐能力,实现了零样本精准识别。实验结果表明,该方法在多个数据集上均有较大提升,尤其在开放集识别任务中展现了较强的适应性。该系统已成功应用于青海野生动物栖息地保护,构建了基于真实场景的动物图像数据集,为生态脆弱区的生物多样性保护提供了创新技术范式。代码及部分数据集将在GitHub上公开。
基金supported by Grants from the Research Funds of the Center for Advanced Interdisciplinary Science and Biomedicine of IHM(QYPY20220018)the National Natural Science Foundation of China(31822026,32271063,31500860,and 32100821)the National Science and Technology Innovation 2030 Major Project of China(2021ZD0203900).
文摘In the face of constantly changing environments,the central nervous system(CNS)rapidly and accurately calculates the body's needs,regulates feeding behavior,and maintains energy homeostasis.The arcuate nucleus of the hypothalamus(ARC)plays a key role in this process,serv-ing as a critical brain region for detecting nutrition-related hormones and regulating appetite and energy homeostasis.Agouti-related protein(AgRP)/neuropeptide Y(NPY)neu-rons in the ARC are core elements that interact with other brain regions through a complex appetite-regulating network to comprehensively control energy homeostasis.In this review,we explore the discovery and research progress of AgRP neurons in regulating feeding and energy metabolism.In addition,recent advances in terms of feeding behavior and energy homeostasis,along with the redundant neural mecha-nisms involved in energy metabolism,are discussed.Finally,the challenges and opportunities in the field of neural regula-tion of feeding and energy metabolism are briefly discussed.
文摘In multimodal learning, Vision-Language Models (VLMs) have become a critical research focus, enabling the integration of textual and visual data. These models have shown significant promise across various natural language processing tasks, such as visual question answering and computer vision applications, including image captioning and image-text retrieval, highlighting their adaptability for complex, multimodal datasets. In this work, we review the landscape of Bootstrapping Language-Image Pre-training (BLIP) and other VLM techniques. A comparative analysis is conducted to assess VLMs’ strengths, limitations, and applicability across tasks while examining challenges such as scalability, data quality, and fine-tuning complexities. The work concludes by outlining potential future directions in VLM research, focusing on enhancing model interpretability, addressing ethical implications, and advancing multimodal integration in real-world applications.
文摘随着自动驾驶、智能导航等领域的快速发展,对时空轨迹预测的准确性和鲁棒性的要求不断提高。传统轨迹预测方法主要依赖运动历史数据,忽略了环境中的语义信息,在复杂场景下往往难以取得理想的预测效果。对轨迹预测领域相关研究进行综述,特别是基于空间语义分析的轨迹预测研究进展。重点探讨了视觉语言模型(Vision Language Model,VLM)和大语言模型(Large Language Model,LLM)在轨迹预测方面的应用,介绍了多种基于空间语义分析的轨迹预测模型。通过实验结果分析发现,VLM和LLM能够显著提升轨迹预测的准确率。基于空间语义分析的轨迹预测方法未来将考虑多模态融合、提升模型架构、提高推理速度等方向,以进一步提升大规模轨迹预测的性能。
文摘利用纳米压痕仪的连续刚度测量模式测试了常温氙离子辐照后Hastelloy N合金的纳米硬度。结果表明,辐照样品的纳米硬度均大于未辐照样品的纳米硬度,且辐照剂量在0.5~3.0 dpa这一范围内时,辐照样品的纳米硬度处于饱和状态。在Nix-Gao模型的基础上,分离出未辐照样品和辐照样品的压痕尺寸效应,并通过VLM(volume law of mixture)模型来模拟实验测得的纳米硬度。由于随着压头压入深度的增加,塑性影响区中将同时包含辐照损伤层与基体,在VLM模型中引入“界面参数”(χ)以修正基体的形变量,改进后的模型能够更好地模拟纳米压痕的实验结果。