期刊文献+

营养大模型的技术架构、应用进展与未来挑战

Technical architecture,application progress,and future challenges of nutrition foundation models
原文传递
导出
摘要 营养信息学正由传统基于规则与常规机器学习范式,迈向以大语言模型(large language model,LLM)与多模态大模型(multimodal large language models,MLLM)为核心的新阶段。本文系统综述了2019–2025年间营养大模型领域的研究进展,归纳了视觉-语言对齐、领域知识注入、检索增强生成(retrieval-augmented generation,RAG)及可解释推理等关键架构与训练技术。在此基础上,本文详细梳理了模型在个性化膳食推荐、营养状态评估、疾病营养管理及膳食自动化记录等典型场景的应用现状。此外,本文总结了Nutrition5k、NutriBench等核心数据集与评测基准的演变历程。最后,针对模型可信度、数据隐私、跨文化泛化及临床循证支持等挑战,本文提出未来研究应深度融合临床证据,构建高质量多模态数据体系,并推进人机协同的精准营养服务落地,以提升临床转化价值。 Nutrition informatics has undergone a significant paradigm shift in recent years.Approaches historically grounded in rule-based decision support and classical task-specific machine learning pipelines are increasingly being superseded by an ecosystem centered on large language models(LLMs)and multimodal vision-language foundation models.This review synthesizes researches published between 2019 and 2025,with the objectives of clarifying architectural patterns that enable nutrition-oriented perception and reasoning,summarizing advances and identifying gaps across major application scenarios,and outlining strategic directions for reliable translation research in clinical and public health practice.Based on a systematic analysis of 92 representative studies,we organize the current landscape into three interrelated research trajectories:(1)Vision and multimodal modeling for dietary perception,focusing on food recognition,ingredient parsing,portion estimation,and nutrient prediction from meal images and videos.Recent methodologies increasingly adopt Transformer-based encoders and explicit vision-language alignment,leveraging depth cues and scale calibration to improve robustness under complex real-world conditions.(2)LLM-based nutrition agents for interactive guidance,supporting dietary counseling,meal planning,and health coaching.To mitigate challenges such as hallucinations and numerical inconsistency,current research emphasizes domain adaptation,tool-augmented computation,and retrieval-augmented generation(RAG)to ground model responses in reliable nutrition databases and clinical guidelines.(3)Personalization-oriented hybrid systems,which combine foundation models with structured components-such as knowledge graphs and causal inference frameworks-while integrating individual-level multi-omics signals,biomarkers,and lifestyle data.These systems aim to generate and optimize meal plans under strict constraints of safety,clinical feasibility,and patient adherence.Across these trajectories,interpretability has transitioned from an optional feature to a core system requirement,driven by the needs of clinical accountability and risk auditing.Concurrently,evaluation protocols are expanding from image-centric datasets(e.g.,Nutrition5k)to comprehensive benchmarking suites designed for multimodal reasoning.Despite rapid progress,limitations persist regarding model factuality,privacy preservation,and external validity across diverse cuisines and socioeconomic settings.We advocate for evidence-grounded pipelines,standardized multimodal datasets with clinical endpoints,and unified evaluation frameworks spanning accuracy,safety,and bias.Human-in-the-loop deployment remains essential to quantify benefit-risk profiles and facilitate the regulatory adoption of AI-driven nutrition services.
作者 张成东 孔浩楠 杨元 闫媛媛 童天朗 王慧 ZHANG Cheng-Dong;KONG Hao-Nan;YANG Yuan;YAN Yuan-Yuan;TONG Tian-Lang;WANG Hui(School of Public Health,Shanghai Jiao Tong University,Shanghai 200025,China;Institute of Digital and Intelligent Medicine,Hainan International Medical Center,Shanghai Jiao Tong University School of Medicine,Qionghai 571400,China)
出处 《生命科学》 2026年第1期1-17,共17页 Chinese Bulletin of Life Sciences
基金 国家自然科学基金重点项目(82030099) 国家重点研发计划项目(2022YFD2101500)。
关键词 营养大模型 多模态学习 大语言模型 个性化营养 检索增强生成 评测基准 nutrition large language model multimodal learning large language model personalized nutrition retrieval-augmented generation evaluation benchmark
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部