摘要
提出一种融合大语言模型的多模态智能分析框架,通过构建一种场景自适应的无人机数据采集系统,结合多尺度空频特征提取、基于对抗样本与领域知识的增强学习以及跨模态语义理解等关键技术,实现对光伏板隐裂、水利设施裂缝等典型小目标的高精度、高效率检测。引入大语言模型作为语义理解与知识引导的核心,并设计“云—边—端”3层协同计算架构,提升系统的实时推理能力与工程适用性。
This study proposes a multi-modal intelligent analysis framework integrating large language models.By constructing a scene-adaptive unmanned aerial vehicle data acquisition system and combining key technologies such as multi-scale spatio-temporal feature extraction,adversarial sample and domain knowledge-based reinforcement learning,and cross-modal semantic understanding,it achieves high-precision and high-efficiency detection of typical small targets,such as hidden cracks in photovoltaic panels and cracks in water conservancy facilities.It introduces large language models as the core for semantic understanding and knowledge guidance and designs a“cloud-edge-device”3 layer collaborative computing architecture,significantly enhancing the real-time inference capability and engineering applicability of the system.
作者
黄俊达
黎卓轩
刘俊求
HUANG Junda;LI Zhuoxuan;LIU Junqiu(Guangdong Kenuo Survey Engineering Co.,Ltd.,Guangzhou 510000,China)
出处
《智能物联技术》
2026年第2期39-43,共5页
Technology of Io T& AI
关键词
低空遥感影像
多模态智能分析
小目标检测
low-altitude remote sensing images
multi-modal intelligent analysis
small target detection