期刊文献+

空间科学虚拟观测台智能检索系统构建

Construction of an Intelligent Retrieval System for the Virtual Space Science Observatory
在线阅读 下载PDF
导出
摘要 【背景】随着空间科学数据的快速增长和多模态化,传统的基于元数据字段的检索方式难以满足科研用户对复杂语义和未预定义查询的检索需求,亟需引入具备语义理解能力的智能检索系统。【目的】本研究旨在构建一个面向空间科学领域数据的智能检索系统,以解决传统元数据查询方式在语义理解和多模态数据检索方面的不足,提升科研人员对异构空间科学数据的发现效率和准确性。【方法】研究基于大语言模型构建动态语义解析机制,结合BM25和稠密向量检索方法实现数据集的混合检索;针对图像和时序数据,采用DINOv2、VISTA、Timer-XL等模型提取内容特征,构建多模态语义索引;系统采用分层架构,集成全文检索与向量数据库,支持自然语言、标签和数据样例等多种查询方式。【结论】空间科学虚拟观测台智能检索系统通过融合多种AI模型,显著提升了数据发现的灵活性与准确性,为大规模空间科学数据的高效利用提供了新范式。 [Background]With the rapid growth of space science data,the traditional metadata-based retrieval methods have gradually become insufficient to meet the needs of researchers for complex semantic queries.There is an urgent need to introduce intelligent retrieval systems capable of semantic understanding.[Objective]This study aims to develop an intelligent retrieval system for space science data,addressing the limitations of conventional metadata-driven approaches in semantic comprehension and multi-modal data retrieval,thereby enhancing the efficiency and accuracy of accessing heterogeneous space science datasets.[Methods]The proposed system employs a dynamic semantic parsing mechanism based on large language models,combined with hybrid retrieval strategies integrating BM25 and dense vector search meth-ods.For image and time-series data,feature representations are extracted using models such as DINOv2,VISTA,and Timer-XL to construct a multi-modal semantic index.The system adopts a hierarchical architecture that integrates full-text search and vector databases,supporting multiple query modes including natural language,tags,and data examples.[Conclusion]The intelligent retrieval system for the virtual space science observatory significantly enhances the flexibility and accuracy of data discovery by integrating multiple AI models,offering a novel paradigm for the efficient utilization of large-scale space science data.
作者 李云龙 焦琦融 王慈枫 邹自明 LI Yunlong;JIAO Qirong;WANG Cifeng;ZOU Ziming(National Space Science Center,Chinese Academy of Sciences,Beijing 100190,China;Computer Network Information Center,Chinese Academy of Sciences,Beijing 100083,China)
出处 《数据与计算发展前沿(中英文)》 2025年第4期20-32,共13页 Frontiers of Data & Computing
基金 国家重点研发计划“基础科研条件与重大科学仪器设备研发”重点专项(2022YFF0711400)。
关键词 空间科学 大数据 智能检索 space science big data intelligent retrieval
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部