科技期刊论文插图识别与向量库构建方法研究被引量：1

Research on the Recognition of Figures and the Construction of Vector Databases in Scientific Journal Articles

导出

摘要 [目的/意义]在科技文献中,插图与文本作为两种常见且互补的信息呈现方式,共同构成知识传播的重要组成部分。通过识别文献中的插图及其关联文本,实现多模态信息的整合与利用,进而提升科技文献的知识挖掘与检索效率,对科技文献资源建设以及多模态知识服务发展具有重要意义。[方法/过程]提出一个针对科技文献中插图知识单元构建向量库的方法,从插图知识单元识别的准确性、完整性和关联性3个维度设计模型;在此基础上,设计向量表征和存储方式进一步构建插图向量库;最终,基于所构建的论文插图向量库,搭建论文插图检索系统,实现科技文献的多模态知识服务。[结果/结论]基于提出的方法,插图知识单元识别的F1值为84.1%,论文插图和关联文本片段识别的F1值分别为99.5%、89.0%,并成功抽取化学化工领域的120万篇科技论文,构建百万级论文插图检索系统,为多模态知识挖掘及知识服务提供基础支撑。 [Purpose/Significance]In scientific literature,figures and text,as two common and complementary forms of information presentation,together constitute a crucial component of knowledge dissemination.By recognizing figures and their associated textual content within literature,the integration and utilization of multimodal information can be achieved,thereby enhancing the efficiency of knowledge extraction and retrieval from scientific documents.It has significant implications for the construction of scientific literature resources and the development of multimodal knowledge services.[Method/Process]This paper proposed a method for constructing a vector database of figure knowledge units in scientific literature.The model was designed to evaluate figure knowledge units based on three key dimensions:accuracy,completeness,and relevance.Building upon this model,a vector representation and storage mechanism were developed to further construct the figure vector database.Finally,an end-to-end figure retrieval system was built upon the constructed database to enable multimodal knowledge services in scientific literature.[Result/Conclusion]Based on the method proposed in this paper,the recognition of figure knowledge units achieves an F1 score of 84.1%.The recognition of figures and their associated text fragments achieves F1 scores of 99.5%and 89.0%respectively.Additionally,it effectively extracts data from 1.2 million chemical literature papers.It has led to the development of a large-scale figure retrieval system,providing foundational support for multimodal knowledge extraction and knowledge services.

作者黄雨馨常志军钱力曲云鹏郭丹李文文吴垚葶王浩霖 Huang Yuxin;Chang Zhijun;Qian Li;Qu Yunpeng;Guo Dan;Li Wenwen;Wu Yaoting;Wang Haolin(National Science Library,Chinese Academy of Sciences,Beijing 100190;Department of Information Resources Management,School of Economics and Management,University of Chinese Academy of Sciences,Beijing 100190)

机构地区中国科学院文献情报中心中国科学院大学经济与管理学院信息资源管理系

出处《图书情报工作》北大核心 2025年第13期32-42,共11页 Library and Information Service

基金国家社会科学基金项目“面向循证医学的领域文献实体关系识别方法研究”(项目编号:21BTQ106) 国家社会科学基金项目“AI4S科技文献知识底座的理论体系及建设方法研究”(项目编号:24BTQ043)研究成果之一。

关键词科技文献插图知识单元论文插图向量库论文插图检索系统多模态知识服务 scientific literature figure knowledge unit figure vector database figure retrieval system multimodal knowledge services

分类号 G250 [文化科学—图书馆学] TP391 [自动化与计算机技术—计算机应用技术]