摘要
数据字典(Data Dictionary,DD)是数据库系统设计内容的重要组成部分,是描述数据库中各数据属性、组成和结构的数据列表集合。一些通用性信息化系统开发过程中,设计开发人员经常遇到如何融合优化既有异构数据字典的问题,这些既有数据字典因设计时缺少行业数据标准或业务范围局限性,在数据表征定义和数据组成及结构设计上差异化明显,但其数据内涵具有高度可融合性,需要花费大量时间和资源通过人工来维护融合数据字典。文中以基层社会网格治理业务背景,针对基层社会治理推广数字化应用开发中异构数据字典融合的痛点问题,研究异构数据字典融合优化方法及相关技术;设计了考虑数据信息完备性和数据结构完整性的数据字典语义去重消岐、关键词提取、相似度计算、数据字典表结构融合方法等4个方面的数据字典融合方法和技术。基于基层社会网格治理业务相关数据字典融合优化实验验证,相较于传统的数据字典融合方法显著提升了融合效率和效果。
Data dictionary(DD)is an important part of the database system design content,and it is a collection of data lists that describes the attributes,composition and structure of the data in the database.In the development process of some general-purpose information systems,designers and developers often encounter the problem of how to integrate and optimize existing heterogeneous data dictionaries.Due to the lack of industry data standards or business scope limitations,these existing data dictionaries differ significantly in data representation definition,data composition and structure design,but their data content is highly convergable.It takes a lot of time and resources to manually maintain a converged data dictionary.Based on the business background of grass-roots social grid governance,this paper aims at the pain points of heterogeneous data dictionary fusion in the development of grass-roots social governance promotion digital application,and studies the optimization methods and related technologies of heterogeneous data dictionary fusion.The methods and techniques of data dictionary fusion are designed,which consider the completeness of data information and the integrity of data structure,such as semantic deduplication and disambiguation,keyword extraction,similarity calculation and table structure fusion.Based on the experimental verification of data dictionary fusion optimization of grass-roots social grid governance business,the fusion efficiency and effect are significantly improved compared with the traditional data dictionary fusion method.
作者
王庆
杨万哲
张聪
WANG Qing;YANG Wanzhe;ZHANG Cong(College of Information Science and Engineering,Northeastern University,Shenyang 110000,China)
出处
《计算机科学》
北大核心
2025年第S1期577-583,共7页
Computer Science
基金
国家重点研发计划(2021YFC3300300)。
关键词
数据字典
数据库设计
编辑距离
相似度计算
基层社会网格治理
Data dictionary
Database design
Edit distance
Similarity calculation
Grass-roots social grid governance