[目的/意义]W3C Data Catalog Vocabulary(数据目录词汇表,DCAT)是各国开放数据元数据标准的基础和源头。为适应更广泛的需求,DCAT修订版有了大幅度的改进,对其的系统介绍可为我国的开放政府数据元数据标准建设提供一定的参考和借鉴。[...[目的/意义]W3C Data Catalog Vocabulary(数据目录词汇表,DCAT)是各国开放数据元数据标准的基础和源头。为适应更广泛的需求,DCAT修订版有了大幅度的改进,对其的系统介绍可为我国的开放政府数据元数据标准建设提供一定的参考和借鉴。[方法/过程]采用文献分析和网络调查方法,介绍W3C数据集交换工作组对DCAT的修订背景、主要成果和改进内容。[结果/结论]W3C DCAT修订版代表着国际上元数据标准的最新成果和发展方向。对我国而言,应在提升数据目录的互操作能力、多学科广泛参与、政府标准与业界实践相互融合上借鉴先进经验。展开更多
The increasing availability of government data has prompted efforts to standardize data cataloging practices for enhanced accessibility and usability.The primary aim of this study is to descriptively assess data catal...The increasing availability of government data has prompted efforts to standardize data cataloging practices for enhanced accessibility and usability.The primary aim of this study is to descriptively assess data catalog and referenced dataset volume,metadata utilization,and thematic composition of United States DCAT compliant data catalogs across Federal,State,County,City,and Territory entities.Data collection involved compiling a list of relevant government agencies and data resources to identify data catalogs.DCAT compliance was then assessed,and metadata from compliant catalogs was extracted.Thematic mapping utilized Python packages RegEx and FuzzyWuzzy to categorize themes into eight standard categories.A combination of descriptive statistics and 1-way ANOVA tests were conducted to analyze dataset volume,metadata utilization,and reported themes.Of the 305 data catalogs identified,259 were found to be DCAT compliant.Federal entities exhibited the highest DCAT compliance rates(92.3%),followed by County(88.1%),City(86.9%),and State(77.0%),while Territory(0%)had no compliant data catalogs.Descriptive analysis revealed that federal DCAT compliant data catalogs(n=59)had the highest average number of data assets across their data catalogs(μ=1,133.2)with the predominant themes being transportation at 21.2%(n=14,785)and geospatial at 15.4%(n=10,761).While county data catalogs(n=52)had the lowest average(μ=232.6)with the most referenced themes being geospatial at 77.6%(n=8450)and finance at 2.4%(n=270).After applying thematic mapping to eight standard categories,the three most dominant themes across all entities were transportation at 38.1%(n=16,504),natural resources with 19.6%(n=8,501),and health and safety with 14.7%(n=6,367).These findings underscore the widespread adoption of the DCAT standard across government entities,with notable gaps at the territorial level.Federal and state entities exhibited the highest data catalog and dataset volumes,while metadata utilization remained relatively consistent across all entity levels.The thematic analysis highlights the importance of standardization efforts to enhance thematic consistency and facilitate effective data interpretation.Further collaboration and investment are warranted to address gaps in catalog coverage and establish standardized data cataloging practices to maximize the accessibility and usability of these data catalogs along with their referenced datasets.展开更多
元数据是各国政府开放数据行动计划的重要组成部分。文章在介绍W3C元数据标准DCAT(Data Catalog Vocabulary)、美国的"开放数据项目"(Project Open Data,POD)和欧盟的DCAT应用纲要(DCAT-AP)方案基础上,分析和总结了美国、欧...元数据是各国政府开放数据行动计划的重要组成部分。文章在介绍W3C元数据标准DCAT(Data Catalog Vocabulary)、美国的"开放数据项目"(Project Open Data,POD)和欧盟的DCAT应用纲要(DCAT-AP)方案基础上,分析和总结了美国、欧盟和爱尔兰政府开放数据元数据建设的成果和特点。通过对比我国北京、上海和浙江三个地方政府开放数据元数据建设情况,认为我国政府开放数据的元数据建设应在政策支持及法律保障、体系完整和语义化方面加强工作,以克服当前在规范化、标准化和互操作上的不足。展开更多
文摘[目的/意义]W3C Data Catalog Vocabulary(数据目录词汇表,DCAT)是各国开放数据元数据标准的基础和源头。为适应更广泛的需求,DCAT修订版有了大幅度的改进,对其的系统介绍可为我国的开放政府数据元数据标准建设提供一定的参考和借鉴。[方法/过程]采用文献分析和网络调查方法,介绍W3C数据集交换工作组对DCAT的修订背景、主要成果和改进内容。[结果/结论]W3C DCAT修订版代表着国际上元数据标准的最新成果和发展方向。对我国而言,应在提升数据目录的互操作能力、多学科广泛参与、政府标准与业界实践相互融合上借鉴先进经验。
文摘The increasing availability of government data has prompted efforts to standardize data cataloging practices for enhanced accessibility and usability.The primary aim of this study is to descriptively assess data catalog and referenced dataset volume,metadata utilization,and thematic composition of United States DCAT compliant data catalogs across Federal,State,County,City,and Territory entities.Data collection involved compiling a list of relevant government agencies and data resources to identify data catalogs.DCAT compliance was then assessed,and metadata from compliant catalogs was extracted.Thematic mapping utilized Python packages RegEx and FuzzyWuzzy to categorize themes into eight standard categories.A combination of descriptive statistics and 1-way ANOVA tests were conducted to analyze dataset volume,metadata utilization,and reported themes.Of the 305 data catalogs identified,259 were found to be DCAT compliant.Federal entities exhibited the highest DCAT compliance rates(92.3%),followed by County(88.1%),City(86.9%),and State(77.0%),while Territory(0%)had no compliant data catalogs.Descriptive analysis revealed that federal DCAT compliant data catalogs(n=59)had the highest average number of data assets across their data catalogs(μ=1,133.2)with the predominant themes being transportation at 21.2%(n=14,785)and geospatial at 15.4%(n=10,761).While county data catalogs(n=52)had the lowest average(μ=232.6)with the most referenced themes being geospatial at 77.6%(n=8450)and finance at 2.4%(n=270).After applying thematic mapping to eight standard categories,the three most dominant themes across all entities were transportation at 38.1%(n=16,504),natural resources with 19.6%(n=8,501),and health and safety with 14.7%(n=6,367).These findings underscore the widespread adoption of the DCAT standard across government entities,with notable gaps at the territorial level.Federal and state entities exhibited the highest data catalog and dataset volumes,while metadata utilization remained relatively consistent across all entity levels.The thematic analysis highlights the importance of standardization efforts to enhance thematic consistency and facilitate effective data interpretation.Further collaboration and investment are warranted to address gaps in catalog coverage and establish standardized data cataloging practices to maximize the accessibility and usability of these data catalogs along with their referenced datasets.
文摘元数据是各国政府开放数据行动计划的重要组成部分。文章在介绍W3C元数据标准DCAT(Data Catalog Vocabulary)、美国的"开放数据项目"(Project Open Data,POD)和欧盟的DCAT应用纲要(DCAT-AP)方案基础上,分析和总结了美国、欧盟和爱尔兰政府开放数据元数据建设的成果和特点。通过对比我国北京、上海和浙江三个地方政府开放数据元数据建设情况,认为我国政府开放数据的元数据建设应在政策支持及法律保障、体系完整和语义化方面加强工作,以克服当前在规范化、标准化和互操作上的不足。