Purpose: The purpose of the paper is to provide a framework for addressing the disconnect between metadata and data science. Data science cannot progress without metadata research.This paper takes steps toward advanc...Purpose: The purpose of the paper is to provide a framework for addressing the disconnect between metadata and data science. Data science cannot progress without metadata research.This paper takes steps toward advancing the synergy between metadata and data science, and identifies pathways for developing a more cohesive metadata research agenda in data science. Design/methodology/approach: This paper identifies factors that challenge metadata research in the digital ecosystem, defines metadata and data science, and presents the concepts big metadata, smart metadata, and metadata capital as part of a metadata lingua franca connecting to data science. Findings: The "utilitarian nature" and "historical and traditional views" of metadata are identified as two intersecting factors that have inhibited metadata research. Big metadata, smart metadata, and metadata capital are presented as part ofa metadata linguafranca to help frame research in the data science research space. Research limitations: There are additional, intersecting factors to consider that likely inhibit metadata research, and other significant metadata concepts to explore. Practical implications: The immediate contribution of this work is that it may elicit response, critique, revision, or, more significantly, motivate research. The work presented can encourage more researchers to consider the significance of metadata as a research worthy topic within data science and the larger digital ecosystem. Originality/value: Although metadata research has not kept pace with other data science topics, there is little attention directed to this problem. This is surprising, given that metadata is essential for data science endeavors. This examination synthesizes original and prior scholarship to provide new grounding for metadata research in data science.展开更多
Metadata prefetching and data placement play a critical role in enhancing access performance for file systems operating over wide-area networks.However,developing effective strategies for metadata prefetching in envir...Metadata prefetching and data placement play a critical role in enhancing access performance for file systems operating over wide-area networks.However,developing effective strategies for metadata prefetching in environments with concurrent workloads and for data placement across distributed networks remains a significant challenge.This study introduces novel and efficient methodologies for metadata prefetching and data placement,leveraging fine-grained control of prefetching strategies and variable-sized data fragment writing to optimize the I/O bandwidth of distributed file systems.The proposed metadata prefetching technique employs dynamic workload analysis to identify dominant workload patterns and adaptively refines prefetching policies,thereby boosting metadata access efficiency under concurrent scenarios.Meanwhile,the data placement strategy improves write performance by storing data fragments locally within the nearest data center and transmitting only the fragment location metadata to the remote data center hosting the original file.Experimental evaluations using real-world system traces demonstrate that the proposed approaches reduce metadata access times by up to 33.5%and application data access times by 17.19%compared to state-of-the-art techniques.展开更多
An ontology and metadata for online learning resource repository management is constructed. First, based on the analysis of the use-case diagram, the upper ontology is illustrated which includes resource library ontol...An ontology and metadata for online learning resource repository management is constructed. First, based on the analysis of the use-case diagram, the upper ontology is illustrated which includes resource library ontology and user ontology, and evaluated from its function and implementation; then the corresponding class diagram, resource description framework (RDF) schema and extensible markup language (XML) schema are given. Secondly, the metadata for online learning resource repository management is proposed based on the Dublin Core Metadata Initiative and the IEEE Learning Technologies Standards Committee Learning Object Metadata Working Group. Finally, the inference instance is shown, which proves the validity of ontology and metadata in online learning resource repository management.展开更多
[Objective] To study the information description of vegetable planting metadata model. [Method] On the basis of analyzing the data involved in every as- pect of vegetable planting, this paper put forward description s...[Objective] To study the information description of vegetable planting metadata model. [Method] On the basis of analyzing the data involved in every as- pect of vegetable planting, this paper put forward description schemes of vegetable planting metadata and constructed vegetable planting metadata model by the means of XML/XML schema. [Result] Metadata model of vegetable planting was established, and information description of vegetable planting metadata model was realized by the using of XML Schema. The whole metadata model consists of 7 first-class classifica- tions, including more than 800 information description points which could completely record vegetable planting-related information. [Conclusion] Standards for data collec- tion, management and sharing were provided for the agriculture applications in indus- tries like GAP management of vegetable planting, facility vegetable, food quality traceability, etc.展开更多
metadata是“关于数据的数据”,本文介绍了 m etadata的基本情况 ,并对 HTML 和 XML 环境的几个 m eta-data规范进行了论述 (包括 Dublin core,PICS,Web Collections,CDF ,MCF及 RDF)。由于 metadata在 Internet信息资源的组织和发现方...metadata是“关于数据的数据”,本文介绍了 m etadata的基本情况 ,并对 HTML 和 XML 环境的几个 m eta-data规范进行了论述 (包括 Dublin core,PICS,Web Collections,CDF ,MCF及 RDF)。由于 metadata在 Internet信息资源的组织和发现方面起着非常重要的作用 ,作者呼吁国人应当加强对 metadata的研究。展开更多
Remote sensing data acquisition is one of the most essential processes in the field of Earth observation.However,traditional methods to acquire data do not satisfy the requirements of current applications because larg...Remote sensing data acquisition is one of the most essential processes in the field of Earth observation.However,traditional methods to acquire data do not satisfy the requirements of current applications because large-scale data processing is required.To address this issue,this paper proposes a data acquisition framework that carries out remote sensing metadata planning and then realizes the online acquisition of large amounts of data.Firstly,this paper establishes a unified metadata cataloging model and realizes the catalog of metadata in a local database.Secondly,a coverage calculation model is presented,which can show users the data coverage information in a selected geographical region under the data requirements of a specific application.Finally,according to the data retrieval results and the coverage calcula-tion,a machine-to-machine interface is provided to acquire target remote sensing data.Experiments were conducted to verify the availability and practicality of the proposed frame-work,and the results show the strengths and powerful capabilities of our framework by overcoming deficiencies in traditional methods.It also achieved the online automatic acquisi-tion of large-scale heterogeneous remote sensing data,which can provide guidance for remote sensing data acquisition strategies.展开更多
A uniform metadata representation is introduced for heterogeneous databases, multi media information and other information sources. Some features about metadata are analyzed. The limitation of existing metadata model...A uniform metadata representation is introduced for heterogeneous databases, multi media information and other information sources. Some features about metadata are analyzed. The limitation of existing metadata model is compared with the new one. The metadata model is described in XML which is fit for metadata denotation and exchange. The well structured data, semi structured data and those exterior file data without structure are described in the metadata model. The model provides feasibility and extensibility for constructing uniform metadata model of data warehouse.展开更多
基于e交通学的交通大数据系统是通过构建由大型高性能计算机组成的集群系统来处理海量的交通数据的存储以及计算服务,不仅所需的环境十分严格,而且成本高、部署周期长、维护困难;不仅如此,随着数据量的增长,业务复杂度的增加,以及计算...基于e交通学的交通大数据系统是通过构建由大型高性能计算机组成的集群系统来处理海量的交通数据的存储以及计算服务,不仅所需的环境十分严格,而且成本高、部署周期长、维护困难;不仅如此,随着数据量的增长,业务复杂度的增加,以及计算强度的加大,通过增加Server数量来增加其处理对海量交通数据的能力会变的十分困难,甚至需要对集群的结构进行重新的设计和部署,这不仅需要大量的人力成本和财力,而且造成了巨大的浪费。MetaData交换及部署能力成为当今大数据驱动的智能交通系统研究的重点。面对海量交通数据,如何存储、管理、处理和应用MetaData是十分关键的问题。本文提出的交通大数据MetaData交换系统(Traffic Big Data Metadata Exchange System,TBMES)实现分布式交通信息交换与互访。该构架通过实时交通数据与交通信息大数据平台实时对接,让交通信息传递具有连续性、真实性;宏观交通数据和微观交通数据无缝对接,既可分析路网交通运行态势,又可评价重要道路节点的交通效率,全面掌握区域交通运营状态;使得交通组织管理可视化、可量化、系统化、自动化;系统的输出结果,可为决策者提供决策的理论支持,促进交通决策科学化。展开更多
文摘Purpose: The purpose of the paper is to provide a framework for addressing the disconnect between metadata and data science. Data science cannot progress without metadata research.This paper takes steps toward advancing the synergy between metadata and data science, and identifies pathways for developing a more cohesive metadata research agenda in data science. Design/methodology/approach: This paper identifies factors that challenge metadata research in the digital ecosystem, defines metadata and data science, and presents the concepts big metadata, smart metadata, and metadata capital as part of a metadata lingua franca connecting to data science. Findings: The "utilitarian nature" and "historical and traditional views" of metadata are identified as two intersecting factors that have inhibited metadata research. Big metadata, smart metadata, and metadata capital are presented as part ofa metadata linguafranca to help frame research in the data science research space. Research limitations: There are additional, intersecting factors to consider that likely inhibit metadata research, and other significant metadata concepts to explore. Practical implications: The immediate contribution of this work is that it may elicit response, critique, revision, or, more significantly, motivate research. The work presented can encourage more researchers to consider the significance of metadata as a research worthy topic within data science and the larger digital ecosystem. Originality/value: Although metadata research has not kept pace with other data science topics, there is little attention directed to this problem. This is surprising, given that metadata is essential for data science endeavors. This examination synthesizes original and prior scholarship to provide new grounding for metadata research in data science.
基金funded by the National Natural Science Foundation of China under Grant No.62362019the Hainan Provincial Natural Science Foundation of China under Grant No.624RC482.
文摘Metadata prefetching and data placement play a critical role in enhancing access performance for file systems operating over wide-area networks.However,developing effective strategies for metadata prefetching in environments with concurrent workloads and for data placement across distributed networks remains a significant challenge.This study introduces novel and efficient methodologies for metadata prefetching and data placement,leveraging fine-grained control of prefetching strategies and variable-sized data fragment writing to optimize the I/O bandwidth of distributed file systems.The proposed metadata prefetching technique employs dynamic workload analysis to identify dominant workload patterns and adaptively refines prefetching policies,thereby boosting metadata access efficiency under concurrent scenarios.Meanwhile,the data placement strategy improves write performance by storing data fragments locally within the nearest data center and transmitting only the fragment location metadata to the remote data center hosting the original file.Experimental evaluations using real-world system traces demonstrate that the proposed approaches reduce metadata access times by up to 33.5%and application data access times by 17.19%compared to state-of-the-art techniques.
基金The Advanced University Action Plan of the Minis-try of Education of China (2004XD-03).
文摘An ontology and metadata for online learning resource repository management is constructed. First, based on the analysis of the use-case diagram, the upper ontology is illustrated which includes resource library ontology and user ontology, and evaluated from its function and implementation; then the corresponding class diagram, resource description framework (RDF) schema and extensible markup language (XML) schema are given. Secondly, the metadata for online learning resource repository management is proposed based on the Dublin Core Metadata Initiative and the IEEE Learning Technologies Standards Committee Learning Object Metadata Working Group. Finally, the inference instance is shown, which proves the validity of ontology and metadata in online learning resource repository management.
基金Supported by the Youth Innovation Fund of Fujian Academy of Agricultural Science(2010QB-17)the Science and Technology Bureau Project of Fujian Province(2008S1001)the Financial Special Project of Fujian Province(STIF-Y07)~~
文摘[Objective] To study the information description of vegetable planting metadata model. [Method] On the basis of analyzing the data involved in every as- pect of vegetable planting, this paper put forward description schemes of vegetable planting metadata and constructed vegetable planting metadata model by the means of XML/XML schema. [Result] Metadata model of vegetable planting was established, and information description of vegetable planting metadata model was realized by the using of XML Schema. The whole metadata model consists of 7 first-class classifica- tions, including more than 800 information description points which could completely record vegetable planting-related information. [Conclusion] Standards for data collec- tion, management and sharing were provided for the agriculture applications in indus- tries like GAP management of vegetable planting, facility vegetable, food quality traceability, etc.
文摘metadata是“关于数据的数据”,本文介绍了 m etadata的基本情况 ,并对 HTML 和 XML 环境的几个 m eta-data规范进行了论述 (包括 Dublin core,PICS,Web Collections,CDF ,MCF及 RDF)。由于 metadata在 Internet信息资源的组织和发现方面起着非常重要的作用 ,作者呼吁国人应当加强对 metadata的研究。
基金supported by the Strategic Priority Research Program of the Chinese Academy of Sciences[grant number XDA19020201]。
文摘Remote sensing data acquisition is one of the most essential processes in the field of Earth observation.However,traditional methods to acquire data do not satisfy the requirements of current applications because large-scale data processing is required.To address this issue,this paper proposes a data acquisition framework that carries out remote sensing metadata planning and then realizes the online acquisition of large amounts of data.Firstly,this paper establishes a unified metadata cataloging model and realizes the catalog of metadata in a local database.Secondly,a coverage calculation model is presented,which can show users the data coverage information in a selected geographical region under the data requirements of a specific application.Finally,according to the data retrieval results and the coverage calcula-tion,a machine-to-machine interface is provided to acquire target remote sensing data.Experiments were conducted to verify the availability and practicality of the proposed frame-work,and the results show the strengths and powerful capabilities of our framework by overcoming deficiencies in traditional methods.It also achieved the online automatic acquisi-tion of large-scale heterogeneous remote sensing data,which can provide guidance for remote sensing data acquisition strategies.
文摘A uniform metadata representation is introduced for heterogeneous databases, multi media information and other information sources. Some features about metadata are analyzed. The limitation of existing metadata model is compared with the new one. The metadata model is described in XML which is fit for metadata denotation and exchange. The well structured data, semi structured data and those exterior file data without structure are described in the metadata model. The model provides feasibility and extensibility for constructing uniform metadata model of data warehouse.
文摘基于e交通学的交通大数据系统是通过构建由大型高性能计算机组成的集群系统来处理海量的交通数据的存储以及计算服务,不仅所需的环境十分严格,而且成本高、部署周期长、维护困难;不仅如此,随着数据量的增长,业务复杂度的增加,以及计算强度的加大,通过增加Server数量来增加其处理对海量交通数据的能力会变的十分困难,甚至需要对集群的结构进行重新的设计和部署,这不仅需要大量的人力成本和财力,而且造成了巨大的浪费。MetaData交换及部署能力成为当今大数据驱动的智能交通系统研究的重点。面对海量交通数据,如何存储、管理、处理和应用MetaData是十分关键的问题。本文提出的交通大数据MetaData交换系统(Traffic Big Data Metadata Exchange System,TBMES)实现分布式交通信息交换与互访。该构架通过实时交通数据与交通信息大数据平台实时对接,让交通信息传递具有连续性、真实性;宏观交通数据和微观交通数据无缝对接,既可分析路网交通运行态势,又可评价重要道路节点的交通效率,全面掌握区域交通运营状态;使得交通组织管理可视化、可量化、系统化、自动化;系统的输出结果,可为决策者提供决策的理论支持,促进交通决策科学化。