Today's multimedia services are far beyond just the voice and data services:they have been diversified tremendously after fueled by the advancement of network infrastructures as well as the sudden surge of multime...Today's multimedia services are far beyond just the voice and data services:they have been diversified tremendously after fueled by the advancement of network infrastructures as well as the sudden surge of multimedia data itself.Currently,researches on metadata insertion,management and transfer keep going very well in order to provide a variety of services to users.In this paper,we propose the design and implementation methods of digital contents metadata system for insertion,storage and retrieval of metadata.The performance evaluation shows that the proposed method performs better than the existing method.展开更多
The exponential growth of over-the-top(OTT)entertainment has fueled a surge in content consumption across diverse formats,especially in regional Indian languages.With the Indian film industry producing over 1500 films...The exponential growth of over-the-top(OTT)entertainment has fueled a surge in content consumption across diverse formats,especially in regional Indian languages.With the Indian film industry producing over 1500 films annually in more than 20 languages,personalized recommendations are essential to highlight relevant content.To overcome the limitations of traditional recommender systems-such as static latent vectors,poor handling of cold-start scenarios,and the absence of uncertainty modeling-we propose a deep Collaborative Neural Generative Embedding(C-NGE)model.C-NGE dynamically learns user and item representations by integrating rating information and metadata features in a unified neural framework.It uses metadata as sampled noise and applies the reparameterization trick to capture latent patterns better and support predictions for new users or items without retraining.We evaluate CNGE on the Indian Regional Movies(IRM)dataset,along with MovieLens 100 K and 1 M.Results show that our model consistently outperforms several existing methods,and its extensibility allows for incorporating additional signals like user reviews and multimodal data to enhance recommendation quality.展开更多
Metadata prefetching and data placement play a critical role in enhancing access performance for file systems operating over wide-area networks.However,developing effective strategies for metadata prefetching in envir...Metadata prefetching and data placement play a critical role in enhancing access performance for file systems operating over wide-area networks.However,developing effective strategies for metadata prefetching in environments with concurrent workloads and for data placement across distributed networks remains a significant challenge.This study introduces novel and efficient methodologies for metadata prefetching and data placement,leveraging fine-grained control of prefetching strategies and variable-sized data fragment writing to optimize the I/O bandwidth of distributed file systems.The proposed metadata prefetching technique employs dynamic workload analysis to identify dominant workload patterns and adaptively refines prefetching policies,thereby boosting metadata access efficiency under concurrent scenarios.Meanwhile,the data placement strategy improves write performance by storing data fragments locally within the nearest data center and transmitting only the fragment location metadata to the remote data center hosting the original file.Experimental evaluations using real-world system traces demonstrate that the proposed approaches reduce metadata access times by up to 33.5%and application data access times by 17.19%compared to state-of-the-art techniques.展开更多
Distribution of metadata in a metadata server cluster is important in mass storage system. A good distribution algorithm has a significant influence on the system performance, availability and scalability. Subtree par...Distribution of metadata in a metadata server cluster is important in mass storage system. A good distribution algorithm has a significant influence on the system performance, availability and scalability. Subtree partition and hash are two traditional metadata distribution algorithms used in distributed file systems. They both have a defect in system scalability. This paper proposes a new directory hash (DH) algorithm. By treating directory as the key value of hash function, implementing concentrated storage of metadata, pipelining operations and prefetching technology, DH algorithm can enhance the system scalability on the premise without sacrificing system performance.展开更多
Software industry has evolved to multi-product and multi-platform development based on a mix of proprietary and open source components. Such integration has occurred in software ecosystems through a software product l...Software industry has evolved to multi-product and multi-platform development based on a mix of proprietary and open source components. Such integration has occurred in software ecosystems through a software product line engineering (SPLE) process. However, metadata are underused in the SPLE and interoperability challenge. The proposed method is first, a semantic metadata enrichment software ecosystem (SMESE) to support multi-platform metadata driven applications, and second, based on mapping ontologies SMESE aggregates and enriches metadata to create a semantic master metadata catalogue (SMMC). The proposed SPLE process uses a component-based software development approach for integrating distributed content management enterprise applications, such as digital libraries. To perform interoperability between existing metadata models (such as Dublin Core, UNIMARC, MARC21, RDF/RDA and BIBFRAME), SMESE implements an ontology mapping model. SMESE consists of nine sub-systems: 1) Metadata initiatives & concordance rules;2) Harvesting of web metadata & data;3) Harvesting of authority metadata & data;4) Rule-based semantic metadata external enrichment;5) Rule-based semantic metadata internal enrichment;6) Semantic metadata external & internal enrichment synchronization;7) User interest-based gateway;8) Semantic master catalogue. To conclude, this paper proposes a decision support process, called SPLE decision support process (SPLE-DSP) which is then used by SMESE to support dynamic reconfiguration. SPLE-DSP consists of a dynamic and optimized metadata-based reconfiguration model. SPLE-DSP takes into account runtime metadata-based variability functionalities, context-awareness and self-adaptation. It also presents the design and implementation of a working prototype of SMESE applied to a semantic digital library.展开更多
In an object-based storage system,a novel scheme named EAP(extending attributes page) is presented to enhance the metadata reliability of the system by adding the user object file information attributes page for each ...In an object-based storage system,a novel scheme named EAP(extending attributes page) is presented to enhance the metadata reliability of the system by adding the user object file information attributes page for each user object and storing the file-related attributes of each user object in object-based storage devices.The EAP scheme requires no additional hardware equipments compared to a general method which uses backup metadata servers to improve the metadata reliability.Leveraging a Markov chain,this paper compares the metadata reliability of the system using the EAP scheme with that using only metadata servers to offer the file metadata service.Our results demonstrate that the EAP scheme can dramatically enhance the reliability of storage system metadata.展开更多
To construct the Agricultural Scientific and Technical Information Core Metadata (ASTICM) standard and its expanding principles, and to develop a register system based on ASTICM, the policy and methods of DC (Dubli...To construct the Agricultural Scientific and Technical Information Core Metadata (ASTICM) standard and its expanding principles, and to develop a register system based on ASTICM, the policy and methods of DC (Dublin Core) and SDBCM (Scientific Database Core Metadata) were studied. The construction of ASTICM has started from the proposed elements of the DCMI (Dublin Core Metadata Initiative), and has expanded the DC and SDBCM with related expanding principles. ASTICM finally includes 75 metadata elements, five expanded principles, and seven application profile creation methods. According to the requirement analysis of a large number of users of agricultural information, a register system based on ASTICM was developed. The ASTICM standard framework and its register system supported the search, sharing, integration exchange and other applications, effectively.展开更多
The reliability and high performance of metadata service is crucial to the store architecture. A novel design of a two-level metadata server file system (TTMFS) is presented, which behaves high reliability and perfo...The reliability and high performance of metadata service is crucial to the store architecture. A novel design of a two-level metadata server file system (TTMFS) is presented, which behaves high reliability and performance. The merits both centralized management and distributed management are considered simultaneously in our design. In this file system, the advanced-metadata server is responsible for manage directory metadata and the whole namespace. The double-metadata server is responsible for maintaining file metadata. And this paper uses the Markov return model to analyze the reliability of the two-level metadata server. The experiment data indicates that the design can provide high throughput.展开更多
There are differences between the different individuals of learning. Adaptive learning support system is a learning system, which provides the learning supports suitable for the characteristics of the individuals acco...There are differences between the different individuals of learning. Adaptive learning support system is a learning system, which provides the learning supports suitable for the characteristics of the individuals according to the differences in the learning of individuals. In this paper, through the analysis on the adaptive learning support system, a system framework based on SOA is proposed and the research methods of the metadata model are emphatically discussed.展开更多
With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The networ...With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The network security environment in the era of big data presents the characteristics of large amounts of data,high diversity,and high real-time requirements.Traditional security defense methods and tools have been unable to cope with the complex and changing network security threats.This paper proposes a machine-learning security defense algorithm based on metadata association features.Emphasize control over unauthorized users through privacy,integrity,and availability.The user model is established and the mapping between the user model and the metadata of the data source is generated.By analyzing the user model and its corresponding mapping relationship,the query of the user model can be decomposed into the query of various heterogeneous data sources,and the integration of heterogeneous data sources based on the metadata association characteristics can be realized.Define and classify customer information,automatically identify and perceive sensitive data,build a behavior audit and analysis platform,analyze user behavior trajectories,and complete the construction of a machine learning customer information security defense system.The experimental results show that when the data volume is 5×103 bit,the data storage integrity of the proposed method is 92%.The data accuracy is 98%,and the success rate of data intrusion is only 2.6%.It can be concluded that the data storage method in this paper is safe,the data accuracy is always at a high level,and the data disaster recovery performance is good.This method can effectively resist data intrusion and has high air traffic control security.It can not only detect all viruses in user data storage,but also realize integrated virus processing,and further optimize the security defense effect of user big data.展开更多
In view of the problems of inconsistent data semantics,inconsistent data formats,and difficult data quality assurance between the railway engineering design phase and the construction and operation phase,as well as th...In view of the problems of inconsistent data semantics,inconsistent data formats,and difficult data quality assurance between the railway engineering design phase and the construction and operation phase,as well as the difficulty in fully realizing the value of design results,this paper proposes a design and implementation scheme for a railway engineering collaborative design platform.The railway engineering collaborative design platform mainly includes functional modules such as metadata management,design collaboration,design delivery management,model component library,model rendering services,and Building Information Modeling(BIM)application services.Based on this,research is conducted on multi-disciplinary parameterized collaborative design technology for railway engineering,infrastructure data management and delivery technology,and design multi-source data fusion and application technology.The railway engineering collaborative design platform is compared with other railway design software to further validate its advantages and advanced features.The platform has been widely applied in multiple railway construction projects,greatly improving the design and project management efficiency.展开更多
近年来,以Chat GPT为代表的大语言模型(large language model,LLM)技术发展迅速.随着模型参数规模的持续增长,构建和应用大模型对数据存储规模和存储访问效率提出了更高要求,这对传统存储系统带来了严峻挑战.首先分析了大模型在数据准...近年来,以Chat GPT为代表的大语言模型(large language model,LLM)技术发展迅速.随着模型参数规模的持续增长,构建和应用大模型对数据存储规模和存储访问效率提出了更高要求,这对传统存储系统带来了严峻挑战.首先分析了大模型在数据准备、模型训练和推理阶段的存储访问特征,深入探讨了传统存储系统在大模型场景下面临的主要问题和瓶颈.针对这些挑战,提出并实现了一种高性能、可扩展的分布式元数据设计Scale FS.通过目录树元数据与属性元数据解耦的架构设计,并结合深度与广度均衡的目录树分层分区策略设计,Scale FS实现了高效的路径解析、负载均衡和系统扩展能力,能够高效管理千亿级文件.此外,Scale FS设计了细粒度元数据结构,优化了元数据访问模式,并构建了面向文件语义优化的元数据键值存储底座,显著提升了元数据访问效率并减少了磁盘I/O操作.实验结果表明,Scale FS的每秒操作次数(operations per second,OPS)是HDFS的1.04~7.12倍,而延迟仅为HDFS的12.67%~99.55%.在千亿级文件规模下,Scale FS的大部分操作性能优于HDFS在十亿级文件规模下的表现,展现出更高的扩展性和访问效率,能够更好地满足大模型场景对千亿级文件存储及高效访问的需求.展开更多
An ontology and metadata for online learning resource repository management is constructed. First, based on the analysis of the use-case diagram, the upper ontology is illustrated which includes resource library ontol...An ontology and metadata for online learning resource repository management is constructed. First, based on the analysis of the use-case diagram, the upper ontology is illustrated which includes resource library ontology and user ontology, and evaluated from its function and implementation; then the corresponding class diagram, resource description framework (RDF) schema and extensible markup language (XML) schema are given. Secondly, the metadata for online learning resource repository management is proposed based on the Dublin Core Metadata Initiative and the IEEE Learning Technologies Standards Committee Learning Object Metadata Working Group. Finally, the inference instance is shown, which proves the validity of ontology and metadata in online learning resource repository management.展开更多
[Objective] To study the information description of vegetable planting metadata model. [Method] On the basis of analyzing the data involved in every as- pect of vegetable planting, this paper put forward description s...[Objective] To study the information description of vegetable planting metadata model. [Method] On the basis of analyzing the data involved in every as- pect of vegetable planting, this paper put forward description schemes of vegetable planting metadata and constructed vegetable planting metadata model by the means of XML/XML schema. [Result] Metadata model of vegetable planting was established, and information description of vegetable planting metadata model was realized by the using of XML Schema. The whole metadata model consists of 7 first-class classifica- tions, including more than 800 information description points which could completely record vegetable planting-related information. [Conclusion] Standards for data collec- tion, management and sharing were provided for the agriculture applications in indus- tries like GAP management of vegetable planting, facility vegetable, food quality traceability, etc.展开更多
metadata是“关于数据的数据”,本文介绍了 m etadata的基本情况 ,并对 HTML 和 XML 环境的几个 m eta-data规范进行了论述 (包括 Dublin core,PICS,Web Collections,CDF ,MCF及 RDF)。由于 metadata在 Internet信息资源的组织和发现方...metadata是“关于数据的数据”,本文介绍了 m etadata的基本情况 ,并对 HTML 和 XML 环境的几个 m eta-data规范进行了论述 (包括 Dublin core,PICS,Web Collections,CDF ,MCF及 RDF)。由于 metadata在 Internet信息资源的组织和发现方面起着非常重要的作用 ,作者呼吁国人应当加强对 metadata的研究。展开更多
基金The MSIP(Ministry of Science,ICT&Future Planning),Korea,under the ITRC(Information Technology Research Center)support program(NIPA-2013-H0301-13-2006)supervised by the NIPA(National IT Industry Promotion Agency)
文摘Today's multimedia services are far beyond just the voice and data services:they have been diversified tremendously after fueled by the advancement of network infrastructures as well as the sudden surge of multimedia data itself.Currently,researches on metadata insertion,management and transfer keep going very well in order to provide a variety of services to users.In this paper,we propose the design and implementation methods of digital contents metadata system for insertion,storage and retrieval of metadata.The performance evaluation shows that the proposed method performs better than the existing method.
文摘The exponential growth of over-the-top(OTT)entertainment has fueled a surge in content consumption across diverse formats,especially in regional Indian languages.With the Indian film industry producing over 1500 films annually in more than 20 languages,personalized recommendations are essential to highlight relevant content.To overcome the limitations of traditional recommender systems-such as static latent vectors,poor handling of cold-start scenarios,and the absence of uncertainty modeling-we propose a deep Collaborative Neural Generative Embedding(C-NGE)model.C-NGE dynamically learns user and item representations by integrating rating information and metadata features in a unified neural framework.It uses metadata as sampled noise and applies the reparameterization trick to capture latent patterns better and support predictions for new users or items without retraining.We evaluate CNGE on the Indian Regional Movies(IRM)dataset,along with MovieLens 100 K and 1 M.Results show that our model consistently outperforms several existing methods,and its extensibility allows for incorporating additional signals like user reviews and multimodal data to enhance recommendation quality.
基金funded by the National Natural Science Foundation of China under Grant No.62362019the Hainan Provincial Natural Science Foundation of China under Grant No.624RC482.
文摘Metadata prefetching and data placement play a critical role in enhancing access performance for file systems operating over wide-area networks.However,developing effective strategies for metadata prefetching in environments with concurrent workloads and for data placement across distributed networks remains a significant challenge.This study introduces novel and efficient methodologies for metadata prefetching and data placement,leveraging fine-grained control of prefetching strategies and variable-sized data fragment writing to optimize the I/O bandwidth of distributed file systems.The proposed metadata prefetching technique employs dynamic workload analysis to identify dominant workload patterns and adaptively refines prefetching policies,thereby boosting metadata access efficiency under concurrent scenarios.Meanwhile,the data placement strategy improves write performance by storing data fragments locally within the nearest data center and transmitting only the fragment location metadata to the remote data center hosting the original file.Experimental evaluations using real-world system traces demonstrate that the proposed approaches reduce metadata access times by up to 33.5%and application data access times by 17.19%compared to state-of-the-art techniques.
基金Project supported by the National Grand Fundamental Research 973 Program of China (Grant No.2004CB318203), and the National Natural Science Foundation of China (Grant No.60603074)
文摘Distribution of metadata in a metadata server cluster is important in mass storage system. A good distribution algorithm has a significant influence on the system performance, availability and scalability. Subtree partition and hash are two traditional metadata distribution algorithms used in distributed file systems. They both have a defect in system scalability. This paper proposes a new directory hash (DH) algorithm. By treating directory as the key value of hash function, implementing concentrated storage of metadata, pipelining operations and prefetching technology, DH algorithm can enhance the system scalability on the premise without sacrificing system performance.
文摘Software industry has evolved to multi-product and multi-platform development based on a mix of proprietary and open source components. Such integration has occurred in software ecosystems through a software product line engineering (SPLE) process. However, metadata are underused in the SPLE and interoperability challenge. The proposed method is first, a semantic metadata enrichment software ecosystem (SMESE) to support multi-platform metadata driven applications, and second, based on mapping ontologies SMESE aggregates and enriches metadata to create a semantic master metadata catalogue (SMMC). The proposed SPLE process uses a component-based software development approach for integrating distributed content management enterprise applications, such as digital libraries. To perform interoperability between existing metadata models (such as Dublin Core, UNIMARC, MARC21, RDF/RDA and BIBFRAME), SMESE implements an ontology mapping model. SMESE consists of nine sub-systems: 1) Metadata initiatives & concordance rules;2) Harvesting of web metadata & data;3) Harvesting of authority metadata & data;4) Rule-based semantic metadata external enrichment;5) Rule-based semantic metadata internal enrichment;6) Semantic metadata external & internal enrichment synchronization;7) User interest-based gateway;8) Semantic master catalogue. To conclude, this paper proposes a decision support process, called SPLE decision support process (SPLE-DSP) which is then used by SMESE to support dynamic reconfiguration. SPLE-DSP consists of a dynamic and optimized metadata-based reconfiguration model. SPLE-DSP takes into account runtime metadata-based variability functionalities, context-awareness and self-adaptation. It also presents the design and implementation of a working prototype of SMESE applied to a semantic digital library.
基金supported by the National Natural Science Foundation of China (No.60873028)the National Basic Research Program (973) of China (No.2004CB318201)+1 种基金the Program for New Century Excellent Talents in University (No.NCET-04-0693)the Innovational Group Project (No.IRT0725),China
文摘In an object-based storage system,a novel scheme named EAP(extending attributes page) is presented to enhance the metadata reliability of the system by adding the user object file information attributes page for each user object and storing the file-related attributes of each user object in object-based storage devices.The EAP scheme requires no additional hardware equipments compared to a general method which uses backup metadata servers to improve the metadata reliability.Leveraging a Markov chain,this paper compares the metadata reliability of the system using the EAP scheme with that using only metadata servers to offer the file metadata service.Our results demonstrate that the EAP scheme can dramatically enhance the reliability of storage system metadata.
文摘To construct the Agricultural Scientific and Technical Information Core Metadata (ASTICM) standard and its expanding principles, and to develop a register system based on ASTICM, the policy and methods of DC (Dublin Core) and SDBCM (Scientific Database Core Metadata) were studied. The construction of ASTICM has started from the proposed elements of the DCMI (Dublin Core Metadata Initiative), and has expanded the DC and SDBCM with related expanding principles. ASTICM finally includes 75 metadata elements, five expanded principles, and seven application profile creation methods. According to the requirement analysis of a large number of users of agricultural information, a register system based on ASTICM was developed. The ASTICM standard framework and its register system supported the search, sharing, integration exchange and other applications, effectively.
基金Supported by the Industrialized Foundation ofHebei Province (F020501)
文摘The reliability and high performance of metadata service is crucial to the store architecture. A novel design of a two-level metadata server file system (TTMFS) is presented, which behaves high reliability and performance. The merits both centralized management and distributed management are considered simultaneously in our design. In this file system, the advanced-metadata server is responsible for manage directory metadata and the whole namespace. The double-metadata server is responsible for maintaining file metadata. And this paper uses the Markov return model to analyze the reliability of the two-level metadata server. The experiment data indicates that the design can provide high throughput.
文摘There are differences between the different individuals of learning. Adaptive learning support system is a learning system, which provides the learning supports suitable for the characteristics of the individuals according to the differences in the learning of individuals. In this paper, through the analysis on the adaptive learning support system, a system framework based on SOA is proposed and the research methods of the metadata model are emphatically discussed.
基金This work was supported by the National Natural Science Foundation of China(U2133208,U20A20161).
文摘With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The network security environment in the era of big data presents the characteristics of large amounts of data,high diversity,and high real-time requirements.Traditional security defense methods and tools have been unable to cope with the complex and changing network security threats.This paper proposes a machine-learning security defense algorithm based on metadata association features.Emphasize control over unauthorized users through privacy,integrity,and availability.The user model is established and the mapping between the user model and the metadata of the data source is generated.By analyzing the user model and its corresponding mapping relationship,the query of the user model can be decomposed into the query of various heterogeneous data sources,and the integration of heterogeneous data sources based on the metadata association characteristics can be realized.Define and classify customer information,automatically identify and perceive sensitive data,build a behavior audit and analysis platform,analyze user behavior trajectories,and complete the construction of a machine learning customer information security defense system.The experimental results show that when the data volume is 5×103 bit,the data storage integrity of the proposed method is 92%.The data accuracy is 98%,and the success rate of data intrusion is only 2.6%.It can be concluded that the data storage method in this paper is safe,the data accuracy is always at a high level,and the data disaster recovery performance is good.This method can effectively resist data intrusion and has high air traffic control security.It can not only detect all viruses in user data storage,but also realize integrated virus processing,and further optimize the security defense effect of user big data.
基金supported by the National Key Research and Development Program of China(2021YFB2600405).
文摘In view of the problems of inconsistent data semantics,inconsistent data formats,and difficult data quality assurance between the railway engineering design phase and the construction and operation phase,as well as the difficulty in fully realizing the value of design results,this paper proposes a design and implementation scheme for a railway engineering collaborative design platform.The railway engineering collaborative design platform mainly includes functional modules such as metadata management,design collaboration,design delivery management,model component library,model rendering services,and Building Information Modeling(BIM)application services.Based on this,research is conducted on multi-disciplinary parameterized collaborative design technology for railway engineering,infrastructure data management and delivery technology,and design multi-source data fusion and application technology.The railway engineering collaborative design platform is compared with other railway design software to further validate its advantages and advanced features.The platform has been widely applied in multiple railway construction projects,greatly improving the design and project management efficiency.
基金The Advanced University Action Plan of the Minis-try of Education of China (2004XD-03).
文摘An ontology and metadata for online learning resource repository management is constructed. First, based on the analysis of the use-case diagram, the upper ontology is illustrated which includes resource library ontology and user ontology, and evaluated from its function and implementation; then the corresponding class diagram, resource description framework (RDF) schema and extensible markup language (XML) schema are given. Secondly, the metadata for online learning resource repository management is proposed based on the Dublin Core Metadata Initiative and the IEEE Learning Technologies Standards Committee Learning Object Metadata Working Group. Finally, the inference instance is shown, which proves the validity of ontology and metadata in online learning resource repository management.
基金Supported by the Youth Innovation Fund of Fujian Academy of Agricultural Science(2010QB-17)the Science and Technology Bureau Project of Fujian Province(2008S1001)the Financial Special Project of Fujian Province(STIF-Y07)~~
文摘[Objective] To study the information description of vegetable planting metadata model. [Method] On the basis of analyzing the data involved in every as- pect of vegetable planting, this paper put forward description schemes of vegetable planting metadata and constructed vegetable planting metadata model by the means of XML/XML schema. [Result] Metadata model of vegetable planting was established, and information description of vegetable planting metadata model was realized by the using of XML Schema. The whole metadata model consists of 7 first-class classifica- tions, including more than 800 information description points which could completely record vegetable planting-related information. [Conclusion] Standards for data collec- tion, management and sharing were provided for the agriculture applications in indus- tries like GAP management of vegetable planting, facility vegetable, food quality traceability, etc.
文摘metadata是“关于数据的数据”,本文介绍了 m etadata的基本情况 ,并对 HTML 和 XML 环境的几个 m eta-data规范进行了论述 (包括 Dublin core,PICS,Web Collections,CDF ,MCF及 RDF)。由于 metadata在 Internet信息资源的组织和发现方面起着非常重要的作用 ,作者呼吁国人应当加强对 metadata的研究。