Metadata prefetching and data placement play a critical role in enhancing access performance for file systems operating over wide-area networks.However,developing effective strategies for metadata prefetching in envir...Metadata prefetching and data placement play a critical role in enhancing access performance for file systems operating over wide-area networks.However,developing effective strategies for metadata prefetching in environments with concurrent workloads and for data placement across distributed networks remains a significant challenge.This study introduces novel and efficient methodologies for metadata prefetching and data placement,leveraging fine-grained control of prefetching strategies and variable-sized data fragment writing to optimize the I/O bandwidth of distributed file systems.The proposed metadata prefetching technique employs dynamic workload analysis to identify dominant workload patterns and adaptively refines prefetching policies,thereby boosting metadata access efficiency under concurrent scenarios.Meanwhile,the data placement strategy improves write performance by storing data fragments locally within the nearest data center and transmitting only the fragment location metadata to the remote data center hosting the original file.Experimental evaluations using real-world system traces demonstrate that the proposed approaches reduce metadata access times by up to 33.5%and application data access times by 17.19%compared to state-of-the-art techniques.展开更多
Metadata, by definition, is information associated with data, covering where, how, when and by whom the data were acquired. The chance to take part in European projects such as the EU-SeaDataNet (the Pan-European infr...Metadata, by definition, is information associated with data, covering where, how, when and by whom the data were acquired. The chance to take part in European projects such as the EU-SeaDataNet (the Pan-European infrastructure for ocean and marine data management) made it necessary to use XML (Extensible Markup Language) as a standard file format for sharing data and metadata. At present, the Italian National Oceanographic Data Centre (OGS/NODC) has all its data and metadata contained in an Oracle relational database, and the metadata is managed using XML formats and schema (XSD;XML Schema Definition), giving common vocabularies for parameters, instruments, ships, etc. in agreement with European project standards. One problem with XML is the dynamic change in metadata schemas (XSD), necessitating development of a system which is flexible and capable of managing the changes. This paper describes a system for managing oceanographic metadata using XML files and the functionalities provided by the Oracle database. To better manage the XML format, we chose to load into the OGS/NODC database the whole XML file, using a dedicated field. The database Oracle gives us the flexibility to manage the XML format locally, within the OGS/NODC information system, using the XML DB (XML DataBase) Oracle features. This, through the use of XMLType, allows the inclusion of the XML into the database. Furthermore, through the use of the XQuery functions it is possible to create a set of views through which information contained in the XML can be viewed more immediately in a relational form. Using the XML and the XQuery functions, it is possible to store, extract and manage different kinds of information that might be exchanged at the European level. Moreover, with a RESTful (Representational State Transfer) Web Service, we have a simple and standard interface for rapidly and easily creating, modifying and deleting records containing XML files inside the database. Finally, through the use of a RESTful Web Service, it is possible to decouple the application from the database, so that through the use of software that manages HTTP URLs, such as the Mikado (SeaDataNet project), the XML file can be inserted, updated and deleted inside the database without the need for a direct connection to it.展开更多
With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The networ...With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The network security environment in the era of big data presents the characteristics of large amounts of data,high diversity,and high real-time requirements.Traditional security defense methods and tools have been unable to cope with the complex and changing network security threats.This paper proposes a machine-learning security defense algorithm based on metadata association features.Emphasize control over unauthorized users through privacy,integrity,and availability.The user model is established and the mapping between the user model and the metadata of the data source is generated.By analyzing the user model and its corresponding mapping relationship,the query of the user model can be decomposed into the query of various heterogeneous data sources,and the integration of heterogeneous data sources based on the metadata association characteristics can be realized.Define and classify customer information,automatically identify and perceive sensitive data,build a behavior audit and analysis platform,analyze user behavior trajectories,and complete the construction of a machine learning customer information security defense system.The experimental results show that when the data volume is 5×103 bit,the data storage integrity of the proposed method is 92%.The data accuracy is 98%,and the success rate of data intrusion is only 2.6%.It can be concluded that the data storage method in this paper is safe,the data accuracy is always at a high level,and the data disaster recovery performance is good.This method can effectively resist data intrusion and has high air traffic control security.It can not only detect all viruses in user data storage,but also realize integrated virus processing,and further optimize the security defense effect of user big data.展开更多
In view of the problems of inconsistent data semantics,inconsistent data formats,and difficult data quality assurance between the railway engineering design phase and the construction and operation phase,as well as th...In view of the problems of inconsistent data semantics,inconsistent data formats,and difficult data quality assurance between the railway engineering design phase and the construction and operation phase,as well as the difficulty in fully realizing the value of design results,this paper proposes a design and implementation scheme for a railway engineering collaborative design platform.The railway engineering collaborative design platform mainly includes functional modules such as metadata management,design collaboration,design delivery management,model component library,model rendering services,and Building Information Modeling(BIM)application services.Based on this,research is conducted on multi-disciplinary parameterized collaborative design technology for railway engineering,infrastructure data management and delivery technology,and design multi-source data fusion and application technology.The railway engineering collaborative design platform is compared with other railway design software to further validate its advantages and advanced features.The platform has been widely applied in multiple railway construction projects,greatly improving the design and project management efficiency.展开更多
基金funded by the National Natural Science Foundation of China under Grant No.62362019the Hainan Provincial Natural Science Foundation of China under Grant No.624RC482.
文摘Metadata prefetching and data placement play a critical role in enhancing access performance for file systems operating over wide-area networks.However,developing effective strategies for metadata prefetching in environments with concurrent workloads and for data placement across distributed networks remains a significant challenge.This study introduces novel and efficient methodologies for metadata prefetching and data placement,leveraging fine-grained control of prefetching strategies and variable-sized data fragment writing to optimize the I/O bandwidth of distributed file systems.The proposed metadata prefetching technique employs dynamic workload analysis to identify dominant workload patterns and adaptively refines prefetching policies,thereby boosting metadata access efficiency under concurrent scenarios.Meanwhile,the data placement strategy improves write performance by storing data fragments locally within the nearest data center and transmitting only the fragment location metadata to the remote data center hosting the original file.Experimental evaluations using real-world system traces demonstrate that the proposed approaches reduce metadata access times by up to 33.5%and application data access times by 17.19%compared to state-of-the-art techniques.
文摘Metadata, by definition, is information associated with data, covering where, how, when and by whom the data were acquired. The chance to take part in European projects such as the EU-SeaDataNet (the Pan-European infrastructure for ocean and marine data management) made it necessary to use XML (Extensible Markup Language) as a standard file format for sharing data and metadata. At present, the Italian National Oceanographic Data Centre (OGS/NODC) has all its data and metadata contained in an Oracle relational database, and the metadata is managed using XML formats and schema (XSD;XML Schema Definition), giving common vocabularies for parameters, instruments, ships, etc. in agreement with European project standards. One problem with XML is the dynamic change in metadata schemas (XSD), necessitating development of a system which is flexible and capable of managing the changes. This paper describes a system for managing oceanographic metadata using XML files and the functionalities provided by the Oracle database. To better manage the XML format, we chose to load into the OGS/NODC database the whole XML file, using a dedicated field. The database Oracle gives us the flexibility to manage the XML format locally, within the OGS/NODC information system, using the XML DB (XML DataBase) Oracle features. This, through the use of XMLType, allows the inclusion of the XML into the database. Furthermore, through the use of the XQuery functions it is possible to create a set of views through which information contained in the XML can be viewed more immediately in a relational form. Using the XML and the XQuery functions, it is possible to store, extract and manage different kinds of information that might be exchanged at the European level. Moreover, with a RESTful (Representational State Transfer) Web Service, we have a simple and standard interface for rapidly and easily creating, modifying and deleting records containing XML files inside the database. Finally, through the use of a RESTful Web Service, it is possible to decouple the application from the database, so that through the use of software that manages HTTP URLs, such as the Mikado (SeaDataNet project), the XML file can be inserted, updated and deleted inside the database without the need for a direct connection to it.
基金This work was supported by the National Natural Science Foundation of China(U2133208,U20A20161).
文摘With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The network security environment in the era of big data presents the characteristics of large amounts of data,high diversity,and high real-time requirements.Traditional security defense methods and tools have been unable to cope with the complex and changing network security threats.This paper proposes a machine-learning security defense algorithm based on metadata association features.Emphasize control over unauthorized users through privacy,integrity,and availability.The user model is established and the mapping between the user model and the metadata of the data source is generated.By analyzing the user model and its corresponding mapping relationship,the query of the user model can be decomposed into the query of various heterogeneous data sources,and the integration of heterogeneous data sources based on the metadata association characteristics can be realized.Define and classify customer information,automatically identify and perceive sensitive data,build a behavior audit and analysis platform,analyze user behavior trajectories,and complete the construction of a machine learning customer information security defense system.The experimental results show that when the data volume is 5×103 bit,the data storage integrity of the proposed method is 92%.The data accuracy is 98%,and the success rate of data intrusion is only 2.6%.It can be concluded that the data storage method in this paper is safe,the data accuracy is always at a high level,and the data disaster recovery performance is good.This method can effectively resist data intrusion and has high air traffic control security.It can not only detect all viruses in user data storage,but also realize integrated virus processing,and further optimize the security defense effect of user big data.
基金supported by the National Key Research and Development Program of China(2021YFB2600405).
文摘In view of the problems of inconsistent data semantics,inconsistent data formats,and difficult data quality assurance between the railway engineering design phase and the construction and operation phase,as well as the difficulty in fully realizing the value of design results,this paper proposes a design and implementation scheme for a railway engineering collaborative design platform.The railway engineering collaborative design platform mainly includes functional modules such as metadata management,design collaboration,design delivery management,model component library,model rendering services,and Building Information Modeling(BIM)application services.Based on this,research is conducted on multi-disciplinary parameterized collaborative design technology for railway engineering,infrastructure data management and delivery technology,and design multi-source data fusion and application technology.The railway engineering collaborative design platform is compared with other railway design software to further validate its advantages and advanced features.The platform has been widely applied in multiple railway construction projects,greatly improving the design and project management efficiency.