Identification of security risk factors for small reservoirs is the basis for implementation of early warning systems.The manner of identification of the factors for small reservoirs is of practical significance when ...Identification of security risk factors for small reservoirs is the basis for implementation of early warning systems.The manner of identification of the factors for small reservoirs is of practical significance when data are incomplete.The existing grey relational models have some disadvantages in measuring the correlation between categorical data sequences.To this end,this paper introduces a new grey relational model to analyze heterogeneous data.In this study,a set of security risk factors for small reservoirs was first constructed based on theoretical analysis,and heterogeneous data of these factors were recorded as sequences.The sequences were regarded as random variables,and the information entropy and conditional entropy between sequences were measured to analyze the relational degree between risk factors.Then,a new grey relational analysis model for heterogeneous data was constructed,and a comprehensive security risk factor identification method was developed.A case study of small reservoirs in Guangxi Zhuang Autonomous Region in China shows that the model constructed in this study is applicable to security risk factor identification for small reservoirs with heterogeneous and sparse data.展开更多
In this paper,the entity_relation data model for integrating spatio_temporal data is designed.In the design,spatio_temporal data can be effectively stored and spatiao_temporal analysis can be easily realized.
In this paper, the authors present the development of a data modelling tool that visualizes the transformation process of an "Entity-Relationship" Diagram (ERD) into a relational database schema. The authors' foc...In this paper, the authors present the development of a data modelling tool that visualizes the transformation process of an "Entity-Relationship" Diagram (ERD) into a relational database schema. The authors' focus is the design of a tool for educational purposes and its implementation on e-learning database course. The tool presents two stages of database design. The first stage is to draw ERD graphically and validate it. The drawing is done by a learner. Then at second stage, the system enables automatically transformation of ERD to relational database schema by using common rules. Thus, the learner could understand more easily how to apply the theoretical material. A detailed description of system functionalities and algorithm for the conversion are proposed. Finally, a user interface and usage aspects are exposed.展开更多
In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple e...In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple extraction models facemultiple challenges when processing domain-specific data,including insufficient utilization of semantic interaction information between entities and relations,difficulties in handling challenging samples,and the scarcity of domain-specific datasets.To address these issues,our study introduces three innovative components:Relation semantic enhancement,data augmentation,and a voting strategy,all designed to significantly improve the model’s performance in tackling domain-specific relational triple extraction tasks.We first propose an innovative attention interaction module.This method significantly enhances the semantic interaction capabilities between entities and relations by integrating semantic information fromrelation labels.Second,we propose a voting strategy that effectively combines the strengths of large languagemodels(LLMs)and fine-tuned small pre-trained language models(SLMs)to reevaluate challenging samples,thereby improving the model’s adaptability in specific domains.Additionally,we explore the use of LLMs for data augmentation,aiming to generate domain-specific datasets to alleviate the scarcity of domain data.Experiments conducted on three domain-specific datasets demonstrate that our model outperforms existing comparative models in several aspects,with F1 scores exceeding the State of the Art models by 2%,1.6%,and 0.6%,respectively,validating the effectiveness and generalizability of our approach.展开更多
We developed a parallel object relational DBMS named PORLES. It uses BSP model as its parallel computing model, and monoid calculus as its basis of data model. In this paper, we introduce its data model, parallel que...We developed a parallel object relational DBMS named PORLES. It uses BSP model as its parallel computing model, and monoid calculus as its basis of data model. In this paper, we introduce its data model, parallel query optimization, transaction processing system and parallel access method in detail.展开更多
As there is datum redundancy in tradition database and temporal database in existence and the quantities of temporal database are increasing fleetly.We put forward compress storage tactics for temporal datum which com...As there is datum redundancy in tradition database and temporal database in existence and the quantities of temporal database are increasing fleetly.We put forward compress storage tactics for temporal datum which combine compress technology in existence in order to settle datum redundancy in the course of temporal datum storage and temporal datum of slow acting domain and momentary acting domain are accessed by using each from independence clock method and mutual clock method.We also bring forward strategy of gridding storage to resolve the problems of temporal datum rising rapidly.展开更多
Data envelopment analysis(DEA) model is widely used to evaluate the relative efficiency of producers. It is a kind of objective decision method with multiple indexes. However, the two basic models frequently used at p...Data envelopment analysis(DEA) model is widely used to evaluate the relative efficiency of producers. It is a kind of objective decision method with multiple indexes. However, the two basic models frequently used at present, the C2R model and the C2GS2 model have limitations when used alone,resulting in evaluations that are often unsatisfactory. In order to solve this problem, a mixed DEA model is built and is used to evaluate the validity of the business efficiency of listed companies. An explanation of how to use this mixed DEA model is offered and its feasibility is verified.展开更多
In order to set up a conceptual data model that reflects the real world as accurately as possible,this paper firstly reviews and analyzes the disadvantages of previous conceptual data models used by traditional GIS in...In order to set up a conceptual data model that reflects the real world as accurately as possible,this paper firstly reviews and analyzes the disadvantages of previous conceptual data models used by traditional GIS in simulating geographic space,gives a new explanation to geographic space and analyzes its various essential characteristics.Finally,this paper proposes several detailed key points for designing a new type of GIS data model and gives a simple holistic GIS data model.展开更多
Large-scale water pumping has caused significant decline in groundwater level in the Upper Arkansas corridor region, which in turn has triggered a chain of hydrological and ecological impacts. A newly developed concep...Large-scale water pumping has caused significant decline in groundwater level in the Upper Arkansas corridor region, which in turn has triggered a chain of hydrological and ecological impacts. A newly developed conceptualization groundwater data model was used to organize various datasets on the Upper Arkansas corridor groundwater system and to develop a MODFLOW model to simulate groundwater flow in the region from 1959 to 2005. The simulation results have shown a sig- nificant decline in groundwater level and the conversion of Arkansas River from a gaining river to a losing river in the western two-thirds of the study area. Correlation analysis between percentage of salt cedar and the hydrogeological conditions indicates that these hydrogeological changes at least partially account for invasion of salt cedar that is more drought tolerant. The analysis also illustrates the com- plexity of the interaction mechanisms between hydrological conditions and salt cedar distribution, and suggests the need for better data on salt cedar distribution with higher spatial resolution and across larger hydrological gradients.展开更多
This paper concentrates on the problem of data redundancy under the extended-possibility-based model. Based on the information gain in data classification, a measure - relation redundancy - is proposed to evaluate the...This paper concentrates on the problem of data redundancy under the extended-possibility-based model. Based on the information gain in data classification, a measure - relation redundancy - is proposed to evaluate the degree of a given relation being redundant in whole. The properties of relation redundancy are also investigated. This new measure is useful in dealing with data redundancy.展开更多
Data model is the core knowledge of database course.A deep understanding of data model is the key to mastering database design and application.The data models of NoSQL databases are categorized as key-value stores,col...Data model is the core knowledge of database course.A deep understanding of data model is the key to mastering database design and application.The data models of NoSQL databases are categorized as key-value stores,column-oriented stores,document-oriented stores and graph databases.This paper makes a comparative analysis of the characteristics of the relational data model and NoSQL data models,and gives the design and implementation of different data models combined with cases,so that students can master the relevant theories and application methods of the database model.展开更多
MatBase is a prototype data and knowledge base management expert intelligent system based on the Relational,Entity-Relationship,and(Elementary)Mathematical Data Models.Dyadic relationships are quite common in data mod...MatBase is a prototype data and knowledge base management expert intelligent system based on the Relational,Entity-Relationship,and(Elementary)Mathematical Data Models.Dyadic relationships are quite common in data modeling.Besides their relational-type constraints,they often exhibit mathematical properties that are not covered by the Relational Data Model.This paper presents and discusses the MatBase algorithm that assists database designers in discovering all non-relational constraints associated to them,as well as its algorithm for enforcing them,thus providing a significantly higher degree of data quality.展开更多
Enterprise applications utilize relational databases and structured business processes, requiring slow and expensive conversion of inputs and outputs, from business documents such as invoices, purchase orders, and rec...Enterprise applications utilize relational databases and structured business processes, requiring slow and expensive conversion of inputs and outputs, from business documents such as invoices, purchase orders, and receipts, into known templates and schemas before processing. We propose a new LLM Agent-based intelligent data extraction, transformation, and load (IntelligentETL) pipeline that not only ingests PDFs and detects inputs within it but also addresses the extraction of structured and unstructured data by developing tools that most efficiently and securely deal with respective data types. We study the efficiency of our proposed pipeline and compare it with enterprise solutions that also utilize LLMs. We establish the supremacy in timely and accurate data extraction and transformation capabilities of our approach for analyzing the data from varied sources based on nested and/or interlinked input constraints.展开更多
Within the new model of integrated medical and elderly care services,elderly-related data manifest a composite rights structure that integrates both public and private law dimensions.The granular and multi-dimensional...Within the new model of integrated medical and elderly care services,elderly-related data manifest a composite rights structure that integrates both public and private law dimensions.The granular and multi-dimensional nature and heightened sensitivity of such data,combined with the inherent vulnerability and dependency of elderly-related data subjects,render the regulatory landscape particularly complex.Existing mechanisms for data circulation reveal deficiencies,including fragmented legal norms,indeterminate allocation of data ownership,and supervisory inadequacy.This paper conducts a doctrinal inquiry into the legal relationships among multiple stakeholders across three principal dimensions:data service authorisation,data transmission and operation,and data supervision and safeguard.It proposes a regulatory framework based on a dual-track mechanism-combining top-down harmonisation of existing legal provisions with bottom-up implementation of data trusts-supported by a comprehensive oversight architecture involving government agencies,public interest organisations,and industry associations.This framework is intended to ensure the effective protection of the rights and interests of digitally vulnerable elderly individuals.展开更多
基金supported by the National Nature Science Foundation of China(Grant No.71401052)the National Social Science Foundation of China(Grant No.17BGL156)the Key Project of the National Social Science Foundation of China(Grant No.14AZD024)
文摘Identification of security risk factors for small reservoirs is the basis for implementation of early warning systems.The manner of identification of the factors for small reservoirs is of practical significance when data are incomplete.The existing grey relational models have some disadvantages in measuring the correlation between categorical data sequences.To this end,this paper introduces a new grey relational model to analyze heterogeneous data.In this study,a set of security risk factors for small reservoirs was first constructed based on theoretical analysis,and heterogeneous data of these factors were recorded as sequences.The sequences were regarded as random variables,and the information entropy and conditional entropy between sequences were measured to analyze the relational degree between risk factors.Then,a new grey relational analysis model for heterogeneous data was constructed,and a comprehensive security risk factor identification method was developed.A case study of small reservoirs in Guangxi Zhuang Autonomous Region in China shows that the model constructed in this study is applicable to security risk factor identification for small reservoirs with heterogeneous and sparse data.
基金Project supported by the National Surveying Technical Fund(No.200_07)
文摘In this paper,the entity_relation data model for integrating spatio_temporal data is designed.In the design,spatio_temporal data can be effectively stored and spatiao_temporal analysis can be easily realized.
文摘In this paper, the authors present the development of a data modelling tool that visualizes the transformation process of an "Entity-Relationship" Diagram (ERD) into a relational database schema. The authors' focus is the design of a tool for educational purposes and its implementation on e-learning database course. The tool presents two stages of database design. The first stage is to draw ERD graphically and validate it. The drawing is done by a learner. Then at second stage, the system enables automatically transformation of ERD to relational database schema by using common rules. Thus, the learner could understand more easily how to apply the theoretical material. A detailed description of system functionalities and algorithm for the conversion are proposed. Finally, a user interface and usage aspects are exposed.
基金Science and Technology Innovation 2030-Major Project of“New Generation Artificial Intelligence”granted by Ministry of Science and Technology,Grant Number 2020AAA0109300.
文摘In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple extraction models facemultiple challenges when processing domain-specific data,including insufficient utilization of semantic interaction information between entities and relations,difficulties in handling challenging samples,and the scarcity of domain-specific datasets.To address these issues,our study introduces three innovative components:Relation semantic enhancement,data augmentation,and a voting strategy,all designed to significantly improve the model’s performance in tackling domain-specific relational triple extraction tasks.We first propose an innovative attention interaction module.This method significantly enhances the semantic interaction capabilities between entities and relations by integrating semantic information fromrelation labels.Second,we propose a voting strategy that effectively combines the strengths of large languagemodels(LLMs)and fine-tuned small pre-trained language models(SLMs)to reevaluate challenging samples,thereby improving the model’s adaptability in specific domains.Additionally,we explore the use of LLMs for data augmentation,aiming to generate domain-specific datasets to alleviate the scarcity of domain data.Experiments conducted on three domain-specific datasets demonstrate that our model outperforms existing comparative models in several aspects,with F1 scores exceeding the State of the Art models by 2%,1.6%,and 0.6%,respectively,validating the effectiveness and generalizability of our approach.
文摘We developed a parallel object relational DBMS named PORLES. It uses BSP model as its parallel computing model, and monoid calculus as its basis of data model. In this paper, we introduce its data model, parallel query optimization, transaction processing system and parallel access method in detail.
文摘As there is datum redundancy in tradition database and temporal database in existence and the quantities of temporal database are increasing fleetly.We put forward compress storage tactics for temporal datum which combine compress technology in existence in order to settle datum redundancy in the course of temporal datum storage and temporal datum of slow acting domain and momentary acting domain are accessed by using each from independence clock method and mutual clock method.We also bring forward strategy of gridding storage to resolve the problems of temporal datum rising rapidly.
基金Supported by Commission of Science Technology and Industry for National Defense(No, C192005C001)
文摘Data envelopment analysis(DEA) model is widely used to evaluate the relative efficiency of producers. It is a kind of objective decision method with multiple indexes. However, the two basic models frequently used at present, the C2R model and the C2GS2 model have limitations when used alone,resulting in evaluations that are often unsatisfactory. In order to solve this problem, a mixed DEA model is built and is used to evaluate the validity of the business efficiency of listed companies. An explanation of how to use this mixed DEA model is offered and its feasibility is verified.
文摘In order to set up a conceptual data model that reflects the real world as accurately as possible,this paper firstly reviews and analyzes the disadvantages of previous conceptual data models used by traditional GIS in simulating geographic space,gives a new explanation to geographic space and analyzes its various essential characteristics.Finally,this paper proposes several detailed key points for designing a new type of GIS data model and gives a simple holistic GIS data model.
基金supported by the Provost Office’s Targeted Excellence Program at Kansas State University,the U.S. National Science Foundation (No. EPS0553722)the United States Department of Agriculture/Agricultural Research Service (Co-operative Agreement 58-6209-3-018)
文摘Large-scale water pumping has caused significant decline in groundwater level in the Upper Arkansas corridor region, which in turn has triggered a chain of hydrological and ecological impacts. A newly developed conceptualization groundwater data model was used to organize various datasets on the Upper Arkansas corridor groundwater system and to develop a MODFLOW model to simulate groundwater flow in the region from 1959 to 2005. The simulation results have shown a sig- nificant decline in groundwater level and the conversion of Arkansas River from a gaining river to a losing river in the western two-thirds of the study area. Correlation analysis between percentage of salt cedar and the hydrogeological conditions indicates that these hydrogeological changes at least partially account for invasion of salt cedar that is more drought tolerant. The analysis also illustrates the com- plexity of the interaction mechanisms between hydrological conditions and salt cedar distribution, and suggests the need for better data on salt cedar distribution with higher spatial resolution and across larger hydrological gradients.
基金Supported by the National Natural Science Foundation of China(No.70231010/70321001)the Bilateral Scientific and Technological Cooperation between China and Flanders (No.174B0201)
文摘This paper concentrates on the problem of data redundancy under the extended-possibility-based model. Based on the information gain in data classification, a measure - relation redundancy - is proposed to evaluate the degree of a given relation being redundant in whole. The properties of relation redundancy are also investigated. This new measure is useful in dealing with data redundancy.
基金This work was partly supported through the collaborative education projects of production and learning,and 2019 Sichuan teaching reform and research project,and teaching reform and research project of University of Electronic Science and technology in 2019.
文摘Data model is the core knowledge of database course.A deep understanding of data model is the key to mastering database design and application.The data models of NoSQL databases are categorized as key-value stores,column-oriented stores,document-oriented stores and graph databases.This paper makes a comparative analysis of the characteristics of the relational data model and NoSQL data models,and gives the design and implementation of different data models combined with cases,so that students can master the relevant theories and application methods of the database model.
文摘MatBase is a prototype data and knowledge base management expert intelligent system based on the Relational,Entity-Relationship,and(Elementary)Mathematical Data Models.Dyadic relationships are quite common in data modeling.Besides their relational-type constraints,they often exhibit mathematical properties that are not covered by the Relational Data Model.This paper presents and discusses the MatBase algorithm that assists database designers in discovering all non-relational constraints associated to them,as well as its algorithm for enforcing them,thus providing a significantly higher degree of data quality.
文摘Enterprise applications utilize relational databases and structured business processes, requiring slow and expensive conversion of inputs and outputs, from business documents such as invoices, purchase orders, and receipts, into known templates and schemas before processing. We propose a new LLM Agent-based intelligent data extraction, transformation, and load (IntelligentETL) pipeline that not only ingests PDFs and detects inputs within it but also addresses the extraction of structured and unstructured data by developing tools that most efficiently and securely deal with respective data types. We study the efficiency of our proposed pipeline and compare it with enterprise solutions that also utilize LLMs. We establish the supremacy in timely and accurate data extraction and transformation capabilities of our approach for analyzing the data from varied sources based on nested and/or interlinked input constraints.
文摘Within the new model of integrated medical and elderly care services,elderly-related data manifest a composite rights structure that integrates both public and private law dimensions.The granular and multi-dimensional nature and heightened sensitivity of such data,combined with the inherent vulnerability and dependency of elderly-related data subjects,render the regulatory landscape particularly complex.Existing mechanisms for data circulation reveal deficiencies,including fragmented legal norms,indeterminate allocation of data ownership,and supervisory inadequacy.This paper conducts a doctrinal inquiry into the legal relationships among multiple stakeholders across three principal dimensions:data service authorisation,data transmission and operation,and data supervision and safeguard.It proposes a regulatory framework based on a dual-track mechanism-combining top-down harmonisation of existing legal provisions with bottom-up implementation of data trusts-supported by a comprehensive oversight architecture involving government agencies,public interest organisations,and industry associations.This framework is intended to ensure the effective protection of the rights and interests of digitally vulnerable elderly individuals.