Individual Tree Detection-and-Counting(ITDC)is among the important tasks in town areas,and numerous methods are proposed in this direction.Despite their many advantages,still,the proposed methods are inadequate to pro...Individual Tree Detection-and-Counting(ITDC)is among the important tasks in town areas,and numerous methods are proposed in this direction.Despite their many advantages,still,the proposed methods are inadequate to provide robust results because they mostly rely on the direct field investigations.This paper presents a novel approach involving high-resolution imagery and the Canopy-Height-Model(CHM)data to solve the ITDC problem.The new approach is studied in six urban scenes:farmland,woodland,park,industrial land,road and residential areas.First,it identifies tree canopy regions using a deep learning network from high-resolution imagery.It then deploys the CHM-data to detect treetops of the canopy regions using a local maximum algorithm and individual tree canopies using the region growing.Finally,it calculates and describes the number of individual trees and tree canopies.The proposed approach is experimented with the data from Shanghai,China.Our results show that the individual tree detection method had an average overall accuracy of 0.953,with a precision of 0.987 for woodland scene.Meanwhile,the R^(2) value for canopy segmentation in different urban scenes is greater than 0.780 and 0.779 for canopy area and diameter size,respectively.These results confirm that the proposed method is robust enough for urban tree planning and management.展开更多
The transmission of scientific data over long distances is required to enable interplanetary science expeditions. Current approaches include transmitting all collected data or transmitting low resolution data to enabl...The transmission of scientific data over long distances is required to enable interplanetary science expeditions. Current approaches include transmitting all collected data or transmitting low resolution data to enable ground controller review and selection of data for transmission. Model-based data transmission (MBDT) seeks to increase the amount of knowledge conveyed per unit of data transmitted by comparing high-resolution data collected in situ to a pre-existing (or potentially co-transmitted) model. This paper describes the application of MBDT to gravitational data and characterizes its utility and performance. This is performed by applying the MBDT technique to a selection of gravitational data previously collected for the Earth and comparing the transmission requirements to the level required for raw data transmis-sion and non-application-aware compression. Levels of transmission reduction up to 31.8% (without the use maximum-error-thresholding) and up to 97.17% (with the use of maximum-error-thresholding) resulted. These levels significantly exceed what is possible with non-application-aware compression.展开更多
The study aimed to develop a customized Data Governance Maturity Model (DGMM) for the Ministry of Defence (MoD) in Kenya to address data governance challenges in military settings. Current frameworks lack specific req...The study aimed to develop a customized Data Governance Maturity Model (DGMM) for the Ministry of Defence (MoD) in Kenya to address data governance challenges in military settings. Current frameworks lack specific requirements for the defence industry. The model uses Key Performance Indicators (KPIs) to enhance data governance procedures. Design Science Research guided the study, using qualitative and quantitative methods to gather data from MoD personnel. Major deficiencies were found in data integration, quality control, and adherence to data security regulations. The DGMM helps the MOD improve personnel, procedures, technology, and organizational elements related to data management. The model was tested against ISO/IEC 38500 and recommended for use in other government sectors with similar data governance issues. The DGMM has the potential to enhance data management efficiency, security, and compliance in the MOD and guide further research in military data governance.展开更多
DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expres...DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.展开更多
To improve the performance of the traditional map matching algorithms in freeway traffic state monitoring systems using the low logging frequency GPS (global positioning system) probe data, a map matching algorithm ...To improve the performance of the traditional map matching algorithms in freeway traffic state monitoring systems using the low logging frequency GPS (global positioning system) probe data, a map matching algorithm based on the Oracle spatial data model is proposed. The algorithm uses the Oracle road network data model to analyze the spatial relationships between massive GPS positioning points and freeway networks, builds an N-shortest path algorithm to find reasonable candidate routes between GPS positioning points efficiently, and uses the fuzzy logic inference system to determine the final matched traveling route. According to the implementation with field data from Los Angeles, the computation speed of the algorithm is about 135 GPS positioning points per second and the accuracy is 98.9%. The results demonstrate the effectiveness and accuracy of the proposed algorithm for mapping massive GPS positioning data onto freeway networks with complex geometric characteristics.展开更多
Cooling process of iron ore pellets in a circular cooler has great impacts on the pellet quality and systematic energy exploitation. However, multi-variables and non-visualization of this gray system is unfavorable to...Cooling process of iron ore pellets in a circular cooler has great impacts on the pellet quality and systematic energy exploitation. However, multi-variables and non-visualization of this gray system is unfavorable to efficient production. Thus, the cooling process of iron ore pellets was optimized using mathematical model and data mining techniques. A mathematical model was established and validated by steady-state production data, and the results show that the calculated values coincide very well with the measured values. Based on the proposed model, effects of important process parameters on gas-pellet temperature profiles within the circular cooler were analyzed to better understand the entire cooling process. Two data mining techniques—Association Rules Induction and Clustering were also applied on the steady-state production data to obtain expertise operating rules and optimized targets. Finally, an optimized control strategy for the circular cooler was proposed and an operation guidance system was developed. The system could realize the visualization of thermal process at steady state and provide operation guidance to optimize the circular cooler.展开更多
An object oriented data modelling in computer aided design (CAD) databases is focused. Starting with the discussion of data modelling requirements for CAD applications, appropriate data modelling features are introdu...An object oriented data modelling in computer aided design (CAD) databases is focused. Starting with the discussion of data modelling requirements for CAD applications, appropriate data modelling features are introduced herewith. A feasible approach to select the “best” data model for an application is to analyze the data which has to be stored in the database. A data model is appropriate for modelling a given task if the information of the application environment can be easily mapped to the data model. Thus, the involved data are analyzed and then object oriented data model appropriate for CAD applications are derived. Based on the reviewed object oriented techniques applied in CAD, object oriented data modelling in CAD is addressed in details. At last 3D geometrical data models and implementation of their data model using the object oriented method are presented.展开更多
A uniform metadata representation is introduced for heterogeneous databases, multi media information and other information sources. Some features about metadata are analyzed. The limitation of existing metadata model...A uniform metadata representation is introduced for heterogeneous databases, multi media information and other information sources. Some features about metadata are analyzed. The limitation of existing metadata model is compared with the new one. The metadata model is described in XML which is fit for metadata denotation and exchange. The well structured data, semi structured data and those exterior file data without structure are described in the metadata model. The model provides feasibility and extensibility for constructing uniform metadata model of data warehouse.展开更多
The conception of multilevel security (MLS) is commonly used in the study of data model for secure database. But there are some limitations in the basic MLS model, such as inference channels. The availability and data...The conception of multilevel security (MLS) is commonly used in the study of data model for secure database. But there are some limitations in the basic MLS model, such as inference channels. The availability and data integrity of the system are seriously constrained by it′s 'No Read Up, No Write Down' property in the basic MLS model. In order to eliminate the covert channels, the polyinstantiation and the cover story are used in the new data model. The read and write rules have been redefined for improving the agility and usability of the system based on the MLS model. All the methods in the improved data model make the system more secure, agile and usable.展开更多
Hydrocarbon production from shale has attracted much attention in the recent years. When applied to this prolific and hydrocarbon rich resource plays, our understanding of the complexities of the flow mechanism(sorpt...Hydrocarbon production from shale has attracted much attention in the recent years. When applied to this prolific and hydrocarbon rich resource plays, our understanding of the complexities of the flow mechanism(sorption process and flow behavior in complex fracture systems- induced or natural) leaves much to be desired. In this paper, we present and discuss a novel approach to modeling, history matching of hydrocarbon production from a Marcellus shale asset in southwestern Pennsylvania using advanced data mining, pattern recognition and machine learning technologies. In this new approach instead of imposing our understanding of the flow mechanism, the impact of multi-stage hydraulic fractures, and the production process on the reservoir model, we allow the production history, well log, completion and hydraulic fracturing data to guide our model and determine its behavior. The uniqueness of this technology is that it incorporates the so-called "hard data" directly into the reservoir model, so that the model can be used to optimize the hydraulic fracture process. The "hard data" refers to field measurements during the hydraulic fracturing process such as fluid and proppant type and amount, injection pressure and rate as well as proppant concentration. This novel approach contrasts with the current industry focus on the use of "soft data"(non-measured, interpretive data such as frac length, width,height and conductivity) in the reservoir models. The study focuses on a Marcellus shale asset that includes 135 wells with multiple pads, different landing targets, well length and reservoir properties. The full field history matching process was successfully completed using this data driven approach thus capturing the production behavior with acceptable accuracy for individual wells and for the entire asset.展开更多
Atmospheric CO_(2)is one of key parameters to estimate air-sea CO_(2)flux.The Orbiting Carbon Observatory-2(OCO-2)satellite has observed the column-averaged dry-air mole fractions of global atmospheric carbon dioxide(...Atmospheric CO_(2)is one of key parameters to estimate air-sea CO_(2)flux.The Orbiting Carbon Observatory-2(OCO-2)satellite has observed the column-averaged dry-air mole fractions of global atmospheric carbon dioxide(XCO_(2))since 2014.In this study,the OCO-2 XCO_(2)products were compared between in-situ data from the Total Carbon Column Network(TCCON)and Global Monitoring Division(GMD),and modeling data from CarbonTracker2019 over global ocean and land.Results showed that the OCO-2 XCO_(2)data are consistent with the TCCON and GMD in situ XCO_(2)data,with mean absolute biases of 0.25×10^(-6)and 0.67×10^(-6),respectively.Moreover,the OCO-2 XCO_(2)data are also consistent with the CarbonTracker2019 modeling XCO_(2)data,with mean absolute biases of 0.78×10^(-6)over ocean and 1.02×10^(-6)over land.The results indicated the high accuracy of the OCO-2 XCO_(2)product over global ocean which could be applied to estimate the air-sea CO_(2)flux.展开更多
This is the second of a three-part series of papers which presents the principle and architecture of the CRNM, a trajectory-oriented, carriageway-based road network data model. The first part of the series has introdu...This is the second of a three-part series of papers which presents the principle and architecture of the CRNM, a trajectory-oriented, carriageway-based road network data model. The first part of the series has introduced a general background of building trajectory-oriented road network data models, including motivation, related works, and basic concepts. Based on it, this paper describs the CRNM in detail. At first, the notion of basic roadway entity is proposed and discussed. Secondly, carriageway is selected as the basic roadway entity after compared with other kinds of roadway, and approaches to representing other roadways with carriageways are introduced. At last, an overall architecture of the CRNM is proposed.展开更多
This is the first of a three-part series of pape rs which introduces a general background of building trajectory-oriented road net work data models, including motivation, related works, and basic concepts. The p urpos...This is the first of a three-part series of pape rs which introduces a general background of building trajectory-oriented road net work data models, including motivation, related works, and basic concepts. The p urpose of the series is to develop a trajectory-oriented road network data mode l, namely carriageway-based road network data model (CRNM). Part 1 deals with t he modeling background. Part 2 proposes the principle and architecture of the CR NM. Part 3 investigates the implementation of the CRNM in a case study. In the p resent paper, the challenges of managing trajectory data are discussed. Then, de veloping trajectory-oriented road network data models is proposed as a solution and existing road network data models are reviewed. Basic representation approa ches of a road network are introduced as well as its constitution.展开更多
This is the final of a three-part series of papers which mainly discusses the implementation issues of the CRNM. The first two papers in the series have introduced the modeling background and methodology, respectively...This is the final of a three-part series of papers which mainly discusses the implementation issues of the CRNM. The first two papers in the series have introduced the modeling background and methodology, respectively. An overall architecture of the CRNM has been proposed in the last paper. On the basis of the above discusses, a linear reference method (LRM) for providing spatial references for location points of a trajectory is developed. A case study is introduced to illustrate the application of the CRNM for modeling a road network in the real world is given. A comprehensive conclusion is given for the series of papers.展开更多
Target detection is always an important application in hyperspectral image processing field. In this paper, a spectral-spatial target detection algorithm for hyperspectral data is proposed.The spatial feature and spec...Target detection is always an important application in hyperspectral image processing field. In this paper, a spectral-spatial target detection algorithm for hyperspectral data is proposed.The spatial feature and spectral feature were unified based on the data filed theory and extracted by weighted manifold embedding. The novelties of the proposed method lie in two aspects. One is the way in which the spatial features and spectral features were fused as a new feature based on the data field theory, and the other is that local information was introduced to describe the decision boundary and explore the discriminative features for target detection. The extracted features based on data field modeling and manifold embedding techniques were considered for a target detection task.Three standard hyperspectral datasets were considered in the analysis. The effectiveness of the proposed target detection algorithm based on data field theory was proved by the higher detection rates with lower False Alarm Rates(FARs) with respect to those achieved by conventional hyperspectral target detectors.展开更多
Multidatabase systems are designed to achieve schema integration and data interoperation among distributed and heterogeneous database systems. But data model heterogeneity and schema heterogeneity make this a challeng...Multidatabase systems are designed to achieve schema integration and data interoperation among distributed and heterogeneous database systems. But data model heterogeneity and schema heterogeneity make this a challenging task. A multidatabase common data model is firstly introduced based on XML, named XML-based Integration Data Model (XIDM), which is suitable for integrating different types of schemas. Then an approach of schema mappings based on XIDM in multidatabase systems has been presented. The mappings include global mappings, dealing with horizontal and vertical partitioning between global schemas and export schemas, and local mappings, processing the transformation between export schemas and local schemas. Finally, the illustration and implementation of schema mappings in a multidatabase prototype - Panorama system are also discussed. The implementation results demonstrate that the XIDM is an efficient model for managing multiple heterogeneous data sources and the approaches of schema mapping based on XIDM behave very well when integrating relational, object-oriented database systems and other file systems.展开更多
Guangzhou is the capital and largest city(land area:7287 km2)of Guangdong province in South China.The air quality in Guangzhou typically worsens in November due to unfavorable meteorological conditions for pollutan...Guangzhou is the capital and largest city(land area:7287 km2)of Guangdong province in South China.The air quality in Guangzhou typically worsens in November due to unfavorable meteorological conditions for pollutant dispersion.During the Guangzhou Asian Games in November 2010,the Guangzhou government carried out a number of emission control measures that significantly improved the air quality.In this paper,we estimated the acute health outcome changes related to the air quality improvement during the 2010 Guangzhou Asian Games using a next-generation,fully-integrated assessment system for air quality and health benefits.This advanced system generates air quality data by fusing model and monitoring data instead of using monitoring data alone,which provides more reliable results.The air quality estimates retain the spatial distribution of model results while calibrating the value with observations.The results show that the mean PM2.5concentration in November 2010 decreased by 3.5μg/m^3 compared to that in 2009 due to the emission control measures.From the analysis,we estimate that the air quality improvement avoided 106 premature deaths,1869 cases of hospital admission,and 20,026 cases of outpatient visits.The overall cost benefit of the improved air quality is estimated to be 165 million CNY,with the avoided premature death contributing 90%of this figure.The research demonstrates that Ben MAP-CE is capable of assessing the health and cost benefits of air pollution control for sound policy making.展开更多
The parametric temporal data model captures a real world entity in a single tuple, which reduces query language complexity. Such a data model, however, is difficult to be implemented on top of conventional databases b...The parametric temporal data model captures a real world entity in a single tuple, which reduces query language complexity. Such a data model, however, is difficult to be implemented on top of conventional databases because of its unfixed attribute sizes. XML is a matured technology and can be an elegant solution for such challenge. Representing data in XML trigger a question about storage efficiency. The goal of this work is to provide a straightforward answer to such a question. To this end, we compare three different storage models for the parametric temporal data model and show that XML is not worse than any other approaches. Furthermore, XML outperforms the other storages under certain conditions. Therefore, our simulation results provide a positive indication that the myth about XML is not true in the parametric temporal data model.展开更多
Symbol portrayal is an important function of GIS. Sharing symbolic information in different GIS platforms is necessary for GIS applications and users. This paper discusses the necessity, possibility and solution techn...Symbol portrayal is an important function of GIS. Sharing symbolic information in different GIS platforms is necessary for GIS applications and users. This paper discusses the necessity, possibility and solution technique of sharing a symbol library in different GIS platforms. The route map is designed as follows: first, to set up a general data model for the symbol library, then to design a standard exchange format, and finally to call on the GIS manufacturer to provide the interchange tools for their symbol library for the standard exchange format. This paper analyzes the general characteristics of GIS symbolic library, gives a symbol library model and a draft of XML schema of the symbol library exchange format.展开更多
Marine information has been increasing quickly. The traditional database technologies have disadvantages in manipulating large amounts of marine information which relates to the position in 3-D with the time. Recently...Marine information has been increasing quickly. The traditional database technologies have disadvantages in manipulating large amounts of marine information which relates to the position in 3-D with the time. Recently, greater emphasis has been placed on GIS (geographical information system)to deal with the marine information. The GIS has shown great success for terrestrial applications in the last decades, but its use in marine fields has been far more restricted. One of the main reasons is that most of the GIS systems or their data models are designed for land applications. They cannot do well with the nature of the marine environment and for the marine information. And this becomes a fundamental challenge to the traditional GIS and its data structure. This work designed a data model, the raster-based spatio-temporal hierarchical data model (RSHDM), for the marine information system, or for the knowledge discovery fi'om spatio-temporal data, which bases itself on the nature of the marine data and overcomes the shortages of the current spatio-temporal models when they are used in the field. As an experiment, the marine fishery data warehouse (FDW) for marine fishery management was set up, which was based on the RSHDM. The experiment proved that the RSHDM can do well with the data and can extract easily the aggregations that the management needs at different levels.展开更多
基金supported by the project funded by International Research Center of Big Data for Sustainable 740 Development Goals[Grant Number CBAS2022GSP07]Fundamental Research Funds for the Central Universities,Chongqing Natural Science Foundation[Grant Number CSTB2022NSCQMSX 2069]Ministry of Education of China[Grant Number 19JZD023].
文摘Individual Tree Detection-and-Counting(ITDC)is among the important tasks in town areas,and numerous methods are proposed in this direction.Despite their many advantages,still,the proposed methods are inadequate to provide robust results because they mostly rely on the direct field investigations.This paper presents a novel approach involving high-resolution imagery and the Canopy-Height-Model(CHM)data to solve the ITDC problem.The new approach is studied in six urban scenes:farmland,woodland,park,industrial land,road and residential areas.First,it identifies tree canopy regions using a deep learning network from high-resolution imagery.It then deploys the CHM-data to detect treetops of the canopy regions using a local maximum algorithm and individual tree canopies using the region growing.Finally,it calculates and describes the number of individual trees and tree canopies.The proposed approach is experimented with the data from Shanghai,China.Our results show that the individual tree detection method had an average overall accuracy of 0.953,with a precision of 0.987 for woodland scene.Meanwhile,the R^(2) value for canopy segmentation in different urban scenes is greater than 0.780 and 0.779 for canopy area and diameter size,respectively.These results confirm that the proposed method is robust enough for urban tree planning and management.
文摘The transmission of scientific data over long distances is required to enable interplanetary science expeditions. Current approaches include transmitting all collected data or transmitting low resolution data to enable ground controller review and selection of data for transmission. Model-based data transmission (MBDT) seeks to increase the amount of knowledge conveyed per unit of data transmitted by comparing high-resolution data collected in situ to a pre-existing (or potentially co-transmitted) model. This paper describes the application of MBDT to gravitational data and characterizes its utility and performance. This is performed by applying the MBDT technique to a selection of gravitational data previously collected for the Earth and comparing the transmission requirements to the level required for raw data transmis-sion and non-application-aware compression. Levels of transmission reduction up to 31.8% (without the use maximum-error-thresholding) and up to 97.17% (with the use of maximum-error-thresholding) resulted. These levels significantly exceed what is possible with non-application-aware compression.
文摘The study aimed to develop a customized Data Governance Maturity Model (DGMM) for the Ministry of Defence (MoD) in Kenya to address data governance challenges in military settings. Current frameworks lack specific requirements for the defence industry. The model uses Key Performance Indicators (KPIs) to enhance data governance procedures. Design Science Research guided the study, using qualitative and quantitative methods to gather data from MoD personnel. Major deficiencies were found in data integration, quality control, and adherence to data security regulations. The DGMM helps the MOD improve personnel, procedures, technology, and organizational elements related to data management. The model was tested against ISO/IEC 38500 and recommended for use in other government sectors with similar data governance issues. The DGMM has the potential to enhance data management efficiency, security, and compliance in the MOD and guide further research in military data governance.
文摘DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.
文摘To improve the performance of the traditional map matching algorithms in freeway traffic state monitoring systems using the low logging frequency GPS (global positioning system) probe data, a map matching algorithm based on the Oracle spatial data model is proposed. The algorithm uses the Oracle road network data model to analyze the spatial relationships between massive GPS positioning points and freeway networks, builds an N-shortest path algorithm to find reasonable candidate routes between GPS positioning points efficiently, and uses the fuzzy logic inference system to determine the final matched traveling route. According to the implementation with field data from Los Angeles, the computation speed of the algorithm is about 135 GPS positioning points per second and the accuracy is 98.9%. The results demonstrate the effectiveness and accuracy of the proposed algorithm for mapping massive GPS positioning data onto freeway networks with complex geometric characteristics.
基金Item Sponsored by National Natural Science Foundation of China(51174253)
文摘Cooling process of iron ore pellets in a circular cooler has great impacts on the pellet quality and systematic energy exploitation. However, multi-variables and non-visualization of this gray system is unfavorable to efficient production. Thus, the cooling process of iron ore pellets was optimized using mathematical model and data mining techniques. A mathematical model was established and validated by steady-state production data, and the results show that the calculated values coincide very well with the measured values. Based on the proposed model, effects of important process parameters on gas-pellet temperature profiles within the circular cooler were analyzed to better understand the entire cooling process. Two data mining techniques—Association Rules Induction and Clustering were also applied on the steady-state production data to obtain expertise operating rules and optimized targets. Finally, an optimized control strategy for the circular cooler was proposed and an operation guidance system was developed. The system could realize the visualization of thermal process at steady state and provide operation guidance to optimize the circular cooler.
文摘An object oriented data modelling in computer aided design (CAD) databases is focused. Starting with the discussion of data modelling requirements for CAD applications, appropriate data modelling features are introduced herewith. A feasible approach to select the “best” data model for an application is to analyze the data which has to be stored in the database. A data model is appropriate for modelling a given task if the information of the application environment can be easily mapped to the data model. Thus, the involved data are analyzed and then object oriented data model appropriate for CAD applications are derived. Based on the reviewed object oriented techniques applied in CAD, object oriented data modelling in CAD is addressed in details. At last 3D geometrical data models and implementation of their data model using the object oriented method are presented.
文摘A uniform metadata representation is introduced for heterogeneous databases, multi media information and other information sources. Some features about metadata are analyzed. The limitation of existing metadata model is compared with the new one. The metadata model is described in XML which is fit for metadata denotation and exchange. The well structured data, semi structured data and those exterior file data without structure are described in the metadata model. The model provides feasibility and extensibility for constructing uniform metadata model of data warehouse.
文摘The conception of multilevel security (MLS) is commonly used in the study of data model for secure database. But there are some limitations in the basic MLS model, such as inference channels. The availability and data integrity of the system are seriously constrained by it′s 'No Read Up, No Write Down' property in the basic MLS model. In order to eliminate the covert channels, the polyinstantiation and the cover story are used in the new data model. The read and write rules have been redefined for improving the agility and usability of the system based on the MLS model. All the methods in the improved data model make the system more secure, agile and usable.
基金RPSEA and U.S.Department of Energy for partially funding this study
文摘Hydrocarbon production from shale has attracted much attention in the recent years. When applied to this prolific and hydrocarbon rich resource plays, our understanding of the complexities of the flow mechanism(sorption process and flow behavior in complex fracture systems- induced or natural) leaves much to be desired. In this paper, we present and discuss a novel approach to modeling, history matching of hydrocarbon production from a Marcellus shale asset in southwestern Pennsylvania using advanced data mining, pattern recognition and machine learning technologies. In this new approach instead of imposing our understanding of the flow mechanism, the impact of multi-stage hydraulic fractures, and the production process on the reservoir model, we allow the production history, well log, completion and hydraulic fracturing data to guide our model and determine its behavior. The uniqueness of this technology is that it incorporates the so-called "hard data" directly into the reservoir model, so that the model can be used to optimize the hydraulic fracture process. The "hard data" refers to field measurements during the hydraulic fracturing process such as fluid and proppant type and amount, injection pressure and rate as well as proppant concentration. This novel approach contrasts with the current industry focus on the use of "soft data"(non-measured, interpretive data such as frac length, width,height and conductivity) in the reservoir models. The study focuses on a Marcellus shale asset that includes 135 wells with multiple pads, different landing targets, well length and reservoir properties. The full field history matching process was successfully completed using this data driven approach thus capturing the production behavior with acceptable accuracy for individual wells and for the entire asset.
基金The National Key Research and Development Programme of China under contract No.2017YFA0603004the Fund of Southern Marine Science and Engineering Guangdong Laboratory(Zhanjiang)(Zhanjiang Bay Laboratory)under contract No.ZJW-2019-08+1 种基金the National Natural Science Foundation of China under contract Nos 41825014,41676172 and 41676170the Global Change and Air-Sea Interaction Project of China under contract Nos GASI-02-SCS-YGST2-01,GASI-02-PACYGST2-01 and GASI-02-IND-YGST2-01。
文摘Atmospheric CO_(2)is one of key parameters to estimate air-sea CO_(2)flux.The Orbiting Carbon Observatory-2(OCO-2)satellite has observed the column-averaged dry-air mole fractions of global atmospheric carbon dioxide(XCO_(2))since 2014.In this study,the OCO-2 XCO_(2)products were compared between in-situ data from the Total Carbon Column Network(TCCON)and Global Monitoring Division(GMD),and modeling data from CarbonTracker2019 over global ocean and land.Results showed that the OCO-2 XCO_(2)data are consistent with the TCCON and GMD in situ XCO_(2)data,with mean absolute biases of 0.25×10^(-6)and 0.67×10^(-6),respectively.Moreover,the OCO-2 XCO_(2)data are also consistent with the CarbonTracker2019 modeling XCO_(2)data,with mean absolute biases of 0.78×10^(-6)over ocean and 1.02×10^(-6)over land.The results indicated the high accuracy of the OCO-2 XCO_(2)product over global ocean which could be applied to estimate the air-sea CO_(2)flux.
文摘This is the second of a three-part series of papers which presents the principle and architecture of the CRNM, a trajectory-oriented, carriageway-based road network data model. The first part of the series has introduced a general background of building trajectory-oriented road network data models, including motivation, related works, and basic concepts. Based on it, this paper describs the CRNM in detail. At first, the notion of basic roadway entity is proposed and discussed. Secondly, carriageway is selected as the basic roadway entity after compared with other kinds of roadway, and approaches to representing other roadways with carriageways are introduced. At last, an overall architecture of the CRNM is proposed.
文摘This is the first of a three-part series of pape rs which introduces a general background of building trajectory-oriented road net work data models, including motivation, related works, and basic concepts. The p urpose of the series is to develop a trajectory-oriented road network data mode l, namely carriageway-based road network data model (CRNM). Part 1 deals with t he modeling background. Part 2 proposes the principle and architecture of the CR NM. Part 3 investigates the implementation of the CRNM in a case study. In the p resent paper, the challenges of managing trajectory data are discussed. Then, de veloping trajectory-oriented road network data models is proposed as a solution and existing road network data models are reviewed. Basic representation approa ches of a road network are introduced as well as its constitution.
文摘This is the final of a three-part series of papers which mainly discusses the implementation issues of the CRNM. The first two papers in the series have introduced the modeling background and methodology, respectively. An overall architecture of the CRNM has been proposed in the last paper. On the basis of the above discusses, a linear reference method (LRM) for providing spatial references for location points of a trajectory is developed. A case study is introduced to illustrate the application of the CRNM for modeling a road network in the real world is given. A comprehensive conclusion is given for the series of papers.
文摘Target detection is always an important application in hyperspectral image processing field. In this paper, a spectral-spatial target detection algorithm for hyperspectral data is proposed.The spatial feature and spectral feature were unified based on the data filed theory and extracted by weighted manifold embedding. The novelties of the proposed method lie in two aspects. One is the way in which the spatial features and spectral features were fused as a new feature based on the data field theory, and the other is that local information was introduced to describe the decision boundary and explore the discriminative features for target detection. The extracted features based on data field modeling and manifold embedding techniques were considered for a target detection task.Three standard hyperspectral datasets were considered in the analysis. The effectiveness of the proposed target detection algorithm based on data field theory was proved by the higher detection rates with lower False Alarm Rates(FARs) with respect to those achieved by conventional hyperspectral target detectors.
文摘Multidatabase systems are designed to achieve schema integration and data interoperation among distributed and heterogeneous database systems. But data model heterogeneity and schema heterogeneity make this a challenging task. A multidatabase common data model is firstly introduced based on XML, named XML-based Integration Data Model (XIDM), which is suitable for integrating different types of schemas. Then an approach of schema mappings based on XIDM in multidatabase systems has been presented. The mappings include global mappings, dealing with horizontal and vertical partitioning between global schemas and export schemas, and local mappings, processing the transformation between export schemas and local schemas. Finally, the illustration and implementation of schema mappings in a multidatabase prototype - Panorama system are also discussed. The implementation results demonstrate that the XIDM is an efficient model for managing multiple heterogeneous data sources and the approaches of schema mapping based on XIDM behave very well when integrating relational, object-oriented database systems and other file systems.
基金provided by the US Environmental Protection Agency(No.5-312-0212979-51786L)the Guangzhou EnvironmentalProtection Bureau(No.x2hj B2150020)+3 种基金the project of an integrated modeling and filed observational verification on the deposition of typical industrial point-source mercury emissions in the Pearl River Deltsupported by the funding of the Guangdong Provincial Key Laboratory of Atmospheric Environment and Pollution Control(No.2011A060901011)the project of Atmospheric Haze Collaboration Control Technology Design from the Chinese Academy of Sciences(No.XDB05030400)the National Environmental Protection Public Welfare Industry Targeted Research Foundation of China(No.201409019)
文摘Guangzhou is the capital and largest city(land area:7287 km2)of Guangdong province in South China.The air quality in Guangzhou typically worsens in November due to unfavorable meteorological conditions for pollutant dispersion.During the Guangzhou Asian Games in November 2010,the Guangzhou government carried out a number of emission control measures that significantly improved the air quality.In this paper,we estimated the acute health outcome changes related to the air quality improvement during the 2010 Guangzhou Asian Games using a next-generation,fully-integrated assessment system for air quality and health benefits.This advanced system generates air quality data by fusing model and monitoring data instead of using monitoring data alone,which provides more reliable results.The air quality estimates retain the spatial distribution of model results while calibrating the value with observations.The results show that the mean PM2.5concentration in November 2010 decreased by 3.5μg/m^3 compared to that in 2009 due to the emission control measures.From the analysis,we estimate that the air quality improvement avoided 106 premature deaths,1869 cases of hospital admission,and 20,026 cases of outpatient visits.The overall cost benefit of the improved air quality is estimated to be 165 million CNY,with the avoided premature death contributing 90%of this figure.The research demonstrates that Ben MAP-CE is capable of assessing the health and cost benefits of air pollution control for sound policy making.
基金supported by the National Research Foundation in Korea through contract N-12-NM-IR05
文摘The parametric temporal data model captures a real world entity in a single tuple, which reduces query language complexity. Such a data model, however, is difficult to be implemented on top of conventional databases because of its unfixed attribute sizes. XML is a matured technology and can be an elegant solution for such challenge. Representing data in XML trigger a question about storage efficiency. The goal of this work is to provide a straightforward answer to such a question. To this end, we compare three different storage models for the parametric temporal data model and show that XML is not worse than any other approaches. Furthermore, XML outperforms the other storages under certain conditions. Therefore, our simulation results provide a positive indication that the myth about XML is not true in the parametric temporal data model.
基金Supported by the Spatial Information Engineering Key Laboratory Found of Chinese National Surveying and Mapping Bureau.(No.200722)
文摘Symbol portrayal is an important function of GIS. Sharing symbolic information in different GIS platforms is necessary for GIS applications and users. This paper discusses the necessity, possibility and solution technique of sharing a symbol library in different GIS platforms. The route map is designed as follows: first, to set up a general data model for the symbol library, then to design a standard exchange format, and finally to call on the GIS manufacturer to provide the interchange tools for their symbol library for the standard exchange format. This paper analyzes the general characteristics of GIS symbolic library, gives a symbol library model and a draft of XML schema of the symbol library exchange format.
基金supported by the National Key Basic Research and Development Program of China under contract No.2006CB701305the National Natural Science Foundation of China under coutract No.40571129the National High-Technology Program of China under contract Nos 2002AA639400,2003AA604040 and 2003AA637030.
文摘Marine information has been increasing quickly. The traditional database technologies have disadvantages in manipulating large amounts of marine information which relates to the position in 3-D with the time. Recently, greater emphasis has been placed on GIS (geographical information system)to deal with the marine information. The GIS has shown great success for terrestrial applications in the last decades, but its use in marine fields has been far more restricted. One of the main reasons is that most of the GIS systems or their data models are designed for land applications. They cannot do well with the nature of the marine environment and for the marine information. And this becomes a fundamental challenge to the traditional GIS and its data structure. This work designed a data model, the raster-based spatio-temporal hierarchical data model (RSHDM), for the marine information system, or for the knowledge discovery fi'om spatio-temporal data, which bases itself on the nature of the marine data and overcomes the shortages of the current spatio-temporal models when they are used in the field. As an experiment, the marine fishery data warehouse (FDW) for marine fishery management was set up, which was based on the RSHDM. The experiment proved that the RSHDM can do well with the data and can extract easily the aggregations that the management needs at different levels.