Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recogni...Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recognition, image processing, and etc. We combine sampling technique with DBSCAN algorithm to cluster large spatial databases, and two sampling based DBSCAN (SDBSCAN) algorithms are developed. One algorithm introduces sampling technique inside DBSCAN, and the other uses sampling procedure outside DBSCAN. Experimental results demonstrate that our algorithms are effective and efficient in clustering large scale spatial databases.展开更多
In this paper, constrained K closest pairs query is introduced, wbich retrieves the K closest pairs satisfying the given spatial constraint from two datasets. For data sets indexed by R trees in spatial databases, thr...In this paper, constrained K closest pairs query is introduced, wbich retrieves the K closest pairs satisfying the given spatial constraint from two datasets. For data sets indexed by R trees in spatial databases, three algorithms are presented for answering this kind of query. Among of them, two-phase Range+Join and Join+Range algorithms adopt the strategy that changes the execution order of range and closest pairs queries, and constrained heap-based algorithm utilizes extended distance functions to prune search space and minimize the pruning distance. Experimental results show that constrained heap-base algorithm has better applicability and performance than two-phase algorithms.展开更多
It is a period of information explosion. Especially for spatial information science, information can be acquired through many ways, such as man made planet, aeroplane, laser, digital photogrammetry and so on. Spatial...It is a period of information explosion. Especially for spatial information science, information can be acquired through many ways, such as man made planet, aeroplane, laser, digital photogrammetry and so on. Spatial data sources are usually distributed and heterogeneous. Federated database is the best resolution for the share and interoperation of spatial database. In this paper, the concepts of federated database and interoperability are introduced. Three heterogeneous kinds of spatial data, vector, image and DEM are used to create integrated database. A data model of federated spatial databases is given.展开更多
Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically...Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically scattered in a geometrical domain, spatial objects may be similar to each other in a non-geometrical domain. Most existing clustering algorithms group spatial datasets into different compact regions in a geometrical domain without considering the aspect of a non-geometrical domain. However, many application scenarios require clustering results in which a cluster has not only high proximity in a geometrical domain, but also high similarity in a non-geometrical domain. This means constraints are imposed on the clustering goal from both geometrical and non-geometrical domains simultaneously. Such a clustering problem is called dual clustering. As distributed clustering applications become more and more popular, it is necessary to tackle the dual clustering problem in distributed databases. The DCAD algorithm is proposed to solve this problem. DCAD consists of two levels of clustering: local clustering and global clustering. First, clustering is conducted at each local site with a local clustering algorithm, and the features of local clusters are extracted clustering is obtained based on those features fective and efficient. Second, local features from each site are sent to a central site where global Experiments on both artificial and real spatial datasets show that DCAD is effective and efficient.展开更多
The huge amount of information stored in databases owned by corporations (e.g., retail, financial, telecom) has spurred a tremendous interest in the area of knowledge discovery and data mining. Clustering, in data mi...The huge amount of information stored in databases owned by corporations (e.g., retail, financial, telecom) has spurred a tremendous interest in the area of knowledge discovery and data mining. Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recognition, image processing, and other business applications. Although researchers have been working on clustering algorithms for decades, and a lot of algorithms for clustering have been developed, there is still no efficient algorithm for clustering very large databases and high dimensional data. As an outstanding representative of clustering algorithms, DBSCAN algorithm shows good performance in spatial data clustering. However, for large spatial databases, DBSCAN requires large volume of memory support and could incur substantial I/O costs because it operates directly on the entire database. In this paper, several approaches are proposed to scale DBSCAN algorithm to large spatial databases. To begin with, a fast DBSCAN algorithm is developed, which considerably speeds up the original DBSCAN algorithm. Then a sampling based DBSCAN algorithm, a partitioning-based DBSCAN algorithm, and a parallel DBSCAN algorithm are introduced consecutively. Following that, based on the above-proposed algorithms, a synthetic algorithm is also given. Finally, some experimental results are given to demonstrate the effectiveness and efficiency of these algorithms.展开更多
Spatial databases store objects with their locations and certain types of attached items.A variety of modern applications have been developed by leveraging the utilization of locations and items in spatial objects,suc...Spatial databases store objects with their locations and certain types of attached items.A variety of modern applications have been developed by leveraging the utilization of locations and items in spatial objects,such as searching points of interest,hot topics,or users’attitude in specified spatial regions.In many scenarios,the high and low-frequency items in a spatial region are worth noticing,considering they represent the majority’s interest or eccentric users’opinion.However,existing works have yet to identify such items in an interactive manner,despite the significance of the endeavor in decision-making systems.This study recognizes a novel type of analytical query,called top/bottom-k fraction query,to discover such items in spatial databases.To achieve fast query response,we propose a multilayered data summary that is spread out across the main memory and external memory.A memory-based estimation method for top/bottom-k fraction queries is proposed.To maximize the use of the main memory space,we design a data summary tuning method to dynamically allocate memory space among different spatial partitions.The proposed approach is evaluated with real-life datasets and synthetic datasets in terms of estimation accuracy.Evaluation results demonstrate the effectiveness of the proposed data summary and corresponding estimation and tuning algorithms.展开更多
Urban environmental quality research is crucial,as cities become competitive centers concentrating human talent,industrial activity,and financial resources,contributing significantly to national economies.Municipal an...Urban environmental quality research is crucial,as cities become competitive centers concentrating human talent,industrial activity,and financial resources,contributing significantly to national economies.Municipal and government priorities include retaining residents,preventing skilled worker outflow,and meeting the evolving needs of urban populations.The study presents the development and application of a scenario-based spatial analysis tool for assessing urban environmental quality at a detailed spatial scale within the city of Novosibirsk.Using advanced geoinformatics,GIS techniques,and an expert knowledge base,the tool integrates diverse thematic data layers with user-defined scenarios to compute and visualize the Scenario-based Urban Environment Quality Index across 87,905 standardized unit areas.The methodology incorporates comprehensive criteria aligned with existing urban planning frameworks and includes demographic targeting to address the city’s heterogeneous population.Validation against expert evaluations demonstrates high accuracy and consistency,while dynamicmodeling capabilities facilitate monitoring the effects of planned urban development initiatives.This approach bridges a critical gap in urban planning by providing granular,data-driven insights that reflect residents’real needs and spatial inequalities.The tool greatly benefits municipal authorities by enabling evidence-based prioritization of interventions,fostering inclusive and sustainable urban growth,and enhancing transparency and participatory governance.Its implementation as a no-code/low-code QGIS plugin ensures wide accessibility and practical application in strategic urban development,marking a significant advancement in urban environment quality assessment science and practice.展开更多
Define and theory of autocorrelation decision tree (ADT) is introduced. In spatial data mining, spatial parallel query are very expensive operations. A new parallel algorithm in terms of autocorrelation decision tre...Define and theory of autocorrelation decision tree (ADT) is introduced. In spatial data mining, spatial parallel query are very expensive operations. A new parallel algorithm in terms of autocorrelation decision tree is presented. And the new method reduces CPU- and I/O-time and improves the query efficiency of spatial data. For dynamic load balancing, there are better control and optimization. Experimental performance comparison shows that the improved algorithm can obtain a optimal accelerator with the same quantities of processors. There are more completely accesses on nodes. And an individual implement of intelligent information retrieval for spatial data mining is presented.展开更多
Direction is a common spatial concept that is used in our daily life. It is frequently used as a selection condition in spatial queries. As a result, it is important for spatial databases to provide a mechanism for mo...Direction is a common spatial concept that is used in our daily life. It is frequently used as a selection condition in spatial queries. As a result, it is important for spatial databases to provide a mechanism for modeling and processing direction queries and reasoning. Depending on the direction relation matrix, an inverted direction relation matrix and the concept of direction pre- dominance are proposed to improve the detection of direction relation between objects. Direction predicates of spatial systems are also extended. These techniques can improve the veracity of direction queries and reasoning. Experiments show excellent efficiency and performance in view of direction queries.展开更多
Mobile devices with global positioning capabilities allow users to retrieve points of interest (POI) in their proximity. Due to the nature of spatial queries, location-based service (LBS) needs the user position in or...Mobile devices with global positioning capabilities allow users to retrieve points of interest (POI) in their proximity. Due to the nature of spatial queries, location-based service (LBS) needs the user position in order to process requests. On the other hand, revealing exact user locations to LBS may pinpoint their identities and breach their privacy. Spatial K-anonymity (SKA) exploits the concept of K-anonymity in order to protect the identity of users from location-based attacks. However, existing reciprocal methods rely on a specialized data structure. In contrast, a reciprocal algorithm was proposed using existing spatial index on the user locations. At the same time, an adjusted median splits algorithm was provided. Finally, according to effectiveness (i.e., anonymizing spatial region size) and efficiency (i.e., construction cost), the experimental results verify that the proposed methods have better performance. Moreover, since using employ general-purpose spatial indices, the proposed method supports conventional spatial queries as well.展开更多
With the deployment of modern infrastructure for public transportation, several studies have analyzed movement patterns of people using smart card data and have characterized different areas. In this paper, we propose...With the deployment of modern infrastructure for public transportation, several studies have analyzed movement patterns of people using smart card data and have characterized different areas. In this paper, we propose the “movement purpose hypothesis” that each movement occurs from two causes: where the person is and what the person wants to do at a given moment. We formulate this hypothesis to a synthesis model in which two network graphs generate a movement network graph. Then we develop two novel-embedding models to assess the hypothesis, and demonstrate that the models obtain a vector representation of a geospatial area using movement patterns of people from large-scale smart card data. We conducted an experiment using smart card data for a large network of railroads in the Kansai region of Japan. We obtained a vector representation of each railroad station and each purpose using the developed embedding models. Results show that network embedding methods are suitable for a large-scale movement of data, and the developed models perform better than existing embedding methods in the task of multi-label classification for train stations on the purpose of use data set. Our proposed models can contribute to the prediction of people flows by discovering underlying representations of geospatial areas from mobility data.展开更多
A novel Hilbert-curve is introduced for parallel spatial data partitioning, with consideration of the huge-amount property of spatial information and the variable-length characteristic of vector data items. Based on t...A novel Hilbert-curve is introduced for parallel spatial data partitioning, with consideration of the huge-amount property of spatial information and the variable-length characteristic of vector data items. Based on the improved Hilbert curve, the algorithm can be designed to achieve almost-uniform spatial data partitioning among multiple disks in parallel spatial databases. Thus, the phenomenon of data imbalance can be significantly avoided and search and query efficiency can be enhanced.展开更多
This paper presents a cadastral spatial data storage structure based on relational database,the method and the procedure to realize it.The paper consists of three parts.In the first part,some existing problems in some...This paper presents a cadastral spatial data storage structure based on relational database,the method and the procedure to realize it.The paper consists of three parts.In the first part,some existing problems in some developed cadastral management systems are discussed.These problems are the following four.1) The security of cadastral spatial data is difficult to be assured.2) It is difficult to varify cadastral data and the integrality of cadastral data is difficult to be kept.3) To transmit and share cadastral data is difficult.4) The efficiency of data access is low.In the second part,the feasibility of using relational database to store spatial data is analyzed and a new cadastral spatial data storage structure is presented.At the same time,the related table structures and field descriptions are given,and then the merits and demerits of this storage structure are analyzed in detail.In the last part,through a real example,the detailed methods to make the new storage structure a reality are given.Moreover,some involving key techniques of the new storage structure are discussed.These techniques are:1) the application of database transaction,2) the application of database trigger,3) and the application of secure recovery of database.展开更多
The spatial database management system of China geological survey extent is a social service system. Its aim is to help the government and the whole social public to expediently use the spatial database, such as query...The spatial database management system of China geological survey extent is a social service system. Its aim is to help the government and the whole social public to expediently use the spatial database, such as querying, indexing, mapping and product outputting. The management system has been developed based on MAPGIS6.x SDK and Visual C++, considering the spatial database contents and structure and the requirements of users. This paper introduces the software structure, the data flow chart and some key techniques of software development.展开更多
With the deepening informationization of Resources&Environment Remote Sensing geological survey conducted,some potential problems and deficiency are:(1)shortage of unified-planed running environment;(2)inconsisten...With the deepening informationization of Resources&Environment Remote Sensing geological survey conducted,some potential problems and deficiency are:(1)shortage of unified-planed running environment;(2)inconsistent methods of data integration;and(3)disadvantages of different performing ways of data integration.This paper solves the above problems through overall planning and design,constructs unified running environment.展开更多
To meet the requirements of efficient management and web publishing for marine remote sensing data, a spatial database engine, named MRSSDE, is designed independently. The logical model, physical model, and optimizati...To meet the requirements of efficient management and web publishing for marine remote sensing data, a spatial database engine, named MRSSDE, is designed independently. The logical model, physical model, and optimization method of MRSSDE are discussed in detail. Compared to the ArcSDE, which is the leading product of Spatial Database Engine, the MRSSDE proved to be more effective.展开更多
During the whole 20th century in China, especially the latest 50 years, we have gotten much geological information about geological mapping, geophysics, geochemistry, mineral exploration, remote sensing, environmental...During the whole 20th century in China, especially the latest 50 years, we have gotten much geological information about geological mapping, geophysics, geochemistry, mineral exploration, remote sensing, environmental geology, hydrogeology, engineering geology and oceanic geology etc. by our geologists and explorers. All the information has been accumulated and can be used as a decision-making foundation for the future plan of geological survey. The spatial database of geological survey extents has been established by using computer technology. The database contained all kinds of exploration sections and collected about 160 000 records in this database. This paper introduces the data construction, contents and applying system of this database, and trys to let people know what kinds of geological survey were finished, when the exploration were carried out, and how and where you can get this information.展开更多
Abstract In the study of sequence stratigraphy and litho-paleogeography, quantitative analysis, precise calculation and detailed comparison of tremendous geological data, such as field profiles, logging records and se...Abstract In the study of sequence stratigraphy and litho-paleogeography, quantitative analysis, precise calculation and detailed comparison of tremendous geological data, such as field profiles, logging records and seismic curves from different areas, are the basic requirements. In order to obtain a more reliable and precise result, this paper presents a novel method that combines spatial database analysis with the single-factor mapping technology to establish sequence stratigraphical succession and to map the Ordovician litho-paleogeography of the Ordos Basin, one of the largest oil-gas bearing basins in North China Platform. By using this method, all of the related basic geological data can be quantitatively analyzed and effectively managed. Various attributes of the basic stratigraphic units and their characters, such as sequence thickness, penecontemporaneous dolostone content, shallow water parget content, and terrigenous material content, can be fully utilized statistically in facies analysis and in mapping. Based on this analysis, this paper has be exerted single-factor isopachous mapping quantitatively for each of the Ordovician sequences in the basin, and finally synthesized multiple factors to reconstruct the litho-paleogeography for each of the sequence intervals. The study shows that the proposed method is quite effective and has a much higher resolution in recognizing litho-paleogeographic subunits compared with traditional ways. For example, in one of the Middle Ordovician sequence intervals (SQ19 in the Lower Majiagou Formation) of the Ordos Basin, the authors have successfully developed a mathematical formula to divide the distribution of various facies units substantially, such as old lands, submarine uplifts, supratidal zones, intertidal zones and subtidal zones.展开更多
The searching method of spatial information on traditional geo-archives catalog database(TGCD)is based on the text,and the result of retrieval can be only from the text of fields of relational database.The information...The searching method of spatial information on traditional geo-archives catalog database(TGCD)is based on the text,and the result of retrieval can be only from the text of fields of relational database.The information queried must be input into the relational database as a text form in advance,otherwise,the visitors would not get any result from it.So.展开更多
Aiming at the problem that current geographical information systems(GIS)usually does not maintain semantic and user-defined constraints out of three consistency-constrains(third refers to topology constraint),this res...Aiming at the problem that current geographical information systems(GIS)usually does not maintain semantic and user-defined constraints out of three consistency-constrains(third refers to topology constraint),this research focuses on building an efficient spatial data management system using two constraint violation detection methods.An algorithm for constraint violation detection has been derived to maintain the error-free up-to-date spatial database.Results indicate that the developed constraint violation detection(CVD)system is more efficient compared with conventional systems.展开更多
基金Supported by the Open Researches Fund Program of L IESMARS(WKL(0 0 ) 0 30 2 )
文摘Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recognition, image processing, and etc. We combine sampling technique with DBSCAN algorithm to cluster large spatial databases, and two sampling based DBSCAN (SDBSCAN) algorithms are developed. One algorithm introduces sampling technique inside DBSCAN, and the other uses sampling procedure outside DBSCAN. Experimental results demonstrate that our algorithms are effective and efficient in clustering large scale spatial databases.
基金Supported by National Natural Science Foundationof China (60073045)
文摘In this paper, constrained K closest pairs query is introduced, wbich retrieves the K closest pairs satisfying the given spatial constraint from two datasets. For data sets indexed by R trees in spatial databases, three algorithms are presented for answering this kind of query. Among of them, two-phase Range+Join and Join+Range algorithms adopt the strategy that changes the execution order of range and closest pairs queries, and constrained heap-based algorithm utilizes extended distance functions to prune search space and minimize the pruning distance. Experimental results show that constrained heap-base algorithm has better applicability and performance than two-phase algorithms.
基金Supported by the National Nature Science Foundation under"Outstanding Young Researchers"(495 2 5 10 1)
文摘It is a period of information explosion. Especially for spatial information science, information can be acquired through many ways, such as man made planet, aeroplane, laser, digital photogrammetry and so on. Spatial data sources are usually distributed and heterogeneous. Federated database is the best resolution for the share and interoperation of spatial database. In this paper, the concepts of federated database and interoperability are introduced. Three heterogeneous kinds of spatial data, vector, image and DEM are used to create integrated database. A data model of federated spatial databases is given.
基金Funded by the National 973 Program of China (No.2003CB415205)the National Natural Science Foundation of China (No.40523005, No.60573183, No.60373019)the Open Research Fund Program of LIESMARS (No.WKL(04)0303).
文摘Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically scattered in a geometrical domain, spatial objects may be similar to each other in a non-geometrical domain. Most existing clustering algorithms group spatial datasets into different compact regions in a geometrical domain without considering the aspect of a non-geometrical domain. However, many application scenarios require clustering results in which a cluster has not only high proximity in a geometrical domain, but also high similarity in a non-geometrical domain. This means constraints are imposed on the clustering goal from both geometrical and non-geometrical domains simultaneously. Such a clustering problem is called dual clustering. As distributed clustering applications become more and more popular, it is necessary to tackle the dual clustering problem in distributed databases. The DCAD algorithm is proposed to solve this problem. DCAD consists of two levels of clustering: local clustering and global clustering. First, clustering is conducted at each local site with a local clustering algorithm, and the features of local clusters are extracted clustering is obtained based on those features fective and efficient. Second, local features from each site are sent to a central site where global Experiments on both artificial and real spatial datasets show that DCAD is effective and efficient.
基金This work was supported by the National Natural Science Foundation of China! (No.69743001) the National Doctoral Subject Fou
文摘The huge amount of information stored in databases owned by corporations (e.g., retail, financial, telecom) has spurred a tremendous interest in the area of knowledge discovery and data mining. Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recognition, image processing, and other business applications. Although researchers have been working on clustering algorithms for decades, and a lot of algorithms for clustering have been developed, there is still no efficient algorithm for clustering very large databases and high dimensional data. As an outstanding representative of clustering algorithms, DBSCAN algorithm shows good performance in spatial data clustering. However, for large spatial databases, DBSCAN requires large volume of memory support and could incur substantial I/O costs because it operates directly on the entire database. In this paper, several approaches are proposed to scale DBSCAN algorithm to large spatial databases. To begin with, a fast DBSCAN algorithm is developed, which considerably speeds up the original DBSCAN algorithm. Then a sampling based DBSCAN algorithm, a partitioning-based DBSCAN algorithm, and a parallel DBSCAN algorithm are introduced consecutively. Following that, based on the above-proposed algorithms, a synthetic algorithm is also given. Finally, some experimental results are given to demonstrate the effectiveness and efficiency of these algorithms.
基金partly supported by the National Natural Science Foundation of China(Nos.61602129,61872106,and 61772157)。
文摘Spatial databases store objects with their locations and certain types of attached items.A variety of modern applications have been developed by leveraging the utilization of locations and items in spatial objects,such as searching points of interest,hot topics,or users’attitude in specified spatial regions.In many scenarios,the high and low-frequency items in a spatial region are worth noticing,considering they represent the majority’s interest or eccentric users’opinion.However,existing works have yet to identify such items in an interactive manner,despite the significance of the endeavor in decision-making systems.This study recognizes a novel type of analytical query,called top/bottom-k fraction query,to discover such items in spatial databases.To achieve fast query response,we propose a multilayered data summary that is spread out across the main memory and external memory.A memory-based estimation method for top/bottom-k fraction queries is proposed.To maximize the use of the main memory space,we design a data summary tuning method to dynamically allocate memory space among different spatial partitions.The proposed approach is evaluated with real-life datasets and synthetic datasets in terms of estimation accuracy.Evaluation results demonstrate the effectiveness of the proposed data summary and corresponding estimation and tuning algorithms.
基金funded by theMinistry of Science and Higher Education of Russia,R&D project number FEFS-2026-0003.
文摘Urban environmental quality research is crucial,as cities become competitive centers concentrating human talent,industrial activity,and financial resources,contributing significantly to national economies.Municipal and government priorities include retaining residents,preventing skilled worker outflow,and meeting the evolving needs of urban populations.The study presents the development and application of a scenario-based spatial analysis tool for assessing urban environmental quality at a detailed spatial scale within the city of Novosibirsk.Using advanced geoinformatics,GIS techniques,and an expert knowledge base,the tool integrates diverse thematic data layers with user-defined scenarios to compute and visualize the Scenario-based Urban Environment Quality Index across 87,905 standardized unit areas.The methodology incorporates comprehensive criteria aligned with existing urban planning frameworks and includes demographic targeting to address the city’s heterogeneous population.Validation against expert evaluations demonstrates high accuracy and consistency,while dynamicmodeling capabilities facilitate monitoring the effects of planned urban development initiatives.This approach bridges a critical gap in urban planning by providing granular,data-driven insights that reflect residents’real needs and spatial inequalities.The tool greatly benefits municipal authorities by enabling evidence-based prioritization of interventions,fostering inclusive and sustainable urban growth,and enhancing transparency and participatory governance.Its implementation as a no-code/low-code QGIS plugin ensures wide accessibility and practical application in strategic urban development,marking a significant advancement in urban environment quality assessment science and practice.
文摘Define and theory of autocorrelation decision tree (ADT) is introduced. In spatial data mining, spatial parallel query are very expensive operations. A new parallel algorithm in terms of autocorrelation decision tree is presented. And the new method reduces CPU- and I/O-time and improves the query efficiency of spatial data. For dynamic load balancing, there are better control and optimization. Experimental performance comparison shows that the improved algorithm can obtain a optimal accelerator with the same quantities of processors. There are more completely accesses on nodes. And an individual implement of intelligent information retrieval for spatial data mining is presented.
文摘Direction is a common spatial concept that is used in our daily life. It is frequently used as a selection condition in spatial queries. As a result, it is important for spatial databases to provide a mechanism for modeling and processing direction queries and reasoning. Depending on the direction relation matrix, an inverted direction relation matrix and the concept of direction pre- dominance are proposed to improve the detection of direction relation between objects. Direction predicates of spatial systems are also extended. These techniques can improve the veracity of direction queries and reasoning. Experiments show excellent efficiency and performance in view of direction queries.
基金National Natural Science Foundation of China(No.61070032)
文摘Mobile devices with global positioning capabilities allow users to retrieve points of interest (POI) in their proximity. Due to the nature of spatial queries, location-based service (LBS) needs the user position in order to process requests. On the other hand, revealing exact user locations to LBS may pinpoint their identities and breach their privacy. Spatial K-anonymity (SKA) exploits the concept of K-anonymity in order to protect the identity of users from location-based attacks. However, existing reciprocal methods rely on a specialized data structure. In contrast, a reciprocal algorithm was proposed using existing spatial index on the user locations. At the same time, an adjusted median splits algorithm was provided. Finally, according to effectiveness (i.e., anonymizing spatial region size) and efficiency (i.e., construction cost), the experimental results verify that the proposed methods have better performance. Moreover, since using employ general-purpose spatial indices, the proposed method supports conventional spatial queries as well.
文摘With the deployment of modern infrastructure for public transportation, several studies have analyzed movement patterns of people using smart card data and have characterized different areas. In this paper, we propose the “movement purpose hypothesis” that each movement occurs from two causes: where the person is and what the person wants to do at a given moment. We formulate this hypothesis to a synthesis model in which two network graphs generate a movement network graph. Then we develop two novel-embedding models to assess the hypothesis, and demonstrate that the models obtain a vector representation of a geospatial area using movement patterns of people from large-scale smart card data. We conducted an experiment using smart card data for a large network of railroads in the Kansai region of Japan. We obtained a vector representation of each railroad station and each purpose using the developed embedding models. Results show that network embedding methods are suitable for a large-scale movement of data, and the developed models perform better than existing embedding methods in the task of multi-label classification for train stations on the purpose of use data set. Our proposed models can contribute to the prediction of people flows by discovering underlying representations of geospatial areas from mobility data.
基金Funded by the National 863 Program of China (No. 2005AA113150), and the National Natural Science Foundation of China (No.40701158).
文摘A novel Hilbert-curve is introduced for parallel spatial data partitioning, with consideration of the huge-amount property of spatial information and the variable-length characteristic of vector data items. Based on the improved Hilbert curve, the algorithm can be designed to achieve almost-uniform spatial data partitioning among multiple disks in parallel spatial databases. Thus, the phenomenon of data imbalance can be significantly avoided and search and query efficiency can be enhanced.
文摘This paper presents a cadastral spatial data storage structure based on relational database,the method and the procedure to realize it.The paper consists of three parts.In the first part,some existing problems in some developed cadastral management systems are discussed.These problems are the following four.1) The security of cadastral spatial data is difficult to be assured.2) It is difficult to varify cadastral data and the integrality of cadastral data is difficult to be kept.3) To transmit and share cadastral data is difficult.4) The efficiency of data access is low.In the second part,the feasibility of using relational database to store spatial data is analyzed and a new cadastral spatial data storage structure is presented.At the same time,the related table structures and field descriptions are given,and then the merits and demerits of this storage structure are analyzed in detail.In the last part,through a real example,the detailed methods to make the new storage structure a reality are given.Moreover,some involving key techniques of the new storage structure are discussed.These techniques are:1) the application of database transaction,2) the application of database trigger,3) and the application of secure recovery of database.
文摘The spatial database management system of China geological survey extent is a social service system. Its aim is to help the government and the whole social public to expediently use the spatial database, such as querying, indexing, mapping and product outputting. The management system has been developed based on MAPGIS6.x SDK and Visual C++, considering the spatial database contents and structure and the requirements of users. This paper introduces the software structure, the data flow chart and some key techniques of software development.
文摘With the deepening informationization of Resources&Environment Remote Sensing geological survey conducted,some potential problems and deficiency are:(1)shortage of unified-planed running environment;(2)inconsistent methods of data integration;and(3)disadvantages of different performing ways of data integration.This paper solves the above problems through overall planning and design,constructs unified running environment.
基金Supported by the National 863 High-Tech Program of China (No.2007AA12Z237), the Natural Science Foundation of China (No. 40571123).
文摘To meet the requirements of efficient management and web publishing for marine remote sensing data, a spatial database engine, named MRSSDE, is designed independently. The logical model, physical model, and optimization method of MRSSDE are discussed in detail. Compared to the ArcSDE, which is the leading product of Spatial Database Engine, the MRSSDE proved to be more effective.
文摘During the whole 20th century in China, especially the latest 50 years, we have gotten much geological information about geological mapping, geophysics, geochemistry, mineral exploration, remote sensing, environmental geology, hydrogeology, engineering geology and oceanic geology etc. by our geologists and explorers. All the information has been accumulated and can be used as a decision-making foundation for the future plan of geological survey. The spatial database of geological survey extents has been established by using computer technology. The database contained all kinds of exploration sections and collected about 160 000 records in this database. This paper introduces the data construction, contents and applying system of this database, and trys to let people know what kinds of geological survey were finished, when the exploration were carried out, and how and where you can get this information.
基金Supported by the National Natural Science Foundation of China for the Innovation Group Project(No. 40621002)
文摘Abstract In the study of sequence stratigraphy and litho-paleogeography, quantitative analysis, precise calculation and detailed comparison of tremendous geological data, such as field profiles, logging records and seismic curves from different areas, are the basic requirements. In order to obtain a more reliable and precise result, this paper presents a novel method that combines spatial database analysis with the single-factor mapping technology to establish sequence stratigraphical succession and to map the Ordovician litho-paleogeography of the Ordos Basin, one of the largest oil-gas bearing basins in North China Platform. By using this method, all of the related basic geological data can be quantitatively analyzed and effectively managed. Various attributes of the basic stratigraphic units and their characters, such as sequence thickness, penecontemporaneous dolostone content, shallow water parget content, and terrigenous material content, can be fully utilized statistically in facies analysis and in mapping. Based on this analysis, this paper has be exerted single-factor isopachous mapping quantitatively for each of the Ordovician sequences in the basin, and finally synthesized multiple factors to reconstruct the litho-paleogeography for each of the sequence intervals. The study shows that the proposed method is quite effective and has a much higher resolution in recognizing litho-paleogeographic subunits compared with traditional ways. For example, in one of the Middle Ordovician sequence intervals (SQ19 in the Lower Majiagou Formation) of the Ordos Basin, the authors have successfully developed a mathematical formula to divide the distribution of various facies units substantially, such as old lands, submarine uplifts, supratidal zones, intertidal zones and subtidal zones.
文摘The searching method of spatial information on traditional geo-archives catalog database(TGCD)is based on the text,and the result of retrieval can be only from the text of fields of relational database.The information queried must be input into the relational database as a text form in advance,otherwise,the visitors would not get any result from it.So.
文摘Aiming at the problem that current geographical information systems(GIS)usually does not maintain semantic and user-defined constraints out of three consistency-constrains(third refers to topology constraint),this research focuses on building an efficient spatial data management system using two constraint violation detection methods.An algorithm for constraint violation detection has been derived to maintain the error-free up-to-date spatial database.Results indicate that the developed constraint violation detection(CVD)system is more efficient compared with conventional systems.