With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heter...With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.展开更多
Due to the development of cloud computing and machine learning,users can upload their data to the cloud for machine learning model training.However,dishonest clouds may infer user data,resulting in user data leakage.P...Due to the development of cloud computing and machine learning,users can upload their data to the cloud for machine learning model training.However,dishonest clouds may infer user data,resulting in user data leakage.Previous schemes have achieved secure outsourced computing,but they suffer from low computational accuracy,difficult-to-handle heterogeneous distribution of data from multiple sources,and high computational cost,which result in extremely poor user experience and expensive cloud computing costs.To address the above problems,we propose amulti-precision,multi-sourced,andmulti-key outsourcing neural network training scheme.Firstly,we design a multi-precision functional encryption computation based on Euclidean division.Second,we design the outsourcing model training algorithm based on a multi-precision functional encryption with multi-sourced heterogeneity.Finally,we conduct experiments on three datasets.The results indicate that our framework achieves an accuracy improvement of 6%to 30%.Additionally,it offers a memory space optimization of 1.0×2^(24) times compared to the previous best approach.展开更多
The power Internet of Things(IoT)is a significant trend in technology and a requirement for national strategic development.With the deepening digital transformation of the power grid,China’s power system has initiall...The power Internet of Things(IoT)is a significant trend in technology and a requirement for national strategic development.With the deepening digital transformation of the power grid,China’s power system has initially built a power IoT architecture comprising a perception,network,and platform application layer.However,owing to the structural complexity of the power system,the construction of the power IoT continues to face problems such as complex access management of massive heterogeneous equipment,diverse IoT protocol access methods,high concurrency of network communications,and weak data security protection.To address these issues,this study optimizes the existing architecture of the power IoT and designs an integrated management framework for the access of multi-source heterogeneous data in the power IoT,comprising cloud,pipe,edge,and terminal parts.It further reviews and analyzes the key technologies involved in the power IoT,such as the unified management of the physical model,high concurrent access,multi-protocol access,multi-source heterogeneous data storage management,and data security control,to provide a more flexible,efficient,secure,and easy-to-use solution for multi-source heterogeneous data access in the power IoT.展开更多
To construct mediators for data integration systems that integrate structured and semi-structured data, and to facilitate the reformulation and decomposition of the query, the presented system uses the XML processing ...To construct mediators for data integration systems that integrate structured and semi-structured data, and to facilitate the reformulation and decomposition of the query, the presented system uses the XML processing language (XPL) for the mediator. With XPL, it is easy to construct mediators for data integration based on XML, and it can accelerate the work in the mediator.展开更多
This paper analyzes the status of existing resources through extensive research and international cooperation on the basis of four typical global monthly surface temperature datasets including the climate research dat...This paper analyzes the status of existing resources through extensive research and international cooperation on the basis of four typical global monthly surface temperature datasets including the climate research dataset of the University of East Anglia(CRUTEM3), the dataset of the U.S. National Climatic Data Center(GHCN-V3), the dataset of the U.S. National Aeronautics and Space Administration(GISSTMP), and the Berkeley Earth surface temperature dataset(Berkeley). China's first global monthly temperature dataset over land was developed by integrating the four aforementioned global temperature datasets and several regional datasets from major countries or regions. This dataset contains information from 9,519 stations worldwide of at least 20 years for monthly mean temperature, 7,073 for maximum temperature, and 6,587 for minimum temperature. Compared with CRUTEM3 and GHCN-V3, the station density is much higher particularly for South America, Africa,and Asia. Moreover, data from significantly more stations were available after the year 1990 which dramatically reduced the uncertainty of the estimated global temperature trend during 1990e2011. The integrated dataset can serve as a reliable data source for global climate change research.展开更多
Currently,most enterprises have adopted information software and digital equipment and gradually established digital factories.They conduct enterprise data collection and decision-support activities,generating large v...Currently,most enterprises have adopted information software and digital equipment and gradually established digital factories.They conduct enterprise data collection and decision-support activities,generating large volumes of multi-source heterogeneous data across all stages of the product life cycle.However,current data utilization methods remain simplistic,and the goal of leveraging multi-source heterogeneous data to drive manufacturing value has yet to be fully realized.To address this issue,this study first defines the concept and characteristics of multi-source heterogeneous data in intelligent manufacturing,based on an analysis of its relationship with industrial big data.Then,integrating principles from data science,a technological framework for multi-source heterogeneous data is proposed.The key technologies involved in each stage of data processing are investigated,and typical applications of such data in intelligent manufacturing are discussed.Finally,this paper analyzes the challenges and future development directions of multi-source heterogeneous data processing in intelligent manufacturing.The goal is to provide theoretical and technical support for integrating intelligent manufacturing with data science.展开更多
A heterogeneous wireless sensor network comprises a number of inexpensive energy constrained wireless sensor nodes which collect data from the sensing environment and transmit them toward the improved cluster head in ...A heterogeneous wireless sensor network comprises a number of inexpensive energy constrained wireless sensor nodes which collect data from the sensing environment and transmit them toward the improved cluster head in a coordinated way. Employing clustering techniques in such networks can achieve balanced energy consumption of member nodes and prolong the network lifetimes.In classical clustering techniques, clustering and in-cluster data routes are usually separated into independent operations. Although separate considerations of these two issues simplify the system design, it is often the non-optimal lifetime expectancy for wireless sensor networks. This paper proposes an integral framework that integrates these two correlated items in an interactive entirety. For that,we develop the clustering problems using nonlinear programming. Evolution process of clustering is provided in simulations. Results show that our joint-design proposal reaches the near optimal match between member nodes and cluster heads.展开更多
We propose a three-step technique to achieve this purpose. First, we utilize a collection of XML namespaces organized into hierarchical structure as a medium for expressing data semantics. Second, we define the format...We propose a three-step technique to achieve this purpose. First, we utilize a collection of XML namespaces organized into hierarchical structure as a medium for expressing data semantics. Second, we define the format of resource descriptor for the information source discovery scheme so that we can dynamically register and/or deregister the Web data sources on the fly. Third, we employ an inverted-index mechanism to identify the subset of information sources that are relevant to a particular user query. We describe the design, architecture, and implementation of our approach—IWDS, and illustrate its use through case examples. Key words integration - heterogeneity - Web data source - XML namespace CLC number TP 311.13 Foundation item: Supported by the National Key Technologies R&D Program of China(2002BA103A04)Biography: WU Wei (1975-), male, Ph.D candidate, research direction: information integration, distribute computing展开更多
Cleaning duplicate data is a major problem that persists even though many works have been done to solve it, due to the exponential growth of data amount treated and the necessity to use scalable and speed algorithms. ...Cleaning duplicate data is a major problem that persists even though many works have been done to solve it, due to the exponential growth of data amount treated and the necessity to use scalable and speed algorithms. This problem depends on the type and quality of data, and differs according to the volume of data set manipulated. In this paper we are going to introduce a novel framework based on extended fuzzy C-means algorithm by using topic ontology. This work aims to improve the OLAP querying process over heterogeneous data warehouses that contain big data sets, by improving query results integration, eliminating redundancies by using the extended classification algorithm, and measuring the loss of information.展开更多
The rapid urbanization and structural imbalances in Chinese megacities have exacerbated the housing supplydemand mismatch,creating an urgent need for fine-scale diagnostic tools.This study addresses this critical gap ...The rapid urbanization and structural imbalances in Chinese megacities have exacerbated the housing supplydemand mismatch,creating an urgent need for fine-scale diagnostic tools.This study addresses this critical gap by developing the Housing Contradiction Evaluation Weighted Index(HCEWI)model,making three key contributions to high-resolution housing monitoring.First,we establish a tripartite theoretical framework integrating dynamic population pressure(PPI),housing supply potential(HSI),and functional diversity(HHI).The PPI innovatively combines mobile signaling data with principal component analysis to capture real-time commuting patterns,while the HSI introduces a novel dual-criteria system based on Local Climate Zones(LCZ),weighted by building density and residential function ratio.Second,we develop a spatiotemporal coupling architecture featuring an entropy-weighted dynamic integration mechanism with self-correcting modules,demonstrating robust performance against data noise.Third,our 25-month longitudinal analysis in Shenzhen reveals significant findings,including persistent bipolar clustering patterns,contrasting volatility between peripheral and core areas,and seasonal policy responsiveness.Methodologically,we advance urban diagnostics through 500-meter grid monthly monitoring and process-oriented temporal operators that reveal“tentacle-like”spatial restructuring along transit corridors.Our findings provide a replicable framework for precision housing governance and demonstrate the transformative potential of mobile signaling data in implementing China’s“city-specific policy”approach.We further propose targeted intervention strategies,including balance regulation for high-contradiction zones,Transit-Oriented Development(TOD)activation for low-contradiction clusters,and dynamic land conversion mechanisms for transitional areas.展开更多
Multiple efforts have been performed worldwide around diverse aspects of land administra-tion.However,land administration data and systems’notorious heterogeneity remains a longstanding challenge to develop a harmoni...Multiple efforts have been performed worldwide around diverse aspects of land administra-tion.However,land administration data and systems’notorious heterogeneity remains a longstanding challenge to develop a harmonized vision.In this sense,the traditional Spatial Data Infrastructures adoption is not enough to overcome this challenge since data sources’heterogeneity implies needs related to harmonization interoperability,sharing,and integration in land administration development.This paper proposes a graph-based represen-tation of knowledge for integrating multiple and heterogeneous data sources(tables,shape-files,geodatabases,and WFS services)belonging to two Colombian agencies within a decentralized land administration scenario.These knowledge graphs are developed on an ontology-based knowledge representation using national and international standards for land administration.Our approach aims to prevent data isolation,enable cross-datasets integration,accomplish machine-processable data,and facilitate the reuse and exploitation of multi-jurisdictional datasets in a single approach.A real case study demonstrates the applicability of the land administration data cycle deployed.展开更多
文摘With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.
基金supported by Natural Science Foundation of China(Nos.62303126,62362008,author Z.Z,https://www.nsfc.gov.cn/,accessed on 20 December 2024)Major Scientific and Technological Special Project of Guizhou Province([2024]014)+2 种基金Guizhou Provincial Science and Technology Projects(No.ZK[2022]General149) ,author Z.Z,https://kjt.guizhou.gov.cn/,accessed on 20 December 2024)The Open Project of the Key Laboratory of Computing Power Network and Information Security,Ministry of Education under Grant 2023ZD037,author Z.Z,https://www.gzu.edu.cn/,accessed on 20 December 2024)Open Research Project of the State Key Laboratory of Industrial Control Technology,Zhejiang University,China(No.ICT2024B25),author Z.Z,https://www.gzu.edu.cn/,accessed on 20 December 2024).
文摘Due to the development of cloud computing and machine learning,users can upload their data to the cloud for machine learning model training.However,dishonest clouds may infer user data,resulting in user data leakage.Previous schemes have achieved secure outsourced computing,but they suffer from low computational accuracy,difficult-to-handle heterogeneous distribution of data from multiple sources,and high computational cost,which result in extremely poor user experience and expensive cloud computing costs.To address the above problems,we propose amulti-precision,multi-sourced,andmulti-key outsourcing neural network training scheme.Firstly,we design a multi-precision functional encryption computation based on Euclidean division.Second,we design the outsourcing model training algorithm based on a multi-precision functional encryption with multi-sourced heterogeneity.Finally,we conduct experiments on three datasets.The results indicate that our framework achieves an accuracy improvement of 6%to 30%.Additionally,it offers a memory space optimization of 1.0×2^(24) times compared to the previous best approach.
基金supported by the National Key Research and Development Program of China(grant number 2019YFE0123600)。
文摘The power Internet of Things(IoT)is a significant trend in technology and a requirement for national strategic development.With the deepening digital transformation of the power grid,China’s power system has initially built a power IoT architecture comprising a perception,network,and platform application layer.However,owing to the structural complexity of the power system,the construction of the power IoT continues to face problems such as complex access management of massive heterogeneous equipment,diverse IoT protocol access methods,high concurrency of network communications,and weak data security protection.To address these issues,this study optimizes the existing architecture of the power IoT and designs an integrated management framework for the access of multi-source heterogeneous data in the power IoT,comprising cloud,pipe,edge,and terminal parts.It further reviews and analyzes the key technologies involved in the power IoT,such as the unified management of the physical model,high concurrent access,multi-protocol access,multi-source heterogeneous data storage management,and data security control,to provide a more flexible,efficient,secure,and easy-to-use solution for multi-source heterogeneous data access in the power IoT.
文摘To construct mediators for data integration systems that integrate structured and semi-structured data, and to facilitate the reformulation and decomposition of the query, the presented system uses the XML processing language (XPL) for the mediator. With XPL, it is easy to construct mediators for data integration based on XML, and it can accelerate the work in the mediator.
基金supported by the China Meteorological Administration Special Public Welfare Research Fund (GYHY201206012, GYHY201406016)the Climate Change Foundation of the China Meteorological Administration (CCSF201338)
文摘This paper analyzes the status of existing resources through extensive research and international cooperation on the basis of four typical global monthly surface temperature datasets including the climate research dataset of the University of East Anglia(CRUTEM3), the dataset of the U.S. National Climatic Data Center(GHCN-V3), the dataset of the U.S. National Aeronautics and Space Administration(GISSTMP), and the Berkeley Earth surface temperature dataset(Berkeley). China's first global monthly temperature dataset over land was developed by integrating the four aforementioned global temperature datasets and several regional datasets from major countries or regions. This dataset contains information from 9,519 stations worldwide of at least 20 years for monthly mean temperature, 7,073 for maximum temperature, and 6,587 for minimum temperature. Compared with CRUTEM3 and GHCN-V3, the station density is much higher particularly for South America, Africa,and Asia. Moreover, data from significantly more stations were available after the year 1990 which dramatically reduced the uncertainty of the estimated global temperature trend during 1990e2011. The integrated dataset can serve as a reliable data source for global climate change research.
基金funded by the National Natural Science Foundation of China,grant number 62172033.
文摘Currently,most enterprises have adopted information software and digital equipment and gradually established digital factories.They conduct enterprise data collection and decision-support activities,generating large volumes of multi-source heterogeneous data across all stages of the product life cycle.However,current data utilization methods remain simplistic,and the goal of leveraging multi-source heterogeneous data to drive manufacturing value has yet to be fully realized.To address this issue,this study first defines the concept and characteristics of multi-source heterogeneous data in intelligent manufacturing,based on an analysis of its relationship with industrial big data.Then,integrating principles from data science,a technological framework for multi-source heterogeneous data is proposed.The key technologies involved in each stage of data processing are investigated,and typical applications of such data in intelligent manufacturing are discussed.Finally,this paper analyzes the challenges and future development directions of multi-source heterogeneous data processing in intelligent manufacturing.The goal is to provide theoretical and technical support for integrating intelligent manufacturing with data science.
基金supported by National Natural Science Foundation of China(Nos.61304131 and 61402147)Grant of China Scholarship Council(No.201608130174)+2 种基金Natural Science Foundation of Hebei Province(Nos.F2016402054 and F2014402075)the Scientific Research Plan Projects of Hebei Education Department(Nos.BJ2014019,ZD2015087 and QN2015046)the Research Program of Talent Cultivation Project in Hebei Province(No.A2016002023)
文摘A heterogeneous wireless sensor network comprises a number of inexpensive energy constrained wireless sensor nodes which collect data from the sensing environment and transmit them toward the improved cluster head in a coordinated way. Employing clustering techniques in such networks can achieve balanced energy consumption of member nodes and prolong the network lifetimes.In classical clustering techniques, clustering and in-cluster data routes are usually separated into independent operations. Although separate considerations of these two issues simplify the system design, it is often the non-optimal lifetime expectancy for wireless sensor networks. This paper proposes an integral framework that integrates these two correlated items in an interactive entirety. For that,we develop the clustering problems using nonlinear programming. Evolution process of clustering is provided in simulations. Results show that our joint-design proposal reaches the near optimal match between member nodes and cluster heads.
文摘We propose a three-step technique to achieve this purpose. First, we utilize a collection of XML namespaces organized into hierarchical structure as a medium for expressing data semantics. Second, we define the format of resource descriptor for the information source discovery scheme so that we can dynamically register and/or deregister the Web data sources on the fly. Third, we employ an inverted-index mechanism to identify the subset of information sources that are relevant to a particular user query. We describe the design, architecture, and implementation of our approach—IWDS, and illustrate its use through case examples. Key words integration - heterogeneity - Web data source - XML namespace CLC number TP 311.13 Foundation item: Supported by the National Key Technologies R&D Program of China(2002BA103A04)Biography: WU Wei (1975-), male, Ph.D candidate, research direction: information integration, distribute computing
文摘Cleaning duplicate data is a major problem that persists even though many works have been done to solve it, due to the exponential growth of data amount treated and the necessity to use scalable and speed algorithms. This problem depends on the type and quality of data, and differs according to the volume of data set manipulated. In this paper we are going to introduce a novel framework based on extended fuzzy C-means algorithm by using topic ontology. This work aims to improve the OLAP querying process over heterogeneous data warehouses that contain big data sets, by improving query results integration, eliminating redundancies by using the extended classification algorithm, and measuring the loss of information.
基金National Natural Science Foundation of China(No.42101346)Undergraduate Training Programs for Innovation and Entrepreneurship of Wuhan University(GeoAI Special Project)(No.202510486196).
文摘The rapid urbanization and structural imbalances in Chinese megacities have exacerbated the housing supplydemand mismatch,creating an urgent need for fine-scale diagnostic tools.This study addresses this critical gap by developing the Housing Contradiction Evaluation Weighted Index(HCEWI)model,making three key contributions to high-resolution housing monitoring.First,we establish a tripartite theoretical framework integrating dynamic population pressure(PPI),housing supply potential(HSI),and functional diversity(HHI).The PPI innovatively combines mobile signaling data with principal component analysis to capture real-time commuting patterns,while the HSI introduces a novel dual-criteria system based on Local Climate Zones(LCZ),weighted by building density and residential function ratio.Second,we develop a spatiotemporal coupling architecture featuring an entropy-weighted dynamic integration mechanism with self-correcting modules,demonstrating robust performance against data noise.Third,our 25-month longitudinal analysis in Shenzhen reveals significant findings,including persistent bipolar clustering patterns,contrasting volatility between peripheral and core areas,and seasonal policy responsiveness.Methodologically,we advance urban diagnostics through 500-meter grid monthly monitoring and process-oriented temporal operators that reveal“tentacle-like”spatial restructuring along transit corridors.Our findings provide a replicable framework for precision housing governance and demonstrate the transformative potential of mobile signaling data in implementing China’s“city-specific policy”approach.We further propose targeted intervention strategies,including balance regulation for high-contradiction zones,Transit-Oriented Development(TOD)activation for low-contradiction clusters,and dynamic land conversion mechanisms for transitional areas.
基金supported by Colfuturo and Ministerio de Tecnologías de la Información y las Comunicaciones de Colombia,CYTED program-520RT0010[Red GeoLIBERO-Consolidación de una red de geomática libre aplicada a las necesidades de Iberoamérica],and SIP-IPN 20210677[Generación de grafos de conocimiento sobre eventos meteorológicos urbanos].
文摘Multiple efforts have been performed worldwide around diverse aspects of land administra-tion.However,land administration data and systems’notorious heterogeneity remains a longstanding challenge to develop a harmonized vision.In this sense,the traditional Spatial Data Infrastructures adoption is not enough to overcome this challenge since data sources’heterogeneity implies needs related to harmonization interoperability,sharing,and integration in land administration development.This paper proposes a graph-based represen-tation of knowledge for integrating multiple and heterogeneous data sources(tables,shape-files,geodatabases,and WFS services)belonging to two Colombian agencies within a decentralized land administration scenario.These knowledge graphs are developed on an ontology-based knowledge representation using national and international standards for land administration.Our approach aims to prevent data isolation,enable cross-datasets integration,accomplish machine-processable data,and facilitate the reuse and exploitation of multi-jurisdictional datasets in a single approach.A real case study demonstrates the applicability of the land administration data cycle deployed.