The power Internet of Things(IoT)is a significant trend in technology and a requirement for national strategic development.With the deepening digital transformation of the power grid,China’s power system has initiall...The power Internet of Things(IoT)is a significant trend in technology and a requirement for national strategic development.With the deepening digital transformation of the power grid,China’s power system has initially built a power IoT architecture comprising a perception,network,and platform application layer.However,owing to the structural complexity of the power system,the construction of the power IoT continues to face problems such as complex access management of massive heterogeneous equipment,diverse IoT protocol access methods,high concurrency of network communications,and weak data security protection.To address these issues,this study optimizes the existing architecture of the power IoT and designs an integrated management framework for the access of multi-source heterogeneous data in the power IoT,comprising cloud,pipe,edge,and terminal parts.It further reviews and analyzes the key technologies involved in the power IoT,such as the unified management of the physical model,high concurrent access,multi-protocol access,multi-source heterogeneous data storage management,and data security control,to provide a more flexible,efficient,secure,and easy-to-use solution for multi-source heterogeneous data access in the power IoT.展开更多
In the heterogeneous power internet of things(IoT)environment,data signals are acquired to support different business systems to realize advanced intelligent applications,with massive,multi-source,heterogeneous and ot...In the heterogeneous power internet of things(IoT)environment,data signals are acquired to support different business systems to realize advanced intelligent applications,with massive,multi-source,heterogeneous and other characteristics.Reliable perception of information and efficient transmission of energy in multi-source heterogeneous environments are crucial issues.Compressive sensing(CS),as an effective method of signal compression and transmission,can accurately recover the original signal only by very few sampling.In this paper,we study a new method of multi-source heterogeneous data signal reconstruction of power IoT based on compressive sensing technology.Based on the traditional compressive sensing technology to directly recover multi-source heterogeneous signals,we fully use the interference subspace information to design the measurement matrix,which directly and effectively eliminates the interference while making the measurement.The measure matrix is optimized by minimizing the average cross-coherence of the matrix,and the reconstruction performance of the new method is further improved.Finally,the effectiveness of the new method with different parameter settings under different multi-source heterogeneous data signal cases is verified by using orthogonal matching pursuit(OMP)and sparsity adaptive matching pursuit(SAMP)for considering the actual environment with prior information utilization of signal sparsity and no prior information utilization of signal sparsity.展开更多
Due to the development of cloud computing and machine learning,users can upload their data to the cloud for machine learning model training.However,dishonest clouds may infer user data,resulting in user data leakage.P...Due to the development of cloud computing and machine learning,users can upload their data to the cloud for machine learning model training.However,dishonest clouds may infer user data,resulting in user data leakage.Previous schemes have achieved secure outsourced computing,but they suffer from low computational accuracy,difficult-to-handle heterogeneous distribution of data from multiple sources,and high computational cost,which result in extremely poor user experience and expensive cloud computing costs.To address the above problems,we propose amulti-precision,multi-sourced,andmulti-key outsourcing neural network training scheme.Firstly,we design a multi-precision functional encryption computation based on Euclidean division.Second,we design the outsourcing model training algorithm based on a multi-precision functional encryption with multi-sourced heterogeneity.Finally,we conduct experiments on three datasets.The results indicate that our framework achieves an accuracy improvement of 6%to 30%.Additionally,it offers a memory space optimization of 1.0×2^(24) times compared to the previous best approach.展开更多
Taking the Ming Tombs Forest Farm in Beijing as the research object,this research applied multi-source data fusion and GIS heat-map overlay analysis techniques,systematically collected bird observation point data from...Taking the Ming Tombs Forest Farm in Beijing as the research object,this research applied multi-source data fusion and GIS heat-map overlay analysis techniques,systematically collected bird observation point data from the Global Biodiversity Information Facility(GBIF),population distribution data from the Oak Ridge National Laboratory(ORNL)in the United States,as well as information on the composition of tree species in suitable forest areas for birds and the forest geographical information of the Ming Tombs Forest Farm,which is based on literature research and field investigations.By using GIS technology,spatial processing was carried out on bird observation points and population distribution data to identify suitable bird-watching areas in different seasons.Then,according to the suitability value range,these areas were classified into different grades(from unsuitable to highly suitable).The research findings indicated that there was significant spatial heterogeneity in the bird-watching suitability of the Ming Tombs Forest Farm.The north side of the reservoir was generally a core area with high suitability in all seasons.The deep-aged broad-leaved mixed forests supported the overlapping co-existence of the ecological niches of various bird species,such as the Zosterops simplex and Urocissa erythrorhyncha.In contrast,the shallow forest-edge coniferous pure forests and mixed forests were more suitable for specialized species like Carduelis sinica.The southern urban area and the core area of the mausoleums had relatively low suitability due to ecological fragmentation or human interference.Based on these results,this paper proposed a three-level protection framework of“core area conservation—buffer zone management—isolation zone construction”and a spatio-temporal coordinated human-bird co-existence strategy.It was also suggested that the human-bird co-existence space could be optimized through measures such as constructing sound and light buffer interfaces,restoring ecological corridors,and integrating cultural heritage elements.This research provided an operational technical approach and decision-making support for the scientific planning of bird-watching sites and the coordination of ecological protection and tourism development.展开更多
Low-earth-orbit(LEO)satellite network has become a critical component of the satelliteterrestrial integrated network(STIN)due to its superior signal quality and minimal communication latency.However,the highly dynamic...Low-earth-orbit(LEO)satellite network has become a critical component of the satelliteterrestrial integrated network(STIN)due to its superior signal quality and minimal communication latency.However,the highly dynamic nature of LEO satellites leads to limited and rapidly varying contact time between them and Earth stations(ESs),making it difficult to timely download massive communication and remote sensing data within the limited time window.To address this challenge in heterogeneous satellite networks with coexisting geostationary-earth-orbit(GEO)and LEO satellites,this paper proposes a dynamic collaborative inter-satellite data download strategy to optimize the long-term weighted energy consumption and data downloads within the constraints of on-board power,backlog stability and time-varying contact.Specifically,the Lyapunov optimization theory is applied to transform the long-term stochastic optimization problem,subject to time-varying contact time and on-board power constraints,into multiple deterministic single time slot problems,based on which online distributed algorithms are developed to enable each satellite to independently obtain the transmit power allocation and data processing decisions in closed-form.Finally,the simulation results demonstrate the superiority of the proposed scheme over benchmarks,e.g.,achieving asymptotic optimality of the weighted energy consumption and data downloads,while maintaining stability of the on-board backlog.展开更多
The monitoring signals of bearings from single-source sensor often contain limited information for characterizing various working condition,which may lead to instability and uncertainty of the class-imbalanced intelli...The monitoring signals of bearings from single-source sensor often contain limited information for characterizing various working condition,which may lead to instability and uncertainty of the class-imbalanced intelligent fault diagnosis.On the other hand,the vectorization of multi-source sensor signals may not only generate high-dimensional vectors,leading to increasing computational complexity and overfitting problems,but also lose the structural information and the coupling information.This paper proposes a new method for class-imbalanced fault diagnosis of bearing using support tensor machine(STM)driven by heterogeneous data fusion.The collected sound and vibration signals of bearings are successively decomposed into multiple frequency band components to extract various time-domain and frequency-domain statistical parameters.A third-order hetero-geneous feature tensor is designed based on multisensors,frequency band components,and statistical parameters.STM-based intelligent model is constructed to preserve the structural information of the third-order heterogeneous feature tensor for bearing fault diagnosis.A series of comparative experiments verify the advantages of the proposed method.展开更多
Aerodynamic surrogate modeling mostly relies only on integrated loads data obtained from simulation or experiment,while neglecting and wasting the valuable distributed physical information on the surface.To make full ...Aerodynamic surrogate modeling mostly relies only on integrated loads data obtained from simulation or experiment,while neglecting and wasting the valuable distributed physical information on the surface.To make full use of both integrated and distributed loads,a modeling paradigm,called the heterogeneous data-driven aerodynamic modeling,is presented.The essential concept is to incorporate the physical information of distributed loads as additional constraints within the end-to-end aerodynamic modeling.Towards heterogenous data,a novel and easily applicable physical feature embedding modeling framework is designed.This framework extracts lowdimensional physical features from pressure distribution and then effectively enhances the modeling of the integrated loads via feature embedding.The proposed framework can be coupled with multiple feature extraction methods,and the well-performed generalization capabilities over different airfoils are verified through a transonic case.Compared with traditional direct modeling,the proposed framework can reduce testing errors by almost 50%.Given the same prediction accuracy,it can save more than half of the training samples.Furthermore,the visualization analysis has revealed a significant correlation between the discovered low-dimensional physical features and the heterogeneous aerodynamic loads,which shows the interpretability and credibility of the superior performance offered by the proposed deep learning framework.展开更多
Heterogeneous federated learning(HtFL)has gained significant attention due to its ability to accommodate diverse models and data from distributed combat units.The prototype-based HtFL methods were proposed to reduce t...Heterogeneous federated learning(HtFL)has gained significant attention due to its ability to accommodate diverse models and data from distributed combat units.The prototype-based HtFL methods were proposed to reduce the high communication cost of transmitting model parameters.These methods allow for the sharing of only class representatives between heterogeneous clients while maintaining privacy.However,existing prototype learning approaches fail to take the data distribution of clients into consideration,which results in suboptimal global prototype learning and insufficient client model personalization capabilities.To address these issues,we propose a fair trainable prototype federated learning(FedFTP)algorithm,which employs a fair sampling training prototype(FSTP)mechanism and a hyperbolic space constraints(HSC)mechanism to enhance the fairness and effectiveness of prototype learning on the server in heterogeneous environments.Furthermore,a local prototype stable update(LPSU)mechanism is proposed as a means of maintaining personalization while promoting global consistency,based on contrastive learning.Comprehensive experimental results demonstrate that FedFTP achieves state-of-the-art performance in HtFL scenarios.展开更多
Software defect prediction plays a critical role in software development and quality assurance processes. Effective defect prediction enables testers to accurately prioritize testing efforts and enhance defect detecti...Software defect prediction plays a critical role in software development and quality assurance processes. Effective defect prediction enables testers to accurately prioritize testing efforts and enhance defect detection efficiency. Additionally, this technology provides developers with a means to quickly identify errors, thereby improving software robustness and overall quality. However, current research in software defect prediction often faces challenges, such as relying on a single data source or failing to adequately account for the characteristics of multiple coexisting data sources. This approach may overlook the differences and potential value of various data sources, affecting the accuracy and generalization performance of prediction results. To address this issue, this study proposes a multivariate heterogeneous hybrid deep learning algorithm for defect prediction (DP-MHHDL). Initially, Abstract Syntax Tree (AST), Code Dependency Network (CDN), and code static quality metrics are extracted from source code files and used as inputs to ensure data diversity. Subsequently, for the three types of heterogeneous data, the study employs a graph convolutional network optimization model based on adjacency and spatial topologies, a Convolutional Neural Network-Bidirectional Long Short-Term Memory (CNN-BiLSTM) hybrid neural network model, and a TabNet model to extract data features. These features are then concatenated and processed through a fully connected neural network for defect prediction. Finally, the proposed framework is evaluated using ten promise defect repository projects, and performance is assessed with three metrics: F1, Area under the curve (AUC), and Matthews correlation coefficient (MCC). The experimental results demonstrate that the proposed algorithm outperforms existing methods, offering a novel solution for software defect prediction.展开更多
A significant obstacle in intelligent transportation systems(ITS)is the capacity to predict traffic flow.Recent advancements in deep neural networks have enabled the development of models to represent traffic flow acc...A significant obstacle in intelligent transportation systems(ITS)is the capacity to predict traffic flow.Recent advancements in deep neural networks have enabled the development of models to represent traffic flow accurately.However,accurately predicting traffic flow at the individual road level is extremely difficult due to the complex interplay of spatial and temporal factors.This paper proposes a technique for predicting short-term traffic flow data using an architecture that utilizes convolutional bidirectional long short-term memory(Conv-BiLSTM)with attention mechanisms.Prior studies neglected to include data pertaining to factors such as holidays,weather conditions,and vehicle types,which are interconnected and significantly impact the accuracy of forecast outcomes.In addition,this research incorporates recurring monthly periodic pattern data that significantly enhances the accuracy of forecast outcomes.The experimental findings demonstrate a performance improvement of 21.68%when incorporating the vehicle type feature.展开更多
Long runout landslides involve a massive amount of energy and can be extremely hazardous owing to their long movement distance,high mobility and strong destructive power.Numerical methods have been widely used to pred...Long runout landslides involve a massive amount of energy and can be extremely hazardous owing to their long movement distance,high mobility and strong destructive power.Numerical methods have been widely used to predict the landslide runout but a fundamental problem remained is how to determine the reliable numerical parameters.This study proposes a framework to predict the runout of potential landslides through multi-source data collaboration and numerical analysis of historical landslide events.Specifically,for the historical landslide cases,the landslide-induced seismic signal,geophysical surveys,and possible in-situ drone/phone videos(multi-source data collaboration)can validate the numerical results in terms of landslide dynamics and deposit features and help calibrate the numerical(rheological)parameters.Subsequently,the calibrated numerical parameters can be used to numerically predict the runout of potential landslides in the region with a similar geological setting to the recorded events.Application of the runout prediction approach to the 2020 Jiashanying landslide in Guizhou,China gives reasonable results in comparison to the field observations.The numerical parameters are determined from the multi-source data collaboration analysis of a historical case in the region(2019 Shuicheng landslide).The proposed framework for landslide runout prediction can be of great utility for landslide risk assessment and disaster reduction in mountainous regions worldwide.展开更多
Literature shows that both market data and financial media impact stock prices;however,using only one kind of data may lead to information bias.Therefore,this study uses market data and news to investigate their joint...Literature shows that both market data and financial media impact stock prices;however,using only one kind of data may lead to information bias.Therefore,this study uses market data and news to investigate their joint impact on stock price trends.However,combining these two types of information is difficult because of their completely different characteristics.This study develops a hybrid model called MVL-SVM for stock price trend prediction by integrating multi-view learning with a support vector machine(SVM).It works by simply inputting heterogeneous multi-view data simultaneously,which may reduce information loss.Compared with the ARIMA and classic SVM models based on single-and multi-view data,our hybrid model shows statistically significant advantages.In the robustness test,our model outperforms the others by at least 10%accuracy when the sliding windows of news and market data are set to 1–5 days,which confirms our model’s effectiveness.Finally,trading strategies based on single stock and investment portfolios are constructed separately,and the simulations show that MVL-SVM has better profitability and risk control performance than the benchmarks.展开更多
Multi-source seismic technology is an efficient seismic acquisition method that requires a group of blended seismic data to be separated into single-source seismic data for subsequent processing. The separation of ble...Multi-source seismic technology is an efficient seismic acquisition method that requires a group of blended seismic data to be separated into single-source seismic data for subsequent processing. The separation of blended seismic data is a linear inverse problem. According to the relationship between the shooting number and the simultaneous source number of the acquisition system, this separation of blended seismic data is divided into an easily determined or overdetermined linear inverse problem and an underdetermined linear inverse problem that is difficult to solve. For the latter, this paper presents an optimization method that imposes the sparsity constraint on wavefields to construct the object function of inversion, and the problem is solved by using the iterative thresholding method. For the most extremely underdetermined separation problem with single-shooting and multiple sources, this paper presents a method of pseudo-deblending with random noise filtering. In this method, approximate common shot gathers are received through the pseudo-deblending process, and the random noises that appear when the approximate common shot gathers are sorted into common receiver gathers are eliminated through filtering methods. The separation methods proposed in this paper are applied to three types of numerical simulation data, including pure data without noise, data with random noise, and data with linear regular noise to obtain satisfactory results. The noise suppression effects of these methods are sufficient, particularly with single-shooting blended seismic data, which verifies the effectiveness of the proposed methods.展开更多
Identification of security risk factors for small reservoirs is the basis for implementation of early warning systems.The manner of identification of the factors for small reservoirs is of practical significance when ...Identification of security risk factors for small reservoirs is the basis for implementation of early warning systems.The manner of identification of the factors for small reservoirs is of practical significance when data are incomplete.The existing grey relational models have some disadvantages in measuring the correlation between categorical data sequences.To this end,this paper introduces a new grey relational model to analyze heterogeneous data.In this study,a set of security risk factors for small reservoirs was first constructed based on theoretical analysis,and heterogeneous data of these factors were recorded as sequences.The sequences were regarded as random variables,and the information entropy and conditional entropy between sequences were measured to analyze the relational degree between risk factors.Then,a new grey relational analysis model for heterogeneous data was constructed,and a comprehensive security risk factor identification method was developed.A case study of small reservoirs in Guangxi Zhuang Autonomous Region in China shows that the model constructed in this study is applicable to security risk factor identification for small reservoirs with heterogeneous and sparse data.展开更多
In view of the lack of comprehensive evaluation and analysis from the combination of natural and human multi-dimensional factors,the urban surface temperature patterns of Changsha in 2000,2009 and 2016 are retrieved b...In view of the lack of comprehensive evaluation and analysis from the combination of natural and human multi-dimensional factors,the urban surface temperature patterns of Changsha in 2000,2009 and 2016 are retrieved based on multi-source spatial data(Landsat 5 and Landsat 8 satellite image data,POI spatial big data,digital elevation model,etc.),and 12 natural and human factors closely related to urban thermal environment are quickly obtained.The standard deviation ellipse and spatial principal component analysis(PCA)methods are used to analyze the effect of urban human residential thermal environment and its influencing factors.The results showed that the heat island area increased by 547 km~2 and the maximum surface temperature difference reached 10.1℃during the period 2000–2016.The spatial distribution of urban heat island was mainly concentrated in urban built-up areas,such as industrial and commercial agglomerations and densely populated urban centers.The spatial distribution pattern of heat island is gradually decreasing from the urban center to the suburbs.There were multiple high-temperature centers,such as Wuyi square business circle,Xingsha economic and technological development zone in Changsha County,Wangcheng industrial zone,Yuelu industrial agglomeration,and Tianxin industrial zone.From 2000 to 2016,the main axis of spatial development of heat island remained in the northeast-southwest direction.The center of gravity of heat island shifted 2.7 km to the southwest with the deflection angle of 54.9°in 2000–2009.The center of gravity of heat island shifted to the northeast by 4.8 km with the deflection angle of 60.9°in 2009–2016.On the whole,the change of spatial pattern of thermal environment in Changsha was related to the change of urban construction intensity.Through the PCA method,it was concluded that landscape pattern,urban construction intensity and topographic landforms were the main factors affecting the spatial pattern of urban thermal environment of Changsha.The promotion effect of human factors on the formation of heat island effect was obviously greater than that of natural factors.The temperature would rise by 0.293℃under the synthetic effect of human and natural factors.Due to the complexity of factors influencing the urban thermal environment of human settlements,the utilization of multi-source data could help to reveal the spatial pattern and evolution law of urban thermal environment,deepen the understanding of the causes of urban heat island effect,and clarify the correlation between human and natural factors,so as to provide scientific supports for the improvement of the quality of urban human settlements.展开更多
For reservoirs with complex non-Gaussian geological characteristics,such as carbonate reservoirs or reservoirs with sedimentary facies distribution,it is difficult to implement history matching directly,especially for...For reservoirs with complex non-Gaussian geological characteristics,such as carbonate reservoirs or reservoirs with sedimentary facies distribution,it is difficult to implement history matching directly,especially for the ensemble-based data assimilation methods.In this paper,we propose a multi-source information fused generative adversarial network(MSIGAN)model,which is used for parameterization of the complex geologies.In MSIGAN,various information such as facies distribution,microseismic,and inter-well connectivity,can be integrated to learn the geological features.And two major generative models in deep learning,variational autoencoder(VAE)and generative adversarial network(GAN)are combined in our model.Then the proposed MSIGAN model is integrated into the ensemble smoother with multiple data assimilation(ESMDA)method to conduct history matching.We tested the proposed method on two reservoir models with fluvial facies.The experimental results show that the proposed MSIGAN model can effectively learn the complex geological features,which can promote the accuracy of history matching.展开更多
Urban functional area(UFA)is a core scientific issue affecting urban sustainability.The current knowledge gap is mainly reflected in the lack of multi-scale quantitative interpretation methods from the perspective of ...Urban functional area(UFA)is a core scientific issue affecting urban sustainability.The current knowledge gap is mainly reflected in the lack of multi-scale quantitative interpretation methods from the perspective of human-land interaction.In this paper,based on multi-source big data include 250 m×250 m resolution cell phone data,1.81×105 Points of Interest(POI)data and administrative boundary data,we built a UFA identification method and demonstrated empirically in Shenyang City,China.We argue that the method we built can effectively identify multi-scale multi-type UFAs based on human activity and further reveal the spatial correlation between urban facilities and human activity.The empirical study suggests that the employment functional zones in Shenyang City are more concentrated in central cities than other single functional zones.There are more mix functional areas in the central city areas,while the planned industrial new cities need to develop comprehensive functions in Shenyang.UFAs have scale effects and human-land interaction patterns.We suggest that city decision makers should apply multi-sources big data to measure urban functional service in a more refined manner from a supply-demand perspective.展开更多
Due to the complex nature of multi-source geological data, it is difficult to rebuild every geological structure through a single 3D modeling method. The multi-source data interpretation method put forward in this ana...Due to the complex nature of multi-source geological data, it is difficult to rebuild every geological structure through a single 3D modeling method. The multi-source data interpretation method put forward in this analysis is based on a database-driven pattern and focuses on the discrete and irregular features of geological data. The geological data from a variety of sources covering a range of accuracy, resolution, quantity and quality are classified and integrated according to their reliability and consistency for 3D modeling. The new interpolation-approximation fitting construction algorithm of geological surfaces with the non-uniform rational B-spline(NURBS) technique is then presented. The NURBS technique can retain the balance among the requirements for accuracy, surface continuity and data storage of geological structures. Finally, four alternative 3D modeling approaches are demonstrated with reference to some examples, which are selected according to the data quantity and accuracy specification. The proposed approaches offer flexible modeling patterns for different practical engineering demands.展开更多
In traditional medicine and ethnomedicine,medicinal plants have long been recognized as the basis for materials in therapeutic applications worldwide.In particular,the remarkable curative effect of traditional Chinese...In traditional medicine and ethnomedicine,medicinal plants have long been recognized as the basis for materials in therapeutic applications worldwide.In particular,the remarkable curative effect of traditional Chinese medicine during corona virus disease 2019(COVID-19)pandemic has attracted extensive attention globally.Medicinal plants have,therefore,become increasingly popular among the public.However,with increasing demand for and profit with medicinal plants,commercial fraudulent events such as adulteration or counterfeits sometimes occur,which poses a serious threat to the clinical outcomes and interests of consumers.With rapid advances in artificial intelligence,machine learning can be used to mine information on various medicinal plants to establish an ideal resource database.We herein present a review that mainly introduces common machine learning algorithms and discusses their application in multi-source data analysis of medicinal plants.The combination of machine learning algorithms and multi-source data analysis facilitates a comprehensive analysis and aids in the effective evaluation of the quality of medicinal plants.The findings of this review provide new possibilities for promoting the development and utilization of medicinal plants.展开更多
In order to estimate vehicular queue length at signalized intersections accurately and overcome the shortcomings and restrictions of existing studies especially those based on shockwave theory,a new methodology is pre...In order to estimate vehicular queue length at signalized intersections accurately and overcome the shortcomings and restrictions of existing studies especially those based on shockwave theory,a new methodology is presented for estimating vehicular queue length using data from both point detectors and probe vehicles. The methodology applies the shockwave theory to model queue evolution over time and space. Using probe vehicle locations and times as well as point detector measured traffic states,analytical formulations for calculating the maximum and minimum( residual) queue length are developed. The proposed methodology is verified using ground truth data collected from numerical experiments conducted in Shanghai,China. It is found that the methodology has a mean absolute percentage error of 17. 09%,which is reasonably effective in estimating the queue length at traffic signalized intersections. Limitations of the proposed models and algorithms are also discussed in the paper.展开更多
基金supported by the National Key Research and Development Program of China(grant number 2019YFE0123600)。
文摘The power Internet of Things(IoT)is a significant trend in technology and a requirement for national strategic development.With the deepening digital transformation of the power grid,China’s power system has initially built a power IoT architecture comprising a perception,network,and platform application layer.However,owing to the structural complexity of the power system,the construction of the power IoT continues to face problems such as complex access management of massive heterogeneous equipment,diverse IoT protocol access methods,high concurrency of network communications,and weak data security protection.To address these issues,this study optimizes the existing architecture of the power IoT and designs an integrated management framework for the access of multi-source heterogeneous data in the power IoT,comprising cloud,pipe,edge,and terminal parts.It further reviews and analyzes the key technologies involved in the power IoT,such as the unified management of the physical model,high concurrent access,multi-protocol access,multi-source heterogeneous data storage management,and data security control,to provide a more flexible,efficient,secure,and easy-to-use solution for multi-source heterogeneous data access in the power IoT.
基金supported by National Natural Science Foundation of China(12174350)Science and Technology Project of State Grid Henan Electric Power Company(5217Q0240008).
文摘In the heterogeneous power internet of things(IoT)environment,data signals are acquired to support different business systems to realize advanced intelligent applications,with massive,multi-source,heterogeneous and other characteristics.Reliable perception of information and efficient transmission of energy in multi-source heterogeneous environments are crucial issues.Compressive sensing(CS),as an effective method of signal compression and transmission,can accurately recover the original signal only by very few sampling.In this paper,we study a new method of multi-source heterogeneous data signal reconstruction of power IoT based on compressive sensing technology.Based on the traditional compressive sensing technology to directly recover multi-source heterogeneous signals,we fully use the interference subspace information to design the measurement matrix,which directly and effectively eliminates the interference while making the measurement.The measure matrix is optimized by minimizing the average cross-coherence of the matrix,and the reconstruction performance of the new method is further improved.Finally,the effectiveness of the new method with different parameter settings under different multi-source heterogeneous data signal cases is verified by using orthogonal matching pursuit(OMP)and sparsity adaptive matching pursuit(SAMP)for considering the actual environment with prior information utilization of signal sparsity and no prior information utilization of signal sparsity.
基金supported by Natural Science Foundation of China(Nos.62303126,62362008,author Z.Z,https://www.nsfc.gov.cn/,accessed on 20 December 2024)Major Scientific and Technological Special Project of Guizhou Province([2024]014)+2 种基金Guizhou Provincial Science and Technology Projects(No.ZK[2022]General149) ,author Z.Z,https://kjt.guizhou.gov.cn/,accessed on 20 December 2024)The Open Project of the Key Laboratory of Computing Power Network and Information Security,Ministry of Education under Grant 2023ZD037,author Z.Z,https://www.gzu.edu.cn/,accessed on 20 December 2024)Open Research Project of the State Key Laboratory of Industrial Control Technology,Zhejiang University,China(No.ICT2024B25),author Z.Z,https://www.gzu.edu.cn/,accessed on 20 December 2024).
文摘Due to the development of cloud computing and machine learning,users can upload their data to the cloud for machine learning model training.However,dishonest clouds may infer user data,resulting in user data leakage.Previous schemes have achieved secure outsourced computing,but they suffer from low computational accuracy,difficult-to-handle heterogeneous distribution of data from multiple sources,and high computational cost,which result in extremely poor user experience and expensive cloud computing costs.To address the above problems,we propose amulti-precision,multi-sourced,andmulti-key outsourcing neural network training scheme.Firstly,we design a multi-precision functional encryption computation based on Euclidean division.Second,we design the outsourcing model training algorithm based on a multi-precision functional encryption with multi-sourced heterogeneity.Finally,we conduct experiments on three datasets.The results indicate that our framework achieves an accuracy improvement of 6%to 30%.Additionally,it offers a memory space optimization of 1.0×2^(24) times compared to the previous best approach.
基金Sponsored by Beijing Youth Innovation Talent Support Program for Urban Greening and Landscaping——The 2024 Special Project for Promoting High-Quality Development of Beijing’s Landscaping through Scientific and Technological Innovation(KJCXQT202410).
文摘Taking the Ming Tombs Forest Farm in Beijing as the research object,this research applied multi-source data fusion and GIS heat-map overlay analysis techniques,systematically collected bird observation point data from the Global Biodiversity Information Facility(GBIF),population distribution data from the Oak Ridge National Laboratory(ORNL)in the United States,as well as information on the composition of tree species in suitable forest areas for birds and the forest geographical information of the Ming Tombs Forest Farm,which is based on literature research and field investigations.By using GIS technology,spatial processing was carried out on bird observation points and population distribution data to identify suitable bird-watching areas in different seasons.Then,according to the suitability value range,these areas were classified into different grades(from unsuitable to highly suitable).The research findings indicated that there was significant spatial heterogeneity in the bird-watching suitability of the Ming Tombs Forest Farm.The north side of the reservoir was generally a core area with high suitability in all seasons.The deep-aged broad-leaved mixed forests supported the overlapping co-existence of the ecological niches of various bird species,such as the Zosterops simplex and Urocissa erythrorhyncha.In contrast,the shallow forest-edge coniferous pure forests and mixed forests were more suitable for specialized species like Carduelis sinica.The southern urban area and the core area of the mausoleums had relatively low suitability due to ecological fragmentation or human interference.Based on these results,this paper proposed a three-level protection framework of“core area conservation—buffer zone management—isolation zone construction”and a spatio-temporal coordinated human-bird co-existence strategy.It was also suggested that the human-bird co-existence space could be optimized through measures such as constructing sound and light buffer interfaces,restoring ecological corridors,and integrating cultural heritage elements.This research provided an operational technical approach and decision-making support for the scientific planning of bird-watching sites and the coordination of ecological protection and tourism development.
基金supported by the National Natural Science Foundation of China under Grant 62371098the National Key Laboratory ofWireless Communications Foundation under Grant IFN20230203the National Key Research and Development Program of China under Grant 2021YFB2900404.
文摘Low-earth-orbit(LEO)satellite network has become a critical component of the satelliteterrestrial integrated network(STIN)due to its superior signal quality and minimal communication latency.However,the highly dynamic nature of LEO satellites leads to limited and rapidly varying contact time between them and Earth stations(ESs),making it difficult to timely download massive communication and remote sensing data within the limited time window.To address this challenge in heterogeneous satellite networks with coexisting geostationary-earth-orbit(GEO)and LEO satellites,this paper proposes a dynamic collaborative inter-satellite data download strategy to optimize the long-term weighted energy consumption and data downloads within the constraints of on-board power,backlog stability and time-varying contact.Specifically,the Lyapunov optimization theory is applied to transform the long-term stochastic optimization problem,subject to time-varying contact time and on-board power constraints,into multiple deterministic single time slot problems,based on which online distributed algorithms are developed to enable each satellite to independently obtain the transmit power allocation and data processing decisions in closed-form.Finally,the simulation results demonstrate the superiority of the proposed scheme over benchmarks,e.g.,achieving asymptotic optimality of the weighted energy consumption and data downloads,while maintaining stability of the on-board backlog.
基金supported by the National Natural Science Foundation of China(No.52275104)the Science and Technology Innovation Program of Hunan Province(No.2023RC3097).
文摘The monitoring signals of bearings from single-source sensor often contain limited information for characterizing various working condition,which may lead to instability and uncertainty of the class-imbalanced intelligent fault diagnosis.On the other hand,the vectorization of multi-source sensor signals may not only generate high-dimensional vectors,leading to increasing computational complexity and overfitting problems,but also lose the structural information and the coupling information.This paper proposes a new method for class-imbalanced fault diagnosis of bearing using support tensor machine(STM)driven by heterogeneous data fusion.The collected sound and vibration signals of bearings are successively decomposed into multiple frequency band components to extract various time-domain and frequency-domain statistical parameters.A third-order hetero-geneous feature tensor is designed based on multisensors,frequency band components,and statistical parameters.STM-based intelligent model is constructed to preserve the structural information of the third-order heterogeneous feature tensor for bearing fault diagnosis.A series of comparative experiments verify the advantages of the proposed method.
基金supported by the National Natural Science Foundation of China(Nos.92152301,12072282)。
文摘Aerodynamic surrogate modeling mostly relies only on integrated loads data obtained from simulation or experiment,while neglecting and wasting the valuable distributed physical information on the surface.To make full use of both integrated and distributed loads,a modeling paradigm,called the heterogeneous data-driven aerodynamic modeling,is presented.The essential concept is to incorporate the physical information of distributed loads as additional constraints within the end-to-end aerodynamic modeling.Towards heterogenous data,a novel and easily applicable physical feature embedding modeling framework is designed.This framework extracts lowdimensional physical features from pressure distribution and then effectively enhances the modeling of the integrated loads via feature embedding.The proposed framework can be coupled with multiple feature extraction methods,and the well-performed generalization capabilities over different airfoils are verified through a transonic case.Compared with traditional direct modeling,the proposed framework can reduce testing errors by almost 50%.Given the same prediction accuracy,it can save more than half of the training samples.Furthermore,the visualization analysis has revealed a significant correlation between the discovered low-dimensional physical features and the heterogeneous aerodynamic loads,which shows the interpretability and credibility of the superior performance offered by the proposed deep learning framework.
基金supported by the Natural Science Foundation of Xinjiang Uygur Autonomous Region(No.2022D01B187).
文摘Heterogeneous federated learning(HtFL)has gained significant attention due to its ability to accommodate diverse models and data from distributed combat units.The prototype-based HtFL methods were proposed to reduce the high communication cost of transmitting model parameters.These methods allow for the sharing of only class representatives between heterogeneous clients while maintaining privacy.However,existing prototype learning approaches fail to take the data distribution of clients into consideration,which results in suboptimal global prototype learning and insufficient client model personalization capabilities.To address these issues,we propose a fair trainable prototype federated learning(FedFTP)algorithm,which employs a fair sampling training prototype(FSTP)mechanism and a hyperbolic space constraints(HSC)mechanism to enhance the fairness and effectiveness of prototype learning on the server in heterogeneous environments.Furthermore,a local prototype stable update(LPSU)mechanism is proposed as a means of maintaining personalization while promoting global consistency,based on contrastive learning.Comprehensive experimental results demonstrate that FedFTP achieves state-of-the-art performance in HtFL scenarios.
文摘Software defect prediction plays a critical role in software development and quality assurance processes. Effective defect prediction enables testers to accurately prioritize testing efforts and enhance defect detection efficiency. Additionally, this technology provides developers with a means to quickly identify errors, thereby improving software robustness and overall quality. However, current research in software defect prediction often faces challenges, such as relying on a single data source or failing to adequately account for the characteristics of multiple coexisting data sources. This approach may overlook the differences and potential value of various data sources, affecting the accuracy and generalization performance of prediction results. To address this issue, this study proposes a multivariate heterogeneous hybrid deep learning algorithm for defect prediction (DP-MHHDL). Initially, Abstract Syntax Tree (AST), Code Dependency Network (CDN), and code static quality metrics are extracted from source code files and used as inputs to ensure data diversity. Subsequently, for the three types of heterogeneous data, the study employs a graph convolutional network optimization model based on adjacency and spatial topologies, a Convolutional Neural Network-Bidirectional Long Short-Term Memory (CNN-BiLSTM) hybrid neural network model, and a TabNet model to extract data features. These features are then concatenated and processed through a fully connected neural network for defect prediction. Finally, the proposed framework is evaluated using ten promise defect repository projects, and performance is assessed with three metrics: F1, Area under the curve (AUC), and Matthews correlation coefficient (MCC). The experimental results demonstrate that the proposed algorithm outperforms existing methods, offering a novel solution for software defect prediction.
文摘A significant obstacle in intelligent transportation systems(ITS)is the capacity to predict traffic flow.Recent advancements in deep neural networks have enabled the development of models to represent traffic flow accurately.However,accurately predicting traffic flow at the individual road level is extremely difficult due to the complex interplay of spatial and temporal factors.This paper proposes a technique for predicting short-term traffic flow data using an architecture that utilizes convolutional bidirectional long short-term memory(Conv-BiLSTM)with attention mechanisms.Prior studies neglected to include data pertaining to factors such as holidays,weather conditions,and vehicle types,which are interconnected and significantly impact the accuracy of forecast outcomes.In addition,this research incorporates recurring monthly periodic pattern data that significantly enhances the accuracy of forecast outcomes.The experimental findings demonstrate a performance improvement of 21.68%when incorporating the vehicle type feature.
基金supported by the National Natural Science Foundation of China(41977215)。
文摘Long runout landslides involve a massive amount of energy and can be extremely hazardous owing to their long movement distance,high mobility and strong destructive power.Numerical methods have been widely used to predict the landslide runout but a fundamental problem remained is how to determine the reliable numerical parameters.This study proposes a framework to predict the runout of potential landslides through multi-source data collaboration and numerical analysis of historical landslide events.Specifically,for the historical landslide cases,the landslide-induced seismic signal,geophysical surveys,and possible in-situ drone/phone videos(multi-source data collaboration)can validate the numerical results in terms of landslide dynamics and deposit features and help calibrate the numerical(rheological)parameters.Subsequently,the calibrated numerical parameters can be used to numerically predict the runout of potential landslides in the region with a similar geological setting to the recorded events.Application of the runout prediction approach to the 2020 Jiashanying landslide in Guizhou,China gives reasonable results in comparison to the field observations.The numerical parameters are determined from the multi-source data collaboration analysis of a historical case in the region(2019 Shuicheng landslide).The proposed framework for landslide runout prediction can be of great utility for landslide risk assessment and disaster reduction in mountainous regions worldwide.
基金partly supported by National Natural Science Foundation of China(No.71771204,72231010)the Fundamental Research Funds for the Central Universities(No.E0E48946X2).
文摘Literature shows that both market data and financial media impact stock prices;however,using only one kind of data may lead to information bias.Therefore,this study uses market data and news to investigate their joint impact on stock price trends.However,combining these two types of information is difficult because of their completely different characteristics.This study develops a hybrid model called MVL-SVM for stock price trend prediction by integrating multi-view learning with a support vector machine(SVM).It works by simply inputting heterogeneous multi-view data simultaneously,which may reduce information loss.Compared with the ARIMA and classic SVM models based on single-and multi-view data,our hybrid model shows statistically significant advantages.In the robustness test,our model outperforms the others by at least 10%accuracy when the sliding windows of news and market data are set to 1–5 days,which confirms our model’s effectiveness.Finally,trading strategies based on single stock and investment portfolios are constructed separately,and the simulations show that MVL-SVM has better profitability and risk control performance than the benchmarks.
文摘Multi-source seismic technology is an efficient seismic acquisition method that requires a group of blended seismic data to be separated into single-source seismic data for subsequent processing. The separation of blended seismic data is a linear inverse problem. According to the relationship between the shooting number and the simultaneous source number of the acquisition system, this separation of blended seismic data is divided into an easily determined or overdetermined linear inverse problem and an underdetermined linear inverse problem that is difficult to solve. For the latter, this paper presents an optimization method that imposes the sparsity constraint on wavefields to construct the object function of inversion, and the problem is solved by using the iterative thresholding method. For the most extremely underdetermined separation problem with single-shooting and multiple sources, this paper presents a method of pseudo-deblending with random noise filtering. In this method, approximate common shot gathers are received through the pseudo-deblending process, and the random noises that appear when the approximate common shot gathers are sorted into common receiver gathers are eliminated through filtering methods. The separation methods proposed in this paper are applied to three types of numerical simulation data, including pure data without noise, data with random noise, and data with linear regular noise to obtain satisfactory results. The noise suppression effects of these methods are sufficient, particularly with single-shooting blended seismic data, which verifies the effectiveness of the proposed methods.
基金supported by the National Nature Science Foundation of China(Grant No.71401052)the National Social Science Foundation of China(Grant No.17BGL156)the Key Project of the National Social Science Foundation of China(Grant No.14AZD024)
文摘Identification of security risk factors for small reservoirs is the basis for implementation of early warning systems.The manner of identification of the factors for small reservoirs is of practical significance when data are incomplete.The existing grey relational models have some disadvantages in measuring the correlation between categorical data sequences.To this end,this paper introduces a new grey relational model to analyze heterogeneous data.In this study,a set of security risk factors for small reservoirs was first constructed based on theoretical analysis,and heterogeneous data of these factors were recorded as sequences.The sequences were regarded as random variables,and the information entropy and conditional entropy between sequences were measured to analyze the relational degree between risk factors.Then,a new grey relational analysis model for heterogeneous data was constructed,and a comprehensive security risk factor identification method was developed.A case study of small reservoirs in Guangxi Zhuang Autonomous Region in China shows that the model constructed in this study is applicable to security risk factor identification for small reservoirs with heterogeneous and sparse data.
基金National Social Science Foundation of China,No.15BJY051Open Topic of Hunan Key Laboratory of Land Resources Evaluation and Utilization,No.SYS-ZX-202002Research Project of Appraisement Committee of Social Sciences Research Achievements of Hunan Province,No.XSP18ZDI031。
文摘In view of the lack of comprehensive evaluation and analysis from the combination of natural and human multi-dimensional factors,the urban surface temperature patterns of Changsha in 2000,2009 and 2016 are retrieved based on multi-source spatial data(Landsat 5 and Landsat 8 satellite image data,POI spatial big data,digital elevation model,etc.),and 12 natural and human factors closely related to urban thermal environment are quickly obtained.The standard deviation ellipse and spatial principal component analysis(PCA)methods are used to analyze the effect of urban human residential thermal environment and its influencing factors.The results showed that the heat island area increased by 547 km~2 and the maximum surface temperature difference reached 10.1℃during the period 2000–2016.The spatial distribution of urban heat island was mainly concentrated in urban built-up areas,such as industrial and commercial agglomerations and densely populated urban centers.The spatial distribution pattern of heat island is gradually decreasing from the urban center to the suburbs.There were multiple high-temperature centers,such as Wuyi square business circle,Xingsha economic and technological development zone in Changsha County,Wangcheng industrial zone,Yuelu industrial agglomeration,and Tianxin industrial zone.From 2000 to 2016,the main axis of spatial development of heat island remained in the northeast-southwest direction.The center of gravity of heat island shifted 2.7 km to the southwest with the deflection angle of 54.9°in 2000–2009.The center of gravity of heat island shifted to the northeast by 4.8 km with the deflection angle of 60.9°in 2009–2016.On the whole,the change of spatial pattern of thermal environment in Changsha was related to the change of urban construction intensity.Through the PCA method,it was concluded that landscape pattern,urban construction intensity and topographic landforms were the main factors affecting the spatial pattern of urban thermal environment of Changsha.The promotion effect of human factors on the formation of heat island effect was obviously greater than that of natural factors.The temperature would rise by 0.293℃under the synthetic effect of human and natural factors.Due to the complexity of factors influencing the urban thermal environment of human settlements,the utilization of multi-source data could help to reveal the spatial pattern and evolution law of urban thermal environment,deepen the understanding of the causes of urban heat island effect,and clarify the correlation between human and natural factors,so as to provide scientific supports for the improvement of the quality of urban human settlements.
基金supported by the National Natural Science Foundation of China under Grant 51722406,52074340,and 51874335the Shandong Provincial Natural Science Foundation under Grant JQ201808+5 种基金The Fundamental Research Funds for the Central Universities under Grant 18CX02097Athe Major Scientific and Technological Projects of CNPC under Grant ZD2019-183-008the Science and Technology Support Plan for Youth Innovation of University in Shandong Province under Grant 2019KJH002the National Research Council of Science and Technology Major Project of China under Grant 2016ZX05025001-006111 Project under Grant B08028Sinopec Science and Technology Project under Grant P20050-1
文摘For reservoirs with complex non-Gaussian geological characteristics,such as carbonate reservoirs or reservoirs with sedimentary facies distribution,it is difficult to implement history matching directly,especially for the ensemble-based data assimilation methods.In this paper,we propose a multi-source information fused generative adversarial network(MSIGAN)model,which is used for parameterization of the complex geologies.In MSIGAN,various information such as facies distribution,microseismic,and inter-well connectivity,can be integrated to learn the geological features.And two major generative models in deep learning,variational autoencoder(VAE)and generative adversarial network(GAN)are combined in our model.Then the proposed MSIGAN model is integrated into the ensemble smoother with multiple data assimilation(ESMDA)method to conduct history matching.We tested the proposed method on two reservoir models with fluvial facies.The experimental results show that the proposed MSIGAN model can effectively learn the complex geological features,which can promote the accuracy of history matching.
基金Under the auspices of Natural Science Foundation of China(No.41971166)。
文摘Urban functional area(UFA)is a core scientific issue affecting urban sustainability.The current knowledge gap is mainly reflected in the lack of multi-scale quantitative interpretation methods from the perspective of human-land interaction.In this paper,based on multi-source big data include 250 m×250 m resolution cell phone data,1.81×105 Points of Interest(POI)data and administrative boundary data,we built a UFA identification method and demonstrated empirically in Shenyang City,China.We argue that the method we built can effectively identify multi-scale multi-type UFAs based on human activity and further reveal the spatial correlation between urban facilities and human activity.The empirical study suggests that the employment functional zones in Shenyang City are more concentrated in central cities than other single functional zones.There are more mix functional areas in the central city areas,while the planned industrial new cities need to develop comprehensive functions in Shenyang.UFAs have scale effects and human-land interaction patterns.We suggest that city decision makers should apply multi-sources big data to measure urban functional service in a more refined manner from a supply-demand perspective.
基金Supported by the National Natural Science Foundation of China(No.51379006 and No.51009106)the Program for New Century Excellent Talents in University of Ministry of Education of China(No.NCET-12-0404)the National Basic Research Program of China("973"Program,No.2013CB035903)
文摘Due to the complex nature of multi-source geological data, it is difficult to rebuild every geological structure through a single 3D modeling method. The multi-source data interpretation method put forward in this analysis is based on a database-driven pattern and focuses on the discrete and irregular features of geological data. The geological data from a variety of sources covering a range of accuracy, resolution, quantity and quality are classified and integrated according to their reliability and consistency for 3D modeling. The new interpolation-approximation fitting construction algorithm of geological surfaces with the non-uniform rational B-spline(NURBS) technique is then presented. The NURBS technique can retain the balance among the requirements for accuracy, surface continuity and data storage of geological structures. Finally, four alternative 3D modeling approaches are demonstrated with reference to some examples, which are selected according to the data quantity and accuracy specification. The proposed approaches offer flexible modeling patterns for different practical engineering demands.
基金supported by the National Natural Science Foundation of China(Grant No.:U2202213)the Special Program for the Major Science and Technology Projects of Yunnan Province,China(Grant Nos.:202102AE090051-1-01,and 202202AE090001).
文摘In traditional medicine and ethnomedicine,medicinal plants have long been recognized as the basis for materials in therapeutic applications worldwide.In particular,the remarkable curative effect of traditional Chinese medicine during corona virus disease 2019(COVID-19)pandemic has attracted extensive attention globally.Medicinal plants have,therefore,become increasingly popular among the public.However,with increasing demand for and profit with medicinal plants,commercial fraudulent events such as adulteration or counterfeits sometimes occur,which poses a serious threat to the clinical outcomes and interests of consumers.With rapid advances in artificial intelligence,machine learning can be used to mine information on various medicinal plants to establish an ideal resource database.We herein present a review that mainly introduces common machine learning algorithms and discusses their application in multi-source data analysis of medicinal plants.The combination of machine learning algorithms and multi-source data analysis facilitates a comprehensive analysis and aids in the effective evaluation of the quality of medicinal plants.The findings of this review provide new possibilities for promoting the development and utilization of medicinal plants.
基金Sponsored by the National Natural Science Foundation of China(Grant No.51138003)
文摘In order to estimate vehicular queue length at signalized intersections accurately and overcome the shortcomings and restrictions of existing studies especially those based on shockwave theory,a new methodology is presented for estimating vehicular queue length using data from both point detectors and probe vehicles. The methodology applies the shockwave theory to model queue evolution over time and space. Using probe vehicle locations and times as well as point detector measured traffic states,analytical formulations for calculating the maximum and minimum( residual) queue length are developed. The proposed methodology is verified using ground truth data collected from numerical experiments conducted in Shanghai,China. It is found that the methodology has a mean absolute percentage error of 17. 09%,which is reasonably effective in estimating the queue length at traffic signalized intersections. Limitations of the proposed models and algorithms are also discussed in the paper.