This article explores the characteristics of data resources from the perspective of production factors,analyzes the demand for trustworthy circulation technology,designs a fusion architecture and related solutions,inc...This article explores the characteristics of data resources from the perspective of production factors,analyzes the demand for trustworthy circulation technology,designs a fusion architecture and related solutions,including multi-party data intersection calculation,distributed machine learning,etc.It also compares performance differences,conducts formal verification,points out the value and limitations of architecture innovation,and looks forward to future opportunities.展开更多
Processing large-scale 3-D gravity data is an important topic in geophysics field. Many existing inversion methods lack the competence of processing massive data and practical application capacity. This study proposes...Processing large-scale 3-D gravity data is an important topic in geophysics field. Many existing inversion methods lack the competence of processing massive data and practical application capacity. This study proposes the application of GPU parallel processing technology to the focusing inversion method, aiming at improving the inversion accuracy while speeding up calculation and reducing the memory consumption, thus obtaining the fast and reliable inversion results for large complex model. In this paper, equivalent storage of geometric trellis is used to calculate the sensitivity matrix, and the inversion is based on GPU parallel computing technology. The parallel computing program that is optimized by reducing data transfer, access restrictions and instruction restrictions as well as latency hiding greatly reduces the memory usage, speeds up the calculation, and makes the fast inversion of large models possible. By comparing and analyzing the computing speed of traditional single thread CPU method and CUDA-based GPU parallel technology, the excellent acceleration performance of GPU parallel computing is verified, which provides ideas for practical application of some theoretical inversion methods restricted by computing speed and computer memory. The model test verifies that the focusing inversion method can overcome the problem of severe skin effect and ambiguity of geological body boundary. Moreover, the increase of the model cells and inversion data can more clearly depict the boundary position of the abnormal body and delineate its specific shape.展开更多
The number of satellites,especially those operating in Low-Earth Orbit(LEO),has been exploding in recent years.Additionally,the burgeoning development of Artificial Intelligence(AI)software and hardware has opened up ...The number of satellites,especially those operating in Low-Earth Orbit(LEO),has been exploding in recent years.Additionally,the burgeoning development of Artificial Intelligence(AI)software and hardware has opened up new industrial opportunities in both air and space,with satellite-powered computing emerging as a new computing paradigm:Orbital Edge Computing(OEC).Compared to terrestrial edge computing,the mobility of LEO satellites and their limited communication,computation,and storage resources pose challenges in designing task-specific scheduling algorithms.Previous survey papers have largely focused on terrestrial edge computing or the integration of space and ground technologies,lacking a comprehensive summary of OEC architecture,algorithms,and case studies.This paper conducts a comprehensive survey and analysis of OEC's system architecture,applications,algorithms,and simulation tools,providing a solid background for researchers in the field.By discussing OEC use cases and the challenges faced,potential research directions for future OEC research are proposed.展开更多
Formalizing complex processes and phenomena of a real-world problem may require a large number of variables and constraints,resulting in what is termed a large-scale optimization problem.Nowadays,such large-scale opti...Formalizing complex processes and phenomena of a real-world problem may require a large number of variables and constraints,resulting in what is termed a large-scale optimization problem.Nowadays,such large-scale optimization problems are solved using computing machines,leading to an enormous computational time being required,which may delay deriving timely solutions.Decomposition methods,which partition a large-scale optimization problem into lower-dimensional subproblems,represent a key approach to addressing time-efficiency issues.There has been significant progress in both applied mathematics and emerging artificial intelligence approaches on this front.This work aims at providing an overview of the decomposition methods from both the mathematics and computer science points of view.We also remark on the state-of-the-art developments and recent applications of the decomposition methods,and discuss the future research and development perspectives.展开更多
Robotic computing systems play an important role in enabling intelligent robotic tasks through intelligent algo-rithms and supporting hardware.In recent years,the evolution of robotic algorithms indicates a roadmap fr...Robotic computing systems play an important role in enabling intelligent robotic tasks through intelligent algo-rithms and supporting hardware.In recent years,the evolution of robotic algorithms indicates a roadmap from traditional robotics to hierarchical and end-to-end models.This algorithmic advancement poses a critical challenge in achieving balanced system-wide performance.Therefore,algorithm-hardware co-design has emerged as the primary methodology,which ana-lyzes algorithm behaviors on hardware to identify common computational properties.These properties can motivate algo-rithm optimization to reduce computational complexity and hardware innovation from architecture to circuit for high performance and high energy efficiency.We then reviewed recent works on robotic and embodied AI algorithms and computing hard-ware to demonstrate this algorithm-hardware co-design methodology.In the end,we discuss future research opportunities by answering two questions:(1)how to adapt the computing platforms to the rapid evolution of embodied AI algorithms,and(2)how to transform the potential of emerging hardware innovations into end-to-end inference improvements.展开更多
The rapid adoption of machine learning in sensitive domains,such as healthcare,finance,and government services,has heightened the need for robust,privacy-preserving techniques.Traditional machine learning approaches l...The rapid adoption of machine learning in sensitive domains,such as healthcare,finance,and government services,has heightened the need for robust,privacy-preserving techniques.Traditional machine learning approaches lack built-in privacy mechanisms,exposing sensitive data to risks,which motivates the development of Privacy-Preserving Machine Learning(PPML)methods.Despite significant advances in PPML,a comprehensive and focused exploration of Secure Multi-Party Computing(SMPC)within this context remains underdeveloped.This review aims to bridge this knowledge gap by systematically analyzing the role of SMPC in PPML,offering a structured overviewof current techniques,challenges,and future directions.Using a semi-systematicmapping studymethodology,this paper surveys recent literature spanning SMPC protocols,PPML frameworks,implementation approaches,threat models,and performance metrics.Emphasis is placed on identifying trends,technical limitations,and comparative strengths of leading SMPC-based methods.Our findings reveal thatwhile SMPCoffers strong cryptographic guarantees for privacy,challenges such as computational overhead,communication costs,and scalability persist.The paper also discusses critical vulnerabilities,practical deployment issues,and variations in protocol efficiency across use cases.展开更多
Rack-level loop thermosyphons have been widely adopted as a solution to data centers’growing energy demands.While numerous studies have highlighted the heat transfer performance and energy-saving benefits of this sys...Rack-level loop thermosyphons have been widely adopted as a solution to data centers’growing energy demands.While numerous studies have highlighted the heat transfer performance and energy-saving benefits of this system,its economic feasibility,water usage effectiveness(WUE),and carbon usage effectiveness(CUE)remain underexplored.This study introduces a comprehensive evaluation index designed to assess the applicability of the rack-level loop thermosyphon system across various computing hub nodes.The air wet bulb temperature Ta,w was identified as the most significant factor influencing the variability in the combination of PUE,CUE,and WUE values.The results indicate that the rack-level loop thermosyphon system achieves the highest score in Lanzhou(94.485)and the lowest in Beijing(89.261)based on the comprehensive evaluation index.The overall ranking of cities according to the comprehensive evaluation score is as follows:Gansu hub(Lanzhou)>Inner Mongolia hub(Hohhot)>Ningxia hub(Yinchuan)>Yangtze River Delta hub(Shanghai)>Chengdu Chongqing hub(Chongqing)>Guangdong-Hong Kong-Macao Greater Bay Area hub(Guangzhou)>Guizhou hub(Guiyang)>Beijing-Tianjin-Hebei hub(Beijing).Furthermore,Hohhot,Lanzhou,and Yinchuan consistently rank among the top three cities for comprehensive scores across all load rates,while Guiyang(at a 25%load rate),Guangzhou(at a 50%load rate),and Beijing(at 75%and 100%load rates)exhibited the lowest comprehensive scores.展开更多
With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for clou...With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for cloud servers and edge nodes.The storage capacity of edge nodes close to users is limited.We should store hotspot data in edge nodes as much as possible,so as to ensure response timeliness and access hit rate;However,the current scheme cannot guarantee that every sub-message in a complete data stored by the edge node meets the requirements of hot data;How to complete the detection and deletion of redundant data in edge nodes under the premise of protecting user privacy and data dynamic integrity has become a challenging problem.Our paper proposes a redundant data detection method that meets the privacy protection requirements.By scanning the cipher text,it is determined whether each sub-message of the data in the edge node meets the requirements of the hot data.It has the same effect as zero-knowledge proof,and it will not reveal the privacy of users.In addition,for redundant sub-data that does not meet the requirements of hot data,our paper proposes a redundant data deletion scheme that meets the dynamic integrity of the data.We use Content Extraction Signature(CES)to generate the remaining hot data signature after the redundant data is deleted.The feasibility of the scheme is proved through safety analysis and efficiency analysis.展开更多
The Internet of Things(IoT)has revolutionized how we interact with and gather data from our surrounding environment.IoT devices with various sensors and actuators generate vast amounts of data that can be harnessed to...The Internet of Things(IoT)has revolutionized how we interact with and gather data from our surrounding environment.IoT devices with various sensors and actuators generate vast amounts of data that can be harnessed to derive valuable insights.The rapid proliferation of Internet of Things(IoT)devices has ushered in an era of unprecedented data generation and connectivity.These IoT devices,equipped with many sensors and actuators,continuously produce vast volumes of data.However,the conventional approach of transmitting all this data to centralized cloud infrastructures for processing and analysis poses significant challenges.However,transmitting all this data to a centralized cloud infrastructure for processing and analysis can be inefficient and impractical due to bandwidth limitations,network latency,and scalability issues.This paper proposed a Self-Learning Internet Traffic Fuzzy Classifier(SLItFC)for traffic data analysis.The proposed techniques effectively utilize clustering and classification procedures to improve classification accuracy in analyzing network traffic data.SLItFC addresses the intricate task of efficiently managing and analyzing IoT data traffic at the edge.It employs a sophisticated combination of fuzzy clustering and self-learning techniques,allowing it to adapt and improve its classification accuracy over time.This adaptability is a crucial feature,given the dynamic nature of IoT environments where data patterns and traffic characteristics can evolve rapidly.With the implementation of the fuzzy classifier,the accuracy of the clustering process is improvised with the reduction of the computational time.SLItFC can reduce computational time while maintaining high classification accuracy.This efficiency is paramount in edge computing,where resource constraints demand streamlined data processing.Additionally,SLItFC’s performance advantages make it a compelling choice for organizations seeking to harness the potential of IoT data for real-time insights and decision-making.With the Self-Learning process,the SLItFC model monitors the network traffic data acquired from the IoT Devices.The Sugeno fuzzy model is implemented within the edge computing environment for improved classification accuracy.Simulation analysis stated that the proposed SLItFC achieves 94.5%classification accuracy with reduced classification time.展开更多
Background A task assigned to space exploration satellites involves detecting the physical environment within a certain space.However,space detection data are complex and abstract.These data are not conducive for rese...Background A task assigned to space exploration satellites involves detecting the physical environment within a certain space.However,space detection data are complex and abstract.These data are not conducive for researchers'visual perceptions of the evolution and interaction of events in the space environment.Methods A time-series dynamic data sampling method for large-scale space was proposed for sample detection data in space and time,and the corresponding relationships between data location features and other attribute features were established.A tone-mapping method based on statistical histogram equalization was proposed and applied to the final attribute feature data.The visualization process is optimized for rendering by merging materials,reducing the number of patches,and performing other operations.Results The results of sampling,feature extraction,and uniform visualization of the detection data of complex types,long duration spans,and uneven spatial distributions were obtained.The real-time visualization of large-scale spatial structures using augmented reality devices,particularly low-performance devices,was also investigated.Conclusions The proposed visualization system can reconstruct the three-dimensional structure of a large-scale space,express the structure and changes in the spatial environment using augmented reality,and assist in intuitively discovering spatial environmental events and evolutionary rules.展开更多
In this study, we delve into the realm of efficient Big Data Engineering and Extract, Transform, Load (ETL) processes within the healthcare sector, leveraging the robust foundation provided by the MIMIC-III Clinical D...In this study, we delve into the realm of efficient Big Data Engineering and Extract, Transform, Load (ETL) processes within the healthcare sector, leveraging the robust foundation provided by the MIMIC-III Clinical Database. Our investigation entails a comprehensive exploration of various methodologies aimed at enhancing the efficiency of ETL processes, with a primary emphasis on optimizing time and resource utilization. Through meticulous experimentation utilizing a representative dataset, we shed light on the advantages associated with the incorporation of PySpark and Docker containerized applications. Our research illuminates significant advancements in time efficiency, process streamlining, and resource optimization attained through the utilization of PySpark for distributed computing within Big Data Engineering workflows. Additionally, we underscore the strategic integration of Docker containers, delineating their pivotal role in augmenting scalability and reproducibility within the ETL pipeline. This paper encapsulates the pivotal insights gleaned from our experimental journey, accentuating the practical implications and benefits entailed in the adoption of PySpark and Docker. By streamlining Big Data Engineering and ETL processes in the context of clinical big data, our study contributes to the ongoing discourse on optimizing data processing efficiency in healthcare applications. The source code is available on request.展开更多
The current education field is experiencing an innovation driven by big data and cloud technologies,and these advanced technologies play a central role in the construction of smart campuses.Big data technology has a w...The current education field is experiencing an innovation driven by big data and cloud technologies,and these advanced technologies play a central role in the construction of smart campuses.Big data technology has a wide range of applications in student learning behavior analysis,teaching resource management,campus safety monitoring,and decision support,which improves the quality of education and management efficiency.Cloud computing technology supports the integration,distribution,and optimal use of educational resources through cloud resource sharing,virtual classrooms,intelligent campus management systems,and Infrastructure-as-a-Service(IaaS)models,which reduce costs and increase flexibility.This paper comprehensively discusses the practical application of big data and cloud computing technologies in smart campuses,showing how these technologies can contribute to the development of smart campuses,and laying the foundation for the future innovation of education models.展开更多
With the development of Internet technology and human computing, the computing environment has changed dramatically over the last three decades. Cloud computing emerges as a paradigm of Internet computing in which dyn...With the development of Internet technology and human computing, the computing environment has changed dramatically over the last three decades. Cloud computing emerges as a paradigm of Internet computing in which dynamical, scalable and often virtuMized resources are provided as services. With virtualization technology, cloud computing offers diverse services (such as virtual computing, virtual storage, virtual bandwidth, etc.) for the public by means of multi-tenancy mode. Although users are enjoying the capabilities of super-computing and mass storage supplied by cloud computing, cloud security still remains as a hot spot problem, which is in essence the trust management between data owners and storage service providers. In this paper, we propose a data coloring method based on cloud watermarking to recognize and ensure mutual reputations. The experimental results show that the robustness of reverse cloud generator can guarantee users' embedded social reputation identifications. Hence, our work provides a reference solution to the critical problem of cloud security.展开更多
Advanced cloud computing technology provides cost saving and flexibility of services for users.With the explosion of multimedia data,more and more data owners would outsource their personal multimedia data on the clou...Advanced cloud computing technology provides cost saving and flexibility of services for users.With the explosion of multimedia data,more and more data owners would outsource their personal multimedia data on the cloud.In the meantime,some computationally expensive tasks are also undertaken by cloud servers.However,the outsourced multimedia data and its applications may reveal the data owner’s private information because the data owners lose the control of their data.Recently,this thought has aroused new research interest on privacy-preserving reversible data hiding over outsourced multimedia data.In this paper,two reversible data hiding schemes are proposed for encrypted image data in cloud computing:reversible data hiding by homomorphic encryption and reversible data hiding in encrypted domain.The former is that additional bits are extracted after decryption and the latter is that extracted before decryption.Meanwhile,a combined scheme is also designed.This paper proposes the privacy-preserving outsourcing scheme of reversible data hiding over encrypted image data in cloud computing,which not only ensures multimedia data security without relying on the trustworthiness of cloud servers,but also guarantees that reversible data hiding can be operated over encrypted images at the different stages.Theoretical analysis confirms the correctness of the proposed encryption model and justifies the security of the proposed scheme.The computation cost of the proposed scheme is acceptable and adjusts to different security levels.展开更多
The Spectral Statistical Interpolation (SSI) analysis system of NCEP is used to assimilate meteorological data from the Global Positioning Satellite System (GPS/MET) refraction angles with the variational technique. V...The Spectral Statistical Interpolation (SSI) analysis system of NCEP is used to assimilate meteorological data from the Global Positioning Satellite System (GPS/MET) refraction angles with the variational technique. Verified by radiosonde, including GPS/MET observations into the analysis makes an overall improvement to the analysis variables of temperature, winds, and water vapor. However, the variational model with the ray-tracing method is quite expensive for numerical weather prediction and climate research. For example, about 4 000 GPS/MET refraction angles need to be assimilated to produce an ideal global analysis. Just one iteration of minimization will take more than 24 hours CPU time on the NCEP's Cray C90 computer. Although efforts have been taken to reduce the computational cost, it is still prohibitive for operational data assimilation. In this paper, a parallel version of the three-dimensional variational data assimilation model of GPS/MET occultation measurement suitable for massive parallel processors architectures is developed. The divide-and-conquer strategy is used to achieve parallelism and is implemented by message passing. The authors present the principles for the code's design and examine the performance on the state-of-the-art parallel computers in China. The results show that this parallel model scales favorably as the number of processors is increased. With the Memory-IO technique implemented by the author, the wall clock time per iteration used for assimilating 1420 refraction angles is reduced from 45 s to 12 s using 1420 processors. This suggests that the new parallelized code has the potential to be useful in numerical weather prediction (NWP) and climate studies.展开更多
Industrial big data integration and sharing(IBDIS)is of great significance in managing and providing data for big data analysis in manufacturing systems.A novel fog-computing-based IBDIS approach called Fog-IBDIS is p...Industrial big data integration and sharing(IBDIS)is of great significance in managing and providing data for big data analysis in manufacturing systems.A novel fog-computing-based IBDIS approach called Fog-IBDIS is proposed in order to integrate and share industrial big data with high raw data security and low network traffic loads by moving the integration task from the cloud to the edge of networks.First,a task flow graph(TFG)is designed to model the data analysis process.The TFG is composed of several tasks,which are executed by the data owners through the Fog-IBDIS platform in order to protect raw data privacy.Second,the function of Fog-IBDIS to enable data integration and sharing is presented in five modules:TFG management,compilation and running control,the data integration model,the basic algorithm library,and the management component.Finally,a case study is presented to illustrate the implementation of Fog-IBDIS,which ensures raw data security by deploying the analysis tasks executed by the data generators,and eases the network traffic load by greatly reducing the volume of transmitted data.展开更多
Large-scale multi-objective optimization problems(MOPs)that involve a large number of decision variables,have emerged from many real-world applications.While evolutionary algorithms(EAs)have been widely acknowledged a...Large-scale multi-objective optimization problems(MOPs)that involve a large number of decision variables,have emerged from many real-world applications.While evolutionary algorithms(EAs)have been widely acknowledged as a mainstream method for MOPs,most research progress and successful applications of EAs have been restricted to MOPs with small-scale decision variables.More recently,it has been reported that traditional multi-objective EAs(MOEAs)suffer severe deterioration with the increase of decision variables.As a result,and motivated by the emergence of real-world large-scale MOPs,investigation of MOEAs in this aspect has attracted much more attention in the past decade.This paper reviews the progress of evolutionary computation for large-scale multi-objective optimization from two angles.From the key difficulties of the large-scale MOPs,the scalability analysis is discussed by focusing on the performance of existing MOEAs and the challenges induced by the increase of the number of decision variables.From the perspective of methodology,the large-scale MOEAs are categorized into three classes and introduced respectively:divide and conquer based,dimensionality reduction based and enhanced search-based approaches.Several future research directions are also discussed.展开更多
How to effectively reduce the energy consumption of large-scale data centers is a key issue in cloud computing. This paper presents a novel low-power task scheduling algorithm (L3SA) for large-scale cloud data cente...How to effectively reduce the energy consumption of large-scale data centers is a key issue in cloud computing. This paper presents a novel low-power task scheduling algorithm (L3SA) for large-scale cloud data centers. The winner tree is introduced to make the data nodes as the leaf nodes of the tree and the final winner on the purpose of reducing energy consumption is selected. The complexity of large-scale cloud data centers is fully consider, and the task comparson coefficient is defined to make task scheduling strategy more reasonable. Experiments and performance analysis show that the proposed algorithm can effectively improve the node utilization, and reduce the overall power consumption of the cloud data center.展开更多
Efficient and effective data acquisition is of theoretical and practical importance in WSN applications because data measured and collected by WSN is often unreliable, such as those often accompanied by noise and erro...Efficient and effective data acquisition is of theoretical and practical importance in WSN applications because data measured and collected by WSN is often unreliable, such as those often accompanied by noise and error, missing values or inconsistent data. Motivated by fog computing, which focuses on how to effectively offload computation-intensive tasks from resource-constrained devices, this paper proposes a simple but yet effective data acquisition approach with the ability of filtering abnormal data and meeting the real-time requirement. Our method uses a cooperation mechanism by leveraging on both an architectural and algorithmic approach. Firstly, the sensor node with the limited computing resource only accomplishes detecting and marking the suspicious data using a light weight algorithm. Secondly, the cluster head evaluates suspicious data by referring to the data from the other sensor nodes in the same cluster and discard the abnormal data directly. Thirdly, the sink node fills up the discarded data with an approximate value using nearest neighbor data supplement method. Through the architecture, each node only consumes a few computational resources and distributes the heavily computing load to several nodes. Simulation results show that our data acquisition method is effective considering the real-time outlier filtering and the computing overhead.展开更多
Edge-computing-enabled smart greenhouses are a representative application of the Internet of Things(IoT)technology,which can monitor the environmental information in real-time and employ the information to contribute ...Edge-computing-enabled smart greenhouses are a representative application of the Internet of Things(IoT)technology,which can monitor the environmental information in real-time and employ the information to contribute to intelligent decision-making.In the process,anomaly detection for wireless sensor data plays an important role.However,the traditional anomaly detection algorithms originally designed for anomaly detection in static data do not properly consider the inherent characteristics of the data stream produced by wireless sensors such as infiniteness,correlations,and concept drift,which may pose a considerable challenge to anomaly detection based on data stream and lead to low detection accuracy and efficiency.First,the data stream is usually generated quickly,which means that the data stream is infinite and enormous.Hence,any traditional off-line anomaly detection algorithm that attempts to store the whole dataset or to scan the dataset multiple times for anomaly detection will run out of memory space.Second,there exist correlations among different data streams,and traditional algorithms hardly consider these correlations.Third,the underlying data generation process or distribution may change over time.Thus,traditional anomaly detection algorithms with no model update will lose their effects.Considering these issues,a novel method(called DLSHiForest)based on Locality-Sensitive Hashing and the time window technique is proposed to solve these problems while achieving accurate and efficient detection.Comprehensive experiments are executed using a real-world agricultural greenhouse dataset to demonstrate the feasibility of our approach.Experimental results show that our proposal is practical for addressing the challenges of traditional anomaly detection while ensuring accuracy and efficiency.展开更多
文摘This article explores the characteristics of data resources from the perspective of production factors,analyzes the demand for trustworthy circulation technology,designs a fusion architecture and related solutions,including multi-party data intersection calculation,distributed machine learning,etc.It also compares performance differences,conducts formal verification,points out the value and limitations of architecture innovation,and looks forward to future opportunities.
基金Supported by Project of National Natural Science Foundation(No.41874134)
文摘Processing large-scale 3-D gravity data is an important topic in geophysics field. Many existing inversion methods lack the competence of processing massive data and practical application capacity. This study proposes the application of GPU parallel processing technology to the focusing inversion method, aiming at improving the inversion accuracy while speeding up calculation and reducing the memory consumption, thus obtaining the fast and reliable inversion results for large complex model. In this paper, equivalent storage of geometric trellis is used to calculate the sensitivity matrix, and the inversion is based on GPU parallel computing technology. The parallel computing program that is optimized by reducing data transfer, access restrictions and instruction restrictions as well as latency hiding greatly reduces the memory usage, speeds up the calculation, and makes the fast inversion of large models possible. By comparing and analyzing the computing speed of traditional single thread CPU method and CUDA-based GPU parallel technology, the excellent acceleration performance of GPU parallel computing is verified, which provides ideas for practical application of some theoretical inversion methods restricted by computing speed and computer memory. The model test verifies that the focusing inversion method can overcome the problem of severe skin effect and ambiguity of geological body boundary. Moreover, the increase of the model cells and inversion data can more clearly depict the boundary position of the abnormal body and delineate its specific shape.
基金funded by the Hong Kong-Macao-Taiwan Science and Technology Cooperation Project of the Science and Technology Innovation Action Plan in Shanghai,China(23510760200)the Oriental Talent Youth Program of Shanghai,China(No.Y3DFRCZL01)+1 种基金the Outstanding Program of the Youth Innovation Promotion Association of the Chinese Academy of Sciences(No.Y2023080)the Strategic Priority Research Program of the Chinese Academy of Sciences Category A(No.XDA0360404).
文摘The number of satellites,especially those operating in Low-Earth Orbit(LEO),has been exploding in recent years.Additionally,the burgeoning development of Artificial Intelligence(AI)software and hardware has opened up new industrial opportunities in both air and space,with satellite-powered computing emerging as a new computing paradigm:Orbital Edge Computing(OEC).Compared to terrestrial edge computing,the mobility of LEO satellites and their limited communication,computation,and storage resources pose challenges in designing task-specific scheduling algorithms.Previous survey papers have largely focused on terrestrial edge computing or the integration of space and ground technologies,lacking a comprehensive summary of OEC architecture,algorithms,and case studies.This paper conducts a comprehensive survey and analysis of OEC's system architecture,applications,algorithms,and simulation tools,providing a solid background for researchers in the field.By discussing OEC use cases and the challenges faced,potential research directions for future OEC research are proposed.
基金The Australian Research Council(DP200101197,DP230101107).
文摘Formalizing complex processes and phenomena of a real-world problem may require a large number of variables and constraints,resulting in what is termed a large-scale optimization problem.Nowadays,such large-scale optimization problems are solved using computing machines,leading to an enormous computational time being required,which may delay deriving timely solutions.Decomposition methods,which partition a large-scale optimization problem into lower-dimensional subproblems,represent a key approach to addressing time-efficiency issues.There has been significant progress in both applied mathematics and emerging artificial intelligence approaches on this front.This work aims at providing an overview of the decomposition methods from both the mathematics and computer science points of view.We also remark on the state-of-the-art developments and recent applications of the decomposition methods,and discuss the future research and development perspectives.
基金supported in part by NSFC under Grant 62422407in part by RGC under Grant 26204424in part by ACCESS–AI Chip Center for Emerging Smart Systems, sponsored by the Inno HK initiative of the Innovation and Technology Commission of the Hong Kong Special Administrative Region Government
文摘Robotic computing systems play an important role in enabling intelligent robotic tasks through intelligent algo-rithms and supporting hardware.In recent years,the evolution of robotic algorithms indicates a roadmap from traditional robotics to hierarchical and end-to-end models.This algorithmic advancement poses a critical challenge in achieving balanced system-wide performance.Therefore,algorithm-hardware co-design has emerged as the primary methodology,which ana-lyzes algorithm behaviors on hardware to identify common computational properties.These properties can motivate algo-rithm optimization to reduce computational complexity and hardware innovation from architecture to circuit for high performance and high energy efficiency.We then reviewed recent works on robotic and embodied AI algorithms and computing hard-ware to demonstrate this algorithm-hardware co-design methodology.In the end,we discuss future research opportunities by answering two questions:(1)how to adapt the computing platforms to the rapid evolution of embodied AI algorithms,and(2)how to transform the potential of emerging hardware innovations into end-to-end inference improvements.
文摘The rapid adoption of machine learning in sensitive domains,such as healthcare,finance,and government services,has heightened the need for robust,privacy-preserving techniques.Traditional machine learning approaches lack built-in privacy mechanisms,exposing sensitive data to risks,which motivates the development of Privacy-Preserving Machine Learning(PPML)methods.Despite significant advances in PPML,a comprehensive and focused exploration of Secure Multi-Party Computing(SMPC)within this context remains underdeveloped.This review aims to bridge this knowledge gap by systematically analyzing the role of SMPC in PPML,offering a structured overviewof current techniques,challenges,and future directions.Using a semi-systematicmapping studymethodology,this paper surveys recent literature spanning SMPC protocols,PPML frameworks,implementation approaches,threat models,and performance metrics.Emphasis is placed on identifying trends,technical limitations,and comparative strengths of leading SMPC-based methods.Our findings reveal thatwhile SMPCoffers strong cryptographic guarantees for privacy,challenges such as computational overhead,communication costs,and scalability persist.The paper also discusses critical vulnerabilities,practical deployment issues,and variations in protocol efficiency across use cases.
基金supported by the Natural Science Foundation of Hunan Province,China(Grant Nos.2023JJ50178 and 2023JJ50194)the Excellent Youth Project of Hunan Provincial Department of Education(Grant No.23B0542).
文摘Rack-level loop thermosyphons have been widely adopted as a solution to data centers’growing energy demands.While numerous studies have highlighted the heat transfer performance and energy-saving benefits of this system,its economic feasibility,water usage effectiveness(WUE),and carbon usage effectiveness(CUE)remain underexplored.This study introduces a comprehensive evaluation index designed to assess the applicability of the rack-level loop thermosyphon system across various computing hub nodes.The air wet bulb temperature Ta,w was identified as the most significant factor influencing the variability in the combination of PUE,CUE,and WUE values.The results indicate that the rack-level loop thermosyphon system achieves the highest score in Lanzhou(94.485)and the lowest in Beijing(89.261)based on the comprehensive evaluation index.The overall ranking of cities according to the comprehensive evaluation score is as follows:Gansu hub(Lanzhou)>Inner Mongolia hub(Hohhot)>Ningxia hub(Yinchuan)>Yangtze River Delta hub(Shanghai)>Chengdu Chongqing hub(Chongqing)>Guangdong-Hong Kong-Macao Greater Bay Area hub(Guangzhou)>Guizhou hub(Guiyang)>Beijing-Tianjin-Hebei hub(Beijing).Furthermore,Hohhot,Lanzhou,and Yinchuan consistently rank among the top three cities for comprehensive scores across all load rates,while Guiyang(at a 25%load rate),Guangzhou(at a 50%load rate),and Beijing(at 75%and 100%load rates)exhibited the lowest comprehensive scores.
基金sponsored by the National Natural Science Foundation of China under grant number No. 62172353, No. 62302114, No. U20B2046 and No. 62172115Innovation Fund Program of the Engineering Research Center for Integration and Application of Digital Learning Technology of Ministry of Education No.1331007 and No. 1311022+1 种基金Natural Science Foundation of the Jiangsu Higher Education Institutions Grant No. 17KJB520044Six Talent Peaks Project in Jiangsu Province No.XYDXX-108
文摘With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for cloud servers and edge nodes.The storage capacity of edge nodes close to users is limited.We should store hotspot data in edge nodes as much as possible,so as to ensure response timeliness and access hit rate;However,the current scheme cannot guarantee that every sub-message in a complete data stored by the edge node meets the requirements of hot data;How to complete the detection and deletion of redundant data in edge nodes under the premise of protecting user privacy and data dynamic integrity has become a challenging problem.Our paper proposes a redundant data detection method that meets the privacy protection requirements.By scanning the cipher text,it is determined whether each sub-message of the data in the edge node meets the requirements of the hot data.It has the same effect as zero-knowledge proof,and it will not reveal the privacy of users.In addition,for redundant sub-data that does not meet the requirements of hot data,our paper proposes a redundant data deletion scheme that meets the dynamic integrity of the data.We use Content Extraction Signature(CES)to generate the remaining hot data signature after the redundant data is deleted.The feasibility of the scheme is proved through safety analysis and efficiency analysis.
基金This research is funded by 2023 Henan Province Science and Technology Research Projects:Key Technology of Rapid Urban Flood Forecasting Based onWater Level Feature Analysis and Spatio-Temporal Deep Learning(No.232102320015)Henan Provincial Higher Education Key Research Project Program(Project No.23B520024)a Multi-Sensor-Based Indoor Environmental Parameters Monitoring and Control System.
文摘The Internet of Things(IoT)has revolutionized how we interact with and gather data from our surrounding environment.IoT devices with various sensors and actuators generate vast amounts of data that can be harnessed to derive valuable insights.The rapid proliferation of Internet of Things(IoT)devices has ushered in an era of unprecedented data generation and connectivity.These IoT devices,equipped with many sensors and actuators,continuously produce vast volumes of data.However,the conventional approach of transmitting all this data to centralized cloud infrastructures for processing and analysis poses significant challenges.However,transmitting all this data to a centralized cloud infrastructure for processing and analysis can be inefficient and impractical due to bandwidth limitations,network latency,and scalability issues.This paper proposed a Self-Learning Internet Traffic Fuzzy Classifier(SLItFC)for traffic data analysis.The proposed techniques effectively utilize clustering and classification procedures to improve classification accuracy in analyzing network traffic data.SLItFC addresses the intricate task of efficiently managing and analyzing IoT data traffic at the edge.It employs a sophisticated combination of fuzzy clustering and self-learning techniques,allowing it to adapt and improve its classification accuracy over time.This adaptability is a crucial feature,given the dynamic nature of IoT environments where data patterns and traffic characteristics can evolve rapidly.With the implementation of the fuzzy classifier,the accuracy of the clustering process is improvised with the reduction of the computational time.SLItFC can reduce computational time while maintaining high classification accuracy.This efficiency is paramount in edge computing,where resource constraints demand streamlined data processing.Additionally,SLItFC’s performance advantages make it a compelling choice for organizations seeking to harness the potential of IoT data for real-time insights and decision-making.With the Self-Learning process,the SLItFC model monitors the network traffic data acquired from the IoT Devices.The Sugeno fuzzy model is implemented within the edge computing environment for improved classification accuracy.Simulation analysis stated that the proposed SLItFC achieves 94.5%classification accuracy with reduced classification time.
文摘Background A task assigned to space exploration satellites involves detecting the physical environment within a certain space.However,space detection data are complex and abstract.These data are not conducive for researchers'visual perceptions of the evolution and interaction of events in the space environment.Methods A time-series dynamic data sampling method for large-scale space was proposed for sample detection data in space and time,and the corresponding relationships between data location features and other attribute features were established.A tone-mapping method based on statistical histogram equalization was proposed and applied to the final attribute feature data.The visualization process is optimized for rendering by merging materials,reducing the number of patches,and performing other operations.Results The results of sampling,feature extraction,and uniform visualization of the detection data of complex types,long duration spans,and uneven spatial distributions were obtained.The real-time visualization of large-scale spatial structures using augmented reality devices,particularly low-performance devices,was also investigated.Conclusions The proposed visualization system can reconstruct the three-dimensional structure of a large-scale space,express the structure and changes in the spatial environment using augmented reality,and assist in intuitively discovering spatial environmental events and evolutionary rules.
文摘In this study, we delve into the realm of efficient Big Data Engineering and Extract, Transform, Load (ETL) processes within the healthcare sector, leveraging the robust foundation provided by the MIMIC-III Clinical Database. Our investigation entails a comprehensive exploration of various methodologies aimed at enhancing the efficiency of ETL processes, with a primary emphasis on optimizing time and resource utilization. Through meticulous experimentation utilizing a representative dataset, we shed light on the advantages associated with the incorporation of PySpark and Docker containerized applications. Our research illuminates significant advancements in time efficiency, process streamlining, and resource optimization attained through the utilization of PySpark for distributed computing within Big Data Engineering workflows. Additionally, we underscore the strategic integration of Docker containers, delineating their pivotal role in augmenting scalability and reproducibility within the ETL pipeline. This paper encapsulates the pivotal insights gleaned from our experimental journey, accentuating the practical implications and benefits entailed in the adoption of PySpark and Docker. By streamlining Big Data Engineering and ETL processes in the context of clinical big data, our study contributes to the ongoing discourse on optimizing data processing efficiency in healthcare applications. The source code is available on request.
文摘The current education field is experiencing an innovation driven by big data and cloud technologies,and these advanced technologies play a central role in the construction of smart campuses.Big data technology has a wide range of applications in student learning behavior analysis,teaching resource management,campus safety monitoring,and decision support,which improves the quality of education and management efficiency.Cloud computing technology supports the integration,distribution,and optimal use of educational resources through cloud resource sharing,virtual classrooms,intelligent campus management systems,and Infrastructure-as-a-Service(IaaS)models,which reduce costs and increase flexibility.This paper comprehensively discusses the practical application of big data and cloud computing technologies in smart campuses,showing how these technologies can contribute to the development of smart campuses,and laying the foundation for the future innovation of education models.
基金supported by National Basic Research Program of China (973 Program) (No. 2007CB310800)China Postdoctoral Science Foundation (No. 20090460107 and No. 201003794)
文摘With the development of Internet technology and human computing, the computing environment has changed dramatically over the last three decades. Cloud computing emerges as a paradigm of Internet computing in which dynamical, scalable and often virtuMized resources are provided as services. With virtualization technology, cloud computing offers diverse services (such as virtual computing, virtual storage, virtual bandwidth, etc.) for the public by means of multi-tenancy mode. Although users are enjoying the capabilities of super-computing and mass storage supplied by cloud computing, cloud security still remains as a hot spot problem, which is in essence the trust management between data owners and storage service providers. In this paper, we propose a data coloring method based on cloud watermarking to recognize and ensure mutual reputations. The experimental results show that the robustness of reverse cloud generator can guarantee users' embedded social reputation identifications. Hence, our work provides a reference solution to the critical problem of cloud security.
基金This work was supported by the National Natural Science Foundation of China(No.61702276)the Startup Foundation for Introducing Talent of Nanjing University of Information Science and Technology under Grant 2016r055 and the Priority Academic Program Development(PAPD)of Jiangsu Higher Education Institutions.The authors are grateful for the anonymous reviewers who made constructive comments and improvements.
文摘Advanced cloud computing technology provides cost saving and flexibility of services for users.With the explosion of multimedia data,more and more data owners would outsource their personal multimedia data on the cloud.In the meantime,some computationally expensive tasks are also undertaken by cloud servers.However,the outsourced multimedia data and its applications may reveal the data owner’s private information because the data owners lose the control of their data.Recently,this thought has aroused new research interest on privacy-preserving reversible data hiding over outsourced multimedia data.In this paper,two reversible data hiding schemes are proposed for encrypted image data in cloud computing:reversible data hiding by homomorphic encryption and reversible data hiding in encrypted domain.The former is that additional bits are extracted after decryption and the latter is that extracted before decryption.Meanwhile,a combined scheme is also designed.This paper proposes the privacy-preserving outsourcing scheme of reversible data hiding over encrypted image data in cloud computing,which not only ensures multimedia data security without relying on the trustworthiness of cloud servers,but also guarantees that reversible data hiding can be operated over encrypted images at the different stages.Theoretical analysis confirms the correctness of the proposed encryption model and justifies the security of the proposed scheme.The computation cost of the proposed scheme is acceptable and adjusts to different security levels.
基金supported by the National Natural Science Eoundation of China under Grant No.40221503the China National Key Programme for Development Basic Sciences (Abbreviation:973 Project,Grant No.G1999032801)
文摘The Spectral Statistical Interpolation (SSI) analysis system of NCEP is used to assimilate meteorological data from the Global Positioning Satellite System (GPS/MET) refraction angles with the variational technique. Verified by radiosonde, including GPS/MET observations into the analysis makes an overall improvement to the analysis variables of temperature, winds, and water vapor. However, the variational model with the ray-tracing method is quite expensive for numerical weather prediction and climate research. For example, about 4 000 GPS/MET refraction angles need to be assimilated to produce an ideal global analysis. Just one iteration of minimization will take more than 24 hours CPU time on the NCEP's Cray C90 computer. Although efforts have been taken to reduce the computational cost, it is still prohibitive for operational data assimilation. In this paper, a parallel version of the three-dimensional variational data assimilation model of GPS/MET occultation measurement suitable for massive parallel processors architectures is developed. The divide-and-conquer strategy is used to achieve parallelism and is implemented by message passing. The authors present the principles for the code's design and examine the performance on the state-of-the-art parallel computers in China. The results show that this parallel model scales favorably as the number of processors is increased. With the Memory-IO technique implemented by the author, the wall clock time per iteration used for assimilating 1420 refraction angles is reduced from 45 s to 12 s using 1420 processors. This suggests that the new parallelized code has the potential to be useful in numerical weather prediction (NWP) and climate studies.
基金This work was supported in part by the National Natural Science Foundation of China(51435009)Shanghai Sailing Program(19YF1401500)the Fundamental Research Funds for the Central Universities(2232019D3-34).
文摘Industrial big data integration and sharing(IBDIS)is of great significance in managing and providing data for big data analysis in manufacturing systems.A novel fog-computing-based IBDIS approach called Fog-IBDIS is proposed in order to integrate and share industrial big data with high raw data security and low network traffic loads by moving the integration task from the cloud to the edge of networks.First,a task flow graph(TFG)is designed to model the data analysis process.The TFG is composed of several tasks,which are executed by the data owners through the Fog-IBDIS platform in order to protect raw data privacy.Second,the function of Fog-IBDIS to enable data integration and sharing is presented in five modules:TFG management,compilation and running control,the data integration model,the basic algorithm library,and the management component.Finally,a case study is presented to illustrate the implementation of Fog-IBDIS,which ensures raw data security by deploying the analysis tasks executed by the data generators,and eases the network traffic load by greatly reducing the volume of transmitted data.
基金This work was supported by the Natural Science Foundation of China(Nos.61672478 and 61806090)the National Key Research and Development Program of China(No.2017YFB1003102)+4 种基金the Guangdong Provincial Key Laboratory(No.2020B121201001)the Shenzhen Peacock Plan(No.KQTD2016112514355531)the Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-inspired Intelligence Fund(No.2019028)the Fellowship of China Postdoctoral Science Foundation(No.2020M671900)the National Leading Youth Talent Support Program of China.
文摘Large-scale multi-objective optimization problems(MOPs)that involve a large number of decision variables,have emerged from many real-world applications.While evolutionary algorithms(EAs)have been widely acknowledged as a mainstream method for MOPs,most research progress and successful applications of EAs have been restricted to MOPs with small-scale decision variables.More recently,it has been reported that traditional multi-objective EAs(MOEAs)suffer severe deterioration with the increase of decision variables.As a result,and motivated by the emergence of real-world large-scale MOPs,investigation of MOEAs in this aspect has attracted much more attention in the past decade.This paper reviews the progress of evolutionary computation for large-scale multi-objective optimization from two angles.From the key difficulties of the large-scale MOPs,the scalability analysis is discussed by focusing on the performance of existing MOEAs and the challenges induced by the increase of the number of decision variables.From the perspective of methodology,the large-scale MOEAs are categorized into three classes and introduced respectively:divide and conquer based,dimensionality reduction based and enhanced search-based approaches.Several future research directions are also discussed.
基金supported by the National Natural Science Foundation of China(6120200461272084)+9 种基金the National Key Basic Research Program of China(973 Program)(2011CB302903)the Specialized Research Fund for the Doctoral Program of Higher Education(2009322312000120113223110003)the China Postdoctoral Science Foundation Funded Project(2011M5000952012T50514)the Natural Science Foundation of Jiangsu Province(BK2011754BK2009426)the Jiangsu Postdoctoral Science Foundation Funded Project(1102103C)the Natural Science Fund of Higher Education of Jiangsu Province(12KJB520007)the Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions(yx002001)
文摘How to effectively reduce the energy consumption of large-scale data centers is a key issue in cloud computing. This paper presents a novel low-power task scheduling algorithm (L3SA) for large-scale cloud data centers. The winner tree is introduced to make the data nodes as the leaf nodes of the tree and the final winner on the purpose of reducing energy consumption is selected. The complexity of large-scale cloud data centers is fully consider, and the task comparson coefficient is defined to make task scheduling strategy more reasonable. Experiments and performance analysis show that the proposed algorithm can effectively improve the node utilization, and reduce the overall power consumption of the cloud data center.
基金supported by National Natural Science Foundation of China, "Research on Accurate and Fair Service Recommendation Approach in Mobile Internet Environment", (No. 61571066)
文摘Efficient and effective data acquisition is of theoretical and practical importance in WSN applications because data measured and collected by WSN is often unreliable, such as those often accompanied by noise and error, missing values or inconsistent data. Motivated by fog computing, which focuses on how to effectively offload computation-intensive tasks from resource-constrained devices, this paper proposes a simple but yet effective data acquisition approach with the ability of filtering abnormal data and meeting the real-time requirement. Our method uses a cooperation mechanism by leveraging on both an architectural and algorithmic approach. Firstly, the sensor node with the limited computing resource only accomplishes detecting and marking the suspicious data using a light weight algorithm. Secondly, the cluster head evaluates suspicious data by referring to the data from the other sensor nodes in the same cluster and discard the abnormal data directly. Thirdly, the sink node fills up the discarded data with an approximate value using nearest neighbor data supplement method. Through the architecture, each node only consumes a few computational resources and distributes the heavily computing load to several nodes. Simulation results show that our data acquisition method is effective considering the real-time outlier filtering and the computing overhead.
基金supported in part by the Fundamental Research Funds for the Central Universities under Grant No.30919011282.
文摘Edge-computing-enabled smart greenhouses are a representative application of the Internet of Things(IoT)technology,which can monitor the environmental information in real-time and employ the information to contribute to intelligent decision-making.In the process,anomaly detection for wireless sensor data plays an important role.However,the traditional anomaly detection algorithms originally designed for anomaly detection in static data do not properly consider the inherent characteristics of the data stream produced by wireless sensors such as infiniteness,correlations,and concept drift,which may pose a considerable challenge to anomaly detection based on data stream and lead to low detection accuracy and efficiency.First,the data stream is usually generated quickly,which means that the data stream is infinite and enormous.Hence,any traditional off-line anomaly detection algorithm that attempts to store the whole dataset or to scan the dataset multiple times for anomaly detection will run out of memory space.Second,there exist correlations among different data streams,and traditional algorithms hardly consider these correlations.Third,the underlying data generation process or distribution may change over time.Thus,traditional anomaly detection algorithms with no model update will lose their effects.Considering these issues,a novel method(called DLSHiForest)based on Locality-Sensitive Hashing and the time window technique is proposed to solve these problems while achieving accurate and efficient detection.Comprehensive experiments are executed using a real-world agricultural greenhouse dataset to demonstrate the feasibility of our approach.Experimental results show that our proposal is practical for addressing the challenges of traditional anomaly detection while ensuring accuracy and efficiency.