Thedeployment of the Internet of Things(IoT)with smart sensors has facilitated the emergence of fog computing as an important technology for delivering services to smart environments such as campuses,smart cities,and ...Thedeployment of the Internet of Things(IoT)with smart sensors has facilitated the emergence of fog computing as an important technology for delivering services to smart environments such as campuses,smart cities,and smart transportation systems.Fog computing tackles a range of challenges,including processing,storage,bandwidth,latency,and reliability,by locally distributing secure information through end nodes.Consisting of endpoints,fog nodes,and back-end cloud infrastructure,it provides advanced capabilities beyond traditional cloud computing.In smart environments,particularly within smart city transportation systems,the abundance of devices and nodes poses significant challenges related to power consumption and system reliability.To address the challenges of latency,energy consumption,and fault tolerance in these environments,this paper proposes a latency-aware,faulttolerant framework for resource scheduling and data management,referred to as the FORD framework,for smart cities in fog environments.This framework is designed to meet the demands of time-sensitive applications,such as those in smart transportation systems.The FORD framework incorporates latency-aware resource scheduling to optimize task execution in smart city environments,leveraging resources from both fog and cloud environments.Through simulation-based executions,tasks are allocated to the nearest available nodes with minimum latency.In the event of execution failure,a fault-tolerantmechanism is employed to ensure the successful completion of tasks.Upon successful execution,data is efficiently stored in the cloud data center,ensuring data integrity and reliability within the smart city ecosystem.展开更多
With the rapid development of medical data sharing,issues of privacy and ownership have become prominent,which have limited the scale of data sharing.To address the above challenges,we propose a blockchainbased data-s...With the rapid development of medical data sharing,issues of privacy and ownership have become prominent,which have limited the scale of data sharing.To address the above challenges,we propose a blockchainbased data-sharing framework to ensure data security and encourage data owners to actively participate in sharing.We introduce a reliable attribute-based searchable encryption scheme that enables fine-grained access control of encrypted data and ensures secure and efficient data sharing.The revenue distribution model is constructed based on Shapley value to motivate participants.Additionally,by integrating the smart contract technology of blockchain,the search operation and incentive mechanism are automatically executed.Through revenue distribution analysis,the incentive effect and rationality of the proposed scheme are verified.Performance evaluation shows that,compared with traditional data-sharing models,our proposed framework not only meets data security requirements but also incentivizes more participants to actively participate in data sharing.展开更多
We propose a Cross-Chain Mapping Blockchain(CCMB)for scalable data management in massive Internet of Things(IoT)networks.Specifically,CCMB aims to improve the scalability of securely storing,tracing,and transmitting I...We propose a Cross-Chain Mapping Blockchain(CCMB)for scalable data management in massive Internet of Things(IoT)networks.Specifically,CCMB aims to improve the scalability of securely storing,tracing,and transmitting IoT behavior and reputation data based on our proposed cross-mapped Behavior Chain(BChain)and Reputation Chain(RChain).To improve off-chain IoT data storage scalability,we show that our lightweight CCMB architecture efficiently utilizes available fog-cloud resources.The scalability of on-chain IoT data tracing is enhanced using our Mapping Smart Contract(MSC)and cross-chain mapping design to perform rapid Reputation-to-Behavior(R2B)traceability queries between BChain and RChain blocks.To maximize off-chain to on-chain throughput,we optimize the CCMB block settings and producers based on a general Poisson Point Process(PPP)network model.The constrained optimization problem is formulated as a Markov Decision Process(MDP),and solved using a dual-network Deep Reinforcement Learning(DRL)algorithm.Simulation results validate CCMB’s scalability advantages in storage,traceability,and throughput.In specific massive IoT scenarios,CCMB can reduce the storage footprint by 50%and traceability query time by 90%,while improving system throughput by 55%compared to existing benchmarks.展开更多
On October 18,2017,the 19th National Congress Report called for the implementation of the Healthy China Strategy.The development of biomedical data plays a pivotal role in advancing this strategy.Since the 18th Nation...On October 18,2017,the 19th National Congress Report called for the implementation of the Healthy China Strategy.The development of biomedical data plays a pivotal role in advancing this strategy.Since the 18th National Congress of the Communist Party of China,China has vigorously promoted the integration and implementation of the Healthy China and Digital China strategies.The National Health Commission has prioritized the development of health and medical big data,issuing policies to promote standardized applica-tions and foster innovation in"Internet+Healthcare."Biomedical data has significantly contributed to preci-sion medicine,personalized health management,drug development,disease diagnosis,public health monitor-ing,and epidemic prediction capabilities.展开更多
Sharing data while protecting privacy in the industrial Internet is a significant challenge.Traditional machine learning methods require a combination of all data for training;however,this approach can be limited by d...Sharing data while protecting privacy in the industrial Internet is a significant challenge.Traditional machine learning methods require a combination of all data for training;however,this approach can be limited by data availability and privacy concerns.Federated learning(FL)has gained considerable attention because it allows for decentralized training on multiple local datasets.However,the training data collected by data providers are often non-independent and identically distributed(non-IID),resulting in poor FL performance.This paper proposes a privacy-preserving approach for sharing non-IID data in the industrial Internet using an FL approach based on blockchain technology.To overcome the problem of non-IID data leading to poor training accuracy,we propose dynamically updating the local model based on the divergence of the global and local models.This approach can significantly improve the accuracy of FL training when there is relatively large dispersion.In addition,we design a dynamic gradient clipping algorithm to alleviate the influence of noise on the model accuracy to reduce potential privacy leakage caused by sharing model parameters.Finally,we evaluate the performance of the proposed scheme using commonly used open-source image datasets.The simulation results demonstrate that our method can significantly enhance the accuracy while protecting privacy and maintaining efficiency,thereby providing a new solution to data-sharing and privacy-protection challenges in the industrial Internet.展开更多
With the development of technology,the connected vehicle has been upgraded from a traditional transport vehicle to an information terminal and energy storage terminal.The data of ICV(intelligent connected vehicles)is ...With the development of technology,the connected vehicle has been upgraded from a traditional transport vehicle to an information terminal and energy storage terminal.The data of ICV(intelligent connected vehicles)is the key to organically maximizing their efficiency.However,in the context of increasingly strict global data security supervision and compliance,numerous problems,including complex types of connected vehicle data,poor data collaboration between the IT(information technology)domain and OT(operation technology)domain,different data format standards,lack of shared trust sources,difficulty in ensuring the quality of shared data,lack of data control rights,as well as difficulty in defining data ownership,make vehicle data sharing face a lot of problems,and data islands are widespread.This study proposes FADSF(Fuzzy Anonymous Data Share Frame),an automobile data sharing scheme based on blockchain.The data holder publishes the shared data information and forms the corresponding label storage on the blockchain.The data demander browses the data directory information to select and purchase data assets and verify them.The data demander selects and purchases data assets and verifies them by browsing the data directory information.Meanwhile,this paper designs a data structure Data Discrimination Bloom Filter(DDBF),making complaints about illegal data.When the number of data complaints reaches the threshold,the audit traceability contract is triggered to punish the illegal data publisher,aiming to improve the data quality and maintain a good data sharing ecology.In this paper,based on Ethereum,the above scheme is tested to demonstrate its feasibility,efficiency and security.展开更多
In this paper,a variety of classical convolutional neural networks are trained on two different datasets using transfer learning method.We demonstrated that the training dataset has a significant impact on the trainin...In this paper,a variety of classical convolutional neural networks are trained on two different datasets using transfer learning method.We demonstrated that the training dataset has a significant impact on the training results,in addition to the optimization achieved through the model structure.However,the lack of open-source agricultural data,combined with the absence of a comprehensive open-source data sharing platform,remains a substantial obstacle.This issue is closely related to the difficulty and high cost of obtaining high-quality agricultural data,the low level of education of most employees,underdeveloped distributed training systems and unsecured data security.To address these challenges,this paper proposes a novel idea of constructing an agricultural data sharing platform based on a federated learning(FL)framework,aiming to overcome the deficiency of high-quality data in agricultural field training.展开更多
As an introductory course for the emerging major of big data management and application,“Introduction to Big Data”has not yet formed a curriculum standard and implementation plan that is widely accepted and used by ...As an introductory course for the emerging major of big data management and application,“Introduction to Big Data”has not yet formed a curriculum standard and implementation plan that is widely accepted and used by everyone.To this end,we discuss some of our explorations and attempts in the construction and teaching process of big data courses for the major of big data management and application from the perspective of course planning,course implementation,and course summary.After interviews with students and feedback from questionnaires,students are highly satisfied with some of the teaching measures and programs currently adopted.展开更多
This study aims to investigate the influence of social media on college choice among undergraduates majoring in Big Data Management and Application in China.The study attempts to reveal how information on social media...This study aims to investigate the influence of social media on college choice among undergraduates majoring in Big Data Management and Application in China.The study attempts to reveal how information on social media platforms such as Weibo,WeChat,and Zhihu influences the cognition and choice process of prospective students.By employing an online quantitative survey questionnaire,data were collected from the 2022 and 2023 classes of new students majoring in Big Data Management and Application at Guilin University of Electronic Technology.The aim was to evaluate the role of social media in their college choice process and understand the features and information that most attract prospective students.Social media has become a key factor influencing the college choice decision-making of undergraduates majoring in Big Data Management and Application in China.Students tend to obtain school information through social media platforms and use this information as an important reference in their decision-making process.Higher education institutions should strengthen their social media information dissemination,providing accurate,timely,and attractive information.It is also necessary to ensure effective management of social media platforms,maintain a positive reputation for the school on social media,and increase the interest and trust of prospective students.Simultaneously,educational decision-makers should consider incorporating social media analysis into their recruitment strategies to better attract new student enrollment.This study provides a new perspective for understanding higher education choice behavior in the digital age,particularly by revealing the importance of social media in the educational decision-making process.This has important practical and theoretical implications for higher education institutions,policymakers,and social media platform operators.展开更多
Due to the fact that consumers'privacy data sharing has multifaceted and complex effects on the e-commerce platform and its two sided agents,consumers and sellers,a game-theoretic model in a monopoly e-market is s...Due to the fact that consumers'privacy data sharing has multifaceted and complex effects on the e-commerce platform and its two sided agents,consumers and sellers,a game-theoretic model in a monopoly e-market is set up to study the equilibrium strategies of the three agents(the platform,the seller on it and consumers)under privacy data sharing.Equilibrium decisions show that after sharing consumers'privacy data once,the platform can collect more privacy data from consumers.Meanwhile,privacy data sharing pushes the seller to reduce the product price.Moreover,the platform will increase the transaction fee if the privacy data sharing value is high.It is also indicated that privacy data sharing always benefits consumers and the seller.However,the platform's profit decreases if the privacy data sharing value is low and the privacy data sharing level is high.Finally,an extended model considering an incomplete information game among the agents is discussed.The results show that both the platform and the seller cannot obtain a high profit from privacy data sharing.Factors including the seller's possibility to buy privacy data,the privacy data sharing value and privacy data sharing level affect the two agents'payoffs.If the platform wishes to benefit from privacy data sharing,it should increase the possibility of the seller to buy privacy data or increase the privacy data sharing value.展开更多
In Decentralized Machine Learning(DML)systems,system participants contribute their resources to assist others in developing machine learning solutions.Identifying malicious contributions in DML systems is challenging,...In Decentralized Machine Learning(DML)systems,system participants contribute their resources to assist others in developing machine learning solutions.Identifying malicious contributions in DML systems is challenging,which has led to the exploration of blockchain technology.Blockchain leverages its transparency and immutability to record the provenance and reliability of training data.However,storing massive datasets or implementing model evaluation processes on smart contracts incurs high computational costs.Additionally,current research on preventing malicious contributions in DML systems primarily focuses on protecting models from being exploited by workers who contribute incorrect or misleading data.However,less attention has been paid to the scenario where malicious requesters intentionally manipulate test data during evaluation to gain an unfair advantage.This paper proposes a transparent and accountable training data sharing method that securely shares data among potentially malicious system participants.First,we introduce a blockchain-based DML system architecture that supports secure training data sharing through the IPFS network.Second,we design a blockchain smart contract to transparently split training datasets into training and test datasets,respectively,without involving system participants.Under the system,transparent and accountable training data sharing can be achieved with attribute-based proxy re-encryption.We demonstrate the security analysis for the system,and conduct experiments on the Ethereum and IPFS platforms to show the feasibility and practicality of the system.展开更多
In today’s highly competitive retail industry,offline stores face increasing pressure on profitability.They hope to improve their ability in shelf management with the help of big data technology.For this,on-shelf ava...In today’s highly competitive retail industry,offline stores face increasing pressure on profitability.They hope to improve their ability in shelf management with the help of big data technology.For this,on-shelf availability is an essential indicator of shelf data management and closely relates to customer purchase behavior.RFM(recency,frequency,andmonetary)patternmining is a powerful tool to evaluate the value of customer behavior.However,the existing RFM patternmining algorithms do not consider the quarterly nature of goods,resulting in unreasonable shelf availability and difficulty in profit-making.To solve this problem,we propose a quarterly RFM mining algorithmfor On-shelf products named OS-RFM.Our algorithmmines the high recency,high frequency,and high monetary patterns and considers the period of the on-shelf goods in quarterly units.We conducted experiments using two real datasets for numerical and graphical analysis to prove the algorithm’s effectiveness.Compared with the state-of-the-art RFM mining algorithm,our algorithm can identify more patterns and performs well in terms of precision,recall,and F1-score,with the recall rate nearing 100%.Also,the novel algorithm operates with significantly shorter running times and more stable memory usage than existing mining algorithms.Additionally,we analyze the sales trends of products in different quarters and seasonal variations.The analysis assists businesses in maintaining reasonable on-shelf availability and achieving greater profitability.展开更多
On the basis of PDM(product data management) definition and its connotation, the factors to ensure implementation success are analyzed. The definition phase, analysis phase, design phase, build and test phase, and pos...On the basis of PDM(product data management) definition and its connotation, the factors to ensure implementation success are analyzed. The definition phase, analysis phase, design phase, build and test phase, and post production phase during PDM implementation are described. The implementation is divided into ten processes, which consist of the above different phases. The relationships between phases and processes are illustrated. Finally, a workflow is proposed to guide the implementing at a fixed price.展开更多
This paper proposes a useful web-based system for the management and sharing of electron probe micro-analysis( EPMA)data in geology. A new web-based architecture that integrates the management and sharing functions is...This paper proposes a useful web-based system for the management and sharing of electron probe micro-analysis( EPMA)data in geology. A new web-based architecture that integrates the management and sharing functions is developed and implemented.Earth scientists can utilize this system to not only manage their data,but also easily communicate and share it with other researchers.Data query methods provide the core functionality of the proposed management and sharing modules. The modules in this system have been developed using cloud GIS technologies,which help achieve real-time spatial area retrieval on a map. The system has been tested by approximately 263 users at Jilin University and Beijing SHRIMP Center. A survey was conducted among these users to estimate the usability of the primary functions of the system,and the assessment result is summarized and presented.展开更多
This paper is concerned with the development of product data management (PDM) systems--WPDM systems based on web technologies. As a tool to integrate information, traditional PDM system has many benefits for the com...This paper is concerned with the development of product data management (PDM) systems--WPDM systems based on web technologies. As a tool to integrate information, traditional PDM system has many benefits for the companies in such aspects as improving design productivity, better control over projects and so on. With the maturing of web technologies, the advantages of WPDM system are obvious. We will show these advantages in detail in Part 3. WPDM system is built on three-tier application model to provide security and flexibility, they are back-end, middle layer and front-end. The basic designs in each layer will be briefly introduced in Part 4. In the future, WPDM will be extended to integrate with other applications to provide a complete web-based engineering environment.展开更多
The mining industry faces a number of challenges that promote the adoption of new technologies.Big data,which is driven by the accelerating progress of information and communication technology,is one of the promising ...The mining industry faces a number of challenges that promote the adoption of new technologies.Big data,which is driven by the accelerating progress of information and communication technology,is one of the promising technologies that can reshape the entire mining landscape.Despite numerous attempts to apply big data in the mining industry,fundamental problems of big data,especially big data management(BDM),in the mining industry persist.This paper aims to fill the gap by presenting the basics of BDM.This work provides a brief introduction to big data and BDM,and it discusses the challenges encountered by the mining industry to indicate the necessity of implementing big data.It also summarizes data sources in the mining industry and presents the potential benefits of big data to the mining industry.This work also envisions a future in which a global database project is established and big data is used together with other technologies(i.e.,automation),supported by government policies and following international standards.This paper also outlines the precautions for the utilization of BDM in the mining industry.展开更多
The CifNet network multi-well data management system is developed for 100MB or 1000MB local network environments which are used in Chinese oil industry. The kernel techniques of CifNet system include: 1, establishing ...The CifNet network multi-well data management system is developed for 100MB or 1000MB local network environments which are used in Chinese oil industry. The kernel techniques of CifNet system include: 1, establishing a high efficient and low cost network multi-well data management architecture based on the General Logging Curve Theory and the Cif data format; 2, implementing efficient visit and transmission of multi-well data in C/S local network based on TCP/IP protocol; 3,ensuring the safety of multi-well data in store, visit and application based on Unix operating system security. By using CifNet system, the researcher in office or at home can visit curves of any borehole in any working area of any oilfield. The application foreground of CifNet system is also commented.展开更多
The fast proliferation of edge devices for the Internet of Things(IoT)has led to massive volumes of data explosion.The generated data is collected and shared using edge-based IoT structures at a considerably high freq...The fast proliferation of edge devices for the Internet of Things(IoT)has led to massive volumes of data explosion.The generated data is collected and shared using edge-based IoT structures at a considerably high frequency.Thus,the data-sharing privacy exposure issue is increasingly intimidating when IoT devices make malicious requests for filching sensitive information from a cloud storage system through edge nodes.To address the identified issue,we present evolutionary privacy preservation learning strategies for an edge computing-based IoT data sharing scheme.In particular,we introduce evolutionary game theory and construct a payoff matrix to symbolize intercommunication between IoT devices and edge nodes,where IoT devices and edge nodes are two parties of the game.IoT devices may make malicious requests to achieve their goals of stealing privacy.Accordingly,edge nodes should deny malicious IoT device requests to prevent IoT data from being disclosed.They dynamically adjust their own strategies according to the opponent's strategy and finally maximize the payoffs.Built upon a developed application framework to illustrate the concrete data sharing architecture,a novel algorithm is proposed that can derive the optimal evolutionary learning strategy.Furthermore,we numerically simulate evolutionarily stable strategies,and the final results experimentally verify the correctness of the IoT data sharing privacy preservation scheme.Therefore,the proposed model can effectively defeat malicious invasion and protect sensitive information from leaking when IoT data is shared.展开更多
With the development of the Internet of Things(IoT),the massive data sharing between IoT devices improves the Quality of Service(QoS)and user experience in various IoT applications.However,data sharing may cause serio...With the development of the Internet of Things(IoT),the massive data sharing between IoT devices improves the Quality of Service(QoS)and user experience in various IoT applications.However,data sharing may cause serious privacy leakages to data providers.To address this problem,in this study,data sharing is realized through model sharing,based on which a secure data sharing mechanism,called BP2P-FL,is proposed using peer-to-peer federated learning with the privacy protection of data providers.In addition,by introducing the blockchain to the data sharing,every training process is recorded to ensure that data providers offer high-quality data.For further privacy protection,the differential privacy technology is used to disturb the global data sharing model.The experimental results show that BP2P-FL has high accuracy and feasibility in the data sharing of various IoT applications.展开更多
Data sharing technology in Internet of Vehicles(Io V)has attracted great research interest with the goal of realizing intelligent transportation and traffic management.Meanwhile,the main concerns have been raised abou...Data sharing technology in Internet of Vehicles(Io V)has attracted great research interest with the goal of realizing intelligent transportation and traffic management.Meanwhile,the main concerns have been raised about the security and privacy of vehicle data.The mobility and real-time characteristics of vehicle data make data sharing more difficult in Io V.The emergence of blockchain and federated learning brings new directions.In this paper,a data-sharing model that combines blockchain and federated learning is proposed to solve the security and privacy problems of data sharing in Io V.First,we use federated learning to share data instead of exposing actual data and propose an adaptive differential privacy scheme to further balance the privacy and availability of data.Then,we integrate the verification scheme into the consensus process,so that the consensus computation can filter out low-quality models.Experimental data shows that our data-sharing model can better balance the relationship between data availability and privacy,and also has enhanced security.展开更多
基金supported by the Deanship of Scientific Research and Graduate Studies at King Khalid University under research grant number(R.G.P.2/93/45).
文摘Thedeployment of the Internet of Things(IoT)with smart sensors has facilitated the emergence of fog computing as an important technology for delivering services to smart environments such as campuses,smart cities,and smart transportation systems.Fog computing tackles a range of challenges,including processing,storage,bandwidth,latency,and reliability,by locally distributing secure information through end nodes.Consisting of endpoints,fog nodes,and back-end cloud infrastructure,it provides advanced capabilities beyond traditional cloud computing.In smart environments,particularly within smart city transportation systems,the abundance of devices and nodes poses significant challenges related to power consumption and system reliability.To address the challenges of latency,energy consumption,and fault tolerance in these environments,this paper proposes a latency-aware,faulttolerant framework for resource scheduling and data management,referred to as the FORD framework,for smart cities in fog environments.This framework is designed to meet the demands of time-sensitive applications,such as those in smart transportation systems.The FORD framework incorporates latency-aware resource scheduling to optimize task execution in smart city environments,leveraging resources from both fog and cloud environments.Through simulation-based executions,tasks are allocated to the nearest available nodes with minimum latency.In the event of execution failure,a fault-tolerantmechanism is employed to ensure the successful completion of tasks.Upon successful execution,data is efficiently stored in the cloud data center,ensuring data integrity and reliability within the smart city ecosystem.
基金supported by the Natural Science Foundation of Hebei Province of China(F2021201052).
文摘With the rapid development of medical data sharing,issues of privacy and ownership have become prominent,which have limited the scale of data sharing.To address the above challenges,we propose a blockchainbased data-sharing framework to ensure data security and encourage data owners to actively participate in sharing.We introduce a reliable attribute-based searchable encryption scheme that enables fine-grained access control of encrypted data and ensures secure and efficient data sharing.The revenue distribution model is constructed based on Shapley value to motivate participants.Additionally,by integrating the smart contract technology of blockchain,the search operation and incentive mechanism are automatically executed.Through revenue distribution analysis,the incentive effect and rationality of the proposed scheme are verified.Performance evaluation shows that,compared with traditional data-sharing models,our proposed framework not only meets data security requirements but also incentivizes more participants to actively participate in data sharing.
基金supported in part by the National Key Research and Development Program of China under Grant 2023YFB3106900the National Natural Science Foundation of China under Grant 62171113the China Scholarship Council under Grant 202406080100.
文摘We propose a Cross-Chain Mapping Blockchain(CCMB)for scalable data management in massive Internet of Things(IoT)networks.Specifically,CCMB aims to improve the scalability of securely storing,tracing,and transmitting IoT behavior and reputation data based on our proposed cross-mapped Behavior Chain(BChain)and Reputation Chain(RChain).To improve off-chain IoT data storage scalability,we show that our lightweight CCMB architecture efficiently utilizes available fog-cloud resources.The scalability of on-chain IoT data tracing is enhanced using our Mapping Smart Contract(MSC)and cross-chain mapping design to perform rapid Reputation-to-Behavior(R2B)traceability queries between BChain and RChain blocks.To maximize off-chain to on-chain throughput,we optimize the CCMB block settings and producers based on a general Poisson Point Process(PPP)network model.The constrained optimization problem is formulated as a Markov Decision Process(MDP),and solved using a dual-network Deep Reinforcement Learning(DRL)algorithm.Simulation results validate CCMB’s scalability advantages in storage,traceability,and throughput.In specific massive IoT scenarios,CCMB can reduce the storage footprint by 50%and traceability query time by 90%,while improving system throughput by 55%compared to existing benchmarks.
文摘On October 18,2017,the 19th National Congress Report called for the implementation of the Healthy China Strategy.The development of biomedical data plays a pivotal role in advancing this strategy.Since the 18th National Congress of the Communist Party of China,China has vigorously promoted the integration and implementation of the Healthy China and Digital China strategies.The National Health Commission has prioritized the development of health and medical big data,issuing policies to promote standardized applica-tions and foster innovation in"Internet+Healthcare."Biomedical data has significantly contributed to preci-sion medicine,personalized health management,drug development,disease diagnosis,public health monitor-ing,and epidemic prediction capabilities.
基金This work was supported by the National Key R&D Program of China under Grant 2023YFB2703802the Hunan Province Innovation and Entrepreneurship Training Program for College Students S202311528073.
文摘Sharing data while protecting privacy in the industrial Internet is a significant challenge.Traditional machine learning methods require a combination of all data for training;however,this approach can be limited by data availability and privacy concerns.Federated learning(FL)has gained considerable attention because it allows for decentralized training on multiple local datasets.However,the training data collected by data providers are often non-independent and identically distributed(non-IID),resulting in poor FL performance.This paper proposes a privacy-preserving approach for sharing non-IID data in the industrial Internet using an FL approach based on blockchain technology.To overcome the problem of non-IID data leading to poor training accuracy,we propose dynamically updating the local model based on the divergence of the global and local models.This approach can significantly improve the accuracy of FL training when there is relatively large dispersion.In addition,we design a dynamic gradient clipping algorithm to alleviate the influence of noise on the model accuracy to reduce potential privacy leakage caused by sharing model parameters.Finally,we evaluate the performance of the proposed scheme using commonly used open-source image datasets.The simulation results demonstrate that our method can significantly enhance the accuracy while protecting privacy and maintaining efficiency,thereby providing a new solution to data-sharing and privacy-protection challenges in the industrial Internet.
基金This work was financially supported by the National Key Research and Development Program of China(2022YFB3103200).
文摘With the development of technology,the connected vehicle has been upgraded from a traditional transport vehicle to an information terminal and energy storage terminal.The data of ICV(intelligent connected vehicles)is the key to organically maximizing their efficiency.However,in the context of increasingly strict global data security supervision and compliance,numerous problems,including complex types of connected vehicle data,poor data collaboration between the IT(information technology)domain and OT(operation technology)domain,different data format standards,lack of shared trust sources,difficulty in ensuring the quality of shared data,lack of data control rights,as well as difficulty in defining data ownership,make vehicle data sharing face a lot of problems,and data islands are widespread.This study proposes FADSF(Fuzzy Anonymous Data Share Frame),an automobile data sharing scheme based on blockchain.The data holder publishes the shared data information and forms the corresponding label storage on the blockchain.The data demander browses the data directory information to select and purchase data assets and verify them.The data demander selects and purchases data assets and verifies them by browsing the data directory information.Meanwhile,this paper designs a data structure Data Discrimination Bloom Filter(DDBF),making complaints about illegal data.When the number of data complaints reaches the threshold,the audit traceability contract is triggered to punish the illegal data publisher,aiming to improve the data quality and maintain a good data sharing ecology.In this paper,based on Ethereum,the above scheme is tested to demonstrate its feasibility,efficiency and security.
基金National Key Research and Development Program of China(2021ZD0113704).
文摘In this paper,a variety of classical convolutional neural networks are trained on two different datasets using transfer learning method.We demonstrated that the training dataset has a significant impact on the training results,in addition to the optimization achieved through the model structure.However,the lack of open-source agricultural data,combined with the absence of a comprehensive open-source data sharing platform,remains a substantial obstacle.This issue is closely related to the difficulty and high cost of obtaining high-quality agricultural data,the low level of education of most employees,underdeveloped distributed training systems and unsecured data security.To address these challenges,this paper proposes a novel idea of constructing an agricultural data sharing platform based on a federated learning(FL)framework,aiming to overcome the deficiency of high-quality data in agricultural field training.
文摘As an introductory course for the emerging major of big data management and application,“Introduction to Big Data”has not yet formed a curriculum standard and implementation plan that is widely accepted and used by everyone.To this end,we discuss some of our explorations and attempts in the construction and teaching process of big data courses for the major of big data management and application from the perspective of course planning,course implementation,and course summary.After interviews with students and feedback from questionnaires,students are highly satisfied with some of the teaching measures and programs currently adopted.
文摘This study aims to investigate the influence of social media on college choice among undergraduates majoring in Big Data Management and Application in China.The study attempts to reveal how information on social media platforms such as Weibo,WeChat,and Zhihu influences the cognition and choice process of prospective students.By employing an online quantitative survey questionnaire,data were collected from the 2022 and 2023 classes of new students majoring in Big Data Management and Application at Guilin University of Electronic Technology.The aim was to evaluate the role of social media in their college choice process and understand the features and information that most attract prospective students.Social media has become a key factor influencing the college choice decision-making of undergraduates majoring in Big Data Management and Application in China.Students tend to obtain school information through social media platforms and use this information as an important reference in their decision-making process.Higher education institutions should strengthen their social media information dissemination,providing accurate,timely,and attractive information.It is also necessary to ensure effective management of social media platforms,maintain a positive reputation for the school on social media,and increase the interest and trust of prospective students.Simultaneously,educational decision-makers should consider incorporating social media analysis into their recruitment strategies to better attract new student enrollment.This study provides a new perspective for understanding higher education choice behavior in the digital age,particularly by revealing the importance of social media in the educational decision-making process.This has important practical and theoretical implications for higher education institutions,policymakers,and social media platform operators.
基金The National Social Science Foundation of China(No.17BGL196)。
文摘Due to the fact that consumers'privacy data sharing has multifaceted and complex effects on the e-commerce platform and its two sided agents,consumers and sellers,a game-theoretic model in a monopoly e-market is set up to study the equilibrium strategies of the three agents(the platform,the seller on it and consumers)under privacy data sharing.Equilibrium decisions show that after sharing consumers'privacy data once,the platform can collect more privacy data from consumers.Meanwhile,privacy data sharing pushes the seller to reduce the product price.Moreover,the platform will increase the transaction fee if the privacy data sharing value is high.It is also indicated that privacy data sharing always benefits consumers and the seller.However,the platform's profit decreases if the privacy data sharing value is low and the privacy data sharing level is high.Finally,an extended model considering an incomplete information game among the agents is discussed.The results show that both the platform and the seller cannot obtain a high profit from privacy data sharing.Factors including the seller's possibility to buy privacy data,the privacy data sharing value and privacy data sharing level affect the two agents'payoffs.If the platform wishes to benefit from privacy data sharing,it should increase the possibility of the seller to buy privacy data or increase the privacy data sharing value.
基金supported by the MSIT(Ministry of Science and ICT),Korea,under the Special R&D Zone Development Project(R&D)—Development of R&D Innovation Valley support program(2023-DD-RD-0152)supervised by the Innovation Foundation.It was also partially supported by the Ministry of Science and ICT(MSIT),Korea,under the Information Technology Research Center(ITRC)support program(IITP-2024-2020-0-01797)supervised by the Institute for Information&Communications Technology Planning&Evaluation(IITP).
文摘In Decentralized Machine Learning(DML)systems,system participants contribute their resources to assist others in developing machine learning solutions.Identifying malicious contributions in DML systems is challenging,which has led to the exploration of blockchain technology.Blockchain leverages its transparency and immutability to record the provenance and reliability of training data.However,storing massive datasets or implementing model evaluation processes on smart contracts incurs high computational costs.Additionally,current research on preventing malicious contributions in DML systems primarily focuses on protecting models from being exploited by workers who contribute incorrect or misleading data.However,less attention has been paid to the scenario where malicious requesters intentionally manipulate test data during evaluation to gain an unfair advantage.This paper proposes a transparent and accountable training data sharing method that securely shares data among potentially malicious system participants.First,we introduce a blockchain-based DML system architecture that supports secure training data sharing through the IPFS network.Second,we design a blockchain smart contract to transparently split training datasets into training and test datasets,respectively,without involving system participants.Under the system,transparent and accountable training data sharing can be achieved with attribute-based proxy re-encryption.We demonstrate the security analysis for the system,and conduct experiments on the Ethereum and IPFS platforms to show the feasibility and practicality of the system.
基金partially supported by the Foundation of State Key Laboratory of Public Big Data(No.PBD2022-01).
文摘In today’s highly competitive retail industry,offline stores face increasing pressure on profitability.They hope to improve their ability in shelf management with the help of big data technology.For this,on-shelf availability is an essential indicator of shelf data management and closely relates to customer purchase behavior.RFM(recency,frequency,andmonetary)patternmining is a powerful tool to evaluate the value of customer behavior.However,the existing RFM patternmining algorithms do not consider the quarterly nature of goods,resulting in unreasonable shelf availability and difficulty in profit-making.To solve this problem,we propose a quarterly RFM mining algorithmfor On-shelf products named OS-RFM.Our algorithmmines the high recency,high frequency,and high monetary patterns and considers the period of the on-shelf goods in quarterly units.We conducted experiments using two real datasets for numerical and graphical analysis to prove the algorithm’s effectiveness.Compared with the state-of-the-art RFM mining algorithm,our algorithm can identify more patterns and performs well in terms of precision,recall,and F1-score,with the recall rate nearing 100%.Also,the novel algorithm operates with significantly shorter running times and more stable memory usage than existing mining algorithms.Additionally,we analyze the sales trends of products in different quarters and seasonal variations.The analysis assists businesses in maintaining reasonable on-shelf availability and achieving greater profitability.
文摘On the basis of PDM(product data management) definition and its connotation, the factors to ensure implementation success are analyzed. The definition phase, analysis phase, design phase, build and test phase, and post production phase during PDM implementation are described. The implementation is divided into ten processes, which consist of the above different phases. The relationships between phases and processes are illustrated. Finally, a workflow is proposed to guide the implementing at a fixed price.
基金National Major Scientific Instruments and Equipment Development Special Funds,China(No.2016YFF0103303)National Science and Technology Support Program,China(No.2014BAK02B03)
文摘This paper proposes a useful web-based system for the management and sharing of electron probe micro-analysis( EPMA)data in geology. A new web-based architecture that integrates the management and sharing functions is developed and implemented.Earth scientists can utilize this system to not only manage their data,but also easily communicate and share it with other researchers.Data query methods provide the core functionality of the proposed management and sharing modules. The modules in this system have been developed using cloud GIS technologies,which help achieve real-time spatial area retrieval on a map. The system has been tested by approximately 263 users at Jilin University and Beijing SHRIMP Center. A survey was conducted among these users to estimate the usability of the primary functions of the system,and the assessment result is summarized and presented.
文摘This paper is concerned with the development of product data management (PDM) systems--WPDM systems based on web technologies. As a tool to integrate information, traditional PDM system has many benefits for the companies in such aspects as improving design productivity, better control over projects and so on. With the maturing of web technologies, the advantages of WPDM system are obvious. We will show these advantages in detail in Part 3. WPDM system is built on three-tier application model to provide security and flexibility, they are back-end, middle layer and front-end. The basic designs in each layer will be briefly introduced in Part 4. In the future, WPDM will be extended to integrate with other applications to provide a complete web-based engineering environment.
文摘The mining industry faces a number of challenges that promote the adoption of new technologies.Big data,which is driven by the accelerating progress of information and communication technology,is one of the promising technologies that can reshape the entire mining landscape.Despite numerous attempts to apply big data in the mining industry,fundamental problems of big data,especially big data management(BDM),in the mining industry persist.This paper aims to fill the gap by presenting the basics of BDM.This work provides a brief introduction to big data and BDM,and it discusses the challenges encountered by the mining industry to indicate the necessity of implementing big data.It also summarizes data sources in the mining industry and presents the potential benefits of big data to the mining industry.This work also envisions a future in which a global database project is established and big data is used together with other technologies(i.e.,automation),supported by government policies and following international standards.This paper also outlines the precautions for the utilization of BDM in the mining industry.
文摘The CifNet network multi-well data management system is developed for 100MB or 1000MB local network environments which are used in Chinese oil industry. The kernel techniques of CifNet system include: 1, establishing a high efficient and low cost network multi-well data management architecture based on the General Logging Curve Theory and the Cif data format; 2, implementing efficient visit and transmission of multi-well data in C/S local network based on TCP/IP protocol; 3,ensuring the safety of multi-well data in store, visit and application based on Unix operating system security. By using CifNet system, the researcher in office or at home can visit curves of any borehole in any working area of any oilfield. The application foreground of CifNet system is also commented.
基金supported in part by Zhejiang Provincial Natural Science Foundation of China under Grant nos.LZ22F020002 and LY22F020003National Natural Science Foundation of China under Grant nos.61772018 and 62002226the key project of Humanities and Social Sciences in Colleges and Universities of Zhejiang Province under Grant no.2021GH017.
文摘The fast proliferation of edge devices for the Internet of Things(IoT)has led to massive volumes of data explosion.The generated data is collected and shared using edge-based IoT structures at a considerably high frequency.Thus,the data-sharing privacy exposure issue is increasingly intimidating when IoT devices make malicious requests for filching sensitive information from a cloud storage system through edge nodes.To address the identified issue,we present evolutionary privacy preservation learning strategies for an edge computing-based IoT data sharing scheme.In particular,we introduce evolutionary game theory and construct a payoff matrix to symbolize intercommunication between IoT devices and edge nodes,where IoT devices and edge nodes are two parties of the game.IoT devices may make malicious requests to achieve their goals of stealing privacy.Accordingly,edge nodes should deny malicious IoT device requests to prevent IoT data from being disclosed.They dynamically adjust their own strategies according to the opponent's strategy and finally maximize the payoffs.Built upon a developed application framework to illustrate the concrete data sharing architecture,a novel algorithm is proposed that can derive the optimal evolutionary learning strategy.Furthermore,we numerically simulate evolutionarily stable strategies,and the final results experimentally verify the correctness of the IoT data sharing privacy preservation scheme.Therefore,the proposed model can effectively defeat malicious invasion and protect sensitive information from leaking when IoT data is shared.
基金This work is supported by National Natural Science Foundation of China under Grant No.U1905211 and 61702103Natural Science Foundation of Fujian Province under Grant No.2020J01167 and 2020J01169.
文摘With the development of the Internet of Things(IoT),the massive data sharing between IoT devices improves the Quality of Service(QoS)and user experience in various IoT applications.However,data sharing may cause serious privacy leakages to data providers.To address this problem,in this study,data sharing is realized through model sharing,based on which a secure data sharing mechanism,called BP2P-FL,is proposed using peer-to-peer federated learning with the privacy protection of data providers.In addition,by introducing the blockchain to the data sharing,every training process is recorded to ensure that data providers offer high-quality data.For further privacy protection,the differential privacy technology is used to disturb the global data sharing model.The experimental results show that BP2P-FL has high accuracy and feasibility in the data sharing of various IoT applications.
基金supported by the Ministry of Education Industry-University Cooperation Collaborative Education Projects of China under Grant 202102119036 and 202102082013。
文摘Data sharing technology in Internet of Vehicles(Io V)has attracted great research interest with the goal of realizing intelligent transportation and traffic management.Meanwhile,the main concerns have been raised about the security and privacy of vehicle data.The mobility and real-time characteristics of vehicle data make data sharing more difficult in Io V.The emergence of blockchain and federated learning brings new directions.In this paper,a data-sharing model that combines blockchain and federated learning is proposed to solve the security and privacy problems of data sharing in Io V.First,we use federated learning to share data instead of exposing actual data and propose an adaptive differential privacy scheme to further balance the privacy and availability of data.Then,we integrate the verification scheme into the consensus process,so that the consensus computation can filter out low-quality models.Experimental data shows that our data-sharing model can better balance the relationship between data availability and privacy,and also has enhanced security.