The generation of synthetic trajectories has become essential in various fields for analyzing complex movement patterns.However,the use of real-world trajectory data poses significant privacy risks,such as location re...The generation of synthetic trajectories has become essential in various fields for analyzing complex movement patterns.However,the use of real-world trajectory data poses significant privacy risks,such as location reidentification and correlation attacks.To address these challenges,privacy-preserving trajectory generation methods are critical for applications relying on sensitive location data.This paper introduces DPIL-Traj,an advanced framework designed to generate synthetic trajectories while achieving a superior balance between data utility and privacy preservation.Firstly,the framework incorporates Differential Privacy Clustering,which anonymizes trajectory data by applying differential privacy techniques that add noise,ensuring the protection of sensitive user information.Secondly,Imitation Learning is used to replicate decision-making behaviors observed in real-world trajectories.By learning from expert trajectories,this component generates synthetic data that closely mimics real-world decision-making processes while optimizing the quality of the generated trajectories.Finally,Markov-based Trajectory Generation is employed to capture and maintain the inherent temporal dynamics of movement patterns.Extensive experiments conducted on the GeoLife trajectory dataset show that DPIL-Traj improves utility performance by an average of 19.85%,and in terms of privacy performance by an average of 12.51%,compared to state-of-the-art approaches.Ablation studies further reveal that DP clustering effectively safeguards privacy,imitation learning enhances utility under noise,and the Markov module strengthens temporal coherence.展开更多
Deep learning’s widespread dependence on large datasets raises privacy concerns due to the potential presence of sensitive information.Differential privacy stands out as a crucial method for preserving privacy,garner...Deep learning’s widespread dependence on large datasets raises privacy concerns due to the potential presence of sensitive information.Differential privacy stands out as a crucial method for preserving privacy,garnering significant interest for its ability to offer robust and verifiable privacy safeguards during data training.However,classic differentially private learning introduces the same level of noise into the gradients across training iterations,which affects the trade-off between model utility and privacy guarantees.To address this issue,an adaptive differential privacy mechanism was proposed in this paper,which dynamically adjusts the privacy budget at the layer-level as training progresses to resist member inference attacks.Specifically,an equal privacy budget is initially allocated to each layer.Subsequently,as training advances,the privacy budget for layers closer to the output is reduced(adding more noise),while the budget for layers closer to the input is increased.The adjustment magnitude depends on the training iterations and is automatically determined based on the iteration count.This dynamic allocation provides a simple process for adjusting privacy budgets,alleviating the burden on users to tweak parameters and ensuring that privacy preservation strategies align with training progress.Extensive experiments on five well-known datasets indicate that the proposed method outperforms competing methods in terms of accuracy and resilience against membership inference attacks.展开更多
Federated learning effectively alleviates privacy and security issues raised by the development of artificial intelligence through a distributed training architecture.Existing research has shown that attackers can com...Federated learning effectively alleviates privacy and security issues raised by the development of artificial intelligence through a distributed training architecture.Existing research has shown that attackers can compromise user privacy and security by stealing model parameters.Therefore,differential privacy is applied in federated learning to further address malicious issues.However,the addition of noise and the update clipping mechanism in differential privacy jointly limit the further development of federated learning in privacy protection and performance optimization.Therefore,we propose an adaptive adjusted differential privacy federated learning method.First,a dynamic adaptive privacy budget allocation strategy is proposed,which flexibly adjusts the privacy budget within a given range based on the client’s data volume and training requirements,thereby alleviating the loss of privacy budget and the magnitude of model noise.Second,a longitudinal clipping differential privacy strategy is proposed,which based on the differences in factors that affect parameter updates,uses sparse methods to trim local updates,thereby reducing the impact of privacy pruning steps on model accuracy.The two strategies work together to ensure user privacy while the effect of differential privacy on model accuracy is reduced.To evaluate the effectiveness of our method,we conducted extensive experiments on benchmark datasets,and the results showed that our proposed method performed well in terms of performance and privacy protection.展开更多
The rapid development and widespread adoption of massive open online courses(MOOCs)have indeed had a significant impact on China’s education curriculum.However,the problem of fake reviews and ratings on the platform ...The rapid development and widespread adoption of massive open online courses(MOOCs)have indeed had a significant impact on China’s education curriculum.However,the problem of fake reviews and ratings on the platform has seriously affected the authenticity of course evaluations and user trust,requiring effective anomaly detection techniques for screening.The textual characteristics of MOOCs reviews,such as varying lengths and diverse emotional tendencies,have brought complexity to text analysis.Traditional rule-based analysis methods are often inadequate in dealing with such unstructured data.We propose a Differential Privacy-Enabled Text Convolutional Neural Network(DP-TextCNN)framework,aiming to achieve high-precision identification of outliers in MOOCs course reviews and ratings while protecting user privacy.This framework leverages the advantages of Convolutional Neural Networks(CNN)in text feature extraction and combines differential privacy techniques.It balances data privacy protection with model performance by introducing controlled random noise during the data preprocessing stage.By embedding differential privacy into the model training process,we ensure the privacy security of the framework when handling sensitive data,while maintaining a high recognition accuracy.Experimental results indicate that the DP-TextCNN framework achieves an exceptional accuracy of over 95%in identifying fake reviews on the dataset,this outcome not only verifies the applicability of differential privacy techniques in TextCNN but also underscores its potential in handling sensitive educational data.Additionally,we analyze the specific impact of differential privacy parameters on framework performance,offering theoretical support and empirical analysis to strike an optimal balance between privacy protection and framework efficiency.展开更多
With the ongoing digitalization and intelligence of power systems,there is an increasing reliance on large-scale data-driven intelligent technologies for tasks such as scheduling optimization and load forecasting.Neve...With the ongoing digitalization and intelligence of power systems,there is an increasing reliance on large-scale data-driven intelligent technologies for tasks such as scheduling optimization and load forecasting.Nevertheless,power data often contains sensitive information,making it a critical industry challenge to efficiently utilize this data while ensuring privacy.Traditional Federated Learning(FL)methods can mitigate data leakage by training models locally instead of transmitting raw data.Despite this,FL still has privacy concerns,especially gradient leakage,which might expose users’sensitive information.Therefore,integrating Differential Privacy(DP)techniques is essential for stronger privacy protection.Even so,the noise from DP may reduce the performance of federated learning models.To address this challenge,this paper presents an explainability-driven power data privacy federated learning framework.It incorporates DP technology and,based on model explainability,adaptively adjusts privacy budget allocation and model aggregation,thus balancing privacy protection and model performance.The key innovations of this paper are as follows:(1)We propose an explainability-driven power data privacy federated learning framework.(2)We detail a privacy budget allocation strategy:assigning budgets per training round by gradient effectiveness and at model granularity by layer importance.(3)We design a weighted aggregation strategy that considers the SHAP value and model accuracy for quality knowledge sharing.(4)Experiments show the proposed framework outperforms traditional methods in balancing privacy protection and model performance in power load forecasting tasks.展开更多
Federated Learning(FL),a practical solution that leverages distributed data across devices without the need for centralized data storage,which enables multiple participants to jointly train models while preserving dat...Federated Learning(FL),a practical solution that leverages distributed data across devices without the need for centralized data storage,which enables multiple participants to jointly train models while preserving data privacy and avoiding direct data sharing.Despite its privacy-preserving advantages,FL remains vulnerable to backdoor attacks,where malicious participants introduce backdoors into local models that are then propagated to the global model through the aggregation process.While existing differential privacy defenses have demonstrated effectiveness against backdoor attacks in FL,they often incur a significant degradation in the performance of the aggregated models on benign tasks.To address this limitation,we propose a novel backdoor defense mechanism based on differential privacy.Our approach first utilizes the inherent out-of-distribution characteristics of backdoor samples to identify and exclude malicious model updates that significantly deviate from benign models.By filtering out models that are clearly backdoor-infected before applying differential privacy,our method reduces the required noise level for differential privacy,thereby enhancing model robustness while preserving performance.Experimental evaluations on the CIFAR10 and FEMNIST datasets demonstrate that our method effectively limits the backdoor accuracy to below 15%across various backdoor scenarios while maintaining high main task accuracy.展开更多
Mobile crowdsensing(MCS)has become an effective paradigm to facilitate urban sensing.However,mobile users participating in sensing tasks will face the risk of location privacy leakage when uploading their actual sensi...Mobile crowdsensing(MCS)has become an effective paradigm to facilitate urban sensing.However,mobile users participating in sensing tasks will face the risk of location privacy leakage when uploading their actual sensing location data.In the application of mobile crowdsensing,most location privacy protection studies do not consider the temporal correlations between locations,so they are vulnerable to various inference attacks,and there is the problem of low data availability.In order to solve the above problems,this paper proposes a dynamic differential location privacy data publishing framework(DDLP)that protects privacy while publishing locations continuously.Firstly,the corresponding Markov transition matrices are established according to different times of historical trajectories,and then the protection location set is generated based on the current location at each timestamp.Moreover,using the exponential mechanism in differential privacy perturbs the true location by designing the utility function.Finally,experiments on the real-world trajectory dataset show that our method not only provides strong privacy guarantees,but also outperforms existing methods in terms of data availability and computational efficiency.展开更多
As a distributed machine learning method,federated learning(FL)has the advantage of naturally protecting data privacy.It keeps data locally and trains local models through local data to protect the privacy of local da...As a distributed machine learning method,federated learning(FL)has the advantage of naturally protecting data privacy.It keeps data locally and trains local models through local data to protect the privacy of local data.The federated learning method effectively solves the problem of artificial Smart data islands and privacy protection issues.However,existing research shows that attackersmay still steal user information by analyzing the parameters in the federated learning training process and the aggregation parameters on the server side.To solve this problem,differential privacy(DP)techniques are widely used for privacy protection in federated learning.However,adding Gaussian noise perturbations to the data degrades the model learning performance.To address these issues,this paper proposes a differential privacy federated learning scheme based on adaptive Gaussian noise(DPFL-AGN).To protect the data privacy and security of the federated learning training process,adaptive Gaussian noise is specifically added in the training process to hide the real parameters uploaded by the client.In addition,this paper proposes an adaptive noise reduction method.With the convergence of the model,the Gaussian noise in the later stage of the federated learning training process is reduced adaptively.This paper conducts a series of simulation experiments on realMNIST and CIFAR-10 datasets,and the results show that the DPFL-AGN algorithmperforms better compared to the other algorithms.展开更多
In recent years,the research field of data collection under local differential privacy(LDP)has expanded its focus fromelementary data types to includemore complex structural data,such as set-value and graph data.Howev...In recent years,the research field of data collection under local differential privacy(LDP)has expanded its focus fromelementary data types to includemore complex structural data,such as set-value and graph data.However,our comprehensive review of existing literature reveals that there needs to be more studies that engage with key-value data collection.Such studies would simultaneously collect the frequencies of keys and the mean of values associated with each key.Additionally,the allocation of the privacy budget between the frequencies of keys and the means of values for each key does not yield an optimal utility tradeoff.Recognizing the importance of obtaining accurate key frequencies and mean estimations for key-value data collection,this paper presents a novel framework:the Key-Strategy Framework forKey-ValueDataCollection under LDP.Initially,theKey-StrategyUnary Encoding(KS-UE)strategy is proposed within non-interactive frameworks for the purpose of privacy budget allocation to achieve precise key frequencies;subsequently,the Key-Strategy Generalized Randomized Response(KS-GRR)strategy is introduced for interactive frameworks to enhance the efficiency of collecting frequent keys through group-anditeration methods.Both strategies are adapted for scenarios in which users possess either a single or multiple key-value pairs.Theoretically,we demonstrate that the variance of KS-UE is lower than that of existing methods.These claims are substantiated through extensive experimental evaluation on real-world datasets,confirming the effectiveness and efficiency of the KS-UE and KS-GRR strategies.展开更多
The rapid evolution of artificial intelligence(AI)technologies has significantly propelled the advancement of the Internet of Vehicles(IoV).With AI support,represented by machine learning technology,vehicles gain the ...The rapid evolution of artificial intelligence(AI)technologies has significantly propelled the advancement of the Internet of Vehicles(IoV).With AI support,represented by machine learning technology,vehicles gain the capability to make intelligent decisions.As a distributed learning paradigm,federated learning(FL)has emerged as a preferred solution in IoV.Compared to traditional centralized machine learning,FL reduces communication overhead and improves privacy protection.Despite these benefits,FL still faces some security and privacy concerns,such as poisoning attacks and inference attacks,prompting exploration into blockchain integration to enhance its security posture.This paper introduces a novel blockchain-enabled federated learning(BCFL)scheme with differential privacy(DP)tailored for IoV.In order to meet the performance demanding IoV environment,the proposed methodology integrates a consortium blockchain with Practical Byzantine Fault Tolerance(PBFT)consensus,which offers superior efficiency over the conventional public blockchains.In addition,the proposed approach utilizes the Differentially Private Stochastic Gradient Descent(DP-SGD)algorithm in the local training process of FL for enhanced privacy protection.Experiment results indicate that the integration of blockchain elevates the security level of FL in that the proposed approach effectively safeguards FL against poisoning attacks.On the other hand,the additional overhead associated with blockchain integration is also limited to a moderate level to meet the efficiency criteria of IoV.Furthermore,by incorporating DP,the proposed approach is shown to have the(ε-δ)privacy guarantee while maintaining an acceptable level of model accuracy.This enhancement effectively mitigates the threat of inference attacks on private information.展开更多
差分隐私凭借其强大的隐私保护能力被应用在随机森林算法解决其中的隐私泄露问题,然而,直接将差分隐私应用在随机森林算法会使模型的分类准确率严重下降.为了平衡隐私保护和模型准确性之间的矛盾,提出了一种高效的差分隐私随机森林训练...差分隐私凭借其强大的隐私保护能力被应用在随机森林算法解决其中的隐私泄露问题,然而,直接将差分隐私应用在随机森林算法会使模型的分类准确率严重下降.为了平衡隐私保护和模型准确性之间的矛盾,提出了一种高效的差分隐私随机森林训练算法eDPRF(efficient differential privacy random forest).具体而言,该算法设计了决策树构建方法,通过引入重排翻转机制高效地查询输出优势,进一步设计相应的效用函数实现分裂特征以及标签的精准输出,有效改善树模型在扰动情况下对于数据信息的学习能力.同时基于组合定理设计了隐私预算分配的策略,通过不放回抽样获得训练子集以及差异化调整内部预算的方式提高树节点的查询预算.最后,通过理论分析以及实验评估,表明算法在给定相同隐私预算的情况下,模型的分类准确度优于同类算法.展开更多
The proliferation of Large Language Models (LLMs) across various sectors underscored the urgency of addressing potential privacy breaches. Vulnerabilities, such as prompt injection attacks and other adversarial tactic...The proliferation of Large Language Models (LLMs) across various sectors underscored the urgency of addressing potential privacy breaches. Vulnerabilities, such as prompt injection attacks and other adversarial tactics, could make these models inadvertently disclose their training data. Such disclosures could compromise personal identifiable information, posing significant privacy risks. In this paper, we proposed a novel multi-faceted approach called Whispered Tuning to address privacy leaks in large language models (LLMs). We integrated a PII redaction model, differential privacy techniques, and an output filter into the LLM fine-tuning process to enhance confidentiality. Additionally, we introduced novel ideas like the Epsilon Dial for adjustable privacy budgeting for differentiated Training Phases per data handler role. Through empirical validation, including attacks on non-private models, we demonstrated the robustness of our proposed solution SecureNLP in safeguarding privacy without compromising utility. This pioneering methodology significantly fortified LLMs against privacy infringements, enabling responsible adoption across sectors.展开更多
This paper investigates a class of constrained distributed zeroth-order optimization(ZOO) problems over timevarying unbalanced graphs while ensuring privacy preservation among individual agents. Not taking into accoun...This paper investigates a class of constrained distributed zeroth-order optimization(ZOO) problems over timevarying unbalanced graphs while ensuring privacy preservation among individual agents. Not taking into account recent progress and addressing these concerns separately, there remains a lack of solutions offering theoretical guarantees for both privacy protection and constrained ZOO over time-varying unbalanced graphs.We hereby propose a novel algorithm, termed the differential privacy(DP) distributed push-sum based zeroth-order constrained optimization algorithm(DP-ZOCOA). Operating over time-varying unbalanced graphs, DP-ZOCOA obviates the need for supplemental suboptimization problem computations, thereby reducing overhead in comparison to distributed primary-dual methods. DP-ZOCOA is specifically tailored to tackle constrained ZOO problems over time-varying unbalanced graphs,offering a guarantee of convergence to the optimal solution while robustly preserving privacy. Moreover, we provide rigorous proofs of convergence and privacy for DP-ZOCOA, underscoring its efficacy in attaining optimal convergence without constraints. To enhance its applicability, we incorporate DP-ZOCOA into the federated learning framework and formulate a decentralized zeroth-order constrained federated learning algorithm(ZOCOA-FL) to address challenges stemming from the timevarying imbalance of communication topology. Finally, the performance and effectiveness of the proposed algorithms are thoroughly evaluated through simulations on distributed least squares(DLS) and decentralized federated learning(DFL) tasks.展开更多
To realize dynamic statistical publishing and protection of location-based data privacy,this paper proposes a differential privacy publishing algorithm based on adaptive sampling and grid clustering and adjustment.The...To realize dynamic statistical publishing and protection of location-based data privacy,this paper proposes a differential privacy publishing algorithm based on adaptive sampling and grid clustering and adjustment.The PID control strategy is combined with the difference in data variation to realize the dynamic adjustment of the data publishing intervals.The spatial-temporal correlations of the adjacent snapshots are utilized to design the grid clustering and adjustment algorithm,which facilitates saving the execution time of the publishing process.The budget distribution and budget absorption strategies are improved to form the sliding window-based differential privacy statistical publishing algorithm,which realizes continuous statistical publishing and privacy protection and improves the accuracy of published data.Experiments and analysis on large datasets of actual locations show that the privacy protection algorithm proposed in this paper is superior to other existing algorithms in terms of the accuracy of adaptive sampling time,the availability of published data,and the execution efficiency of data publishing methods.展开更多
The integration of technologies like artificial intelligence,6G,and vehicular ad-hoc networks holds great potential to meet the communication demands of the Internet of Vehicles and drive the advancement of vehicle ap...The integration of technologies like artificial intelligence,6G,and vehicular ad-hoc networks holds great potential to meet the communication demands of the Internet of Vehicles and drive the advancement of vehicle applications.However,these advancements also generate a surge in data processing requirements,necessitating the offloading of vehicular tasks to edge servers due to the limited computational capacity of vehicles.Despite recent advancements,the robustness and scalability of the existing approaches with respect to the number of vehicles and edge servers and their resources,as well as privacy,remain a concern.In this paper,a lightweight offloading strategy that leverages ubiquitous connectivity through the Space Air Ground Integrated Vehicular Network architecture while ensuring privacy preservation is proposed.The Internet of Vehicles(IoV)environment is first modeled as a graph,with vehicles and base stations as nodes,and their communication links as edges.Secondly,vehicular applications are offloaded to suitable servers based on latency using an attention-based heterogeneous graph neural network(HetGNN)algorithm.Subsequently,a differential privacy stochastic gradient descent trainingmechanism is employed for privacypreserving of vehicles and offloading inference.Finally,the simulation results demonstrated that the proposedHetGNN method shows good performance with 0.321 s of inference time,which is 42.68%,63.93%,30.22%,and 76.04% less than baseline methods such as Deep Deterministic Policy Gradient,Deep Q Learning,Deep Neural Network,and Genetic Algorithm,respectively.展开更多
中心化差分隐私和本地化差分隐私下的直方图发布技术已得到广泛研究。为解决用户隐私需求和发布误差之间难以平衡的问题,在混洗差分隐私模型下提出一种直方图发布算法OD-HP(histogram publishing based on optimized local hash and dum...中心化差分隐私和本地化差分隐私下的直方图发布技术已得到广泛研究。为解决用户隐私需求和发布误差之间难以平衡的问题,在混洗差分隐私模型下提出一种直方图发布算法OD-HP(histogram publishing based on optimized local hash and dummy points)。该算法采用优化本地哈希扰动机制OLH对用户数据进行编码和扰动,解决了数据值域过大导致误差较大的问题。为抵御混洗器和收集端的合谋攻击,在扰动后的数据中添加虚拟数据,混洗端将扰动后的数据和虚拟数据随机均匀混洗,并在收集端进行直方图发布,最后使用EM算法对混洗后的数据求精优化。从理论上分析了OD-HP算法的隐私性和可用性,并在真实数据集上对所提出的方案进行验证。实验结果表明OD-HP算法在保证数据隐私性的同时有效降低了发布误差。展开更多
基金supported by the Natural Science Foundation of Fujian Province of China(2025J01380)National Natural Science Foundation of China(No.62471139)+3 种基金the Major Health Research Project of Fujian Province(2021ZD01001)Fujian Provincial Units Special Funds for Education and Research(2022639)Fujian University of Technology Research Start-up Fund(GY-S24002)Fujian Research and Training Grants for Young and Middle-aged Leaders in Healthcare(GY-H-24179).
文摘The generation of synthetic trajectories has become essential in various fields for analyzing complex movement patterns.However,the use of real-world trajectory data poses significant privacy risks,such as location reidentification and correlation attacks.To address these challenges,privacy-preserving trajectory generation methods are critical for applications relying on sensitive location data.This paper introduces DPIL-Traj,an advanced framework designed to generate synthetic trajectories while achieving a superior balance between data utility and privacy preservation.Firstly,the framework incorporates Differential Privacy Clustering,which anonymizes trajectory data by applying differential privacy techniques that add noise,ensuring the protection of sensitive user information.Secondly,Imitation Learning is used to replicate decision-making behaviors observed in real-world trajectories.By learning from expert trajectories,this component generates synthetic data that closely mimics real-world decision-making processes while optimizing the quality of the generated trajectories.Finally,Markov-based Trajectory Generation is employed to capture and maintain the inherent temporal dynamics of movement patterns.Extensive experiments conducted on the GeoLife trajectory dataset show that DPIL-Traj improves utility performance by an average of 19.85%,and in terms of privacy performance by an average of 12.51%,compared to state-of-the-art approaches.Ablation studies further reveal that DP clustering effectively safeguards privacy,imitation learning enhances utility under noise,and the Markov module strengthens temporal coherence.
基金supported by the National Natural Science Foundation of China(Grant No.62462022)the Hainan Province Science and Technology Special Fund(Grants No.ZDYF2022GXJS229).
文摘Deep learning’s widespread dependence on large datasets raises privacy concerns due to the potential presence of sensitive information.Differential privacy stands out as a crucial method for preserving privacy,garnering significant interest for its ability to offer robust and verifiable privacy safeguards during data training.However,classic differentially private learning introduces the same level of noise into the gradients across training iterations,which affects the trade-off between model utility and privacy guarantees.To address this issue,an adaptive differential privacy mechanism was proposed in this paper,which dynamically adjusts the privacy budget at the layer-level as training progresses to resist member inference attacks.Specifically,an equal privacy budget is initially allocated to each layer.Subsequently,as training advances,the privacy budget for layers closer to the output is reduced(adding more noise),while the budget for layers closer to the input is increased.The adjustment magnitude depends on the training iterations and is automatically determined based on the iteration count.This dynamic allocation provides a simple process for adjusting privacy budgets,alleviating the burden on users to tweak parameters and ensuring that privacy preservation strategies align with training progress.Extensive experiments on five well-known datasets indicate that the proposed method outperforms competing methods in terms of accuracy and resilience against membership inference attacks.
基金funded by the Science and Technology Project of State Grid Corporation of China(Research on the theory and method of multiparty encrypted computation in the edge fusion environment of power IoT,No.5700-202358592A-3-2-ZN)the National Natural Science Foundation of China(Grant Nos.62272056,62372048,62371069).
文摘Federated learning effectively alleviates privacy and security issues raised by the development of artificial intelligence through a distributed training architecture.Existing research has shown that attackers can compromise user privacy and security by stealing model parameters.Therefore,differential privacy is applied in federated learning to further address malicious issues.However,the addition of noise and the update clipping mechanism in differential privacy jointly limit the further development of federated learning in privacy protection and performance optimization.Therefore,we propose an adaptive adjusted differential privacy federated learning method.First,a dynamic adaptive privacy budget allocation strategy is proposed,which flexibly adjusts the privacy budget within a given range based on the client’s data volume and training requirements,thereby alleviating the loss of privacy budget and the magnitude of model noise.Second,a longitudinal clipping differential privacy strategy is proposed,which based on the differences in factors that affect parameter updates,uses sparse methods to trim local updates,thereby reducing the impact of privacy pruning steps on model accuracy.The two strategies work together to ensure user privacy while the effect of differential privacy on model accuracy is reduced.To evaluate the effectiveness of our method,we conducted extensive experiments on benchmark datasets,and the results showed that our proposed method performed well in terms of performance and privacy protection.
文摘The rapid development and widespread adoption of massive open online courses(MOOCs)have indeed had a significant impact on China’s education curriculum.However,the problem of fake reviews and ratings on the platform has seriously affected the authenticity of course evaluations and user trust,requiring effective anomaly detection techniques for screening.The textual characteristics of MOOCs reviews,such as varying lengths and diverse emotional tendencies,have brought complexity to text analysis.Traditional rule-based analysis methods are often inadequate in dealing with such unstructured data.We propose a Differential Privacy-Enabled Text Convolutional Neural Network(DP-TextCNN)framework,aiming to achieve high-precision identification of outliers in MOOCs course reviews and ratings while protecting user privacy.This framework leverages the advantages of Convolutional Neural Networks(CNN)in text feature extraction and combines differential privacy techniques.It balances data privacy protection with model performance by introducing controlled random noise during the data preprocessing stage.By embedding differential privacy into the model training process,we ensure the privacy security of the framework when handling sensitive data,while maintaining a high recognition accuracy.Experimental results indicate that the DP-TextCNN framework achieves an exceptional accuracy of over 95%in identifying fake reviews on the dataset,this outcome not only verifies the applicability of differential privacy techniques in TextCNN but also underscores its potential in handling sensitive educational data.Additionally,we analyze the specific impact of differential privacy parameters on framework performance,offering theoretical support and empirical analysis to strike an optimal balance between privacy protection and framework efficiency.
文摘With the ongoing digitalization and intelligence of power systems,there is an increasing reliance on large-scale data-driven intelligent technologies for tasks such as scheduling optimization and load forecasting.Nevertheless,power data often contains sensitive information,making it a critical industry challenge to efficiently utilize this data while ensuring privacy.Traditional Federated Learning(FL)methods can mitigate data leakage by training models locally instead of transmitting raw data.Despite this,FL still has privacy concerns,especially gradient leakage,which might expose users’sensitive information.Therefore,integrating Differential Privacy(DP)techniques is essential for stronger privacy protection.Even so,the noise from DP may reduce the performance of federated learning models.To address this challenge,this paper presents an explainability-driven power data privacy federated learning framework.It incorporates DP technology and,based on model explainability,adaptively adjusts privacy budget allocation and model aggregation,thus balancing privacy protection and model performance.The key innovations of this paper are as follows:(1)We propose an explainability-driven power data privacy federated learning framework.(2)We detail a privacy budget allocation strategy:assigning budgets per training round by gradient effectiveness and at model granularity by layer importance.(3)We design a weighted aggregation strategy that considers the SHAP value and model accuracy for quality knowledge sharing.(4)Experiments show the proposed framework outperforms traditional methods in balancing privacy protection and model performance in power load forecasting tasks.
文摘Federated Learning(FL),a practical solution that leverages distributed data across devices without the need for centralized data storage,which enables multiple participants to jointly train models while preserving data privacy and avoiding direct data sharing.Despite its privacy-preserving advantages,FL remains vulnerable to backdoor attacks,where malicious participants introduce backdoors into local models that are then propagated to the global model through the aggregation process.While existing differential privacy defenses have demonstrated effectiveness against backdoor attacks in FL,they often incur a significant degradation in the performance of the aggregated models on benign tasks.To address this limitation,we propose a novel backdoor defense mechanism based on differential privacy.Our approach first utilizes the inherent out-of-distribution characteristics of backdoor samples to identify and exclude malicious model updates that significantly deviate from benign models.By filtering out models that are clearly backdoor-infected before applying differential privacy,our method reduces the required noise level for differential privacy,thereby enhancing model robustness while preserving performance.Experimental evaluations on the CIFAR10 and FEMNIST datasets demonstrate that our method effectively limits the backdoor accuracy to below 15%across various backdoor scenarios while maintaining high main task accuracy.
基金supported by the Inner Mongolia Natural Science Foundation(Grant No.2023MS06022)the University Youth Science and Technology Talent Development Project(Innovation Group Development Plan)of Inner Mongolia A.R.of China(Grant No.NMGIRT2318)+1 种基金the“Inner Mongolia Science and Technology Achievement Transfer and Transformation Demonstration Zone,University Collaborative Innovation Base,and University Entrepreneurship Training Base”Construction Project(Supercomputing Power Project)(Grant No.21300-231510)the Engineering Research Center of Ecological Big Data,Ministry of Education.
文摘Mobile crowdsensing(MCS)has become an effective paradigm to facilitate urban sensing.However,mobile users participating in sensing tasks will face the risk of location privacy leakage when uploading their actual sensing location data.In the application of mobile crowdsensing,most location privacy protection studies do not consider the temporal correlations between locations,so they are vulnerable to various inference attacks,and there is the problem of low data availability.In order to solve the above problems,this paper proposes a dynamic differential location privacy data publishing framework(DDLP)that protects privacy while publishing locations continuously.Firstly,the corresponding Markov transition matrices are established according to different times of historical trajectories,and then the protection location set is generated based on the current location at each timestamp.Moreover,using the exponential mechanism in differential privacy perturbs the true location by designing the utility function.Finally,experiments on the real-world trajectory dataset show that our method not only provides strong privacy guarantees,but also outperforms existing methods in terms of data availability and computational efficiency.
基金the Sichuan Provincial Science and Technology Department Project under Grant 2019YFN0104the Yibin Science and Technology Plan Project under Grant 2021GY008the Sichuan University of Science and Engineering Postgraduate Innovation Fund Project under Grant Y2022154.
文摘As a distributed machine learning method,federated learning(FL)has the advantage of naturally protecting data privacy.It keeps data locally and trains local models through local data to protect the privacy of local data.The federated learning method effectively solves the problem of artificial Smart data islands and privacy protection issues.However,existing research shows that attackersmay still steal user information by analyzing the parameters in the federated learning training process and the aggregation parameters on the server side.To solve this problem,differential privacy(DP)techniques are widely used for privacy protection in federated learning.However,adding Gaussian noise perturbations to the data degrades the model learning performance.To address these issues,this paper proposes a differential privacy federated learning scheme based on adaptive Gaussian noise(DPFL-AGN).To protect the data privacy and security of the federated learning training process,adaptive Gaussian noise is specifically added in the training process to hide the real parameters uploaded by the client.In addition,this paper proposes an adaptive noise reduction method.With the convergence of the model,the Gaussian noise in the later stage of the federated learning training process is reduced adaptively.This paper conducts a series of simulation experiments on realMNIST and CIFAR-10 datasets,and the results show that the DPFL-AGN algorithmperforms better compared to the other algorithms.
基金supported by a grant fromthe National Key R&DProgram of China.
文摘In recent years,the research field of data collection under local differential privacy(LDP)has expanded its focus fromelementary data types to includemore complex structural data,such as set-value and graph data.However,our comprehensive review of existing literature reveals that there needs to be more studies that engage with key-value data collection.Such studies would simultaneously collect the frequencies of keys and the mean of values associated with each key.Additionally,the allocation of the privacy budget between the frequencies of keys and the means of values for each key does not yield an optimal utility tradeoff.Recognizing the importance of obtaining accurate key frequencies and mean estimations for key-value data collection,this paper presents a novel framework:the Key-Strategy Framework forKey-ValueDataCollection under LDP.Initially,theKey-StrategyUnary Encoding(KS-UE)strategy is proposed within non-interactive frameworks for the purpose of privacy budget allocation to achieve precise key frequencies;subsequently,the Key-Strategy Generalized Randomized Response(KS-GRR)strategy is introduced for interactive frameworks to enhance the efficiency of collecting frequent keys through group-anditeration methods.Both strategies are adapted for scenarios in which users possess either a single or multiple key-value pairs.Theoretically,we demonstrate that the variance of KS-UE is lower than that of existing methods.These claims are substantiated through extensive experimental evaluation on real-world datasets,confirming the effectiveness and efficiency of the KS-UE and KS-GRR strategies.
基金supported in part by the Natural Science Foundation of Henan Province(Grant No.202300410510)the Consulting Research Project of Chinese Academy of Engineering(Grant No.2020YNZH7)+3 种基金the Key Scientific Research Project of Colleges and Universities in Henan Province(Grant Nos.23A520043 and 23B520010)the International Science and Technology Cooperation Project of Henan Province(Grant No.232102521004)the National Key Research and Development Program of China(Grant No.2020YFB1005404)the Henan Provincial Science and Technology Research Project(Grant No.212102210100).
文摘The rapid evolution of artificial intelligence(AI)technologies has significantly propelled the advancement of the Internet of Vehicles(IoV).With AI support,represented by machine learning technology,vehicles gain the capability to make intelligent decisions.As a distributed learning paradigm,federated learning(FL)has emerged as a preferred solution in IoV.Compared to traditional centralized machine learning,FL reduces communication overhead and improves privacy protection.Despite these benefits,FL still faces some security and privacy concerns,such as poisoning attacks and inference attacks,prompting exploration into blockchain integration to enhance its security posture.This paper introduces a novel blockchain-enabled federated learning(BCFL)scheme with differential privacy(DP)tailored for IoV.In order to meet the performance demanding IoV environment,the proposed methodology integrates a consortium blockchain with Practical Byzantine Fault Tolerance(PBFT)consensus,which offers superior efficiency over the conventional public blockchains.In addition,the proposed approach utilizes the Differentially Private Stochastic Gradient Descent(DP-SGD)algorithm in the local training process of FL for enhanced privacy protection.Experiment results indicate that the integration of blockchain elevates the security level of FL in that the proposed approach effectively safeguards FL against poisoning attacks.On the other hand,the additional overhead associated with blockchain integration is also limited to a moderate level to meet the efficiency criteria of IoV.Furthermore,by incorporating DP,the proposed approach is shown to have the(ε-δ)privacy guarantee while maintaining an acceptable level of model accuracy.This enhancement effectively mitigates the threat of inference attacks on private information.
文摘差分隐私凭借其强大的隐私保护能力被应用在随机森林算法解决其中的隐私泄露问题,然而,直接将差分隐私应用在随机森林算法会使模型的分类准确率严重下降.为了平衡隐私保护和模型准确性之间的矛盾,提出了一种高效的差分隐私随机森林训练算法eDPRF(efficient differential privacy random forest).具体而言,该算法设计了决策树构建方法,通过引入重排翻转机制高效地查询输出优势,进一步设计相应的效用函数实现分裂特征以及标签的精准输出,有效改善树模型在扰动情况下对于数据信息的学习能力.同时基于组合定理设计了隐私预算分配的策略,通过不放回抽样获得训练子集以及差异化调整内部预算的方式提高树节点的查询预算.最后,通过理论分析以及实验评估,表明算法在给定相同隐私预算的情况下,模型的分类准确度优于同类算法.
文摘The proliferation of Large Language Models (LLMs) across various sectors underscored the urgency of addressing potential privacy breaches. Vulnerabilities, such as prompt injection attacks and other adversarial tactics, could make these models inadvertently disclose their training data. Such disclosures could compromise personal identifiable information, posing significant privacy risks. In this paper, we proposed a novel multi-faceted approach called Whispered Tuning to address privacy leaks in large language models (LLMs). We integrated a PII redaction model, differential privacy techniques, and an output filter into the LLM fine-tuning process to enhance confidentiality. Additionally, we introduced novel ideas like the Epsilon Dial for adjustable privacy budgeting for differentiated Training Phases per data handler role. Through empirical validation, including attacks on non-private models, we demonstrated the robustness of our proposed solution SecureNLP in safeguarding privacy without compromising utility. This pioneering methodology significantly fortified LLMs against privacy infringements, enabling responsible adoption across sectors.
基金supported in part by the National Key Research and Development Program of China(2022ZD0120001)the National Natural Science Foundation of China(62233004,62273090,62073076)the Jiangsu Provincial Scientific Research Center of Applied Mathematics(BK20233002)
文摘This paper investigates a class of constrained distributed zeroth-order optimization(ZOO) problems over timevarying unbalanced graphs while ensuring privacy preservation among individual agents. Not taking into account recent progress and addressing these concerns separately, there remains a lack of solutions offering theoretical guarantees for both privacy protection and constrained ZOO over time-varying unbalanced graphs.We hereby propose a novel algorithm, termed the differential privacy(DP) distributed push-sum based zeroth-order constrained optimization algorithm(DP-ZOCOA). Operating over time-varying unbalanced graphs, DP-ZOCOA obviates the need for supplemental suboptimization problem computations, thereby reducing overhead in comparison to distributed primary-dual methods. DP-ZOCOA is specifically tailored to tackle constrained ZOO problems over time-varying unbalanced graphs,offering a guarantee of convergence to the optimal solution while robustly preserving privacy. Moreover, we provide rigorous proofs of convergence and privacy for DP-ZOCOA, underscoring its efficacy in attaining optimal convergence without constraints. To enhance its applicability, we incorporate DP-ZOCOA into the federated learning framework and formulate a decentralized zeroth-order constrained federated learning algorithm(ZOCOA-FL) to address challenges stemming from the timevarying imbalance of communication topology. Finally, the performance and effectiveness of the proposed algorithms are thoroughly evaluated through simulations on distributed least squares(DLS) and decentralized federated learning(DFL) tasks.
基金supported by National Nature Science Foundation of China(No.62361036)Nature Science Foundation of Gansu Province(No.22JR5RA279).
文摘To realize dynamic statistical publishing and protection of location-based data privacy,this paper proposes a differential privacy publishing algorithm based on adaptive sampling and grid clustering and adjustment.The PID control strategy is combined with the difference in data variation to realize the dynamic adjustment of the data publishing intervals.The spatial-temporal correlations of the adjacent snapshots are utilized to design the grid clustering and adjustment algorithm,which facilitates saving the execution time of the publishing process.The budget distribution and budget absorption strategies are improved to form the sliding window-based differential privacy statistical publishing algorithm,which realizes continuous statistical publishing and privacy protection and improves the accuracy of published data.Experiments and analysis on large datasets of actual locations show that the privacy protection algorithm proposed in this paper is superior to other existing algorithms in terms of the accuracy of adaptive sampling time,the availability of published data,and the execution efficiency of data publishing methods.
文摘The integration of technologies like artificial intelligence,6G,and vehicular ad-hoc networks holds great potential to meet the communication demands of the Internet of Vehicles and drive the advancement of vehicle applications.However,these advancements also generate a surge in data processing requirements,necessitating the offloading of vehicular tasks to edge servers due to the limited computational capacity of vehicles.Despite recent advancements,the robustness and scalability of the existing approaches with respect to the number of vehicles and edge servers and their resources,as well as privacy,remain a concern.In this paper,a lightweight offloading strategy that leverages ubiquitous connectivity through the Space Air Ground Integrated Vehicular Network architecture while ensuring privacy preservation is proposed.The Internet of Vehicles(IoV)environment is first modeled as a graph,with vehicles and base stations as nodes,and their communication links as edges.Secondly,vehicular applications are offloaded to suitable servers based on latency using an attention-based heterogeneous graph neural network(HetGNN)algorithm.Subsequently,a differential privacy stochastic gradient descent trainingmechanism is employed for privacypreserving of vehicles and offloading inference.Finally,the simulation results demonstrated that the proposedHetGNN method shows good performance with 0.321 s of inference time,which is 42.68%,63.93%,30.22%,and 76.04% less than baseline methods such as Deep Deterministic Policy Gradient,Deep Q Learning,Deep Neural Network,and Genetic Algorithm,respectively.
文摘中心化差分隐私和本地化差分隐私下的直方图发布技术已得到广泛研究。为解决用户隐私需求和发布误差之间难以平衡的问题,在混洗差分隐私模型下提出一种直方图发布算法OD-HP(histogram publishing based on optimized local hash and dummy points)。该算法采用优化本地哈希扰动机制OLH对用户数据进行编码和扰动,解决了数据值域过大导致误差较大的问题。为抵御混洗器和收集端的合谋攻击,在扰动后的数据中添加虚拟数据,混洗端将扰动后的数据和虚拟数据随机均匀混洗,并在收集端进行直方图发布,最后使用EM算法对混洗后的数据求精优化。从理论上分析了OD-HP算法的隐私性和可用性,并在真实数据集上对所提出的方案进行验证。实验结果表明OD-HP算法在保证数据隐私性的同时有效降低了发布误差。