Large-scale neural networks-based federated learning(FL)has gained public recognition for its effective capabilities in distributed training.Nonetheless,the open system architecture inherent to federated learning syst...Large-scale neural networks-based federated learning(FL)has gained public recognition for its effective capabilities in distributed training.Nonetheless,the open system architecture inherent to federated learning systems raises concerns regarding their vulnerability to potential attacks.Poisoning attacks turn into a major menace to federated learning on account of their concealed property and potent destructive force.By altering the local model during routine machine learning training,attackers can easily contaminate the global model.Traditional detection and aggregation solutions mitigate certain threats,but they are still insufficient to completely eliminate the influence generated by attackers.Therefore,federated unlearning that can remove unreliable models while maintaining the accuracy of the global model has become a solution.Unfortunately some existing federated unlearning approaches are rather difficult to be applied in large neural network models because of their high computational expenses.Hence,we propose SlideFU,an efficient anti-poisoning attack federated unlearning framework.The primary concept of SlideFU is to employ sliding window to construct the training process,where all operations are confined within the window.We design a malicious detection scheme based on principal component analysis(PCA),which calculates the trust factors between compressed models in a low-cost way to eliminate unreliable models.After confirming that the global model is under attack,the system activates the federated unlearning process,calibrates the gradients based on the updated direction of the calibration gradients.Experiments on two public datasets demonstrate that our scheme can recover a robust model with extremely high efficiency.展开更多
Nowadays,machine learning is widely used in various applications.Training a model requires huge amounts of data,but it can pose a threat to user privacy.With the growing concern for privacy,the“Right to be Forgotten...Nowadays,machine learning is widely used in various applications.Training a model requires huge amounts of data,but it can pose a threat to user privacy.With the growing concern for privacy,the“Right to be Forgotten”has been proposed,which means that users have the right to request that their personal information be removed from machine learning models.The emergence of machine unlearning is a response to this need.Implementing machine unlearning is not easy because simply deleting samples from a database does not allow the model to“forget”the data.Therefore,this paper summarises the definition of the machine unlearning formulation,process,deletion requests,design requirements and validation,algorithms,applications,and future perspectives,in the hope that it will help future researchers in machine unlearning.展开更多
As an emerging discipline,machine learning has been widely used in artificial intelligence,education,meteorology and other fields.In the training of machine learning models,trainers need to use a large amount of pract...As an emerging discipline,machine learning has been widely used in artificial intelligence,education,meteorology and other fields.In the training of machine learning models,trainers need to use a large amount of practical data,which inevitably involves user privacy.Besides,by polluting the training data,a malicious adversary can poison the model,thus compromising model security.The data provider hopes that the model trainer can prove to them the confidentiality of the model.Trainer will be required to withdraw data when the trust collapses.In the meantime,trainers hope to forget the injected data to regain security when finding crafted poisoned data after the model training.Therefore,we focus on forgetting systems,the process of which we call machine unlearning,capable of forgetting specific data entirely and efficiently.In this paper,we present the first comprehensive survey of this realm.We summarize and categorize existing machine unlearning methods based on their characteristics and analyze the relation between machine unlearning and relevant fields(e.g.,inference attacks and data poisoning attacks).Finally,we briefly conclude the existing research directions.展开更多
The present work corresponds to a reflection about several theories and approaches of the learning and their relevance in the training of social agents,reflection that emerges from the practice of the training,and att...The present work corresponds to a reflection about several theories and approaches of the learning and their relevance in the training of social agents,reflection that emerges from the practice of the training,and attending to the fact that the meaning of what it implies to learn and how it is that is learned is not at the center of the discussion in universities in Latin America,which emphasizes more what to teach rather than how to teach.展开更多
The aim of the article is to explore the relation among capitalism,creative economy,and the end of rest in Gustavo Vinagre’s movie Unlearning to Sleep.The main argument indicates that,in the context of the imperative...The aim of the article is to explore the relation among capitalism,creative economy,and the end of rest in Gustavo Vinagre’s movie Unlearning to Sleep.The main argument indicates that,in the context of the imperatives within the inhumane temporalities of the 24/7 society,sleep and rest may represent an inevitable and anomalous resistance to the demands of the capitalist order in which creative economy is immersed and exposed in the movie.展开更多
1 Introduction Large Language Models(LLMs)possess massive parameters and are trained on vast datasets,demonstrating exceptional proficiency in various tasks.The remarkable advancements in LLMs also inspire the explora...1 Introduction Large Language Models(LLMs)possess massive parameters and are trained on vast datasets,demonstrating exceptional proficiency in various tasks.The remarkable advancements in LLMs also inspire the exploration of leveraging LLMs as recommenders(LLMRec),whose effectiveness stems from extensive open-world knowledge and reasoning ability in LLMs[1].LLMRec obtains the recommendation ability through instruction tuning on the user interaction data.But in many cases,it is also crucial for LLMRec to forget specific user data,which is referred to as recommendation unlearning[2],as shown in Fig.1.展开更多
全球数字化进程的加速伴随着数据主体信息失控现象日益显著。国内外数据安全相关法律相继出台,其中遗忘权(the Right to Be Forgotten)强调了数据主体拥有从数据使用方撤回其数据的权利。模型遗忘(Machine Unlearning)是机器学习领域践...全球数字化进程的加速伴随着数据主体信息失控现象日益显著。国内外数据安全相关法律相继出台,其中遗忘权(the Right to Be Forgotten)强调了数据主体拥有从数据使用方撤回其数据的权利。模型遗忘(Machine Unlearning)是机器学习领域践行遗忘权的技术,允许模型拥有方(即数据使用方)从已训练的模型中遗忘原本训练数据的指定数据,以满足数据拥有方撤回其数据的需求。现有针对模型遗忘效果的验证方法通常假设存在一个从未使用过被遗忘数据的基准模型,并通过测量遗忘后模型和基准模型的参数分布或输出分布是否足够相似来完成验证。然而,在恶意攻击场景下,模型拥有方容易伪造遗忘后模型的参数和输出分布,且模型参数通常难以归因于特定的训练数据,导致验证方难以有效验证目标模型是否遗忘其数据。本文提出了一种新的公开可验证模型遗忘方案,该方案在数据拥有方和模型拥有方之间执行,并在模型拥有方出现恶意行为时,数据拥有方能够生成任意第三方可验证的不可否认凭证。具体地,数据拥有方先利用动态通用累加器来认证被授权使用的数据或删除不被授权使用的数据;随后,模型拥有方在公开可验证隐蔽模型下证明模型训练使用了被累加数据或没有使用不被累加数据;最后,数据拥有方验证证明的有效性,若发现模型拥有方使用了未授权数据,则其生成公开可验证的凭证来追责模型拥有方的不合法行为。实验评估了不同数据量下证明和验证的计算开销,同时评估了不同数据点删除对模型预测结果的影响。展开更多
在自然语言处理(Natural Language Processing,NLP)领域,后门攻击已成为现代NLP应用的重大威胁,严重影响系统的安全性与可靠性。尽管文本领域已提出多种防御策略,但在不接触中毒数据集也不参与后门训练过程时,面对复杂的攻击场景,现有...在自然语言处理(Natural Language Processing,NLP)领域,后门攻击已成为现代NLP应用的重大威胁,严重影响系统的安全性与可靠性。尽管文本领域已提出多种防御策略,但在不接触中毒数据集也不参与后门训练过程时,面对复杂的攻击场景,现有方法仍难以有效应对。为此,提出一种基于机器遗忘的文本后门攻击防御方法NLPShield。该方法仅需少量干净样本,通过基于错误标注的训练和干净神经元剪枝两个关键阶段,实现对文本后门攻击的有效防御。实验在SST-2和AGNews数据集上进行,结果显示,在保持较高干净准确率的情况下,NLPShield方法相较于现有最先进基线防御方法,平均能将攻击成功率降低24.83%。这表明NLPShield方法能显著提升多种后门攻击的防御效果,切实有效地缓解文本后门攻击。展开更多
基金supported in part by the National Social Science Foundation of China under Grant 20BTQ058in part by the Natural Science Foundation of Hunan Province under Grant 2023JJ50033.
文摘Large-scale neural networks-based federated learning(FL)has gained public recognition for its effective capabilities in distributed training.Nonetheless,the open system architecture inherent to federated learning systems raises concerns regarding their vulnerability to potential attacks.Poisoning attacks turn into a major menace to federated learning on account of their concealed property and potent destructive force.By altering the local model during routine machine learning training,attackers can easily contaminate the global model.Traditional detection and aggregation solutions mitigate certain threats,but they are still insufficient to completely eliminate the influence generated by attackers.Therefore,federated unlearning that can remove unreliable models while maintaining the accuracy of the global model has become a solution.Unfortunately some existing federated unlearning approaches are rather difficult to be applied in large neural network models because of their high computational expenses.Hence,we propose SlideFU,an efficient anti-poisoning attack federated unlearning framework.The primary concept of SlideFU is to employ sliding window to construct the training process,where all operations are confined within the window.We design a malicious detection scheme based on principal component analysis(PCA),which calculates the trust factors between compressed models in a low-cost way to eliminate unreliable models.After confirming that the global model is under attack,the system activates the federated unlearning process,calibrates the gradients based on the updated direction of the calibration gradients.Experiments on two public datasets demonstrate that our scheme can recover a robust model with extremely high efficiency.
基金supported by the National Natural ScienceFoundation of China(62102035)the National Key Researchand Development Program of China(2022ZD0115901).
文摘Nowadays,machine learning is widely used in various applications.Training a model requires huge amounts of data,but it can pose a threat to user privacy.With the growing concern for privacy,the“Right to be Forgotten”has been proposed,which means that users have the right to request that their personal information be removed from machine learning models.The emergence of machine unlearning is a response to this need.Implementing machine unlearning is not easy because simply deleting samples from a database does not allow the model to“forget”the data.Therefore,this paper summarises the definition of the machine unlearning formulation,process,deletion requests,design requirements and validation,algorithms,applications,and future perspectives,in the hope that it will help future researchers in machine unlearning.
基金supported by the National Key Research and Development Program of China(2020YFC2003404)the National Natura Science Foundation of China(No.62072465,62172155,62102425,62102429)+1 种基金the Science and Technology Innovation Program of Hunan Province(Nos.2022RC3061,2021RC2071)the Natural Science Foundation of Hunan Province(No.2022JJ40564).
文摘As an emerging discipline,machine learning has been widely used in artificial intelligence,education,meteorology and other fields.In the training of machine learning models,trainers need to use a large amount of practical data,which inevitably involves user privacy.Besides,by polluting the training data,a malicious adversary can poison the model,thus compromising model security.The data provider hopes that the model trainer can prove to them the confidentiality of the model.Trainer will be required to withdraw data when the trust collapses.In the meantime,trainers hope to forget the injected data to regain security when finding crafted poisoned data after the model training.Therefore,we focus on forgetting systems,the process of which we call machine unlearning,capable of forgetting specific data entirely and efficiently.In this paper,we present the first comprehensive survey of this realm.We summarize and categorize existing machine unlearning methods based on their characteristics and analyze the relation between machine unlearning and relevant fields(e.g.,inference attacks and data poisoning attacks).Finally,we briefly conclude the existing research directions.
文摘The present work corresponds to a reflection about several theories and approaches of the learning and their relevance in the training of social agents,reflection that emerges from the practice of the training,and attending to the fact that the meaning of what it implies to learn and how it is that is learned is not at the center of the discussion in universities in Latin America,which emphasizes more what to teach rather than how to teach.
文摘The aim of the article is to explore the relation among capitalism,creative economy,and the end of rest in Gustavo Vinagre’s movie Unlearning to Sleep.The main argument indicates that,in the context of the imperatives within the inhumane temporalities of the 24/7 society,sleep and rest may represent an inevitable and anomalous resistance to the demands of the capitalist order in which creative economy is immersed and exposed in the movie.
基金supported by the National Natural Science Foundation of China(Grant No.62177033)sponsored by the Huawei Innovation Research Program.
文摘1 Introduction Large Language Models(LLMs)possess massive parameters and are trained on vast datasets,demonstrating exceptional proficiency in various tasks.The remarkable advancements in LLMs also inspire the exploration of leveraging LLMs as recommenders(LLMRec),whose effectiveness stems from extensive open-world knowledge and reasoning ability in LLMs[1].LLMRec obtains the recommendation ability through instruction tuning on the user interaction data.But in many cases,it is also crucial for LLMRec to forget specific user data,which is referred to as recommendation unlearning[2],as shown in Fig.1.
文摘全球数字化进程的加速伴随着数据主体信息失控现象日益显著。国内外数据安全相关法律相继出台,其中遗忘权(the Right to Be Forgotten)强调了数据主体拥有从数据使用方撤回其数据的权利。模型遗忘(Machine Unlearning)是机器学习领域践行遗忘权的技术,允许模型拥有方(即数据使用方)从已训练的模型中遗忘原本训练数据的指定数据,以满足数据拥有方撤回其数据的需求。现有针对模型遗忘效果的验证方法通常假设存在一个从未使用过被遗忘数据的基准模型,并通过测量遗忘后模型和基准模型的参数分布或输出分布是否足够相似来完成验证。然而,在恶意攻击场景下,模型拥有方容易伪造遗忘后模型的参数和输出分布,且模型参数通常难以归因于特定的训练数据,导致验证方难以有效验证目标模型是否遗忘其数据。本文提出了一种新的公开可验证模型遗忘方案,该方案在数据拥有方和模型拥有方之间执行,并在模型拥有方出现恶意行为时,数据拥有方能够生成任意第三方可验证的不可否认凭证。具体地,数据拥有方先利用动态通用累加器来认证被授权使用的数据或删除不被授权使用的数据;随后,模型拥有方在公开可验证隐蔽模型下证明模型训练使用了被累加数据或没有使用不被累加数据;最后,数据拥有方验证证明的有效性,若发现模型拥有方使用了未授权数据,则其生成公开可验证的凭证来追责模型拥有方的不合法行为。实验评估了不同数据量下证明和验证的计算开销,同时评估了不同数据点删除对模型预测结果的影响。
文摘在自然语言处理(Natural Language Processing,NLP)领域,后门攻击已成为现代NLP应用的重大威胁,严重影响系统的安全性与可靠性。尽管文本领域已提出多种防御策略,但在不接触中毒数据集也不参与后门训练过程时,面对复杂的攻击场景,现有方法仍难以有效应对。为此,提出一种基于机器遗忘的文本后门攻击防御方法NLPShield。该方法仅需少量干净样本,通过基于错误标注的训练和干净神经元剪枝两个关键阶段,实现对文本后门攻击的有效防御。实验在SST-2和AGNews数据集上进行,结果显示,在保持较高干净准确率的情况下,NLPShield方法相较于现有最先进基线防御方法,平均能将攻击成功率降低24.83%。这表明NLPShield方法能显著提升多种后门攻击的防御效果,切实有效地缓解文本后门攻击。