In the realm of Intelligent Railway Transportation Systems,effective multi-party collaboration is crucial due to concerns over privacy and data silos.Vertical Federated Learning(VFL)has emerged as a promising approach...In the realm of Intelligent Railway Transportation Systems,effective multi-party collaboration is crucial due to concerns over privacy and data silos.Vertical Federated Learning(VFL)has emerged as a promising approach to facilitate such collaboration,allowing diverse entities to collectively enhance machine learning models without the need to share sensitive training data.However,existing works have highlighted VFL’s susceptibility to privacy inference attacks,where an honest but curious server could potentially reconstruct a client’s raw data from embeddings uploaded by the client.This vulnerability poses a significant threat to VFL-based intelligent railway transportation systems.In this paper,we introduce SensFL,a novel privacy-enhancing method to against privacy inference attacks in VFL.Specifically,SensFL integrates regularization of the sensitivity of embeddings to the original data into the model training process,effectively limiting the information contained in shared embeddings.By reducing the sensitivity of embeddings to the original data,SensFL can effectively resist reverse privacy attacks and prevent the reconstruction of the original data from the embeddings.Extensive experiments were conducted on four distinct datasets and three different models to demonstrate the efficacy of SensFL.Experiment results show that SensFL can effectively mitigate privacy inference attacks while maintaining the accuracy of the primary learning task.These results underscore SensFL’s potential to advance privacy protection technologies within VFL-based intelligent railway systems,addressing critical security concerns in collaborative learning environments.展开更多
Vertical Federated Learning(VFL),which draws attention because of its ability to evaluate individuals based on features spread across multiple institutions,encounters numerous privacy and security threats.Existing sol...Vertical Federated Learning(VFL),which draws attention because of its ability to evaluate individuals based on features spread across multiple institutions,encounters numerous privacy and security threats.Existing solutions often suffer from centralized architectures,and exorbitant costs.To mitigate these issues,in this paper,we propose SecureVFL,a decentralized multi-party VFL scheme designed to enhance efficiency and trustworthiness while guaranteeing privacy.SecureVFL uses a permissioned blockchain and introduces a novel consensus algorithm,Proof of Feature Sharing(PoFS),to facilitate decentralized,trustworthy,and high-throughput federated training.SecureVFL introduces a verifiable and lightweight three-party Replicated Secret Sharing(RSS)protocol for feature intersection summation among overlapping users.Furthermore,we propose a(_(2)^(4))-sharing protocol to achieve federated training in a four-party VFL setting.This protocol involves only addition operations and exhibits robustness.SecureVFL not only enables anonymous interactions among participants but also safeguards their real identities,and provides mechanisms to unmask these identities when malicious activities are performed.We illustrate the proposed mechanism through a case study on VFL across four banks.Finally,our theoretical analysis proves the security of SecureVFL.Experiments demonstrated that SecureVFL outperformed existing multi-party VFL privacy-preserving schemes,such as MP-FedXGB,in terms of both overhead and model performance.展开更多
As an important branch of federated learning,vertical federated learning(VFL)enables multiple institutions to train on the same user samples,bringing considerable industry benefits.However,VFL needs to exchange user f...As an important branch of federated learning,vertical federated learning(VFL)enables multiple institutions to train on the same user samples,bringing considerable industry benefits.However,VFL needs to exchange user features among multiple institutions,which raises concerns about privacy leakage.Moreover,existing multi-party VFL privacy-preserving schemes suffer from issues such as poor reli-ability and high communication overhead.To address these issues,we propose a privacy protection scheme for four institutional VFLs,named FVFL.A hierarchical framework is first introduced to support federated training among four institutions.We also design a verifiable repli-cated secret sharing(RSS)protocol(32)-sharing and combine it with homomorphic encryption to ensure the reliability of FVFL while ensuring the privacy of features and intermediate results of the four institutions.Our theoretical analysis proves the reliability and security of the pro-posed FVFL.Extended experiments verify that the proposed scheme achieves excellent performance with a low communication overhead.展开更多
The data in Mobile Edge Computing(MEC)contains tremendousmarket value,and data sharing canmaximize the usefulness of the data.However,certain data is quite sensitive,and sharing it directly may violate privacy.Vertica...The data in Mobile Edge Computing(MEC)contains tremendousmarket value,and data sharing canmaximize the usefulness of the data.However,certain data is quite sensitive,and sharing it directly may violate privacy.Vertical Federated Learning(VFL)is a secure distributed machine learning framework that completes joint model training by passing encryptedmodel parameters rather than raw data,so there is no data privacy leakage during the training process.Therefore,the VFL can build a bridge between data demander and owner to realize data sharing while protecting data privacy.Typically,the VFL requires a third party for key distribution and decryption of training results.In this article,we employ the consortium blockchain instead of the traditional third party and design a VFL architecture based on the consortium blockchain for data sharing in MEC.More specifically,we propose a V-Raft consensus algorithm based on Verifiable Random Functions(VRFs),which is a variant of the Raft.The VRaft is able to elect leader quickly and stably to assist data demander and owner to complete data sharing by VFL.Moreover,we apply secret sharing todistribute the private key to avoid the situationwhere the training result cannot be decrypted if the leader crashes.Finally,we analyzed the performance of the V-Raft and carried out simulation experiments,and the results show that compared with Raft,the V-Raft has higher efficiency and better scalability.展开更多
The introduction of blockchain to federated learning(FL)is a promising solution to enable anonymous clients to collaboratively learn a shared prediction model using local data while avoiding the risk caused by the cen...The introduction of blockchain to federated learning(FL)is a promising solution to enable anonymous clients to collaboratively learn a shared prediction model using local data while avoiding the risk caused by the central server.However,the current researches only apply a shallow convergence between the two technologies.The aroused problems,such as the unsuitable consensus,the lack of incentive mechanism,and the incompetence of handling vertically partitioned data,make the blockchain-based FL exist in name only.This paper puts forward a novel blockchain-based framework for vertical FL with a specified consensus and incentive.Moreover,a real-world example is demonstrated to prove the practicability of our work.展开更多
In real life,a large amount of data describing the same learning task may be stored in different institutions(called participants),and these data cannot be shared among par-ticipants due to privacy protection.The case...In real life,a large amount of data describing the same learning task may be stored in different institutions(called participants),and these data cannot be shared among par-ticipants due to privacy protection.The case that different attributes/features of the same instance are stored in different institutions is called vertically distributed data.The pur-pose of vertical‐federated feature selection(FS)is to reduce the feature dimension of vertical distributed data jointly without sharing local original data so that the feature subset obtained has the same or better performance as the original feature set.To solve this problem,in the paper,an embedded vertical‐federated FS algorithm based on particle swarm optimisation(PSO‐EVFFS)is proposed by incorporating evolutionary FS into the SecureBoost framework for the first time.By optimising both hyper‐parameters of the XGBoost model and feature subsets,PSO‐EVFFS can obtain a feature subset,which makes the XGBoost model more accurate.At the same time,since different participants only share insensitive parameters such as model loss function,PSO‐EVFFS can effec-tively ensure the privacy of participants'data.Moreover,an ensemble ranking strategy of feature importance based on the XGBoost tree model is developed to effectively remove irrelevant features on each participant.Finally,the proposed algorithm is applied to 10 test datasets and compared with three typical vertical‐federated learning frameworks and two variants of the proposed algorithm with different initialisation strategies.Experi-mental results show that the proposed algorithm can significantly improve the classifi-cation performance of selected feature subsets while fully protecting the data privacy of all participants.展开更多
Vertical Federated Learning(VFL)has many applications in the field of smart healthcare with excellent performance.However,current VFL systems usually primarily focus on the privacy protection during model training,whi...Vertical Federated Learning(VFL)has many applications in the field of smart healthcare with excellent performance.However,current VFL systems usually primarily focus on the privacy protection during model training,while the preparation of training data receives little attention.In real-world applications,like smart healthcare,the process of the training data preparation may involve some participant's intention which could be privacy information for this partici-pant.To protect the privacy of the model training intention,we describe the idea of Intention-Hiding Vertical Feder-ated Learning(IHVFL)and illustrate a framework to achieve this privacy-preserving goal.First,we construct two secure screening protocols to enhance the privacy protection in feature engineering.Second,we implement the work of sample alignment bases on a novel private set intersection protocol.Finally,we use the logistic regression algorithm to demonstrate the process of IHVFL.Experiments show that our model can perform better efficiency(less than 5min)and accuracy(97%)on Breast Cancer medical dataset while maintaining the intention-hiding goal.展开更多
Cross-Platform Social Relationship Prediction(CPSRP)aims to utilize users’data information on multiple platforms to enhance the performance of social relationship prediction,thereby promoting socioeconomic developmen...Cross-Platform Social Relationship Prediction(CPSRP)aims to utilize users’data information on multiple platforms to enhance the performance of social relationship prediction,thereby promoting socioeconomic development.Due to the highly sensitive nature of users’data in terms of privacy,CPSRP typically introduces various privacy-preserving mechanisms to safeguard users’confidential information.Although the introduction mechanism guarantees the security of the users’private information,it tends to degrade the performance of the social relationship prediction.Additionally,existing social relationship prediction schemes overlook the interdependencies among items invoked in a user behavior sequence.For this purpose,we propose a novel privacy-preserve Federated Social Relationship Prediction with Contrastive Learning framework called FSRPCL,which is a multi-task learning framework based on vertical federated learning.Specifically,the users’rating information is perturbed with a bounded differential privacy technology,and then the users’sequential representation information acquired through Transformer is applied for social relationship prediction and contrastive learning.Furthermore,each client uploads their respective weight information to the server,and the server aggregates the weight information and distributes it purposes to each client for updating.Numerous experiments on real-world datasets prove that FSRPCL delivers exceptional performance in social relationship prediction and privacy preservation,and effectively minimizes the impact of privacy-preserving technology on social relationship prediction accuracy.展开更多
Cardiovascular diseases are a prominent cause of mortality,emphasizing the need for early prevention and diagnosis.Utilizing artificial intelligence(AI)models,heart sound analysis emerges as a noninvasive and universa...Cardiovascular diseases are a prominent cause of mortality,emphasizing the need for early prevention and diagnosis.Utilizing artificial intelligence(AI)models,heart sound analysis emerges as a noninvasive and universally applicable approach for assessing cardiovascular health conditions.However,real-world medical data are dispersed across medical institutions,forming“data islands”due to data sharing limitations for security reasons.To this end,federated learning(FL)has been extensively employed in the medical field,which can effectively model across multiple institutions.Additionally,conventional supervised classification methods require fully labeled data classes,e.g.,binary classification requires labeling of positive and negative samples.Nevertheless,the process of labeling healthcare data is timeconsuming and labor-intensive,leading to the possibility of mislabeling negative samples.In this study,we validate an FL framework with a naive positive-unlabeled(PU)learning strategy.Semisupervised FL model can directly learn from a limited set of positive samples and an extensive pool of unlabeled samples.Our emphasis is on vertical-FL to enhance collaboration across institutions with different medical record feature spaces.Additionally,our contribution extends to feature importance analysis,where we explore 6 methods and provide practical recommendations for detecting abnormal heart sounds.The study demonstrated an impressive accuracy of 84%,comparable to outcomes in supervised learning,thereby advancing the application of FL in abnormal heart sound detection.展开更多
基金supported by Systematic Major Project of Shuohuang Railway Development Co.,Ltd.,National Energy Group(Grant Number:SHTL-23-31)Beijing Natural Science Foundation(U22B2027).
文摘In the realm of Intelligent Railway Transportation Systems,effective multi-party collaboration is crucial due to concerns over privacy and data silos.Vertical Federated Learning(VFL)has emerged as a promising approach to facilitate such collaboration,allowing diverse entities to collectively enhance machine learning models without the need to share sensitive training data.However,existing works have highlighted VFL’s susceptibility to privacy inference attacks,where an honest but curious server could potentially reconstruct a client’s raw data from embeddings uploaded by the client.This vulnerability poses a significant threat to VFL-based intelligent railway transportation systems.In this paper,we introduce SensFL,a novel privacy-enhancing method to against privacy inference attacks in VFL.Specifically,SensFL integrates regularization of the sensitivity of embeddings to the original data into the model training process,effectively limiting the information contained in shared embeddings.By reducing the sensitivity of embeddings to the original data,SensFL can effectively resist reverse privacy attacks and prevent the reconstruction of the original data from the embeddings.Extensive experiments were conducted on four distinct datasets and three different models to demonstrate the efficacy of SensFL.Experiment results show that SensFL can effectively mitigate privacy inference attacks while maintaining the accuracy of the primary learning task.These results underscore SensFL’s potential to advance privacy protection technologies within VFL-based intelligent railway systems,addressing critical security concerns in collaborative learning environments.
基金supported by Open Research Projects of Zhejiang Lab(No.2022QA0AB02)Natural Science Foundation of Sichuan Province(2022NSFSC0913)Sichuan Province Selected Funding for Postdoctoral Research Projects(TB2022032).
文摘Vertical Federated Learning(VFL),which draws attention because of its ability to evaluate individuals based on features spread across multiple institutions,encounters numerous privacy and security threats.Existing solutions often suffer from centralized architectures,and exorbitant costs.To mitigate these issues,in this paper,we propose SecureVFL,a decentralized multi-party VFL scheme designed to enhance efficiency and trustworthiness while guaranteeing privacy.SecureVFL uses a permissioned blockchain and introduces a novel consensus algorithm,Proof of Feature Sharing(PoFS),to facilitate decentralized,trustworthy,and high-throughput federated training.SecureVFL introduces a verifiable and lightweight three-party Replicated Secret Sharing(RSS)protocol for feature intersection summation among overlapping users.Furthermore,we propose a(_(2)^(4))-sharing protocol to achieve federated training in a four-party VFL setting.This protocol involves only addition operations and exhibits robustness.SecureVFL not only enables anonymous interactions among participants but also safeguards their real identities,and provides mechanisms to unmask these identities when malicious activities are performed.We illustrate the proposed mechanism through a case study on VFL across four banks.Finally,our theoretical analysis proves the security of SecureVFL.Experiments demonstrated that SecureVFL outperformed existing multi-party VFL privacy-preserving schemes,such as MP-FedXGB,in terms of both overhead and model performance.
基金supported in part by ZTE Industry-University-Institute Cooperation Funds under Grant No. 202211FKY00112Open Research Projects of Zhejiang Lab under Grant No. 2022QA0AB02Natural Science Foundation of Sichuan Province under Grant No. 2022NSFSC0913
文摘As an important branch of federated learning,vertical federated learning(VFL)enables multiple institutions to train on the same user samples,bringing considerable industry benefits.However,VFL needs to exchange user features among multiple institutions,which raises concerns about privacy leakage.Moreover,existing multi-party VFL privacy-preserving schemes suffer from issues such as poor reli-ability and high communication overhead.To address these issues,we propose a privacy protection scheme for four institutional VFLs,named FVFL.A hierarchical framework is first introduced to support federated training among four institutions.We also design a verifiable repli-cated secret sharing(RSS)protocol(32)-sharing and combine it with homomorphic encryption to ensure the reliability of FVFL while ensuring the privacy of features and intermediate results of the four institutions.Our theoretical analysis proves the reliability and security of the pro-posed FVFL.Extended experiments verify that the proposed scheme achieves excellent performance with a low communication overhead.
基金funded by the National Natural Science Foundation(61962009)the National Natural Science Foundation(62202118)+1 种基金Top Technology Talent Project from Guizhou Education Department(Qianjiao ji[2022]073)Foundation of Guangxi Key Laboratory of Cryptography and Information Security(GCIS202118).
文摘The data in Mobile Edge Computing(MEC)contains tremendousmarket value,and data sharing canmaximize the usefulness of the data.However,certain data is quite sensitive,and sharing it directly may violate privacy.Vertical Federated Learning(VFL)is a secure distributed machine learning framework that completes joint model training by passing encryptedmodel parameters rather than raw data,so there is no data privacy leakage during the training process.Therefore,the VFL can build a bridge between data demander and owner to realize data sharing while protecting data privacy.Typically,the VFL requires a third party for key distribution and decryption of training results.In this article,we employ the consortium blockchain instead of the traditional third party and design a VFL architecture based on the consortium blockchain for data sharing in MEC.More specifically,we propose a V-Raft consensus algorithm based on Verifiable Random Functions(VRFs),which is a variant of the Raft.The VRaft is able to elect leader quickly and stably to assist data demander and owner to complete data sharing by VFL.Moreover,we apply secret sharing todistribute the private key to avoid the situationwhere the training result cannot be decrypted if the leader crashes.Finally,we analyzed the performance of the V-Raft and carried out simulation experiments,and the results show that compared with Raft,the V-Raft has higher efficiency and better scalability.
基金Key Program of the National Natural Science Foundation of China(No.2019YFE0190500)Fundamental Research Funds for the Central Universities of Ministry of Education of China(No.2232021D-22)。
文摘The introduction of blockchain to federated learning(FL)is a promising solution to enable anonymous clients to collaboratively learn a shared prediction model using local data while avoiding the risk caused by the central server.However,the current researches only apply a shallow convergence between the two technologies.The aroused problems,such as the unsuitable consensus,the lack of incentive mechanism,and the incompetence of handling vertically partitioned data,make the blockchain-based FL exist in name only.This paper puts forward a novel blockchain-based framework for vertical FL with a specified consensus and incentive.Moreover,a real-world example is demonstrated to prove the practicability of our work.
基金supported by the two funding sources:Scientific Innovation 2030 Major Project for New Generation of AI,Ministry of Science and Technology of the Peoples Republic of China(2020AAA0107300)National Natural Science Foundation of China(62133015).
文摘In real life,a large amount of data describing the same learning task may be stored in different institutions(called participants),and these data cannot be shared among par-ticipants due to privacy protection.The case that different attributes/features of the same instance are stored in different institutions is called vertically distributed data.The pur-pose of vertical‐federated feature selection(FS)is to reduce the feature dimension of vertical distributed data jointly without sharing local original data so that the feature subset obtained has the same or better performance as the original feature set.To solve this problem,in the paper,an embedded vertical‐federated FS algorithm based on particle swarm optimisation(PSO‐EVFFS)is proposed by incorporating evolutionary FS into the SecureBoost framework for the first time.By optimising both hyper‐parameters of the XGBoost model and feature subsets,PSO‐EVFFS can obtain a feature subset,which makes the XGBoost model more accurate.At the same time,since different participants only share insensitive parameters such as model loss function,PSO‐EVFFS can effec-tively ensure the privacy of participants'data.Moreover,an ensemble ranking strategy of feature importance based on the XGBoost tree model is developed to effectively remove irrelevant features on each participant.Finally,the proposed algorithm is applied to 10 test datasets and compared with three typical vertical‐federated learning frameworks and two variants of the proposed algorithm with different initialisation strategies.Experi-mental results show that the proposed algorithm can significantly improve the classifi-cation performance of selected feature subsets while fully protecting the data privacy of all participants.
基金This work was supported by the National Key Research and Development Program of China under Grant 2021YFF0704102.
文摘Vertical Federated Learning(VFL)has many applications in the field of smart healthcare with excellent performance.However,current VFL systems usually primarily focus on the privacy protection during model training,while the preparation of training data receives little attention.In real-world applications,like smart healthcare,the process of the training data preparation may involve some participant's intention which could be privacy information for this partici-pant.To protect the privacy of the model training intention,we describe the idea of Intention-Hiding Vertical Feder-ated Learning(IHVFL)and illustrate a framework to achieve this privacy-preserving goal.First,we construct two secure screening protocols to enhance the privacy protection in feature engineering.Second,we implement the work of sample alignment bases on a novel private set intersection protocol.Finally,we use the logistic regression algorithm to demonstrate the process of IHVFL.Experiments show that our model can perform better efficiency(less than 5min)and accuracy(97%)on Breast Cancer medical dataset while maintaining the intention-hiding goal.
基金supported by the Jiangsu Province Special Funding for the Transformation of Scientific and Technological Achievements(No.BA2022011)the Jiangsu Province Frontier Technology Research and Development Project(No.BF2024071).
文摘Cross-Platform Social Relationship Prediction(CPSRP)aims to utilize users’data information on multiple platforms to enhance the performance of social relationship prediction,thereby promoting socioeconomic development.Due to the highly sensitive nature of users’data in terms of privacy,CPSRP typically introduces various privacy-preserving mechanisms to safeguard users’confidential information.Although the introduction mechanism guarantees the security of the users’private information,it tends to degrade the performance of the social relationship prediction.Additionally,existing social relationship prediction schemes overlook the interdependencies among items invoked in a user behavior sequence.For this purpose,we propose a novel privacy-preserve Federated Social Relationship Prediction with Contrastive Learning framework called FSRPCL,which is a multi-task learning framework based on vertical federated learning.Specifically,the users’rating information is perturbed with a bounded differential privacy technology,and then the users’sequential representation information acquired through Transformer is applied for social relationship prediction and contrastive learning.Furthermore,each client uploads their respective weight information to the server,and the server aggregates the weight information and distributes it purposes to each client for updating.Numerous experiments on real-world datasets prove that FSRPCL delivers exceptional performance in social relationship prediction and privacy preservation,and effectively minimizes the impact of privacy-preserving technology on social relationship prediction accuracy.
基金partially supported by the National Natural Science Foundation of China(grant number 62272044)the Ministry of Science and Technology of the People’s Republic of China with the STI2030-Major Projects(grant number 2021ZD0201900)+5 种基金the Teli Young Fellow Program from the Beijing Institute of Technology,Chinathe Grants-in-Aid for Scientific Research(grant number 20H00569)from the Ministry of Education,Culture,Sports,Science and Technology(MEXT),Japanthe JSPS KAKENHI(grant number 20H00569),Japanthe JST Mirai Program(grant number 21473074),Japanthe JST MOONSHOT Program(grant number JPMJMS229B),Japanthe BIT Research and Innovation Promoting Project(grant number 2023YCXZ014).
文摘Cardiovascular diseases are a prominent cause of mortality,emphasizing the need for early prevention and diagnosis.Utilizing artificial intelligence(AI)models,heart sound analysis emerges as a noninvasive and universally applicable approach for assessing cardiovascular health conditions.However,real-world medical data are dispersed across medical institutions,forming“data islands”due to data sharing limitations for security reasons.To this end,federated learning(FL)has been extensively employed in the medical field,which can effectively model across multiple institutions.Additionally,conventional supervised classification methods require fully labeled data classes,e.g.,binary classification requires labeling of positive and negative samples.Nevertheless,the process of labeling healthcare data is timeconsuming and labor-intensive,leading to the possibility of mislabeling negative samples.In this study,we validate an FL framework with a naive positive-unlabeled(PU)learning strategy.Semisupervised FL model can directly learn from a limited set of positive samples and an extensive pool of unlabeled samples.Our emphasis is on vertical-FL to enhance collaboration across institutions with different medical record feature spaces.Additionally,our contribution extends to feature importance analysis,where we explore 6 methods and provide practical recommendations for detecting abnormal heart sounds.The study demonstrated an impressive accuracy of 84%,comparable to outcomes in supervised learning,thereby advancing the application of FL in abnormal heart sound detection.