Split Learning(SL)has been promoted as a promising collaborative machine learning technique designed to address data privacy and resource efficiency.Specifically,neural networks are divided into client and server subn...Split Learning(SL)has been promoted as a promising collaborative machine learning technique designed to address data privacy and resource efficiency.Specifically,neural networks are divided into client and server subnetworks in order to mitigate the exposure of sensitive data and reduce the overhead on client devices,thereby making SL particularly suitable for resource-constrained devices.Although SL prevents the direct transmission of raw data,it does not alleviate entirely the risk of privacy breaches.In fact,the data intermediately transmitted to the server sub-model may include patterns or information that could reveal sensitive data.Moreover,achieving a balance between model utility and data privacy has emerged as a challenging problem.In this article,we propose a novel defense approach that combines:(i)Adversarial learning,and(ii)Network channel pruning.In particular,the proposed adversarial learning approach is specifically designed to reduce the risk of private data exposure while maintaining high performance for the utility task.On the other hand,the suggested channel pruning enables the model to adaptively adjust and reactivate pruned channels while conducting adversarial training.The integration of these two techniques reduces the informativeness of the intermediate data transmitted by the client sub-model,thereby enhancing its robustness against attribute inference attacks without adding significant computational overhead,making it wellsuited for IoT devices,mobile platforms,and Internet of Vehicles(IoV)scenarios.The proposed defense approach was evaluated using EfficientNet-B0,a widely adopted compact model,along with three benchmark datasets.The obtained results showcased its superior defense capability against attribute inference attacks compared to existing state-of-the-art methods.This research’s findings demonstrated the effectiveness of the proposed channel pruning-based adversarial training approach in achieving the intended compromise between utility and privacy within SL frameworks.In fact,the classification accuracy attained by the attackers witnessed a drastic decrease of 70%.展开更多
The Internet of Vehicles,or IoV,is expected to lessen pollution,ease traffic,and increase road safety.IoV entities’interconnectedness,however,raises the possibility of cyberattacks,which can have detrimental effects....The Internet of Vehicles,or IoV,is expected to lessen pollution,ease traffic,and increase road safety.IoV entities’interconnectedness,however,raises the possibility of cyberattacks,which can have detrimental effects.IoV systems typically send massive volumes of raw data to central servers,which may raise privacy issues.Additionally,model training on IoV devices with limited resources normally leads to slower training times and reduced service quality.We discuss a privacy-preserving Federated Split Learning with Tiny Machine Learning(TinyML)approach,which operates on IoV edge devices without sharing sensitive raw data.Specifically,we focus on integrating split learning(SL)with federated learning(FL)and TinyML models.FL is a decentralisedmachine learning(ML)technique that enables numerous edge devices to train a standard model while retaining data locally collectively.The article intends to thoroughly discuss the architecture and challenges associated with the increasing prevalence of SL in the IoV domain,coupled with FL and TinyML.The approach starts with the IoV learning framework,which includes edge computing,FL,SL,and TinyML,and then proceeds to discuss how these technologies might be integrated.We elucidate the comprehensive operational principles of Federated and split learning by examining and addressingmany challenges.We subsequently examine the integration of SL with FL and various applications of TinyML.Finally,exploring the potential integration of FL and SL with TinyML in the IoV domain is referred to as FSL-TM.It is a superior method for preserving privacy as it conducts model training on individual devices or edge nodes,thereby obviating the necessity for centralised data aggregation,which presents considerable privacy threats.The insights provided aim to help both researchers and practitioners understand the complicated terrain of FL and SL,hence facilitating advancement in this swiftly progressing domain.展开更多
The personalized fine-tuning of large languagemodels(LLMs)on edge devices is severely constrained by limited computation resources.Although split federated learning alleviates on-device burdens,its effectiveness dimin...The personalized fine-tuning of large languagemodels(LLMs)on edge devices is severely constrained by limited computation resources.Although split federated learning alleviates on-device burdens,its effectiveness diminishes in few-shot reasoning scenarios due to the low data efficiency of conventional supervised fine-tuning,which leads to excessive communication overhead.To address this,we propose Language-Empowered Split Fine-Tuning(LESFT),a framework that integrates split architectures with a contrastive-inspired fine-tuning paradigm.LESFT simultaneously learns frommultiple logically equivalent but linguistically diverse reasoning chains,providing richer supervisory signals and improving data efficiency.This process-oriented training allows more effective reasoning adaptation with fewer samples.Extensive experiments demonstrate that LESFT consistently outperforms strong baselines such as SplitLoRA in task accuracy.LESFT consistently outperforms strong baselines on GSM8K,CommonsenseQA,and AQUA_RAT,with the largest gains observed on Qwen2.5-3B.These results indicate that LESFT can effectively adapt large language models for reasoning tasks under the computational and communication constraints of edge environments.展开更多
The isolation of healthcare data among worldwide hospitals and institutes forms barriers for fully realizing the data-hungry artificial intelligence(AI)models promises in renewing medical services.To overcome this,pri...The isolation of healthcare data among worldwide hospitals and institutes forms barriers for fully realizing the data-hungry artificial intelligence(AI)models promises in renewing medical services.To overcome this,privacy-preserving distributed learning frameworks,represented by swarm learning and federated learning,have been investigated recently with the sensitive healthcare data retaining in its local premises.However,existing frameworks use a one-size-fits-all mode that tunes one model for all healthcare situations,which could hardly fit the usually diverse disease prediction in practice.This work introduces the idea of ensemble learning into privacypreserving distributed learning and presents the En-split framework,where the predictions of multiple expert models with specialized diagnostic capabilities are jointly explored.Considering the exacerbation of communication and computation burdens with multiple models during learning,model split is used to partition targeted models into two parts,with hospitals focusing on building the feature-enriched shallow layers.Meanwhile,dedicated noises are implemented to the edge layers for differential privacy protection.Experiments on two public datasets demonstrate En-split’s superior performance on accuracy and efficiency,compared with existing distributed learning frameworks.展开更多
Federated multi-task learning(FMTL)has emerged as a promising framework for learning multiple tasks simultaneously with client-aware personalized models.While the majority of studies have focused on dealing with the n...Federated multi-task learning(FMTL)has emerged as a promising framework for learning multiple tasks simultaneously with client-aware personalized models.While the majority of studies have focused on dealing with the non-independent and identically distributed(Non-IID)characteristics of client datasets,the issue of task heterogeneity has largely been overlooked.Dealing with task heterogeneity often requires complex models,making it impractical for federated learning in resource-constrained environments.In addition,the varying nature of these heterogeneous tasks introduces inductive biases,leading to interference during aggregation and potentially resulting in biased global models.To address these issues,we propose a hierarchical FMTL framework,referred to as FedBone,to facilitate the construction of large-scale models with improved generalization.FedBone leverages server-client split learning and gradient projection to split the entire model into two components:1)a large-scale general model(referred to as the general model)on the cloud server,and 2)multiple task-specific models(referred to as client models)on edge clients,accommodating devices with limited compute power.To enhance the robustness of the large-scale general model,we incorporate the conflicting gradient projection technique into FedBone to rectify the skewed gradient direction caused by aggregating gradients from heterogeneous tasks.The proposed FedBone framework is evaluated on three benchmark datasets and one real ophthalmic dataset.The comprehensive experiments demonstrate that FedBone efficiently adapts to the heterogeneous local tasks of each client and outperforms existing federated learning algorithms in various dense prediction and classification tasks while utilizing off-the-shelf computational resources on the client side.展开更多
基金supported by a grant(No.CRPG-25-2054)under the Cybersecurity Research and Innovation Pioneers Initiative,provided by the National Cybersecurity Authority(NCA)in the Kingdom of Saudi Arabia.
文摘Split Learning(SL)has been promoted as a promising collaborative machine learning technique designed to address data privacy and resource efficiency.Specifically,neural networks are divided into client and server subnetworks in order to mitigate the exposure of sensitive data and reduce the overhead on client devices,thereby making SL particularly suitable for resource-constrained devices.Although SL prevents the direct transmission of raw data,it does not alleviate entirely the risk of privacy breaches.In fact,the data intermediately transmitted to the server sub-model may include patterns or information that could reveal sensitive data.Moreover,achieving a balance between model utility and data privacy has emerged as a challenging problem.In this article,we propose a novel defense approach that combines:(i)Adversarial learning,and(ii)Network channel pruning.In particular,the proposed adversarial learning approach is specifically designed to reduce the risk of private data exposure while maintaining high performance for the utility task.On the other hand,the suggested channel pruning enables the model to adaptively adjust and reactivate pruned channels while conducting adversarial training.The integration of these two techniques reduces the informativeness of the intermediate data transmitted by the client sub-model,thereby enhancing its robustness against attribute inference attacks without adding significant computational overhead,making it wellsuited for IoT devices,mobile platforms,and Internet of Vehicles(IoV)scenarios.The proposed defense approach was evaluated using EfficientNet-B0,a widely adopted compact model,along with three benchmark datasets.The obtained results showcased its superior defense capability against attribute inference attacks compared to existing state-of-the-art methods.This research’s findings demonstrated the effectiveness of the proposed channel pruning-based adversarial training approach in achieving the intended compromise between utility and privacy within SL frameworks.In fact,the classification accuracy attained by the attackers witnessed a drastic decrease of 70%.
文摘The Internet of Vehicles,or IoV,is expected to lessen pollution,ease traffic,and increase road safety.IoV entities’interconnectedness,however,raises the possibility of cyberattacks,which can have detrimental effects.IoV systems typically send massive volumes of raw data to central servers,which may raise privacy issues.Additionally,model training on IoV devices with limited resources normally leads to slower training times and reduced service quality.We discuss a privacy-preserving Federated Split Learning with Tiny Machine Learning(TinyML)approach,which operates on IoV edge devices without sharing sensitive raw data.Specifically,we focus on integrating split learning(SL)with federated learning(FL)and TinyML models.FL is a decentralisedmachine learning(ML)technique that enables numerous edge devices to train a standard model while retaining data locally collectively.The article intends to thoroughly discuss the architecture and challenges associated with the increasing prevalence of SL in the IoV domain,coupled with FL and TinyML.The approach starts with the IoV learning framework,which includes edge computing,FL,SL,and TinyML,and then proceeds to discuss how these technologies might be integrated.We elucidate the comprehensive operational principles of Federated and split learning by examining and addressingmany challenges.We subsequently examine the integration of SL with FL and various applications of TinyML.Finally,exploring the potential integration of FL and SL with TinyML in the IoV domain is referred to as FSL-TM.It is a superior method for preserving privacy as it conducts model training on individual devices or edge nodes,thereby obviating the necessity for centralised data aggregation,which presents considerable privacy threats.The insights provided aim to help both researchers and practitioners understand the complicated terrain of FL and SL,hence facilitating advancement in this swiftly progressing domain.
基金supported in part by the National Natural Science Foundation of China(NSFC)under Grant 62276109The authors extend their appreciation to the Deanship of Scientific Research at King Saud University for funding this work through the Research Group Project number(ORF-2025-585).
文摘The personalized fine-tuning of large languagemodels(LLMs)on edge devices is severely constrained by limited computation resources.Although split federated learning alleviates on-device burdens,its effectiveness diminishes in few-shot reasoning scenarios due to the low data efficiency of conventional supervised fine-tuning,which leads to excessive communication overhead.To address this,we propose Language-Empowered Split Fine-Tuning(LESFT),a framework that integrates split architectures with a contrastive-inspired fine-tuning paradigm.LESFT simultaneously learns frommultiple logically equivalent but linguistically diverse reasoning chains,providing richer supervisory signals and improving data efficiency.This process-oriented training allows more effective reasoning adaptation with fewer samples.Extensive experiments demonstrate that LESFT consistently outperforms strong baselines such as SplitLoRA in task accuracy.LESFT consistently outperforms strong baselines on GSM8K,CommonsenseQA,and AQUA_RAT,with the largest gains observed on Qwen2.5-3B.These results indicate that LESFT can effectively adapt large language models for reasoning tasks under the computational and communication constraints of edge environments.
基金supported by the National Natural Science Foundation of China(62172155)the NationalKey Research andDevelopment Programof China(2022YFF1203001)+2 种基金the Science and Technology Innovation Program of Hunan Province(Nos.2022RC3061,2023RC3027)the Graduate Research Innovation Project of Hunan Province(XJCX2023157)NUDT Scientific Project“Research on Privacy-Enhancing Computing Technologies for Activity Trajectory Data”.
文摘The isolation of healthcare data among worldwide hospitals and institutes forms barriers for fully realizing the data-hungry artificial intelligence(AI)models promises in renewing medical services.To overcome this,privacy-preserving distributed learning frameworks,represented by swarm learning and federated learning,have been investigated recently with the sensitive healthcare data retaining in its local premises.However,existing frameworks use a one-size-fits-all mode that tunes one model for all healthcare situations,which could hardly fit the usually diverse disease prediction in practice.This work introduces the idea of ensemble learning into privacypreserving distributed learning and presents the En-split framework,where the predictions of multiple expert models with specialized diagnostic capabilities are jointly explored.Considering the exacerbation of communication and computation burdens with multiple models during learning,model split is used to partition targeted models into two parts,with hospitals focusing on building the feature-enriched shallow layers.Meanwhile,dedicated noises are implemented to the edge layers for differential privacy protection.Experiments on two public datasets demonstrate En-split’s superior performance on accuracy and efficiency,compared with existing distributed learning frameworks.
基金supported by the Beijing Municipal Science and Technology Commission under Grant No.Z221100002722009the National Natural Science Foundation of China under Grant No.62202455+1 种基金the Youth Innovation Promotion Association of Chinese Academy of Sciences(CAS),the Hunan Provincial Natural Science Foundation of China under Grant No.2023JJ70034the Science Research Foundation of the CAS-Aier Joint Laboratory on Digital Ophthalmology under Grant No.SZYK202201.
文摘Federated multi-task learning(FMTL)has emerged as a promising framework for learning multiple tasks simultaneously with client-aware personalized models.While the majority of studies have focused on dealing with the non-independent and identically distributed(Non-IID)characteristics of client datasets,the issue of task heterogeneity has largely been overlooked.Dealing with task heterogeneity often requires complex models,making it impractical for federated learning in resource-constrained environments.In addition,the varying nature of these heterogeneous tasks introduces inductive biases,leading to interference during aggregation and potentially resulting in biased global models.To address these issues,we propose a hierarchical FMTL framework,referred to as FedBone,to facilitate the construction of large-scale models with improved generalization.FedBone leverages server-client split learning and gradient projection to split the entire model into two components:1)a large-scale general model(referred to as the general model)on the cloud server,and 2)multiple task-specific models(referred to as client models)on edge clients,accommodating devices with limited compute power.To enhance the robustness of the large-scale general model,we incorporate the conflicting gradient projection technique into FedBone to rectify the skewed gradient direction caused by aggregating gradients from heterogeneous tasks.The proposed FedBone framework is evaluated on three benchmark datasets and one real ophthalmic dataset.The comprehensive experiments demonstrate that FedBone efficiently adapts to the heterogeneous local tasks of each client and outperforms existing federated learning algorithms in various dense prediction and classification tasks while utilizing off-the-shelf computational resources on the client side.