In this study,we investigated privacy-preserving ID3 Decision Tree(PPID3)training and inference based on fully homomorphic encryption(FHE),which has not been actively explored due to the high computational cost associ...In this study,we investigated privacy-preserving ID3 Decision Tree(PPID3)training and inference based on fully homomorphic encryption(FHE),which has not been actively explored due to the high computational cost associated with managing numerous child nodes in an ID3 tree.We propose HEaaN-ID3,a novel approach to realize PPID3 using the Cheon-Kim-Kim-Song(CKKS)scheme.HEaaN-ID3 is the first FHE-based ID3 framework that completes both training and inference without any intermediate decryption,which is especially valuable when decryption keys are inaccessible or a single-cloud security domain is assumed.To enhance computational efficiency,we adopt a modified Gini impurity(MGI)score instead of entropy to evaluate information gain,thereby avoiding costly inverse operations.In addition,we fully leverage the Single Instruction Multiple Data(SIMD)property of CKKS to parallelize computations at multiple tree nodes.Unlike previous approaches that require decryption at each node or rely on two-party secure computation,our method enables a fully non-interactive training and inference pipeline in the encrypted domain.We validated the proposed scheme using UCI datasets with both numerical and nominal features,demonstrating inference accuracy comparable to plaintext implementations in Scikit-Learn.Moreover,experiments show that HEaaN-ID3 significantly reduces training and inference time per node relative to earlier FHE-based approaches.展开更多
基金supported by Institute of Information communications Technology Planning Evaluation(IITP)grant funded by theKorea government(MSIT)[No.2022-0-01047,Development of statistical analysis algorithm and module using homomorphic encryption based on real number operation,100%].
文摘In this study,we investigated privacy-preserving ID3 Decision Tree(PPID3)training and inference based on fully homomorphic encryption(FHE),which has not been actively explored due to the high computational cost associated with managing numerous child nodes in an ID3 tree.We propose HEaaN-ID3,a novel approach to realize PPID3 using the Cheon-Kim-Kim-Song(CKKS)scheme.HEaaN-ID3 is the first FHE-based ID3 framework that completes both training and inference without any intermediate decryption,which is especially valuable when decryption keys are inaccessible or a single-cloud security domain is assumed.To enhance computational efficiency,we adopt a modified Gini impurity(MGI)score instead of entropy to evaluate information gain,thereby avoiding costly inverse operations.In addition,we fully leverage the Single Instruction Multiple Data(SIMD)property of CKKS to parallelize computations at multiple tree nodes.Unlike previous approaches that require decryption at each node or rely on two-party secure computation,our method enables a fully non-interactive training and inference pipeline in the encrypted domain.We validated the proposed scheme using UCI datasets with both numerical and nominal features,demonstrating inference accuracy comparable to plaintext implementations in Scikit-Learn.Moreover,experiments show that HEaaN-ID3 significantly reduces training and inference time per node relative to earlier FHE-based approaches.