SM9 was established in 2016 as a Chinese ofcial identity-based cryptographic (IBC) standard, and became an ISO standard in 2021. It is well-known that IBC is suitable for Internet of Things (IoT) applications, since a...SM9 was established in 2016 as a Chinese ofcial identity-based cryptographic (IBC) standard, and became an ISO standard in 2021. It is well-known that IBC is suitable for Internet of Things (IoT) applications, since a centralized processing of client data (e.g. IoT cloud) is often done by gateways. However, due to limited computation resources inside IoT devices, the performance of SM9 becomes a bottleneck in practical usage. The existing SM9 implementa-tionsare often CPU-based, with relatively low latency and low throughput. Consequently, a pivotal challenge for SM9 in large-scale applications is how to reduce the latency while maximizing throughput for numerous concurrent inputs. After a systematic analysis of the SM9 algorithms, we apply optimization techniques including precomputa-tion,resource caching and parallelization to reduce the overhead of SM9. In this work, we introduce the frst prac-ticalimplementation of SM9 and its underlying SM9_P256 curve on GPU. Our GPU implementation combines multiple algorithms and low-level optimizations tailored for GPU’s single instruction, multiple threads architecture in order to achieve high throughput for SM9. Based on these, we propose GAPS, a high-performance Cryptog-raphyas a Service (CaaS) for SM9. GAPS adopts a heterogeneous computing architecture that fexibly schedules the inputs across two implementation platforms: a CPU for the low-latency processing of sporadic inputs, and a GPU for the high-throughput processing of batch inputs. According to our benchmark, GAPS only takes a few milliseconds to process a single SM9 request in idle mode. Moreover, when operating in its batch processing mode, GAPS can generate 2,038,071 private keys, 248,239 signatures or 238,001 ciphertexts per second. The results show that GAPS scales seamlessly across inputs of diferent sizes, preliminarily demonstrating the efcacy of our solution.展开更多
基金supported by National Natural Science Foundation of China(Nos.62172411,62172404,61972094,and 62202458).
文摘SM9 was established in 2016 as a Chinese ofcial identity-based cryptographic (IBC) standard, and became an ISO standard in 2021. It is well-known that IBC is suitable for Internet of Things (IoT) applications, since a centralized processing of client data (e.g. IoT cloud) is often done by gateways. However, due to limited computation resources inside IoT devices, the performance of SM9 becomes a bottleneck in practical usage. The existing SM9 implementa-tionsare often CPU-based, with relatively low latency and low throughput. Consequently, a pivotal challenge for SM9 in large-scale applications is how to reduce the latency while maximizing throughput for numerous concurrent inputs. After a systematic analysis of the SM9 algorithms, we apply optimization techniques including precomputa-tion,resource caching and parallelization to reduce the overhead of SM9. In this work, we introduce the frst prac-ticalimplementation of SM9 and its underlying SM9_P256 curve on GPU. Our GPU implementation combines multiple algorithms and low-level optimizations tailored for GPU’s single instruction, multiple threads architecture in order to achieve high throughput for SM9. Based on these, we propose GAPS, a high-performance Cryptog-raphyas a Service (CaaS) for SM9. GAPS adopts a heterogeneous computing architecture that fexibly schedules the inputs across two implementation platforms: a CPU for the low-latency processing of sporadic inputs, and a GPU for the high-throughput processing of batch inputs. According to our benchmark, GAPS only takes a few milliseconds to process a single SM9 request in idle mode. Moreover, when operating in its batch processing mode, GAPS can generate 2,038,071 private keys, 248,239 signatures or 238,001 ciphertexts per second. The results show that GAPS scales seamlessly across inputs of diferent sizes, preliminarily demonstrating the efcacy of our solution.