
Research Article
Parallel Implementation and Optimization of SM4 Based on CUDA
2 downloads
@INPROCEEDINGS{10.1007/978-3-030-80851-8_7, author={Jun Li and Wenbo Xie and Lingchen Li and Xiaonian Wu}, title={Parallel Implementation and Optimization of SM4 Based on CUDA}, proceedings={Applied Cryptography in Computer and Communications. First EAI International Conference, AC3 2021, Virtual Event, May 15-16, 2021, Proceedings}, proceedings_a={AC3}, year={2021}, month={7}, keywords={Block cipher SM4 Parallel computing CUDA GPU}, doi={10.1007/978-3-030-80851-8_7} }
- Jun Li
Wenbo Xie
Lingchen Li
Xiaonian Wu
Year: 2021
Parallel Implementation and Optimization of SM4 Based on CUDA
AC3
Springer
DOI: 10.1007/978-3-030-80851-8_7
Abstract
SM4 is the 128-bit block cipher used in WAPI standard in China, which has a strong security and flexibility. In this paper, the rapid implementation of SM4 is given. Based on the characteristics of CUDA (Compute Unified Device Architecture), a CPU-GPU (Central Processing Unit-Graphics Processing Unit) scheme of SM4 is proposed by exploiting the structure property. Moreover, this scheme is further improved by introducing the page-locked memory and CUDA streams. The results show that: SM4 optimized parallel implementation under GPU can obtain with a speed-up ratio of 89, and the throughput can reach up to 31.41 Gbps.
Copyright © 2021–2025 ICST