Security and Privacy in Communication Networks. 17th EAI International Conference, SecureComm 2021, Virtual Event, September 6–9, 2021, Proceedings, Part I

Research Article

TESLAC: Accelerating Lattice-Based Cryptography with AI Accelerator

  • @INPROCEEDINGS{10.1007/978-3-030-90019-9_13,
        author={Lipeng Wan and Fangyu Zheng and Jingqiang Lin},
        title={TESLAC: Accelerating Lattice-Based Cryptography with AI Accelerator},
        proceedings={Security and Privacy in Communication Networks. 17th EAI International Conference, SecureComm 2021, Virtual Event, September 6--9, 2021, Proceedings, Part I},
        proceedings_a={SECURECOMM},
        year={2021},
        month={11},
        keywords={Lattice-based cryptosystems Polynomial multiplication over rings AI accelerator Tensor Core LAC},
        doi={10.1007/978-3-030-90019-9_13}
    }
    
  • Lipeng Wan
    Fangyu Zheng
    Jingqiang Lin
    Year: 2021
    TESLAC: Accelerating Lattice-Based Cryptography with AI Accelerator
    SECURECOMM
    Springer
    DOI: 10.1007/978-3-030-90019-9_13
Lipeng Wan1, Fangyu Zheng1, Jingqiang Lin2
  • 1: Chinese Academy of Sciences
  • 2: University of Science and Technology of China

Abstract

In this paper, we exploit AI accelerator to implement cryptographic algorithms. To the best of our knowledge, it is the first attempt to implement quantum-safe Lattice-Based Cryptography (LBC) with AI accelerator. However, AI accelerators are designed for machine learning workloads (e.g., convolution operation), and cannot directly deliver their strong power into the cryptographic computation. Noting that polynomial multiplication over rings is a kind of time-consuming computation in LBC, we utilize a straightforward approach to make the AI accelerator fit well for polynomial multiplication over rings. Additional non-trivial optimizations are also made to minimize the overhead of transformation, such as using low-latency shared memory, coalescing memory access. Moreover, based on NVIDIA AI accelerator, Tensor Core, we have implemented a prototype system named TESLAC and give a set of comprehensive experiments to evaluate its performance. The experimental results show TESLAC can reach tens of millions of operations per second, achieving a performance speedup of two orders of magnitude from the AVX2-accelerated reference implementation. Particularly, with some techniques, TESLAC can also be scaled to other LBC with larger modulo .