Exploring Parallel Implementation of SPHINCS+ using Advanced Vector Extensions (AVX) Sets

Yaoyun Zhou1, Kavin Rajasekaran1, QIAN WANG2
1University of California, Merced, 2University of California Merced


Abstract

SPHINCS+ is a hash-based signature scheme and the only non-lattice-based algorithm selected for the National Institute of Standards and Technology (NIST) standardization among digital signature schemes to be upgraded for the post-quantum era. Its security is based on well-studied hash functions offering an advantage over other schemes by reducing the risk of being compromised. However, SPHINCS+ comprises of a complex multi-tree structure that requires thousands of hash function calls, resulting in significant performance drawbacks compared to other schemes. In this work, we investigate and profile different parallel hash function implementations for SPHINCS+ using vector instruction sets such as AVX2 and AVX512. Our analysis reveals that a 16-lane parallel approach, which incorporates a hybrid of AVX512 and AVX2 instructions, performs best during the signing phase. Conversely, we observe a performance degradation in the AVX512-based 8-lane parallel implementation of SHA-512. We provide an analysis of the underlying causes for this performance discrepancy, which we attribute to factors such as CPU bandwidth congestion and the overhead of instruction cache handling in the AVX512 implementation.