In this paper, we present an efficient algorithm for large-scale leakage optimization under sign-off timing constraints using the technique of multiple voltage threshold (multi-Vt) assignment. Several practical considerations are addressed, such as the synergistic propagation of swaps across all sign-off timing corners, iterative application of block-level and interface logic model (ILM)-level swap lists, and mitigation of hold-timing violations. The algorithm has been deployed successfully for performance-per-watt improvement on several SoC designs containing both CPU and GPU cores. It enables an average of ~10% improvement in leakage compared to current state-of-the art vendor solutions.