Don't prematurely run keccak_f in squeeze When squeezing a multiple of the rate bytes (e.g. in the Kyber XOF), we were running the Keccak permutation one more time than necessary. Before: Did 18900 Kyber generate + decap operations in 2001506us (9442.9 ops/sec) Did 32000 Kyber parse + encap operations in 2041500us (15674.7 ops/sec) After: Did 19796 Kyber generate + decap operations in 2017501us (9812.1 ops/sec) [+3.9%] Did 34000 Kyber parse + encap operations in 2032085us (16731.6 ops/sec) [+6.7%] Change-Id: I69787536508c4eadcc37a2f752c3678c60906c38 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/64007 Reviewed-by: Adam Langley <agl@google.com> Auto-Submit: David Benjamin <davidben@google.com> Commit-Queue: Adam Langley <agl@google.com> Commit-Queue: David Benjamin <davidben@google.com>
diff --git a/crypto/keccak/keccak.c b/crypto/keccak/keccak.c index e482404..7ab8edc 100644 --- a/crypto/keccak/keccak.c +++ b/crypto/keccak/keccak.c
@@ -240,6 +240,11 @@ // because we require |uint8_t| to be a character type. const uint8_t *state_bytes = (const uint8_t *)ctx->state; while (out_len) { + if (ctx->squeeze_offset == ctx->rate_bytes) { + keccak_f(ctx->state); + ctx->squeeze_offset = 0; + } + size_t remaining = ctx->rate_bytes - ctx->squeeze_offset; size_t todo = out_len; if (todo > remaining) { @@ -249,9 +254,5 @@ out += todo; out_len -= todo; ctx->squeeze_offset += todo; - if (ctx->squeeze_offset == ctx->rate_bytes) { - keccak_f(ctx->state); - ctx->squeeze_offset = 0; - } } }