Don't prematurely run keccak_f in squeeze
When squeezing a multiple of the rate bytes (e.g. in the Kyber XOF), we
were running the Keccak permutation one more time than necessary.
Before:
Did 18900 Kyber generate + decap operations in 2001506us (9442.9 ops/sec)
Did 32000 Kyber parse + encap operations in 2041500us (15674.7 ops/sec)
After:
Did 19796 Kyber generate + decap operations in 2017501us (9812.1 ops/sec) [+3.9%]
Did 34000 Kyber parse + encap operations in 2032085us (16731.6 ops/sec) [+6.7%]
Change-Id: I69787536508c4eadcc37a2f752c3678c60906c38
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/64007
Reviewed-by: Adam Langley <agl@google.com>
Auto-Submit: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
diff --git a/crypto/keccak/keccak.c b/crypto/keccak/keccak.c
index e482404..7ab8edc 100644
--- a/crypto/keccak/keccak.c
+++ b/crypto/keccak/keccak.c
@@ -240,6 +240,11 @@
// because we require |uint8_t| to be a character type.
const uint8_t *state_bytes = (const uint8_t *)ctx->state;
while (out_len) {
+ if (ctx->squeeze_offset == ctx->rate_bytes) {
+ keccak_f(ctx->state);
+ ctx->squeeze_offset = 0;
+ }
+
size_t remaining = ctx->rate_bytes - ctx->squeeze_offset;
size_t todo = out_len;
if (todo > remaining) {
@@ -249,9 +254,5 @@
out += todo;
out_len -= todo;
ctx->squeeze_offset += todo;
- if (ctx->squeeze_offset == ctx->rate_bytes) {
- keccak_f(ctx->state);
- ctx->squeeze_offset = 0;
- }
}
}