Squeeze a block at a time when computing the matrix in Kyber

We were currently squeezing 3 bytes at a time. And because kyber.c and
keccak.c are separate files, we've got an optimization barrier, at least
in non-LTO builds, at a very inconvenient place.

Instead, extract the full 168 bytes (SHAKE-128 rate) at a time. This is
conveniently a multiple of three, so we don't need to worry about
partial coefficients. We're still doing a copy due to the Keccak
abstraction, but it should extend cleanly to either LTO or a different
abstraction to read from the state directly. Even without that, it's a
significant perf win.

Benchmarks on an M1:

Before:
Did 87390 Kyber generate + decap operations in 4001590us (21838.8 ops/sec)
Did 119000 Kyber parse + encap operations in 4009460us (29679.8 ops/sec)

After:
Did 96747 Kyber generate + decap operations in 4003687us (24164.5 ops/sec) [+10.6%]
Did 152000 Kyber parse + encap operations in 4018360us (37826.4 ops/sec) [+27.4%]

Change-Id: Iada393edf0c634b7410a34374fb90242a392a9d3
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59265
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
Auto-Submit: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
1 file changed
tree: ac9e839e0bc970005b65547cc0a63749edcb8685
  1. .github/
  2. cmake/
  3. crypto/
  4. decrepit/
  5. fuzz/
  6. include/
  7. rust/
  8. ssl/
  9. third_party/
  10. tool/
  11. util/
  12. .clang-format
  13. .gitignore
  14. API-CONVENTIONS.md
  15. BREAKING-CHANGES.md
  16. BUILDING.md
  17. CMakeLists.txt
  18. codereview.settings
  19. CONTRIBUTING.md
  20. FUZZING.md
  21. go.mod
  22. go.sum
  23. INCORPORATING.md
  24. LICENSE
  25. PORTING.md
  26. README.md
  27. SANDBOXING.md
  28. sources.cmake
  29. STYLE.md
README.md

BoringSSL

BoringSSL is a fork of OpenSSL that is designed to meet Google's needs.

Although BoringSSL is an open source project, it is not intended for general use, as OpenSSL is. We don't recommend that third parties depend upon it. Doing so is likely to be frustrating because there are no guarantees of API or ABI stability.

Programs ship their own copies of BoringSSL when they use it and we update everything as needed when deciding to make API changes. This allows us to mostly avoid compromises in the name of compatibility. It works for us, but it may not work for you.

BoringSSL arose because Google used OpenSSL for many years in various ways and, over time, built up a large number of patches that were maintained while tracking upstream OpenSSL. As Google's product portfolio became more complex, more copies of OpenSSL sprung up and the effort involved in maintaining all these patches in multiple places was growing steadily.

Currently BoringSSL is the SSL library in Chrome/Chromium, Android (but it's not part of the NDK) and a number of other apps/programs.

Project links:

There are other files in this directory which might be helpful: