Remove p > q normalization in RSA keys

RSA CRT is tiny bit messier when p < q.
https://boringssl-review.googlesource.com/25263 solved this by
normalizing to p > q. The cost was we sometimes had to compute a new
iqmp.

Modular inversion is expensive. We did it only once per key, but it's
still a performance cliff in per-key costs. When later work moves
freeze_private_key into RSA private key parsing, it will be a
performance cliff in the private key parser.

Instead, just handle p < q in the CRT function. The only difference is
needing one extra reduction before the modular subtraction. Even using
the fully general mod_montgomery function (as opposed to checking p < q,
or using bn_reduce_once when num_bits(p) == num_bits(q)) was not
measurable.

In doing so, I noticed we didn't actually have tests that exercise the
reduction step. I added one to evp_tests.txt, but it is only meaningful
when blinding is disabled. (Another cost of blinding.) When blinding is
enabled, the answers mod p and q are randomized and we hit this case
with about 1.8% probability. See comment in evp_test.txt.

I kept the optimization where we store iqmp in Montgomery form, not
because the optimization matters, but because we need to store a
corrected, fixed-width version of the value anyway, so we may as well
store it in a more convenient form.

M1 Max
Before:
Did 9048 RSA 2048 signing operations in 5033403us (1797.6 ops/sec)
Did 1500 RSA 4096 signing operations in 5009288us (299.4 ops/sec)
After:
Did 9116 RSA 2048 signing operations in 5053802us (1803.8 ops/sec) [+0.3%]
Did 1500 RSA 4096 signing operations in 5008283us (299.5 ops/sec) [+0.0%]

Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz
Before:
Did 9282 RSA 2048 signing operations in 5019395us (1849.2 ops/sec)
Did 1302 RSA 4096 signing operations in 5055011us (257.6 ops/sec)
After:
Did 9240 RSA 2048 signing operations in 5024845us (1838.9 ops/sec) [-0.6%]
Did 1302 RSA 4096 signing operations in 5046157us (258.0 ops/sec) [+0.2%]

Bug: 316
Change-Id: Icb90c7d5f5188f9b69a6d7bcc63db13d92ec26d5
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/60705
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
5 files changed
tree: 198f0a8f877f338f7be4b58fda6b41dd507f7fe0
  1. .github/
  2. cmake/
  3. crypto/
  4. decrepit/
  5. fuzz/
  6. include/
  7. rust/
  8. ssl/
  9. third_party/
  10. tool/
  11. util/
  12. .clang-format
  13. .gitignore
  14. API-CONVENTIONS.md
  15. BREAKING-CHANGES.md
  16. BUILDING.md
  17. CMakeLists.txt
  18. codereview.settings
  19. CONTRIBUTING.md
  20. FUZZING.md
  21. go.mod
  22. go.sum
  23. INCORPORATING.md
  24. LICENSE
  25. PORTING.md
  26. README.md
  27. SANDBOXING.md
  28. sources.cmake
  29. STYLE.md
README.md

BoringSSL

BoringSSL is a fork of OpenSSL that is designed to meet Google's needs.

Although BoringSSL is an open source project, it is not intended for general use, as OpenSSL is. We don't recommend that third parties depend upon it. Doing so is likely to be frustrating because there are no guarantees of API or ABI stability.

Programs ship their own copies of BoringSSL when they use it and we update everything as needed when deciding to make API changes. This allows us to mostly avoid compromises in the name of compatibility. It works for us, but it may not work for you.

BoringSSL arose because Google used OpenSSL for many years in various ways and, over time, built up a large number of patches that were maintained while tracking upstream OpenSSL. As Google's product portfolio became more complex, more copies of OpenSSL sprung up and the effort involved in maintaining all these patches in multiple places was growing steadily.

Currently BoringSSL is the SSL library in Chrome/Chromium, Android (but it's not part of the NDK) and a number of other apps/programs.

Project links:

There are other files in this directory which might be helpful: