Make bn_mod_lshift_consttime faster

bn_mod_lshift_consttime currently calls bn_mod_lshift1_consttime in a
loop, but between needing a temporary value and having to guard against
some complications in our fixed-width BIGNUM convention, it's actually
picking up a lot of overhead.

This function is currently called to setup Montgomery contexts with
secret moduli (RSA primes). The setup operation is not
performance-sensitive in our benchmarks, because it is amortized away in
RSA private key signing. However, as part of reducing thread contention
with the RSA object, I'm planning to make RSA creation, which we do
benchmark, eagerly fill in the Montgomery context.

We do benchmark RSA parsing, so adding a slow Montgomery setup would
show up in benchmarks. This distinction is mostly artificial. Work done
on creation and work done on first use is still work done once per RSA
key. However, work done on key creation may slow server startup, while
work deferred to first use is amortized but less predictable.

Either way, from this CL, and especially the one to follow it, we have
plenty of low-hanging fruit in this function. As a bonus, this should
help single-use RSA private keys, but that's not something we currently
benchmark.

Modulus sizes below chosen based on:

- Common curve sizes (moot because we use a variable-time setup anyway)

- Common RSA modulus sizes (also variable-time setup)

- Half of common RSA modulus sizes (the secret primes involved)

Of these, only the third category matters. The others can use the
division-based path where it's faster anyway. However, by the end of
this patch series, they'll get a bit closer, so I benchmarked them all
to compare. (Though division still wins in the end.)

Benchmarks on an M1 Max:

Before:
Did 528000 256-bit mont (constime) operations in 2000993us (263869.0 ops/sec)
Did 312000 384-bit mont (constime) operations in 2001281us (155900.1 ops/sec)
Did 246000 512-bit mont (constime) operations in 2001521us (122906.5 ops/sec)
Did 191000 521-bit mont (constime) operations in 2006336us (95198.4 ops/sec)
Did 98000 1024-bit mont (constime) operations in 2001438us (48964.8 ops/sec)
Did 55000 1536-bit mont (constime) operations in 2025306us (27156.4 ops/sec)
Did 35000 2048-bit mont (constime) operations in 2022714us (17303.5 ops/sec)
Did 17640 3072-bit mont (constime) operations in 2028352us (8696.7 ops/sec)
Did 10290 4096-bit mont (constime) operations in 2065529us (4981.8 ops/sec)

After:
Did 712000 256-bit mont (constime) operations in 2000454us (355919.2 ops/sec) [+34.9%]
Did 440000 384-bit mont (constime) operations in 2001121us (219876.8 ops/sec) [+41.0%]
Did 259000 512-bit mont (constime) operations in 2003709us (129260.3 ops/sec) [+5.2%]
Did 212000 521-bit mont (constime) operations in 2007033us (105628.6 ops/sec) [+11.0%]
Did 107000 1024-bit mont (constime) operations in 2018551us (53008.3 ops/sec) [+8.3%]
Did 57000 1536-bit mont (constime) operations in 2001027us (28485.4 ops/sec) [+4.9%]
Did 37000 2048-bit mont (constime) operations in 2039631us (18140.5 ops/sec) [+4.8%]
Did 20000 3072-bit mont (constime) operations in 2041163us (9798.3 ops/sec) [+12.7%]
Did 11760 4096-bit mont (constime) operations in 2007195us (5858.9 ops/sec) [+17.6%]

Bug: 316
Change-Id: I06f4a065fdecc1aec3160fe32a41e200538d1ee3
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/60685
Auto-Submit: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
1 file changed
tree: f3e9a1725c08f756b0d24c2bbf401921d13697ec
  1. .github/
  2. cmake/
  3. crypto/
  4. decrepit/
  5. fuzz/
  6. include/
  7. rust/
  8. ssl/
  9. third_party/
  10. tool/
  11. util/
  12. .clang-format
  13. .gitignore
  14. API-CONVENTIONS.md
  15. BREAKING-CHANGES.md
  16. BUILDING.md
  17. CMakeLists.txt
  18. codereview.settings
  19. CONTRIBUTING.md
  20. FUZZING.md
  21. go.mod
  22. go.sum
  23. INCORPORATING.md
  24. LICENSE
  25. PORTING.md
  26. README.md
  27. SANDBOXING.md
  28. sources.cmake
  29. STYLE.md
README.md

BoringSSL

BoringSSL is a fork of OpenSSL that is designed to meet Google's needs.

Although BoringSSL is an open source project, it is not intended for general use, as OpenSSL is. We don't recommend that third parties depend upon it. Doing so is likely to be frustrating because there are no guarantees of API or ABI stability.

Programs ship their own copies of BoringSSL when they use it and we update everything as needed when deciding to make API changes. This allows us to mostly avoid compromises in the name of compatibility. It works for us, but it may not work for you.

BoringSSL arose because Google used OpenSSL for many years in various ways and, over time, built up a large number of patches that were maintained while tracking upstream OpenSSL. As Google's product portfolio became more complex, more copies of OpenSSL sprung up and the effort involved in maintaining all these patches in multiple places was growing steadily.

Currently BoringSSL is the SSL library in Chrome/Chromium, Android (but it's not part of the NDK) and a number of other apps/programs.

Project links:

There are other files in this directory which might be helpful: