Fix a bug when BN_mod_sqrt is called on very annoying primes

https://boringssl-review.googlesource.com/c/boringssl/+/70813 introduced
a bug in the error-handling at the BN_usub call. (It was missing a `!`.)

The bug is triggered when computing BN_mod_sqrt over a prime p such
that:

- p is 1 mod 8
- The least quadratic non-residue is 22 or higher

Then BN_mod_sqrt might fail with some probability (dependent on how
far below p is from a power of two).

The trivial fix is to add the `!` back but, while I'm here, switch to
BN_rand_range_ex as suggested by Theo Buehler.

This has no impact on any cryptography. Modular square roots are used to
decode compressed coordinates. There, the only annoying prime is P-224.
P-224 which is merely *annoying* (1 mod 8) and not *very annoying* (last
quadratic non-residue is big). It can only impact callers who were
calling into us to compute some math function for some reason.

(Were one to construct an arbitrary elliptic curve that hit this, the
failure mode would just be that we might reject a public key that we
should have accepted.)

What's happening here is that modular square roots easy when p is
3 mod 4 or 5 mod 8, but they're very hard when 1 mod 8. If 1 mod 8, we
must use Tonelli/Shanks. Part of Tonelli/Shanks requires that we find
some quadratic non-residue y. BN_mod_sqrt tries y = 2 up to y = 21
before giving up and trying random values. The bug was in that random
value case.

Thanks to Alex Gaynor for reporting an input that tripped this! It
seems that input was picked as the first prime with a high enough least
quadratic non-residue. I added some tests also with another modulus,
0x2e3a719, which I believe was the 1 mod 8 prime under 100,000,000 with
the largest least quadratic non-residue. Also it happened that 0x7deb1
was just under a power of two, so 0x2e3a719
trips the bug more reliably.

(It's unfortunate that this is public API and that rust-openssl binds
it. This function is only in libcrypto for EC compressed coordinates,
but it's not the right function for it anyway. EC code tries to avoid
heap-allocated BIGNUMs and should be specialized. Were it not for P-224,
we would only need the 3 mod 4 case. With P-224, we can at least know
ahead of time that y = 11 and not bother looking for a quadratic
non-residue. Ah well.)

Change-Id: I6fe97ffe239a03557ed7ff5417bed23f318e8ff4
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/82788
Reviewed-by: Lily Chen <chlily@google.com>
Auto-Submit: David Benjamin <davidben@google.com>
Commit-Queue: Lily Chen <chlily@google.com>
diff --git a/crypto/fipsmodule/bn/sqrt.cc.inc b/crypto/fipsmodule/bn/sqrt.cc.inc
index 56d89e0..9b76788 100644
--- a/crypto/fipsmodule/bn/sqrt.cc.inc
+++ b/crypto/fipsmodule/bn/sqrt.cc.inc
@@ -192,20 +192,9 @@
         goto end;
       }
     } else {
-      if (!BN_pseudo_rand(y, BN_num_bits(p), 0, 0)) {
+      if (!BN_rand_range_ex(y, 22, p)) {
         goto end;
       }
-      if (BN_ucmp(y, p) >= 0) {
-        if (BN_usub(y, y, p)) {
-          goto end;
-        }
-      }
-      // now 0 <= y < |p|
-      if (BN_is_zero(y)) {
-        if (!BN_set_word(y, i)) {
-          goto end;
-        }
-      }
     }
 
     r = bn_jacobi(y, q, ctx);  // here 'q' is |p|
diff --git a/crypto/fipsmodule/bn/test/mod_sqrt_tests.txt b/crypto/fipsmodule/bn/test/mod_sqrt_tests.txt
index 64c752b..dc69d5d 100644
--- a/crypto/fipsmodule/bn/test/mod_sqrt_tests.txt
+++ b/crypto/fipsmodule/bn/test/mod_sqrt_tests.txt
@@ -323,6 +323,38 @@
 A = 2eee37cf06228a387788188e650bc6d8a2ff402931443f69156a29155eca07dcb45f3aac238d92943c0c25c896098716baa433f25bd696a142f5a69d5d937e81
 P = 9df9d6cc20b8540411af4e5357ef2b0353cb1f2ab5ffc3e246b41c32f71e951f
 
+# This prime is 1 mod 8, so we need Tonelli/Shanks. The least quadratic
+# non-residue is 37, so we give up on small numbers to find a non-residue.
+ModSqrt = 2e42c
+A = 2
+P = 7deb1
+
+ModSqrt = 2
+A = 4
+P = 7deb1
+
+# This prime is 1 mod 8, so we need Tonelli/Shanks. The least quadratic
+# non-residue is 67, so we give up on small numbers to find a non-residue.
+ModSqrt = fb0fab
+A = 2
+P = 2e3a719
+
+ModSqrt = 01d6df74
+A = 3
+P = 2e3a719
+
+ModSqrt = 2
+A = 4
+P = 2e3a719
+
+ModSqrt = 6d151e
+A = 5
+P = 2e3a719
+
+ModSqrt = 2573aa
+A = 12345
+P = 2e3a719
+
 
 # NotModSquare tests.
 #