Enable ADX in x86_64-mont*.pl.

We have an SDE bot now and can more easily test things. We also enabled
ADX in rsaz-avx2.pl which does not work without x86_64-mont*.pl enabled.
rsa-avx2.pl's ADX code only turns itself off so that the faster ADX code
can be used... but we disable it.

Verified, after reverting the fix, the test vectors we imported combined
with Intel SDE catches CVE-2016-7055, so we do indeed have test
coverage.

Thanks to Alexey Ivanov for pointing out the discrepancy.

Skylake numbers:

Before:
Did 7296 RSA 2048 signing operations in 10038191us (726.8 ops/sec)
Did 209000 RSA 2048 verify operations in 10030629us (20836.2 ops/sec)
Did 1080 RSA 4096 signing operations in 10072221us (107.2 ops/sec)
Did 60836 RSA 4096 verify operations in 10053929us (6051.0 ops/sec)

ADX consistently off:
Did 9360 RSA 2048 signing operations in 10025823us (933.6 ops/sec)
Did 220000 RSA 2048 verify operations in 10024339us (21946.6 ops/sec)
Did 1048 RSA 4096 signing operations in 10006782us (104.7 ops/sec)
Did 61936 RSA 4096 verify operations in 10088011us (6139.6 ops/sec)

After (ADX consistently on):
Did 10444 RSA 2048 signing operations in 10006781us (1043.7 ops/sec)
Did 323000 RSA 2048 verify operations in 10012192us (32260.7 ops/sec)
Did 1610 RSA 4096 signing operations in 10044930us (160.3 ops/sec)
Did 96000 RSA 4096 verify operations in 10075606us (9528.0 ops/sec)

Change-Id: Icbbd4f06dde60d1a42a691c511b34c47b9a2da5f
Reviewed-on: https://boringssl-review.googlesource.com/18965
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
diff --git a/crypto/fipsmodule/bn/asm/rsaz-avx2.pl b/crypto/fipsmodule/bn/asm/rsaz-avx2.pl
index 14a28dc..437c0f7 100755
--- a/crypto/fipsmodule/bn/asm/rsaz-avx2.pl
+++ b/crypto/fipsmodule/bn/asm/rsaz-avx2.pl
@@ -82,8 +82,6 @@
 # In upstream, this is controlled by shelling out to the compiler to check
 # versions, but BoringSSL is intended to be used with pre-generated perlasm
 # output, so this isn't useful anyway.
-#
-# TODO(davidben): Enable these after testing. $avx goes up to 2 and $addx to 1.
 $avx = 2;
 $addx = 1;
 
diff --git a/crypto/fipsmodule/bn/asm/x86_64-mont.pl b/crypto/fipsmodule/bn/asm/x86_64-mont.pl
index 286e0a0..c27f65d 100755
--- a/crypto/fipsmodule/bn/asm/x86_64-mont.pl
+++ b/crypto/fipsmodule/bn/asm/x86_64-mont.pl
@@ -56,9 +56,7 @@
 # In upstream, this is controlled by shelling out to the compiler to check
 # versions, but BoringSSL is intended to be used with pre-generated perlasm
 # output, so this isn't useful anyway.
-#
-# TODO(davidben): Enable this option after testing. $addx goes up to 1.
-$addx = 0;
+$addx = 1;
 
 # int bn_mul_mont(
 $rp="%rdi";	# BN_ULONG *rp,
diff --git a/crypto/fipsmodule/bn/asm/x86_64-mont5.pl b/crypto/fipsmodule/bn/asm/x86_64-mont5.pl
index db09624..73abb71 100755
--- a/crypto/fipsmodule/bn/asm/x86_64-mont5.pl
+++ b/crypto/fipsmodule/bn/asm/x86_64-mont5.pl
@@ -41,9 +41,7 @@
 # In upstream, this is controlled by shelling out to the compiler to check
 # versions, but BoringSSL is intended to be used with pre-generated perlasm
 # output, so this isn't useful anyway.
-#
-# TODO(davidben): Enable this after testing. $addx goes up to 1.
-$addx = 0;
+$addx = 1;
 
 # int bn_mul_mont_gather5(
 $rp="%rdi";	# BN_ULONG *rp,