tree 6f6d0fd84b6b4072b6a6d96983c1d494d62c1ac5
parent c7bec85f2b8e4daa28391c3805c87a3fc2ad1582
author Eric Biggers <ebiggers@google.com> 1742395070 -0700
committer Boringssl LUCI CQ <boringssl-scoped@luci-project-accounts.iam.gserviceaccount.com> 1742422941 -0700

Clean up aes-gcm-avx512-x86_64.pl to assume 512-bit vectors

aes-gcm-avx512-x86_64.pl (originally aes-gcm-avx10-x86_64.pl) was
designed to support multiple maximum vector lengths, while still
utilizing AVX512 / AVX10 features such as the increased number of vector
registers.  However, the support for multiple maximum vector lengths
turned out to not be useful.  Support for maximum vector lengths other
than 512 bits was just removed from the AVX10 specification, which
leaves "avoiding downclocking" as the only remaining use case for
limiting AVX512 / AVX10 code to 256-bit vectors.  But the bad 512-bit
downclocking has gone away in new CPUs, and the separate VAES+AVX2 code
which I ended up having to write anyway (for CPUs that support VAES but
not AVX512) provides nearly as good 256-bit support anyway.

Therefore, clean up aes-gcm-avx512-x86_64.pl to not be written in terms
of a generic vector length, but rather just assume 512-bit vectors.

This results in some minor changes to the generated assembly:

- The labels in gcm_init_vpclmulqdq_avx512 and
  gcm_ghash_vpclmulqdq_avx512 no longer have the suffixes that were used
  to differentiate between VL=32 and VL=64.
- gcm_init_vpclmulqdq_avx512 is now in a slightly different place in the
  file, since (like the AVX2 equivalent) it's now generated at the
  top level instead of via a Perl function that gets called later on.
- The inc_2blocks label (only used for VL=32) has been removed.
- The code no longer goes out of its way to avoid using immediates of
  4*VL, which is now always 256.  This was an optimization for VL=32
  which shortened some instructions by 3 bytes by keeping immediates in
  the range [-128, 127].  With VL=64 this optimization is not possible,
  so we might as well just write the "obvious" code instead.

Change-Id: I44027d4a81f7d9bdfd4c27e410de2d0158b10325
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/77848
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
