Add VAES and VPCLMULQDQ accelerated AES-GCM

Add an AES-GCM implementation for x86_64 that uses VAES, VPCLMULQDQ, and
either AVX10 or a compatible AVX512 feature set.  The assembly code is
based on the code I wrote for the Linux kernel
(https://git.kernel.org/linus/b06affb1cb580e13).  Some substantial
changes were needed for BoringSSL integration; see the file comment.

The following tables compare the performance of AES-256-GCM before and
after this patch, and also versus the alternative patch from Cloudflare
(https://boringssl-review.googlesource.com/c/boringssl/+/65987/3).  All
tables show throughput in MB/s, for implementation name vs. message
length in bytes.  All benchmarks were done using EVP_AEAD_CTX_seal() and
EVP_AEAD_CTX_open() with an associated data length of 16 bytes.

AMD Zen 5, Granite Ridge (encryption):

               | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
    -----------+-------+-------+-------+-------+-------+-------+
    This patch | 26358 | 21295 | 17402 | 10672 |  7798 |  4840 |
    Cloudflare | 22363 | 18330 | 17008 | 10979 |  7070 |  5870 |
    Existing   |  7194 |  6743 |  6465 |  5404 |  4075 |  3563 |

               |   300 |   200 |    64 |    63 |    16 |
    -----------+-------+-------+-------+-------+-------+
    This patch |  3248 |  2557 |  1359 |   937 |   537 |
    Cloudflare |  3624 |  2770 |  1293 |  1028 |   517 |
    Existing   |  2938 |  2271 |  1266 |   959 |   528 |

AMD Zen 5, Granite Ridge (decryption):

               | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
    -----------+-------+-------+-------+-------+-------+-------+
    This patch | 27214 | 22298 | 18824 | 11401 |  8496 |  5399 |
    Cloudflare | 22629 | 19257 | 17792 | 11575 |  7807 |  6031 |
    Existing   |  7122 |  6805 |  6228 |  4922 |  4604 |  3565 |

               |   300 |   200 |    64 |    63 |    16 |
    -----------+-------+-------+-------+-------+-------+
    This patch |  3637 |  2497 |  1483 |   952 |   589 |
    Cloudflare |  3714 |  2847 |  1437 |  1030 |   567 |
    Existing   |  3012 |  2354 |  1514 |   880 |   632 |

AMD Zen 4, Genoa (encryption):

               | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
    -----------+-------+-------+-------+-------+-------+-------+
    This patch | 10093 |  8907 |  7614 |  5399 |  4247 |  2719 |
    Cloudflare |  9174 |  8073 |  7521 |  5414 |  3786 |  3111 |
    Existing   |  4239 |  3964 |  3800 |  3186 |  2398 |  2069 |

               |   300 |   200 |    64 |    63 |    16 |
    -----------+-------+-------+-------+-------+-------+
    This patch |  1940 |  1553 |   851 |   581 |   343 |
    Cloudflare |  2023 |  1619 |   775 |   619 |   311 |
    Existing   |  1735 |  1334 |   775 |   573 |   317 |

AMD Zen 4, Genoa (decryption):

               | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
    -----------+-------+-------+-------+-------+-------+-------+
    This patch | 10108 |  8922 |  7879 |  5526 |  4250 |  2872 |
    Cloudflare |  9441 |  8347 |  7723 |  5366 |  3902 |  3067 |
    Existing   |  4249 |  3999 |  3810 |  3101 |  2535 |  2026 |

               |   300 |   200 |    64 |    63 |    16 |
    -----------+-------+-------+-------+-------+-------+
    This patch |  2031 |  1536 |   868 |   568 |   346 |
    Cloudflare |  1933 |  1579 |   765 |   569 |   300 |
    Existing   |  1723 |  1381 |   806 |   516 |   345 |

Intel Emerald Rapids (encryption):

               | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
    -----------+-------+-------+-------+-------+-------+-------+
    This patch | 13974 | 11827 | 10166 |  6601 |  4904 |  3334 |
    Cloudflare | 12735 | 10752 |  9966 |  6709 |  4524 |  3647 |
    Existing   |  5237 |  4831 |  4639 |  3747 |  2816 |  2409 |

               |   300 |   200 |    64 |    63 |    16 |
    -----------+-------+-------+-------+-------+-------+
    This patch |  2251 |  1763 |   915 |   649 |   363 |
    Cloudflare |  2329 |  1850 |   855 |   676 |   342 |
    Existing   |  1971 |  1502 |   808 |   626 |   359 |

Intel Emerald Rapids (decryption):

               | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
    -----------+-------+-------+-------+-------+-------+-------+
    This patch | 14239 | 12180 | 10370 |  6692 |  5305 |  3344 |
    Cloudflare | 13348 | 11485 | 10460 |  6736 |  5229 |  3641 |
    Existing   |  5306 |  4958 |  4702 |  3767 |  3071 |  2432 |

               |   300 |   200 |    64 |    63 |    16 |
    -----------+-------+-------+-------+-------+-------+
    This patch |  2197 |  2077 |  1040 |   628 |   390 |
    Cloudflare |  2186 |  1911 |   938 |   615 |   370 |
    Existing   |  2024 |  1727 |   999 |   599 |   421 |

Intel Sapphire Rapids (encryption):

               | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
    -----------+-------+-------+-------+-------+-------+-------+
    This patch | 12726 | 10618 |  9248 |  6012 |  4466 |  2986 |
    Cloudflare | 11059 |  9794 |  9071 |  6052 |  4089 |  3306 |
    Existing   |  4761 |  4397 |  4222 |  3408 |  2560 |  2188 |

               |   300 |   200 |    64 |    63 |    16 |
    -----------+-------+-------+-------+-------+-------+
    This patch |  2051 |  1612 |   838 |   579 |   351 |
    Cloudflare |  2110 |  1686 |   775 |   622 |   311 |
    Existing   |  1792 |  1369 |   733 |   567 |   324 |

Intel Sapphire Rapids (decryption):

               | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
    -----------+-------+-------+-------+-------+-------+-------+
    This patch | 12951 | 11100 |  9447 |  6067 |  4862 |  3030 |
    Cloudflare | 12165 | 10421 |  9506 |  6126 |  4767 |  3321 |
    Existing   |  4807 |  4507 |  4275 |  3400 |  2791 |  2216 |

               |   300 |   200 |    64 |    63 |    16 |
    -----------+-------+-------+-------+-------+-------+
    This patch |  2003 |  1894 |   950 |   572 |   357 |
    Cloudflare |  1999 |  1741 |   857 |   559 |   328 |
    Existing   |  1831 |  1571 |   838 |   539 |   382 |

Change-Id: I5b0833d2ffe8fd273cb38a26cd104c52c3532ceb
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/70187
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
16 files changed
tree: 36a66515533d0fd6cc8a1dffd8e7d95664bad6a1
  1. .bcr/
  2. .github/
  3. cmake/
  4. crypto/
  5. decrepit/
  6. docs/
  7. fuzz/
  8. gen/
  9. include/
  10. infra/
  11. pki/
  12. rust/
  13. ssl/
  14. third_party/
  15. tool/
  16. util/
  17. .bazelignore
  18. .bazelrc
  19. .bazelversion
  20. .clang-format
  21. .gitignore
  22. API-CONVENTIONS.md
  23. BREAKING-CHANGES.md
  24. BUILD.bazel
  25. build.json
  26. BUILDING.md
  27. CMakeLists.txt
  28. codereview.settings
  29. CONTRIBUTING.md
  30. FUZZING.md
  31. go.mod
  32. go.sum
  33. INCORPORATING.md
  34. LICENSE
  35. MODULE.bazel
  36. MODULE.bazel.lock
  37. PORTING.md
  38. PrivacyInfo.xcprivacy
  39. README.md
  40. SANDBOXING.md
  41. STYLE.md
README.md

BoringSSL

BoringSSL is a fork of OpenSSL that is designed to meet Google's needs.

Although BoringSSL is an open source project, it is not intended for general use, as OpenSSL is. We don't recommend that third parties depend upon it. Doing so is likely to be frustrating because there are no guarantees of API or ABI stability.

Programs ship their own copies of BoringSSL when they use it and we update everything as needed when deciding to make API changes. This allows us to mostly avoid compromises in the name of compatibility. It works for us, but it may not work for you.

BoringSSL arose because Google used OpenSSL for many years in various ways and, over time, built up a large number of patches that were maintained while tracking upstream OpenSSL. As Google's product portfolio became more complex, more copies of OpenSSL sprung up and the effort involved in maintaining all these patches in multiple places was growing steadily.

Currently BoringSSL is the SSL library in Chrome/Chromium, Android (but it's not part of the NDK) and a number of other apps/programs.

Project links:

To file a security issue, use the Chromium process and mention in the report this is for BoringSSL. You can ignore the parts of the process that are specific to Chromium/Chrome.

There are other files in this directory which might be helpful: