)]}'
{
  "commit": "b6f2b407f0994c15d455437eccf62d4fb30c904f",
  "tree": "d076228be0d9c182418efe1871b80633c7bc80d0",
  "parents": [
    "312a2c08dee696d52d5c34ba9250efc76a93f076"
  ],
  "author": {
    "name": "David Benjamin",
    "email": "davidben@google.com",
    "time": "Tue Jan 27 14:04:01 2026 -0500"
  },
  "committer": {
    "name": "Boringssl LUCI CQ",
    "email": "boringssl-scoped@luci-project-accounts.iam.gserviceaccount.com",
    "time": "Wed Jan 28 10:11:29 2026 -0800"
  },
  "message": "AES-GCM: optimize ARMv8 kernel and add EOR3 support\n\nOriginally written by Jamison Collins \u003cjdcollin@google.com\u003e in\ncl/828730455. The assembly parts of this CL were reviewed there.\ngo/aes-gcm-eor3-benchmarks has some (internal) benchmarks. I\u0027ve also\nadded some on the hardware I had on hand here.\n\nJamison\u0027s notes:\n\nThis change optimizes the existing ARMv8 AES-GCM assembly by reducing\nNEON register pressure and introduces a new kernel variant leveraging\nthe EOR3 instruction. This CL improves performance by up to 12% on\nConan, 28% on Athena, and 41% on Bondi Beach.\n\nCore Assembly Optimizations: Significant instruction rescheduling was\nimplemented in aesv8-gcm-armv8.pl to reduce false dependencies and\nbetter utilize execution ports. To address NEON bottlenecks, operations\nwere shifted away from NEON registers where possible.\n\nCounter Management: A key optimization is the reworked counter\nmanagement. Previously, counters were incremented, reversed, and moved\nfrom GPR to NEON registers within the hot loop for every block. The new\napproach precalculates these values for the subsequent iteration and\nstores them on the stack. Inside the loop, they are pulled in via NEON\nloads. As values are passed via memory it\u0027s necessary to calculate the\ncounters one iteration ahead to avoid expensive LD-ST forwarding.\n\nEOR3 Support: Support for the EOR3 instruction (part of the SHA3\nextension) has been piped through the library to further optimize\nAES-GCM on capable hardware.\n\nDetection: SHA3 capability detection was added to Linux (via hwcap),\nApple (via sysctl), Fuchsia, and system register reading for\nbaremetal/FreeBSD.\n\nDispatch: Added gcm_arm64_aes_eor3 to the gcm_impl_t enum.\nCRYPTO_gcm128_init_aes_key now selects this implementation when\ngcm_sha3_capable() is true, enabling the optimized _eor3 encrypt/decrypt\nfunctions.\n\nKernel: The Perl script now generates a second kernel variant utilizing\nEOR3 to merge XOR operations during GHASH accumulation and final text\nrendering. Both versions are stored under different names in\naesv8-gcm-armv8-linux.S.\n\nChanges I made when extracting this to the upstream repo:\n\n- Rebased to main and reran pregenerate to pick up new symbols to prefix\n\n- Pulled the arch_extension machinery into its own CL\n\n- Went ahead and added Windows feature dispatch, since MS has added\n  those constants now\n\n- Switched armv8_feature_parsing.h to C++ inline functions, to avoid\n  potential theoretical ODR issues with static symbols in headers. Since\n  these are now plain C++, I made them C++-named.\n\n- Added armv8_feature_parsing.h to the build.\n\nBenchmarks:\n\nApple M1 Pro (has EOR3):\n\nBenchmark                                                       Time             CPU      Time Old      Time New       CPU Old       CPU New\n--------------------------------------------------------------------------------------------------------------------------------------------\nBM_SpeedAEAD/seal_aes_128_gcm/InputSize:16                   +0.0519         +0.0519            42            44            42            44\nBM_SpeedAEAD/seal_aes_128_gcm/InputSize:256                  -0.0356         -0.0297            76            74            76            73\nBM_SpeedAEAD/seal_aes_128_gcm/InputSize:1350                 -0.1067         -0.1064           255           228           255           228\nBM_SpeedAEAD/seal_aes_128_gcm/InputSize:8192                 -0.1451         -0.1458          1253          1071          1249          1067\nBM_SpeedAEAD/seal_aes_128_gcm/InputSize:16384                -0.1427         -0.1446          2458          2108          2453          2099\nBM_SpeedAEAD/open_aes_128_gcm/InputSize:16                   +0.0697         +0.0725            45            48            45            48\nBM_SpeedAEAD/open_aes_128_gcm/InputSize:256                  -0.0254         -0.0301            79            77            79            77\nBM_SpeedAEAD/open_aes_128_gcm/InputSize:1350                 -0.1406         -0.1458           266           229           265           226\nBM_SpeedAEAD/open_aes_128_gcm/InputSize:8192                 -0.2051         -0.2061          1296          1030          1295          1028\nBM_SpeedAEAD/open_aes_128_gcm/InputSize:16384                -0.2122         -0.2114          2547          2007          2541          2004\n\nPixel 5A (does not have EOR3, just the base AES and PMULL extensions):\n\nBenchmark                                                       Time             CPU      Time Old      Time New       CPU Old       CPU New\n--------------------------------------------------------------------------------------------------------------------------------------------\nBM_SpeedAEAD/seal_aes_128_gcm/InputSize:16                   +0.0128         +0.0125           116           118           116           118\nBM_SpeedAEAD/seal_aes_128_gcm/InputSize:256                  -0.0420         -0.0422           213           204           213           204\nBM_SpeedAEAD/seal_aes_128_gcm/InputSize:1350                 -0.0767         -0.0781           740           683           739           681\nBM_SpeedAEAD/seal_aes_128_gcm/InputSize:8192                 -0.1032         -0.1032          3829          3434          3822          3427\nBM_SpeedAEAD/seal_aes_128_gcm/InputSize:16384                -0.1077         -0.1081          7568          6753          7553          6737\nBM_SpeedAEAD/open_aes_128_gcm/InputSize:16                   +0.0000         -0.0002           122           122           122           122\nBM_SpeedAEAD/open_aes_128_gcm/InputSize:256                  +0.0125         +0.0122           211           214           211           213\nBM_SpeedAEAD/open_aes_128_gcm/InputSize:1350                 -0.0161         -0.0165           697           685           695           684\nBM_SpeedAEAD/open_aes_128_gcm/InputSize:8192                 -0.0183         -0.0183          3546          3482          3539          3475\nBM_SpeedAEAD/open_aes_128_gcm/InputSize:16384                -0.0184         -0.0176          6985          6857          6968          6845\n\nChange-Id: Ied74d3493f174f6e8aeaa300816b39d72f2be042\nReviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/87988\nCommit-Queue: David Benjamin \u003cdavidben@google.com\u003e\nReviewed-by: Lily Chen \u003cchlily@google.com\u003e\n",
  "tree_diff": [
    {
      "type": "modify",
      "old_id": "92cbfff10347cc44521fc34b1fbb611253955f48",
      "old_mode": 33188,
      "old_path": "build.json",
      "new_id": "45fb7c46ea3ff657aa9e9a18527311271d83dcd1",
      "new_mode": 33188,
      "new_path": "build.json"
    },
    {
      "type": "add",
      "old_id": "0000000000000000000000000000000000000000",
      "old_mode": 0,
      "old_path": "/dev/null",
      "new_id": "5ebb8764cca35db2b4802a6ec109c9bb1a66b20f",
      "new_mode": 33188,
      "new_path": "crypto/armv8_feature_parsing.h"
    },
    {
      "type": "modify",
      "old_id": "c42ced4eda3f0904d9831ca0c6da290db01f142c",
      "old_mode": 33188,
      "old_path": "crypto/cpu_aarch64_apple.cc",
      "new_id": "c25b5ac04ddc82b30f12a52a211a8620eb8f43a4",
      "new_mode": 33188,
      "new_path": "crypto/cpu_aarch64_apple.cc"
    },
    {
      "type": "modify",
      "old_id": "865dfeaf3b16cc7e5a48bd154f26b2824c990962",
      "old_mode": 33188,
      "old_path": "crypto/cpu_aarch64_fuchsia.cc",
      "new_id": "88db3b2df7f47dcb7e04186baa8a7a0e7dfcc537",
      "new_mode": 33188,
      "new_path": "crypto/cpu_aarch64_fuchsia.cc"
    },
    {
      "type": "modify",
      "old_id": "f203f11cf5ccd7feae665885498a946eefb3c168",
      "old_mode": 33188,
      "old_path": "crypto/cpu_aarch64_linux.cc",
      "new_id": "527c7f03e2ad9f0f4cb866ec1a469adaac6cd20b",
      "new_mode": 33188,
      "new_path": "crypto/cpu_aarch64_linux.cc"
    },
    {
      "type": "modify",
      "old_id": "64b16335cfdbf3acba998250888ec4eb9c8fa8c0",
      "old_mode": 33188,
      "old_path": "crypto/cpu_aarch64_openbsd.cc",
      "new_id": "235c14ae2b9eca7fac84538eb536a805c6de1912",
      "new_mode": 33188,
      "new_path": "crypto/cpu_aarch64_openbsd.cc"
    },
    {
      "type": "modify",
      "old_id": "9734cfc57d7d702e4a8cb9462fd6f2f48e3ff39d",
      "old_mode": 33188,
      "old_path": "crypto/cpu_aarch64_sysreg.cc",
      "new_id": "0360a3d72f20c616ac439a02897578b0693a5da9",
      "new_mode": 33188,
      "new_path": "crypto/cpu_aarch64_sysreg.cc"
    },
    {
      "type": "modify",
      "old_id": "6cdb00f29a6f7463aad7587408dcc4ca737a917a",
      "old_mode": 33188,
      "old_path": "crypto/cpu_aarch64_win.cc",
      "new_id": "4d51bbdaafa7e23309ed2a060a11676b15b34b87",
      "new_mode": 33188,
      "new_path": "crypto/cpu_aarch64_win.cc"
    },
    {
      "type": "modify",
      "old_id": "d2b587b334351af97eb504cfc93737274fc97de1",
      "old_mode": 33188,
      "old_path": "crypto/fipsmodule/aes/asm/aesv8-gcm-armv8.pl",
      "new_id": "23084e6050cc6bf1199684a54075c819dfc4826d",
      "new_mode": 33188,
      "new_path": "crypto/fipsmodule/aes/asm/aesv8-gcm-armv8.pl"
    },
    {
      "type": "modify",
      "old_id": "b0e940a68102c234fe9ab525964f4da171b43ab2",
      "old_mode": 33188,
      "old_path": "crypto/fipsmodule/aes/gcm.cc.inc",
      "new_id": "2dd0e24a8f3667b1421f9a73ea1ed394a9c67fd3",
      "new_mode": 33188,
      "new_path": "crypto/fipsmodule/aes/gcm.cc.inc"
    },
    {
      "type": "modify",
      "old_id": "5a03254353d4ba5a6062a8872f76c87b3faefac7",
      "old_mode": 33188,
      "old_path": "crypto/fipsmodule/aes/gcm_test.cc",
      "new_id": "baa4b934d44d9996ab82716ab2d4e49d76580c52",
      "new_mode": 33188,
      "new_path": "crypto/fipsmodule/aes/gcm_test.cc"
    },
    {
      "type": "modify",
      "old_id": "9b42a6acad73ab8ba4822992b2e976b6c0295636",
      "old_mode": 33188,
      "old_path": "crypto/fipsmodule/aes/internal.h",
      "new_id": "4723afdfeea8cbee8972bbcc90ca68e34ded557a",
      "new_mode": 33188,
      "new_path": "crypto/fipsmodule/aes/internal.h"
    },
    {
      "type": "modify",
      "old_id": "287bb6db7e8211ad1df3e9db20ec38851937206d",
      "old_mode": 33188,
      "old_path": "crypto/internal.h",
      "new_id": "2cdb760bdf08fe10246d09bcfa03a8fb7c1e8e65",
      "new_mode": 33188,
      "new_path": "crypto/internal.h"
    },
    {
      "type": "modify",
      "old_id": "a0b8e7d870610fb7b6a527e80742b8cb2a3917d8",
      "old_mode": 33188,
      "old_path": "gen/bcm/aesv8-gcm-armv8-apple.S",
      "new_id": "6a76daab33732a96c4eda42725cb32f07cb507e2",
      "new_mode": 33188,
      "new_path": "gen/bcm/aesv8-gcm-armv8-apple.S"
    },
    {
      "type": "modify",
      "old_id": "fc07f9a9bef82fd75ec9b81c1226c13623c9bdf6",
      "old_mode": 33188,
      "old_path": "gen/bcm/aesv8-gcm-armv8-linux.S",
      "new_id": "9d99ec31bcf12d2dd3f2f7d6274e9dee984887f9",
      "new_mode": 33188,
      "new_path": "gen/bcm/aesv8-gcm-armv8-linux.S"
    },
    {
      "type": "modify",
      "old_id": "ce27fd3b54297c8231887d2516c0f2191a35fa80",
      "old_mode": 33188,
      "old_path": "gen/bcm/aesv8-gcm-armv8-win.S",
      "new_id": "a5a2b42bed86c9263e554f9654220c6bd1f9f10d",
      "new_mode": 33188,
      "new_path": "gen/bcm/aesv8-gcm-armv8-win.S"
    },
    {
      "type": "modify",
      "old_id": "0f3aab20991fa44f2387e42795a80bcb7c582646",
      "old_mode": 33188,
      "old_path": "gen/boringssl_prefix_symbols_internal_x86_64_win_asm.inc",
      "new_id": "ddd5e5a8fa9de7c765e700fb6d7426ef35e1db43",
      "new_mode": 33188,
      "new_path": "gen/boringssl_prefix_symbols_internal_x86_64_win_asm.inc"
    },
    {
      "type": "modify",
      "old_id": "79ad39943718be6f96575d022ce7fe3648d26e77",
      "old_mode": 33188,
      "old_path": "gen/boringssl_prefix_symbols_internal_x86_win_asm.inc",
      "new_id": "55b655d9a60fa86b17db62752bcbc07357a59a28",
      "new_mode": 33188,
      "new_path": "gen/boringssl_prefix_symbols_internal_x86_win_asm.inc"
    },
    {
      "type": "modify",
      "old_id": "68b393c9e74dbc57f4e9df96297ec0465bb18a88",
      "old_mode": 33188,
      "old_path": "gen/sources.bzl",
      "new_id": "c558749bd501108728d866720c7e656e6857674a",
      "new_mode": 33188,
      "new_path": "gen/sources.bzl"
    },
    {
      "type": "modify",
      "old_id": "76e4935052e8ee89c25659ab5f7785b8900c0928",
      "old_mode": 33188,
      "old_path": "gen/sources.cmake",
      "new_id": "28b63d786158d24de5de93571aec0875d1fd4880",
      "new_mode": 33188,
      "new_path": "gen/sources.cmake"
    },
    {
      "type": "modify",
      "old_id": "1b7aa00f0de21ec8c1d66dc52b95e878e2461d99",
      "old_mode": 33188,
      "old_path": "gen/sources.gni",
      "new_id": "eb8306084d13db0520a18537c88629de528da318",
      "new_mode": 33188,
      "new_path": "gen/sources.gni"
    },
    {
      "type": "modify",
      "old_id": "ce8185a8baebb04f30a52008ff62e32488be65d2",
      "old_mode": 33188,
      "old_path": "gen/sources.json",
      "new_id": "6c08a9cf431394123a06171207b4d8c8da83f65d",
      "new_mode": 33188,
      "new_path": "gen/sources.json"
    },
    {
      "type": "modify",
      "old_id": "c7b695abf6ee253280e3cb197ed39b9fbf5a340d",
      "old_mode": 33188,
      "old_path": "gen/sources.mk",
      "new_id": "d0b18b23331b65df256fb7d36daf466b590fda5b",
      "new_mode": 33188,
      "new_path": "gen/sources.mk"
    },
    {
      "type": "modify",
      "old_id": "f0d81420e89f119668145788986971afe25b6426",
      "old_mode": 33188,
      "old_path": "include/openssl/prefix_symbols_internal_S.h",
      "new_id": "31d21ac340dc4389faa039a744f28333f5f77399",
      "new_mode": 33188,
      "new_path": "include/openssl/prefix_symbols_internal_S.h"
    },
    {
      "type": "modify",
      "old_id": "2332027d7bb282f791728130e77597eac8fab7b4",
      "old_mode": 33188,
      "old_path": "include/openssl/prefix_symbols_internal_c.h",
      "new_id": "04ad11d8285289a7174d8559088e7f5211d5eb92",
      "new_mode": 33188,
      "new_path": "include/openssl/prefix_symbols_internal_c.h"
    }
  ]
}
