Mark the CPU capability helpers as const, not just pure
If we have code like this, the compiler will not currently dedup the
capability check:
void foo() {
if (CRYPTO_is_AVX2_capable()) {
foo_avx2();
} else {
foo_nohw();
}
}
foo();
foo();
foo();
This is because a pure function may still inspect some globals and the
compiler doesn't know that foo_avx2() does not change the output of
CRYPTO_is_AVX2_capable(). We'd really like that to turn into:
if (CRYPTO_is_AVX2_capable()) {
foo_avx2();
foo_avx2();
foo_avx2();
} else {
foo_nohw();
foo_nohw();
foo_nohw();
}
Strictly speaking, these functions are not const because they inspect a
global variable and a test might modify
OPENSSL_get_armcap_pointer_for_test(). However, that internal, test-only
function is already documented as needing to be resolved before any
other BoringSSL function is called. When that rule is heeded, const is
fine.
Bug: 42290548
Change-Id: I1737fd00d443e8854294dcc8446b7b0aa38ffc76
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/70828
Reviewed-by: Bob Beck <bbe@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
diff --git a/crypto/internal.h b/crypto/internal.h
index 5ca29ae..8944a99 100644
--- a/crypto/internal.h
+++ b/crypto/internal.h
@@ -274,9 +274,9 @@
#endif
#if defined(__GNUC__) || defined(__clang__)
-#define OPENSSL_ATTR_PURE __attribute__((pure))
+#define OPENSSL_ATTR_CONST __attribute__((const))
#else
-#define OPENSSL_ATTR_PURE
+#define OPENSSL_ATTR_CONST
#endif
#if defined(BORINGSSL_MALLOC_FAILURE_TESTING)
@@ -1404,9 +1404,9 @@
extern uint32_t OPENSSL_ia32cap_P[4];
// OPENSSL_get_ia32cap initializes the library if needed and returns the |idx|th
-// entry of |OPENSSL_ia32cap_P|. It is marked as a pure function so duplicate
+// entry of |OPENSSL_ia32cap_P|. It is marked as a const function so duplicate
// calls can be merged by the compiler, at least when indices match.
-OPENSSL_ATTR_PURE uint32_t OPENSSL_get_ia32cap(int idx);
+OPENSSL_ATTR_CONST uint32_t OPENSSL_get_ia32cap(int idx);
// See Intel manual, volume 2A, table 3-11.
@@ -1615,9 +1615,9 @@
extern uint32_t OPENSSL_armcap_P;
// OPENSSL_get_armcap initializes the library if needed and returns ARM CPU
-// capabilities. It is marked as a pure function so duplicate calls can be
-// merged by the compiler, at least when indices match.
-OPENSSL_ATTR_PURE uint32_t OPENSSL_get_armcap(void);
+// capabilities. It is marked as a const function so duplicate calls can be
+// merged by the compiler.
+OPENSSL_ATTR_CONST uint32_t OPENSSL_get_armcap(void);
// We do not detect any features at runtime on several 32-bit Arm platforms.
// Apple platforms and OpenBSD require NEON and moved to 64-bit to pick up Armv8