Use sdallocx, if available, when deallocating.

Providing a size hint to the allocator is substantially faster,
especially as we already know/need the size for OPENSSL_cleanse.

We provide a weak symbol that falls back to free when a malloc with
sdallocx is not statically linked with BoringSSL.

Alternatives considered:
* Use dlsym():  This is prone to fail on statically linked binaries
  without symbols.  Additionally, the extra indirection adds call
  overhead above and beyond the linker resolved technique we're using.
* Use CMake rules to identify whether sdallocx is available:  Once the
  library is built, we may link against a variety of malloc
  implementations (not all of which may have sdallocx), so we need to
  have a fallback when the symbol is unavailable.

Change-Id: I3a78e88fac5b6e5d4712aa0347d2ba6b43046e07
Reviewed-on: https://boringssl-review.googlesource.com/31784
Reviewed-by: Chris Kennelly <ckennelly@google.com>
Reviewed-by: Adam Langley <agl@google.com>
diff --git a/crypto/mem.c b/crypto/mem.c
index 5d45baa..a06061b 100644
--- a/crypto/mem.c
+++ b/crypto/mem.c
@@ -71,6 +71,25 @@
 
 #define OPENSSL_MALLOC_PREFIX 8
 
+#if defined(__GNUC__) || defined(__clang__)
+// sdallocx is a sized |free| function. By passing the size (which we happen to
+// always know in BoringSSL), the malloc implementation can save work. We cannot
+// depend on |sdallocx| being available so we declare a wrapper that falls back
+// to |free| as a weak symbol.
+//
+// This will always be safe, but will only be overridden if the malloc
+// implementation is statically linked with BoringSSL. So, if |sdallocx| is
+// provided in, say, libc.so, we still won't use it because that's dynamically
+// linked. This isn't an ideal result, but its helps in some cases.
+void sdallocx(void *ptr, size_t size, int flags);
+
+__attribute((weak, noinline))
+#else
+static
+#endif
+void sdallocx(void *ptr, size_t size, int flags) {
+  free(ptr);
+}
 
 void *OPENSSL_malloc(size_t size) {
   void *ptr = malloc(size + OPENSSL_MALLOC_PREFIX);
@@ -92,7 +111,7 @@
 
   size_t size = *(size_t *)ptr;
   OPENSSL_cleanse(ptr, size + OPENSSL_MALLOC_PREFIX);
-  free(ptr);
+  sdallocx(ptr, size + OPENSSL_MALLOC_PREFIX, 0 /* flags */);
 }
 
 void *OPENSSL_realloc(void *orig_ptr, size_t new_size) {