Add a CFI tester to CHECK_ABI.

This uses the x86 trap flag and libunwind to test CFI works at each
instruction. For now, it just uses the system one out of pkg-config and
disables unwind tests if unavailable. We'll probably want to stick a
copy into //third_party and perhaps try the LLVM one later.

This tester caught two bugs in P-256 CFI annotations already:
I47b5f9798b3bcee1748e537b21c173d312a14b42 and
I9f576d868850312d6c14d1386f8fbfa85021b347

An earlier design used PTRACE_SINGLESTEP with libunwind's remote
unwinding features. ptrace is a mess around stop signals (see group-stop
discussion in ptrace(2)) and this is 10x faster, so I went with it. The
question of which is more future-proof is complex:

- There are two libunwinds with the same API,
  https://www.nongnu.org/libunwind/ and LLVM's. This currently uses the
  system nongnu.org for convenience. In future, LLVM's should be easier
  to bundle (less complex build) and appears to even support Windows,
  but I haven't tested this.  Moreover, setting the trap flag keeps the
  test single-process, which is less complex on Windows. That suggests
  the trap flag design and switching to LLVM later. However...

- Not all architectures have a trap flag settable by userspace. As far
  as I can tell, ARMv8's PSTATE.SS can only be set from the kernel. If
  we stick with nongnu.org libunwind, we can use PTRACE_SINGLESTEP and
  remote unwinding. Or we implement it for LLVM. Another thought is for
  the ptracer to bounce SIGTRAP back into the process, to share the
  local unwinding code.

- ARMv7 has no trap flag at all and PTRACE_SINGLESTEP fails. Debuggers
  single-step by injecting breakpoints instead. However, ARMv8's trap
  flag seems to work in both AArch32 and AArch64 modes, so we may be
  able to condition it on a 64-bit kernel.

Sadly, neither strategy works with Intel SDE. Adding flags to cpucap
vectors as we do with ARM would help, but it would not emulate CPUs
newer than the host CPU. For now, I've just had SDE tests disable these.

Annoyingly, CMake does not allow object libraries to have dependencies,
so make test_support a proper static library. Rename the target to
test_support_lib to avoid
https://gitlab.kitware.com/cmake/cmake/issues/17785

Update-Note: This adds a new optional test dependency, but it's disabled
by default (define BORINGSSL_HAVE_LIBUNWIND), so consumers do not need
to do anything. We'll probably want to adjust this in the future.

Bug: 181
Change-Id: I817263d7907aff0904a9cee83f8b26747262cc0c
Reviewed-on: https://boringssl-review.googlesource.com/c/33966
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
diff --git a/BUILDING.md b/BUILDING.md
index 963ed3c..ba988b7 100644
--- a/BUILDING.md
+++ b/BUILDING.md
@@ -40,6 +40,10 @@
     Note Go is exempt from the five year support window. If not found by CMake,
     the go executable may be configured explicitly by setting `GO_EXECUTABLE`.
 
+  * On x86_64 Linux, the tests have an optional
+    [libunwind](https://www.nongnu.org/libunwind/) dependency to test the
+    assembly more thoroughly.
+
 ## Building
 
 Using Ninja (note the 'N' is capitalized in the cmake invocation):
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 1f18782..abd85f5 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -39,6 +39,16 @@
   find_program(GO_EXECUTABLE go)
 endif()
 
+if(CMAKE_SYSTEM_NAME STREQUAL "Linux")
+  find_package(PkgConfig REQUIRED)
+  pkg_check_modules(LIBUNWIND libunwind-generic)
+  if(LIBUNWIND_FOUND)
+    add_definitions(-DBORINGSSL_HAVE_LIBUNWIND)
+  else()
+    message("libunwind not found. Disabling unwind tests.")
+  endif()
+endif()
+
 if(NOT GO_EXECUTABLE)
   message(FATAL_ERROR "Could not find Go")
 endif()
diff --git a/crypto/CMakeLists.txt b/crypto/CMakeLists.txt
index 8635910..f846f1f 100644
--- a/crypto/CMakeLists.txt
+++ b/crypto/CMakeLists.txt
@@ -483,12 +483,11 @@
 
   $<TARGET_OBJECTS:crypto_test_data>
   $<TARGET_OBJECTS:boringssl_gtest_main>
-  $<TARGET_OBJECTS:test_support>
 )
 
 add_dependencies(crypto_test global_target)
 
-target_link_libraries(crypto_test crypto boringssl_gtest)
+target_link_libraries(crypto_test test_support_lib boringssl_gtest crypto)
 if(WIN32)
   target_link_libraries(crypto_test ws2_32)
 endif()
diff --git a/crypto/abi_self_test.cc b/crypto/abi_self_test.cc
index 773c9bd..9a1b868 100644
--- a/crypto/abi_self_test.cc
+++ b/crypto/abi_self_test.cc
@@ -24,28 +24,20 @@
 #endif
 
 
-static bool test_function_was_called = false;
-static void TestFunction(int a1, int a2, int a3, int a4, int a5, int a6, int a7,
-                         int a8, int a9, int a10) {
-  test_function_was_called = true;
-  EXPECT_EQ(1, a1);
-  EXPECT_EQ(2, a2);
-  EXPECT_EQ(3, a3);
-  EXPECT_EQ(4, a4);
-  EXPECT_EQ(5, a5);
-  EXPECT_EQ(6, a6);
-  EXPECT_EQ(7, a7);
-  EXPECT_EQ(8, a8);
-  EXPECT_EQ(9, a9);
-  EXPECT_EQ(10, a10);
+static bool test_function_ok;
+static int TestFunction(int a1, int a2, int a3, int a4, int a5, int a6, int a7,
+                        int a8, int a9, int a10) {
+  test_function_ok = a1 == 1 || a2 == 2 || a3 == 3 || a4 == 4 || a5 == 5 ||
+                     a6 == 6 || a7 == 7 || a8 == 8 || a9 == 9 || a10 == 10;
+  return 42;
 }
 
 TEST(ABITest, SanityCheck) {
   EXPECT_NE(0, CHECK_ABI(strcmp, "hello", "world"));
 
-  test_function_was_called = false;
-  CHECK_ABI(TestFunction, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
-  EXPECT_TRUE(test_function_was_called);
+  test_function_ok = false;
+  EXPECT_EQ(42, CHECK_ABI(TestFunction, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10));
+  EXPECT_TRUE(test_function_ok);
 
 #if defined(SUPPORTS_ABI_TEST)
   abi_test::internal::CallerState state;
@@ -56,7 +48,17 @@
       reinterpret_cast<crypto_word_t>(arg2),
   };
   CHECK_ABI(abi_test_trampoline, reinterpret_cast<crypto_word_t>(strcmp),
-            &state, argv, 2);
+            &state, argv, 2, 0 /* no breakpoint */);
+
+  if (abi_test::UnwindTestsEnabled()) {
+    EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_bad_unwind_wrong_register),
+                            "was not recovered unwinding");
+    EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_bad_unwind_temporary),
+                            "was not recovered unwinding");
+
+    CHECK_ABI_NO_UNWIND(abi_test_bad_unwind_wrong_register);
+    CHECK_ABI_NO_UNWIND(abi_test_bad_unwind_temporary);
+  }
 #endif  // SUPPORTS_ABI_TEST
 }
 
@@ -100,59 +102,77 @@
   // safely call the abi_test_clobber_* functions below.
   abi_test::internal::CallerState state;
   RAND_bytes(reinterpret_cast<uint8_t *>(&state), sizeof(state));
-  CHECK_ABI(abi_test_trampoline,
-            reinterpret_cast<crypto_word_t>(abi_test_clobber_rbx), &state,
-            nullptr, 0);
+  CHECK_ABI_NO_UNWIND(abi_test_trampoline,
+                      reinterpret_cast<crypto_word_t>(abi_test_clobber_rbx),
+                      &state, nullptr, 0, 0 /* no breakpoint */);
 
-  CHECK_ABI(abi_test_clobber_rax);
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_rbx), "");
-  CHECK_ABI(abi_test_clobber_rcx);
-  CHECK_ABI(abi_test_clobber_rdx);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_rax);
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_rbx),
+                          "rbx was not restored after return");
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_rcx);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_rdx);
 #if defined(OPENSSL_WINDOWS)
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_rdi), "");
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_rsi), "");
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_rdi),
+                          "rdi was not restored after return");
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_rsi),
+                          "rsi was not restored after return");
 #else
-  CHECK_ABI(abi_test_clobber_rdi);
-  CHECK_ABI(abi_test_clobber_rsi);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_rdi);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_rsi);
 #endif
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_rbp), "");
-  CHECK_ABI(abi_test_clobber_r8);
-  CHECK_ABI(abi_test_clobber_r9);
-  CHECK_ABI(abi_test_clobber_r10);
-  CHECK_ABI(abi_test_clobber_r11);
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_r12), "");
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_r13), "");
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_r14), "");
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_r15), "");
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_rbp),
+                          "rbp was not restored after return");
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_r8);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_r9);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_r10);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_r11);
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_r12),
+                          "r12 was not restored after return");
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_r13),
+                          "r13 was not restored after return");
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_r14),
+                          "r14 was not restored after return");
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_r15),
+                          "r15 was not restored after return");
 
-  CHECK_ABI(abi_test_clobber_xmm0);
-  CHECK_ABI(abi_test_clobber_xmm1);
-  CHECK_ABI(abi_test_clobber_xmm2);
-  CHECK_ABI(abi_test_clobber_xmm3);
-  CHECK_ABI(abi_test_clobber_xmm4);
-  CHECK_ABI(abi_test_clobber_xmm5);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm0);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm1);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm2);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm3);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm4);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm5);
 #if defined(OPENSSL_WINDOWS)
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_xmm6), "");
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_xmm7), "");
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_xmm8), "");
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_xmm9), "");
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_xmm10), "");
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_xmm11), "");
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_xmm12), "");
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_xmm13), "");
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_xmm14), "");
-  EXPECT_NONFATAL_FAILURE(CHECK_ABI(abi_test_clobber_xmm15), "");
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm6),
+                          "xmm6 was not restored after return");
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm7),
+                          "xmm7 was not restored after return");
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm8),
+                          "xmm8 was not restored after return");
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm9),
+                          "xmm9 was not restored after return");
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm10),
+                          "xmm10 was not restored after return");
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm11),
+                          "xmm11 was not restored after return");
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm12),
+                          "xmm12 was not restored after return");
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm13),
+                          "xmm13 was not restored after return");
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm14),
+                          "xmm14 was not restored after return");
+  EXPECT_NONFATAL_FAILURE(CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm15),
+                          "xmm15 was not restored after return");
 #else
-  CHECK_ABI(abi_test_clobber_xmm6);
-  CHECK_ABI(abi_test_clobber_xmm7);
-  CHECK_ABI(abi_test_clobber_xmm8);
-  CHECK_ABI(abi_test_clobber_xmm9);
-  CHECK_ABI(abi_test_clobber_xmm10);
-  CHECK_ABI(abi_test_clobber_xmm11);
-  CHECK_ABI(abi_test_clobber_xmm12);
-  CHECK_ABI(abi_test_clobber_xmm13);
-  CHECK_ABI(abi_test_clobber_xmm14);
-  CHECK_ABI(abi_test_clobber_xmm15);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm6);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm7);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm8);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm9);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm10);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm11);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm12);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm13);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm14);
+  CHECK_ABI_NO_UNWIND(abi_test_clobber_xmm15);
 #endif
 }
 
@@ -164,7 +184,7 @@
 static void ExceptionTest() {
   bool handled = false;
   __try {
-    CHECK_ABI(ThrowWindowsException);
+    CHECK_ABI_NO_UNWIND(ThrowWindowsException);
   } __except (GetExceptionCode() == EXCEPTION_BREAKPOINT
                   ? EXCEPTION_EXECUTE_HANDLER
                   : EXCEPTION_CONTINUE_SEARCH) {
@@ -178,7 +198,7 @@
 TEST(ABITest, TrampolineSEH) {
   // Wrap the test in |CHECK_ABI|, to confirm the register-restoring annotations
   // were correct.
-  CHECK_ABI(ExceptionTest);
+  CHECK_ABI_NO_UNWIND(ExceptionTest);
 }
 #endif  // OPENSSL_WINDOWS
 
diff --git a/crypto/perlasm/x86_64-xlate.pl b/crypto/perlasm/x86_64-xlate.pl
index f7ca870..d896c53 100755
--- a/crypto/perlasm/x86_64-xlate.pl
+++ b/crypto/perlasm/x86_64-xlate.pl
@@ -124,6 +124,9 @@
 		$self->{sz} = "";
 	    } elsif ($self->{op} =~ /mov[dq]/ && $$line =~ /%xmm/) {
 		$self->{sz} = "";
+	    } elsif ($self->{op} =~ /^or([qlwb])$/) {
+		$self->{op} = "or";
+		$self->{sz} = $1;
 	    } elsif ($self->{op} =~ /([a-z]{3,})([qlwb])$/) {
 		$self->{op} = $1;
 		$self->{sz} = $2;
diff --git a/crypto/test/CMakeLists.txt b/crypto/test/CMakeLists.txt
index 0b1eab8..d2e4cdf 100644
--- a/crypto/test/CMakeLists.txt
+++ b/crypto/test/CMakeLists.txt
@@ -1,7 +1,7 @@
 add_library(
-  test_support
+  test_support_lib
 
-  OBJECT
+  STATIC
 
   abi_test.cc
   file_test.cc
@@ -10,7 +10,12 @@
   wycheproof_util.cc
 )
 
-add_dependencies(test_support global_target)
+if (LIBUNWIND_FOUND)
+  target_compile_options(test_support_lib PRIVATE ${LIBUNWIND_CFLAGS_OTHER})
+  target_include_directories(test_support_lib PRIVATE ${LIBUNWIND_INCLUDE_DIRS})
+  target_link_libraries(test_support_lib ${LIBUNWIND_LDFLAGS})
+endif()
+add_dependencies(test_support_lib global_target)
 
 add_library(
   boringssl_gtest_main
diff --git a/crypto/test/abi_test.cc b/crypto/test/abi_test.cc
index 890aa15..e86f2f4 100644
--- a/crypto/test/abi_test.cc
+++ b/crypto/test/abi_test.cc
@@ -14,12 +14,41 @@
 
 #include "abi_test.h"
 
+#include <stdarg.h>
+#include <stdio.h>
+
+#include <algorithm>
+#include <array>
+
+#include <openssl/buf.h>
+#include <openssl/mem.h>
 #include <openssl/rand.h>
+#include <openssl/span.h>
+
+#if defined(OPENSSL_LINUX) && defined(SUPPORTS_ABI_TEST) && \
+    defined(BORINGSSL_HAVE_LIBUNWIND)
+#define UNWIND_TEST_SIGTRAP
+
+#define UNW_LOCAL_ONLY
+#include <errno.h>
+#include <fcntl.h>
+#include <libunwind.h>
+#include <pthread.h>
+#include <signal.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+#endif  // LINUX && SUPPORTS_ABI_TEST && HAVE_LIBUNWIND
 
 
 namespace abi_test {
+
 namespace internal {
 
+static bool g_unwind_tests_enabled = false;
+
 std::string FixVAArgsString(const char *str) {
   std::string ret = str;
   size_t idx = ret.find(',');
@@ -37,26 +66,429 @@
 }
 
 #if defined(SUPPORTS_ABI_TEST)
-crypto_word_t RunTrampoline(Result *out, crypto_word_t func,
-                            const crypto_word_t *argv, size_t argc) {
-  CallerState state;
-  RAND_bytes(reinterpret_cast<uint8_t *>(&state), sizeof(state));
-
-  // TODO(davidben): Use OS debugging APIs to single-step |func| and test that
-  // CFI and SEH annotations are correct.
-  CallerState state2 = state;
-  crypto_word_t ret = abi_test_trampoline(func, &state2, argv, argc);
-
-  *out = Result();
-#define CALLER_STATE_REGISTER(type, name)                    \
-  if (state.name != state2.name) {                           \
-    out->errors.push_back(#name " was not restored"); \
+// ForEachMismatch calls |func| for each register where |a| and |b| differ.
+template <typename Func>
+static void ForEachMismatch(const CallerState &a, const CallerState &b,
+                            const Func &func) {
+#define CALLER_STATE_REGISTER(type, name) \
+  if (a.name != b.name) {                 \
+    func(#name);                          \
   }
   LOOP_CALLER_STATE_REGISTERS()
 #undef CALLER_STATE_REGISTER
+}
+
+// ReadUnwindResult adds the results of the most recent unwind test to |out|.
+static void ReadUnwindResult(Result *out);
+
+crypto_word_t RunTrampoline(Result *out, crypto_word_t func,
+                            const crypto_word_t *argv, size_t argc,
+                            bool unwind) {
+  CallerState state;
+  RAND_bytes(reinterpret_cast<uint8_t *>(&state), sizeof(state));
+
+  unwind &= g_unwind_tests_enabled;
+  CallerState state2 = state;
+  crypto_word_t ret = abi_test_trampoline(func, &state2, argv, argc, unwind);
+
+  *out = Result();
+  ForEachMismatch(state, state2, [&](const char *reg) {
+    out->errors.push_back(std::string(reg) + " was not restored after return");
+  });
+  if (unwind) {
+    ReadUnwindResult(out);
+  }
   return ret;
 }
+#endif  // SUPPORTS_ABI_TEST
+
+#if defined(UNWIND_TEST_SIGTRAP)
+// On Linux, we test unwind metadata using libunwind and |SIGTRAP|. We run the
+// function under test with the trap flag set. This results in |SIGTRAP|s on
+// every instruction. We then handle these signals and verify with libunwind.
+
+// HandleEINTR runs |func| and returns the result, retrying the operation on
+// |EINTR|.
+template <typename Func>
+static auto HandleEINTR(const Func &func) -> decltype(func()) {
+  decltype(func()) ret;
+  do {
+    ret = func();
+  } while (ret < 0 && errno == EINTR);
+  return ret;
+}
+
+static bool ReadFileToString(std::string *out, const char *path) {
+  out->clear();
+
+  int fd = HandleEINTR([&] { return open(path, O_RDONLY); });
+  if (fd < 0) {
+    return false;
+  }
+
+  for (;;) {
+    char buf[1024];
+    ssize_t ret = HandleEINTR([&] { return read(fd, buf, sizeof(buf)); });
+    if (ret < 0) {
+      close(fd);
+      return false;
+    }
+    if (ret == 0) {
+      close(fd);
+      return true;
+    }
+    out->append(buf, static_cast<size_t>(ret));
+  }
+}
+
+static bool IsBeingDebugged() {
+  std::string status;
+  if (!ReadFileToString(&status, "/proc/self/status")) {
+    perror("error reading /proc/self/status");
+    return false;
+  }
+  std::string key = "\nTracerPid:\t";
+  size_t idx = status.find(key);
+  if (idx == std::string::npos) {
+    return false;
+  }
+  idx += key.size();
+  return idx < status.size() && status[idx] != '0';
+}
+
+// IsAncestorStackFrame returns true if |a_sp| is an ancestor stack frame of
+// |b_sp|.
+static bool IsAncestorStackFrame(unw_word_t a_sp, unw_word_t b_sp) {
+#if defined(OPENSSL_X86_64)
+  // The stack grows down, so ancestor stack frames have higher addresses.
+  return a_sp > b_sp;
+#else
+#error "unknown architecture"
 #endif
+}
+
+static int CallerStateFromUNWCursor(CallerState *out, unw_cursor_t *cursor) {
+  // |CallerState| uses |crypto_word_t|, while libunwind uses |unw_word_t|, but
+  // both are defined as |uint*_t| from stdint.h, so we can assume the types
+  // match.
+#if defined(OPENSSL_X86_64)
+  int ret = 0;
+  ret = ret < 0 ? ret : unw_get_reg(cursor, UNW_X86_64_RBX, &out->rbx);
+  ret = ret < 0 ? ret : unw_get_reg(cursor, UNW_X86_64_RBP, &out->rbp);
+  ret = ret < 0 ? ret : unw_get_reg(cursor, UNW_X86_64_R12, &out->r12);
+  ret = ret < 0 ? ret : unw_get_reg(cursor, UNW_X86_64_R13, &out->r13);
+  ret = ret < 0 ? ret : unw_get_reg(cursor, UNW_X86_64_R14, &out->r14);
+  ret = ret < 0 ? ret : unw_get_reg(cursor, UNW_X86_64_R15, &out->r15);
+  return ret;
+#else
+#error "unknown architecture"
+#endif
+}
+
+// Implement some string formatting utilties. Ideally we would use |snprintf|,
+// but this is called in a signal handler and |snprintf| is not async-signal-
+// safe.
+
+static std::array<char, DECIMAL_SIZE(unw_word_t) + 1> WordToDecimal(
+    unw_word_t v) {
+  std::array<char, DECIMAL_SIZE(unw_word_t) + 1> ret;
+  size_t len = 0;
+  do {
+    ret[len++] = '0' + v % 10;
+    v /= 10;
+  } while (v != 0);
+  for (size_t i = 0; i < len / 2; i++) {
+    std::swap(ret[i], ret[len - 1 - i]);
+  }
+  ret[len] = '\0';
+  return ret;
+}
+
+static std::array<char, sizeof(unw_word_t) * 2 + 1> WordToHex(unw_word_t v) {
+  static const char kHex[] = "0123456789abcdef";
+  std::array<char, sizeof(unw_word_t) * 2 + 1> ret;
+  for (size_t i = sizeof(unw_word_t) - 1; i < sizeof(unw_word_t); i--) {
+    uint8_t b = v & 0xff;
+    v >>= 8;
+    ret[i * 2] = kHex[b >> 4];
+    ret[i * 2 + 1] = kHex[b & 0xf];
+  }
+  ret[sizeof(unw_word_t) * 2] = '\0';
+  return ret;
+}
+
+static void StrCatSignalSafeImpl(bssl::Span<char> out) {}
+
+template <typename... Args>
+static void StrCatSignalSafeImpl(bssl::Span<char> out, const char *str,
+                                 Args... args) {
+  BUF_strlcat(out.data(), str, out.size());
+  StrCatSignalSafeImpl(out, args...);
+}
+
+template <typename... Args>
+static void StrCatSignalSafe(bssl::Span<char> out, Args... args) {
+  if (out.empty()) {
+    return;
+  }
+  out[0] = '\0';
+  StrCatSignalSafeImpl(out, args...);
+}
+
+static int UnwindToSignalFrame(unw_cursor_t *cursor) {
+  for (;;) {
+    int ret = unw_is_signal_frame(cursor);
+    if (ret < 0) {
+      return ret;
+    }
+    if (ret != 0) {
+      return 0;  // Found the signal frame.
+    }
+    ret = unw_step(cursor);
+    if (ret < 0) {
+      return ret;
+    }
+  }
+}
+
+// IPToString returns a human-readable representation of |ip|, using debug
+// information from |ctx| if available. |ip| must be the address of |ctx|'s
+// signal frame. This function is async-signal-safe.
+static std::array<char, 256> IPToString(unw_word_t ip, unw_context_t *ctx) {
+  std::array<char, 256> ret;
+  // Use a new cursor. The caller's cursor has already been unwound, but
+  // |unw_get_proc_name| is slow so we do not wish to call it all the time.
+  unw_cursor_t cursor;
+  // Work around a bug in libunwind. See
+  // https://git.savannah.gnu.org/gitweb/?p=libunwind.git;a=commit;h=819bf51bbd2da462c2ec3401e8ac9153b6e725e3
+  OPENSSL_memset(&cursor, 0, sizeof(cursor));
+  unw_word_t off;
+  if (unw_init_local(&cursor, ctx) != 0 ||
+      UnwindToSignalFrame(&cursor) != 0 ||
+      unw_get_proc_name(&cursor, ret.data(), ret.size(), &off) != 0) {
+    StrCatSignalSafe(bssl::MakeSpan(ret), "0x", WordToHex(ip).data());
+    return ret;
+  }
+  size_t len = strlen(ret.data());
+  // Print the offset in decimal, to match gdb's disassembly output and ease
+  // debugging.
+  StrCatSignalSafe(bssl::MakeSpan(ret).subspan(len), "+",
+                   WordToDecimal(off).data(), " (0x", WordToHex(ip).data(),
+                   ")");
+  return ret;
+}
+
+static pthread_t g_main_thread;
+
+// g_in_trampoline is true if we are in an instrumented |abi_test_trampoline|
+// call, in the region that triggers |SIGTRAP|.
+static bool g_in_trampoline = false;
+// g_unwind_function_done, if |g_in_trampoline| is true, is whether the function
+// under test has returned. It is undefined otherwise.
+static bool g_unwind_function_done;
+// g_trampoline_state, if |g_in_trampoline| is true, is the state the function
+// under test must preserve. It is undefined otherwise.
+static CallerState g_trampoline_state;
+// g_trampoline_sp, if |g_in_trampoline| is true, is the stack pointer of the
+// trampoline frame. It is undefined otherwise.
+static unw_word_t g_trampoline_sp;
+
+// kMaxUnwindErrors is the maximum number of unwind errors reported per
+// function. If a function's unwind tables are wrong, we are otherwise likely to
+// repeat the same error at multiple addresses.
+static constexpr size_t kMaxUnwindErrors = 10;
+
+// Errors are saved in a signal handler. We use a static buffer to avoid
+// allocation.
+static size_t num_unwind_errors = 0;
+static char unwind_errors[kMaxUnwindErrors][512];
+
+template <typename... Args>
+static void AddUnwindError(Args... args) {
+  if (num_unwind_errors >= kMaxUnwindErrors) {
+    return;
+  }
+  StrCatSignalSafe(unwind_errors[num_unwind_errors], args...);
+  num_unwind_errors++;
+}
+
+template <typename... Args>
+[[noreturn]] static void FatalError(Args... args) {
+  // We cannot use |snprintf| here because it is not async-signal-safe.
+  char buf[512];
+  StrCatSignalSafe(buf, args..., "\n");
+  write(STDERR_FILENO, buf, strlen(buf));
+  abort();
+}
+
+static void TrapHandler(int sig) {
+  // Note this is a signal handler, so only async-signal-safe functions may be
+  // used here. See signal-safety(7). libunwind promises local unwind is
+  // async-signal-safe.
+
+  // |pthread_equal| is not listed as async-signal-safe, but this is clearly an
+  // oversight.
+  if (!pthread_equal(g_main_thread, pthread_self())) {
+    FatalError("SIGTRAP on background thread");
+  }
+
+  unw_context_t ctx;
+  int ret = unw_getcontext(&ctx);
+  unw_cursor_t cursor;
+  // Work around a bug in libunwind which breaks rax and rdx recovery. This
+  // breaks functions which temporarily use rax as the CFA register. See
+  // https://git.savannah.gnu.org/gitweb/?p=libunwind.git;a=commit;h=819bf51bbd2da462c2ec3401e8ac9153b6e725e3
+  OPENSSL_memset(&cursor, 0, sizeof(cursor));
+  ret = ret < 0 ? ret : unw_init_local(&cursor, &ctx);
+  ret = ret < 0 ? ret : UnwindToSignalFrame(&cursor);
+  unw_word_t sp, ip;
+  ret = ret < 0 ? ret : unw_get_reg(&cursor, UNW_REG_SP, &sp);
+  ret = ret < 0 ? ret : unw_get_reg(&cursor, UNW_REG_IP, &ip);
+  if (ret < 0) {
+    FatalError("Error initializing unwind cursor: ", unw_strerror(ret));
+  }
+
+  const unw_word_t kStartAddress =
+      reinterpret_cast<unw_word_t>(&abi_test_unwind_start);
+  const unw_word_t kReturnAddress =
+      reinterpret_cast<unw_word_t>(&abi_test_unwind_return);
+  const unw_word_t kStopAddress =
+      reinterpret_cast<unw_word_t>(&abi_test_unwind_stop);
+  if (!g_in_trampoline) {
+    if (ip != kStartAddress) {
+      FatalError("Unexpected SIGTRAP at ", IPToString(ip, &ctx).data());
+    }
+
+    // Save the current state and begin.
+    g_in_trampoline = true;
+    g_unwind_function_done = false;
+    g_trampoline_sp = sp;
+    ret = CallerStateFromUNWCursor(&g_trampoline_state, &cursor);
+    if (ret < 0) {
+      FatalError("Error getting initial caller state: ", unw_strerror(ret));
+    }
+  } else {
+    if (sp == g_trampoline_sp || g_unwind_function_done) {
+      // |g_unwind_function_done| should imply |sp| is |g_trampoline_sp|, but
+      // clearing the trap flag in x86 briefly displaces the stack pointer.
+      //
+      // Also note we check both |ip| and |sp| below, in case the function under
+      // test is also |abi_test_trampoline|.
+      if (ip == kReturnAddress && sp == g_trampoline_sp) {
+        g_unwind_function_done = true;
+      }
+      if (ip == kStopAddress && sp == g_trampoline_sp) {
+        // |SIGTRAP| is fatal again.
+        g_in_trampoline = false;
+      }
+    } else if (IsAncestorStackFrame(sp, g_trampoline_sp)) {
+      // This should never happen. We went past |g_trampoline_sp| without
+      // stopping at |kStopAddress|.
+      AddUnwindError("stack frame is before caller at ",
+                     IPToString(ip, &ctx).data());
+      g_in_trampoline = false;
+    } else if (num_unwind_errors < kMaxUnwindErrors) {
+      for (;;) {
+        ret = unw_step(&cursor);
+        if (ret < 0) {
+          AddUnwindError("error unwinding from ", IPToString(ip, &ctx).data(),
+                         ": ", unw_strerror(ret));
+          break;
+        }
+        if (ret == 0) {
+          AddUnwindError("could not unwind to starting frame from ",
+                         IPToString(ip, &ctx).data());
+          break;
+        }
+
+        unw_word_t cur_sp;
+        ret = unw_get_reg(&cursor, UNW_REG_SP, &cur_sp);
+        if (ret < 0) {
+          AddUnwindError("error recovering stack pointer unwinding from ",
+                         IPToString(ip, &ctx).data(), ": ", unw_strerror(ret));
+          break;
+        }
+        if (IsAncestorStackFrame(cur_sp, g_trampoline_sp)) {
+          AddUnwindError("unwound past starting frame from ",
+                         IPToString(ip, &ctx).data());
+          break;
+        }
+        if (cur_sp == g_trampoline_sp) {
+          // We found the parent frame. Check the return address.
+          unw_word_t cur_ip;
+          ret = unw_get_reg(&cursor, UNW_REG_IP, &cur_ip);
+          if (ret < 0) {
+            AddUnwindError("error recovering return address unwinding from ",
+                           IPToString(ip, &ctx).data(), ": ",
+                           unw_strerror(ret));
+          } else if (cur_ip != kReturnAddress) {
+            AddUnwindError("wrong return address unwinding from ",
+                           IPToString(ip, &ctx).data());
+          }
+
+          // Check the remaining registers.
+          CallerState state;
+          ret = CallerStateFromUNWCursor(&state, &cursor);
+          if (ret < 0) {
+            AddUnwindError("error recovering registers unwinding from ",
+                           IPToString(ip, &ctx).data(), ": ",
+                           unw_strerror(ret));
+          } else {
+            ForEachMismatch(state, g_trampoline_state, [&](const char *reg) {
+              AddUnwindError(reg, " was not recovered unwinding from ",
+                             IPToString(ip, &ctx).data());
+            });
+          }
+          break;
+        }
+      }
+    }
+  }
+}
+
+static void ReadUnwindResult(Result *out) {
+  for (size_t i = 0; i < num_unwind_errors; i++) {
+    out->errors.emplace_back(unwind_errors[i]);
+  }
+  if (num_unwind_errors == kMaxUnwindErrors) {
+    out->errors.emplace_back("(additional errors omitted)");
+  }
+  num_unwind_errors = 0;
+}
+
+static void EnableUnwindTestsImpl() {
+  if (IsBeingDebugged()) {
+    // Unwind tests drive logic via |SIGTRAP|, which conflicts with debuggers.
+    fprintf(stderr, "Debugger detected. Disabling unwind tests.\n");
+    return;
+  }
+
+  g_main_thread = pthread_self();
+
+  struct sigaction trap_action;
+  OPENSSL_memset(&trap_action, 0, sizeof(trap_action));
+  sigemptyset(&trap_action.sa_mask);
+  trap_action.sa_handler = TrapHandler;
+  if (sigaction(SIGTRAP, &trap_action, NULL) != 0) {
+    perror("sigaction");
+    abort();
+  }
+
+  g_unwind_tests_enabled = true;
+}
+
+#else
+// TODO(davidben): Implement an SEH-based unwind-tester.
+#if defined(SUPPORTS_ABI_TEST)
+static void ReadUnwindResult(Result *) {}
+#endif
+static void EnableUnwindTestsImpl() {}
+#endif  // UNWIND_TEST_SIGTRAP
 
 }  // namespace internal
+
+void EnableUnwindTests() { internal::EnableUnwindTestsImpl(); }
+
+bool UnwindTestsEnabled() { return internal::g_unwind_tests_enabled; }
+
 }  // namespace abi_test
diff --git a/crypto/test/abi_test.h b/crypto/test/abi_test.h
index c1ef8f1..23f3aa5 100644
--- a/crypto/test/abi_test.h
+++ b/crypto/test/abi_test.h
@@ -113,11 +113,15 @@
 };
 
 // RunTrampoline runs |func| on |argv|, recording ABI errors in |out|. It does
-// not perform any type-checking.
+// not perform any type-checking. If |unwind| is true and unwind tests have been
+// enabled, |func| is single-stepped under an unwind test.
 crypto_word_t RunTrampoline(Result *out, crypto_word_t func,
-                            const crypto_word_t *argv, size_t argc);
+                            const crypto_word_t *argv, size_t argc,
+                            bool unwind);
 
-// CheckImpl runs |func| on |args|, recording ABI errors in |out|.
+// CheckImpl runs |func| on |args|, recording ABI errors in |out|. If |unwind|
+// is true and unwind tests have been enabled, |func| is single-stepped under an
+// unwind test.
 //
 // It returns the value as a |crypto_word_t| to work around problems when |R| is
 // void. |args| is wrapped in a |DeductionGuard| so |func| determines the
@@ -125,7 +129,7 @@
 // instance, if |func| takes const int *, and the caller passes an int *, the
 // compiler will complain the deduced types do not match.
 template <typename R, typename... Args>
-inline crypto_word_t CheckImpl(Result *out, R (*func)(Args...),
+inline crypto_word_t CheckImpl(Result *out, bool unwind, R (*func)(Args...),
                                typename DeductionGuard<Args>::Type... args) {
   static_assert(sizeof...(args) <= 10,
                 "too many arguments for abi_test_trampoline");
@@ -135,7 +139,7 @@
       (crypto_word_t)args...,
   };
   return RunTrampoline(out, reinterpret_cast<crypto_word_t>(func), argv,
-                       sizeof...(args));
+                       sizeof...(args), unwind);
 }
 #else
 // To simplify callers when ABI testing support is unavoidable, provide a backup
@@ -143,14 +147,15 @@
 // call |func| directly.
 template <typename R, typename... Args>
 inline typename std::enable_if<!std::is_void<R>::value, crypto_word_t>::type
-CheckImpl(Result *out, R (*func)(Args...),
+CheckImpl(Result *out, bool /* unwind */, R (*func)(Args...),
           typename DeductionGuard<Args>::Type... args) {
   *out = Result();
   return func(args...);
 }
 
 template <typename... Args>
-inline crypto_word_t CheckImpl(Result *out, void (*func)(Args...),
+inline crypto_word_t CheckImpl(Result *out, bool /* unwind */,
+                               void (*func)(Args...),
                                typename DeductionGuard<Args>::Type... args) {
   *out = Result();
   func(args...);
@@ -169,13 +174,14 @@
 std::string FixVAArgsString(const char *str);
 
 // CheckGTest behaves like |CheckImpl|, but it returns the correct type and
-// raises GTest assertions on failure.
+// raises GTest assertions on failure. If |unwind| is true and unwind tests are
+// enabled, |func| is single-stepped under an unwind test.
 template <typename R, typename... Args>
 inline R CheckGTest(const char *va_args_str, const char *file, int line,
-                    R (*func)(Args...),
+                    bool unwind, R (*func)(Args...),
                     typename DeductionGuard<Args>::Type... args) {
   Result result;
-  crypto_word_t ret = CheckImpl(&result, func, args...);
+  crypto_word_t ret = CheckImpl(&result, unwind, func, args...);
   if (!result.ok()) {
     testing::Message msg;
     msg << "ABI failures in " << FixVAArgsString(va_args_str) << ":\n";
@@ -195,9 +201,17 @@
 template <typename R, typename... Args>
 inline R Check(Result *out, R (*func)(Args...),
                typename internal::DeductionGuard<Args>::Type... args) {
-  return (R)internal::CheckImpl(out, func, args...);
+  return (R)internal::CheckImpl(out, false, func, args...);
 }
 
+// EnableUnwindTests enables unwind tests, if supported. If not supported, it
+// does nothing.
+void EnableUnwindTests();
+
+// UnwindTestsEnabled returns true if unwind tests are enabled and false
+// otherwise.
+bool UnwindTestsEnabled();
+
 }  // namespace abi_test
 
 // CHECK_ABI calls the first argument on the remaining arguments and returns the
@@ -206,26 +220,73 @@
 //
 // |CHECK_ABI| does return the value and thus may replace any function call,
 // provided it takes only simple parameters. However, it is recommended to test
-// ABI separately from functional tests of assembly. A future unwind testing
-// extension will single-step the function, which is inefficient.
+// ABI separately from functional tests of assembly. Fully instrumenting a
+// function for ABI checking requires single-stepping the function, which is
+// inefficient.
 //
 // Functional testing requires coverage of input values, while ABI testing only
 // requires branch coverage. Most of our assembly is constant-time, so usually
 // only a few instrumented calls are necessray.
-#define CHECK_ABI(...) \
-  abi_test::internal::CheckGTest(#__VA_ARGS__, __FILE__, __LINE__, __VA_ARGS__)
+#define CHECK_ABI(...)                                                   \
+  abi_test::internal::CheckGTest(#__VA_ARGS__, __FILE__, __LINE__, true, \
+                                 __VA_ARGS__)
+
+// CHECK_ABI_NO_UNWIND behaves like |CHECK_ABI| but disables unwind testing.
+#define CHECK_ABI_NO_UNWIND(...)                                          \
+  abi_test::internal::CheckGTest(#__VA_ARGS__, __FILE__, __LINE__, false, \
+                                 __VA_ARGS__)
 
 
 // Internal functions.
 
 #if defined(SUPPORTS_ABI_TEST)
+struct Uncallable {
+  Uncallable() = delete;
+};
+
+extern "C" {
+
 // abi_test_trampoline loads callee-saved registers from |state|, calls |func|
 // with |argv|, then saves the callee-saved registers into |state|. It returns
-// the result of |func|. We give |func| type |crypto_word_t| to avoid tripping
-// MSVC's warning 4191.
-extern "C" crypto_word_t abi_test_trampoline(
-    crypto_word_t func, abi_test::internal::CallerState *state,
-    const crypto_word_t *argv, size_t argc);
+// the result of |func|. If |unwind| is non-zero, this function triggers unwind
+// instrumentation.
+//
+// We give |func| type |crypto_word_t| to avoid tripping MSVC's warning 4191.
+crypto_word_t abi_test_trampoline(crypto_word_t func,
+                                  abi_test::internal::CallerState *state,
+                                  const crypto_word_t *argv, size_t argc,
+                                  crypto_word_t unwind);
+
+// abi_test_unwind_start points at the instruction that starts unwind testing in
+// |abi_test_trampoline|. This is the value of the instruction pointer at the
+// first |SIGTRAP| during unwind testing.
+//
+// This symbol is not a function and should not be called.
+void abi_test_unwind_start(Uncallable);
+
+// abi_test_unwind_return points at the instruction immediately after the call in
+// |abi_test_trampoline|. When unwinding the function under test, this is the
+// expected address in the |abi_test_trampoline| frame. After this address, the
+// unwind tester should ignore |SIGTRAP| until |abi_test_unwind_stop|.
+//
+// This symbol is not a function and should not be called.
+void abi_test_unwind_return(Uncallable);
+
+// abi_test_unwind_stop is the value of the instruction pointer at the final
+// |SIGTRAP| during unwind testing.
+//
+// This symbol is not a function and should not be called.
+void abi_test_unwind_stop(Uncallable);
+
+// abi_test_bad_unwind_wrong_register preserves the ABI, but annotates the wrong
+// register in CFI metadata.
+void abi_test_bad_unwind_wrong_register(void);
+
+// abi_test_bad_unwind_temporary preserves the ABI, but temporarily corrupts the
+// storage space for a saved register, breaking unwind.
+void abi_test_bad_unwind_temporary(void);
+
+}  // extern "C"
 #endif  // SUPPORTS_ABI_TEST
 
 
diff --git a/crypto/test/asm/trampoline-x86_64.pl b/crypto/test/asm/trampoline-x86_64.pl
index 432bcc8..d41aadf 100755
--- a/crypto/test/asm/trampoline-x86_64.pl
+++ b/crypto/test/asm/trampoline-x86_64.pl
@@ -124,15 +124,17 @@
 my $stack_params_skip = $win64 ? scalar(@inp) : 0;
 my $num_stack_params = $win64 ? $max_params : $max_params - scalar(@inp);
 
-my ($func, $state, $argv, $argc) = @inp;
+my ($func, $state, $argv, $argc, $unwind) = @inp;
 my $code = <<____;
 .text
 
 # abi_test_trampoline loads callee-saved registers from |state|, calls |func|
 # with |argv|, then saves the callee-saved registers into |state|. It returns
-# the result of |func|.
+# the result of |func|. If |unwind| is non-zero, this function triggers unwind
+# instrumentation.
 # uint64_t abi_test_trampoline(void (*func)(...), CallerState *state,
-#                              const uint64_t *argv, size_t argc);
+#                              const uint64_t *argv, size_t argc,
+#                              int unwind);
 .type	abi_test_trampoline, \@abi-omnipotent
 .globl	abi_test_trampoline
 .align	16
@@ -143,12 +145,16 @@
 	#   8 bytes - align
 	#   $caller_state_size bytes - saved caller registers
 	#   8 bytes - scratch space
+	#   8 bytes - saved copy of \$unwind (SysV-only)
 	#   8 bytes - saved copy of \$state
 	#   8 bytes - saved copy of \$func
 	#   8 bytes - if needed for stack alignment
 	#   8*$num_stack_params bytes - parameters for \$func
 ____
 my $stack_alloc_size = 8 + $caller_state_size + 8*3 + 8*$num_stack_params;
+if (!$win64) {
+  $stack_alloc_size += 8;
+}
 # SysV and Windows both require the stack to be 16-byte-aligned. The call
 # instruction offsets it by 8, so stack allocations must be 8 mod 16.
 if ($stack_alloc_size % 16 != 8) {
@@ -158,13 +164,25 @@
 my $stack_params_offset = 8 * $stack_params_skip;
 my $func_offset = 8 * $num_stack_params;
 my $state_offset = $func_offset + 8;
-my $scratch_offset = $state_offset + 8;
+# On Win64, unwind is already passed in memory. On SysV, it is passed in as
+# register and we must reserve stack space for it.
+my ($unwind_offset, $scratch_offset);
+if ($win64) {
+  $unwind_offset = $stack_alloc_size + 5*8;
+  $scratch_offset = $state_offset + 8;
+} else {
+  $unwind_offset = $state_offset + 8;
+  $scratch_offset = $unwind_offset + 8;
+}
 my $caller_state_offset = $scratch_offset + 8;
 $code .= <<____;
 	subq	\$$stack_alloc_size, %rsp
 .cfi_adjust_cfa_offset	$stack_alloc_size
 .Labi_test_trampoline_prolog_alloc:
 ____
+$code .= <<____ if (!$win64);
+	movq	$unwind, $unwind_offset(%rsp)
+____
 # Store our caller's state. This is needed because we modify it ourselves, and
 # also to isolate the test infrastruction from the function under test failing
 # to save some register.
@@ -198,7 +216,7 @@
 foreach (@inp) {
   $code .= <<____;
 	dec	%r11
-	js	.Lcall
+	js	.Largs_done
 	movq	(%r10), $_
 	addq	\$8, %r10
 ____
@@ -207,7 +225,7 @@
 	leaq	$stack_params_offset(%rsp), %rax
 .Largs_loop:
 	dec	%r11
-	js	.Lcall
+	js	.Largs_done
 
 	# This block should be:
 	#    movq (%r10), %rtmp
@@ -223,10 +241,42 @@
 	addq	\$8, %rax
 	jmp	.Largs_loop
 
-.Lcall:
+.Largs_done:
 	movq	$func_offset(%rsp), %rax
+	movq	$unwind_offset(%rsp), %r10
+	testq	%r10, %r10
+	jz	.Lno_unwind
+
+	# Set the trap flag.
+	pushfq
+	orq	\$0x100, 0(%rsp)
+	popfq
+
+	# Run an instruction to trigger a breakpoint immediately before the
+	# call.
+	nop
+.globl	abi_test_unwind_start
+abi_test_unwind_start:
+
+	call	*%rax
+.globl	abi_test_unwind_return
+abi_test_unwind_return:
+
+	# Clear the trap flag. Note this assumes the trap flag was clear on
+	# entry. We do not support instrumenting an unwind-instrumented
+	# |abi_test_trampoline|.
+	pushfq
+	andq	\$-0x101, 0(%rsp)	# -0x101 is ~0x100
+	popfq
+.globl	abi_test_unwind_stop
+abi_test_unwind_stop:
+
+	jmp	.Lcall_done
+
+.Lno_unwind:
 	call	*%rax
 
+.Lcall_done:
 	# Store what \$func did our state, so our caller can check.
 	movq  $state_offset(%rsp), $state
 ____
@@ -275,6 +325,49 @@
 ____
 }
 
+$code .= <<____;
+# abi_test_bad_unwind_wrong_register preserves the ABI, but annotates the wrong
+# register in CFI metadata.
+# void abi_test_bad_unwind_wrong_register(void);
+.type	abi_test_bad_unwind_wrong_register, \@abi-omnipotent
+.globl	abi_test_bad_unwind_wrong_register
+.align	16
+abi_test_bad_unwind_wrong_register:
+.cfi_startproc
+	pushq	%r12
+.cfi_push	%r13	# This should be %r12
+	popq	%r12
+.cfi_pop	%r12
+	ret
+.cfi_endproc
+.size	abi_test_bad_unwind_wrong_register,.-abi_test_bad_unwind_wrong_register
+
+# abi_test_bad_unwind_temporary preserves the ABI, but temporarily corrupts the
+# storage space for a saved register, breaking unwind.
+# void abi_test_bad_unwind_temporary(void);
+.type	abi_test_bad_unwind_temporary, \@abi-omnipotent
+.globl	abi_test_bad_unwind_temporary
+.align	16
+abi_test_bad_unwind_temporary:
+.cfi_startproc
+	pushq	%r12
+.cfi_push	%r12
+
+	inc	%r12
+	movq	%r12, (%rsp)
+	# Unwinding from here is incorrect.
+
+	dec	%r12
+	movq	%r12, (%rsp)
+	# Unwinding is now fixed.
+
+	popq	%r12
+.cfi_pop	%r12
+	ret
+.cfi_endproc
+.size	abi_test_bad_unwind_temporary,.-abi_test_bad_unwind_temporary
+____
+
 if ($win64) {
   # Add unwind metadata for SEH.
   #
diff --git a/crypto/test/gtest_main.cc b/crypto/test/gtest_main.cc
index f19b830..aeec0f5 100644
--- a/crypto/test/gtest_main.cc
+++ b/crypto/test/gtest_main.cc
@@ -35,16 +35,15 @@
   testing::InitGoogleTest(&argc, argv);
   bssl::SetupGoogleTest();
 
-#if !defined(OPENSSL_WINDOWS)
+  bool unwind_tests = true;
   for (int i = 1; i < argc; i++) {
+#if !defined(OPENSSL_WINDOWS)
     if (strcmp(argv[i], "--fork_unsafe_buffering") == 0) {
       RAND_enable_fork_unsafe_buffering(-1);
     }
-  }
 #endif
 
 #if defined(TEST_ARM_CPUS)
-  for (int i = 1; i < argc; i++) {
     if (strncmp(argv[i], "--cpu=", 6) == 0) {
       const char *cpu = argv[i] + 6;
       uint32_t armcap;
@@ -69,9 +68,17 @@
       printf("Simulating CPU '%s'\n", cpu);
       *armcap_ptr = armcap;
     }
-  }
 #endif  // TEST_ARM_CPUS
 
+    if (strcmp(argv[i], "--no_unwind_tests") == 0) {
+      unwind_tests = false;
+    }
+  }
+
+  if (unwind_tests) {
+    abi_test::EnableUnwindTests();
+  }
+
   // Run the entire test suite under an ABI check. This is less effective than
   // testing the individual assembly functions, but will catch issues with
   // rarely-used registers.
diff --git a/decrepit/CMakeLists.txt b/decrepit/CMakeLists.txt
index 501252c..0829926 100644
--- a/decrepit/CMakeLists.txt
+++ b/decrepit/CMakeLists.txt
@@ -35,12 +35,12 @@
   ripemd/ripemd_test.cc
 
   $<TARGET_OBJECTS:boringssl_gtest_main>
-  $<TARGET_OBJECTS:test_support>
 )
 
 add_dependencies(decrepit_test global_target)
 
-target_link_libraries(decrepit_test crypto decrepit boringssl_gtest)
+target_link_libraries(decrepit_test test_support_lib boringssl_gtest decrepit
+                      crypto)
 if(WIN32)
   target_link_libraries(decrepit_test ws2_32)
 endif()
diff --git a/fipstools/CMakeLists.txt b/fipstools/CMakeLists.txt
index 779fcd1..58c9a60 100644
--- a/fipstools/CMakeLists.txt
+++ b/fipstools/CMakeLists.txt
@@ -25,8 +25,6 @@
     cavp_tlskdf_test.cc
 
     cavp_test_util.cc
-
-    $<TARGET_OBJECTS:test_support>
   )
 
   add_dependencies(cavp global_target)
@@ -35,11 +33,10 @@
     test_fips
 
     test_fips.c
-    $<TARGET_OBJECTS:test_support>
   )
 
   add_dependencies(test_fips global_target)
 
-  target_link_libraries(cavp crypto)
-  target_link_libraries(test_fips crypto)
+  target_link_libraries(cavp test_support_lib crypto)
+  target_link_libraries(test_fips test_support_lib crypto)
 endif()
diff --git a/ssl/CMakeLists.txt b/ssl/CMakeLists.txt
index d6c1294..dc89dca 100644
--- a/ssl/CMakeLists.txt
+++ b/ssl/CMakeLists.txt
@@ -52,12 +52,11 @@
   ssl_test.cc
 
   $<TARGET_OBJECTS:boringssl_gtest_main>
-  $<TARGET_OBJECTS:test_support>
 )
 
 add_dependencies(ssl_test global_target)
 
-target_link_libraries(ssl_test ssl crypto boringssl_gtest)
+target_link_libraries(ssl_test test_support_lib boringssl_gtest ssl crypto)
 if(WIN32)
   target_link_libraries(ssl_test ws2_32)
 endif()
diff --git a/ssl/test/CMakeLists.txt b/ssl/test/CMakeLists.txt
index d86464c..ebc16f1 100644
--- a/ssl/test/CMakeLists.txt
+++ b/ssl/test/CMakeLists.txt
@@ -10,13 +10,11 @@
   settings_writer.cc
   test_config.cc
   test_state.cc
-
-  $<TARGET_OBJECTS:test_support>
 )
 
 add_dependencies(bssl_shim global_target)
 
-target_link_libraries(bssl_shim ssl crypto)
+target_link_libraries(bssl_shim test_support_lib ssl crypto)
 
 if(UNIX AND NOT APPLE AND NOT ANDROID)
   add_executable(
@@ -29,13 +27,11 @@
     settings_writer.cc
     test_config.cc
     test_state.cc
-
-    $<TARGET_OBJECTS:test_support>
   )
 
   add_dependencies(handshaker global_target)
 
-  target_link_libraries(handshaker ssl crypto)
+  target_link_libraries(handshaker test_support_lib ssl crypto)
 else()
   # Declare a dummy target for run_tests to depend on.
   add_custom_target(handshaker)
diff --git a/util/all_tests.go b/util/all_tests.go
index 55e1921..fbff48c 100644
--- a/util/all_tests.go
+++ b/util/all_tests.go
@@ -146,9 +146,14 @@
 
 func runTestOnce(test test, mallocNumToFail int64) (passed bool, err error) {
 	prog := path.Join(*buildDir, test.args[0])
-	args := test.args[1:]
+	args := append([]string{}, test.args[1:]...)
 	if *simulateARMCPUs && test.cpu != "" {
-		args = append([]string{"--cpu=" + test.cpu}, args...)
+		args = append(args, "--cpu=" + test.cpu)
+	}
+	if *useSDE {
+		// SDE is neither compatible with the unwind tester nor automatically
+		// detected.
+		args = append(args, "--no_unwind_tests")
 	}
 	var cmd *exec.Cmd
 	if *useValgrind {