Assume hyper-threading-like vulnerabilities are always present.

It's not clear that CPUID will always report the correct value here,
especially for hyper-threading environments. It also isn't clear that
the assumptions made by AMD processors are correct and will always be
correct. It also seems likely that, if a code path is
security-sensitive w.r.t. SMT, it is probably also security-sensitive
w.r.t. other processor (mis)features. Finally, it isn't clear that all
dynamic analysis (fuzzing, SDE, etc.) is done separately for the cross
product of all CPU feature combinations * the value of this bit.

With all that in mind, instruct code sensitive to this bit to always
choose the more conservative path.

I only found one place that's sensitive to this bit, though I didn't
look too hard:

```
aes_nohw_cbc_encrypt:
    [...]
    leaq	OPENSSL_ia32cap_P(%rip),%r10
    mov	(%r10), %r10d
    [...]
    bt	\$28,%r10d
    jc	.Lcbc_slow_prologue
```

I didn't verify that the code in the HTT-enabled paths is any better
than the code in the HTT-disabled paths.

Change-Id: Ifd643e6a1301e5ca2174b84c344eb933d49e0067
Reviewed-on: https://boringssl-review.googlesource.com/c/33404
Reviewed-by: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
diff --git a/crypto/cpu-intel.c b/crypto/cpu-intel.c
index 20cfbe8..5c21f4a 100644
--- a/crypto/cpu-intel.c
+++ b/crypto/cpu-intel.c
@@ -173,29 +173,11 @@
     extended_features[1] = ecx;
   }
 
-  // Determine the number of cores sharing an L1 data cache to adjust the
-  // hyper-threading bit.
-  uint32_t cores_per_cache = 0;
-  if (is_amd) {
-    // AMD CPUs never share an L1 data cache between threads but do set the HTT
-    // bit on multi-core CPUs.
-    cores_per_cache = 1;
-  } else if (num_ids >= 4) {
-    // TODO(davidben): The Intel manual says this CPUID leaf enumerates all
-    // caches using ECX and doesn't say which is first. Does this matter?
-    OPENSSL_cpuid(&eax, &ebx, &ecx, &edx, 4);
-    cores_per_cache = 1 + ((eax >> 14) & 0xfff);
-  }
-
   OPENSSL_cpuid(&eax, &ebx, &ecx, &edx, 1);
 
-  // Adjust the hyper-threading bit.
-  if (edx & (1u << 28)) {
-    uint32_t num_logical_cores = (ebx >> 16) & 0xff;
-    if (cores_per_cache == 1 || num_logical_cores <= 1) {
-      edx &= ~(1u << 28);
-    }
-  }
+  // Force the hyper-threading bit so that the more conservative path is always
+  // chosen.
+  edx |= 1u << 28;
 
   // Reserved bit #20 was historically repurposed to control the in-memory
   // representation of RC4 state. Always set it to zero.