rand: new-style locking and support rdrand.

Pure /dev/urandom, no buffering (previous behaviour):
Did 2320000 RNG (16 bytes) operations in 3000082us (773312.2 ops/sec): 12.4 MB/s
Did 209000 RNG (256 bytes) operations in 3011984us (69389.5 ops/sec): 17.8 MB/s
Did 6851 RNG (8192 bytes) operations in 3052027us (2244.7 ops/sec): 18.4 MB/s

Pure rdrand speed:
Did 34930500 RNG (16 bytes) operations in 3000021us (11643418.5 ops/sec): 186.3 MB/s
Did 2444000 RNG (256 bytes) operations in 3000164us (814622.1 ops/sec): 208.5 MB/s
Did 80000 RNG (8192 bytes) operations in 3020968us (26481.6 ops/sec): 216.9 MB/s

rdrand + ChaCha (as in this change):
Did 19498000 RNG (16 bytes) operations in 3000086us (6499147.0 ops/sec): 104.0 MB/s
Did 1964000 RNG (256 bytes) operations in 3000566us (654543.2 ops/sec): 167.6 MB/s
Did 62000 RNG (8192 bytes) operations in 3034090us (20434.5 ops/sec): 167.4 MB/s

Change-Id: Ie17045650cfe75858e4498ac28dbc4dcf8338376
Reviewed-on: https://boringssl-review.googlesource.com/4328
Reviewed-by: Adam Langley <agl@google.com>
diff --git a/crypto/internal.h b/crypto/internal.h
index 6a8d5b2..9c5d487 100644
--- a/crypto/internal.h
+++ b/crypto/internal.h
@@ -434,6 +434,7 @@
  * stored. */
 typedef enum {
   OPENSSL_THREAD_LOCAL_ERR = 0,
+  OPENSSL_THREAD_LOCAL_RAND,
   OPENSSL_THREAD_LOCAL_TEST,
   NUM_OPENSSL_THREAD_LOCALS,
 } thread_local_data_t;