rand: new-style locking and support rdrand.

Pure /dev/urandom, no buffering (previous behaviour):
Did 2320000 RNG (16 bytes) operations in 3000082us (773312.2 ops/sec): 12.4 MB/s
Did 209000 RNG (256 bytes) operations in 3011984us (69389.5 ops/sec): 17.8 MB/s
Did 6851 RNG (8192 bytes) operations in 3052027us (2244.7 ops/sec): 18.4 MB/s

Pure rdrand speed:
Did 34930500 RNG (16 bytes) operations in 3000021us (11643418.5 ops/sec): 186.3 MB/s
Did 2444000 RNG (256 bytes) operations in 3000164us (814622.1 ops/sec): 208.5 MB/s
Did 80000 RNG (8192 bytes) operations in 3020968us (26481.6 ops/sec): 216.9 MB/s

rdrand + ChaCha (as in this change):
Did 19498000 RNG (16 bytes) operations in 3000086us (6499147.0 ops/sec): 104.0 MB/s
Did 1964000 RNG (256 bytes) operations in 3000566us (654543.2 ops/sec): 167.6 MB/s
Did 62000 RNG (8192 bytes) operations in 3034090us (20434.5 ops/sec): 167.4 MB/s

Change-Id: Ie17045650cfe75858e4498ac28dbc4dcf8338376
Reviewed-on: https://boringssl-review.googlesource.com/4328
Reviewed-by: Adam Langley <agl@google.com>
diff --git a/include/openssl/rand.h b/include/openssl/rand.h
index 8e3bc30..0b2ead8 100644
--- a/include/openssl/rand.h
+++ b/include/openssl/rand.h
@@ -25,8 +25,7 @@
 /* Random number generation. */
 
 
-/* RAND_bytes writes |len| bytes of random data to |buf|. It returns one on
- * success and zero on otherwise. */
+/* RAND_bytes writes |len| bytes of random data to |buf| and returns one. */
 OPENSSL_EXPORT int RAND_bytes(uint8_t *buf, size_t len);
 
 /* RAND_cleanup frees any resources used by the RNG. This is not safe if other