Add 64-bit, P-256 implementation.

This is taken from upstream, although it originally came from us. This
will only take effect on 64-bit systems (x86-64 and aarch64).

Before:

Did 1496 ECDH P-256 operations in 1038743us (1440.2 ops/sec)
Did 2783 ECDSA P-256 signing operations in 1081006us (2574.5 ops/sec)
Did 2400 ECDSA P-256 verify operations in 1059508us (2265.2 ops/sec)

After:

Did 4147 ECDH P-256 operations in 1061723us (3905.9 ops/sec)
Did 9372 ECDSA P-256 signing operations in 1040589us (9006.4 ops/sec)
Did 4114 ECDSA P-256 verify operations in 1063478us (3868.4 ops/sec)

Change-Id: I11fabb03239cc3a7c4a97325ed4e4c97421f91a9
9 files changed