Don't require alignment in ChaCha20 on ARM.
By copying the input and output data via an aligned buffer, the
alignment requirements for the NEON ChaCha implementation on ARM can be
eliminted. This does, however, reduce the speed when aligned buffers are
used. However, updating the GCC version used to generate the ASM more
than makes up for that.
On a SnapDragon 801 (OnePlus One) the aligned speed was 214.6 MB/s and
the unaligned speed was 112.1 MB/s. Now both are 218.4 MB/s. A Nexus 7
also shows a slight speed up.
Change-Id: I68321ba56767fa5354b31a1491a539b299236e9a
Reviewed-on: https://boringssl-review.googlesource.com/3132
Reviewed-by: Adam Langley <agl@google.com>
3 files changed