)]}' { "commit": "5fcd47d137f9b556edc7a392035dc2d2f43282ca", "tree": "eb5a036c2e46952c56b3c4383d880b60e20b9000", "parents": [ "825bec8c8865e314bfc918c8ad352f154fdc4ba8" ], "author": { "name": "Ilya Tokar", "email": "tokarip@google.com", "time": "Mon May 22 16:06:49 2023 -0400" }, "committer": { "name": "Boringssl LUCI CQ", "email": "boringssl-scoped@luci-project-accounts.iam.gserviceaccount.com", "time": "Wed May 24 19:12:59 2023 +0000" }, "message": "Add prefetch to aes_hw_ctr32_encrypt_blocks\n\nSimilar idea to https://boringssl-review.googlesource.com/c/boringssl/+/55466\n\nResults are pretty close to the current state, AMD (rome):\nBM_Encrypt/64/0 344ns ± 3% 343ns ± 1% ~ (p\u003d0.728 n\u003d20+19)\nBM_Encrypt/64/1 394ns ± 2% 394ns ± 3% ~ (p\u003d0.919 n\u003d18+20)\nBM_Encrypt/64/8 391ns ± 1% 390ns ± 2% ~ (p\u003d0.165 n\u003d17+19)\nBM_Encrypt/64/64 342ns ± 3% 341ns ± 2% ~ (p\u003d0.686 n\u003d19+19)\nBM_Encrypt/64/97 393ns ± 1% 394ns ± 3% ~ (p\u003d0.639 n\u003d17+19)\nBM_Encrypt/512/0 437ns ± 2% 437ns ± 1% ~ (p\u003d0.819 n\u003d20+19)\nBM_Encrypt/512/1 566ns ± 1% 551ns ± 3% -2.65% (p\u003d0.000 n\u003d18+18)\nBM_Encrypt/512/8 563ns ± 2% 555ns ± 4% -1.48% (p\u003d0.003 n\u003d18+20)\nBM_Encrypt/512/64 434ns ± 3% 439ns ± 3% +1.03% (p\u003d0.008 n\u003d19+20)\nBM_Encrypt/512/97 565ns ± 2% 555ns ± 4% -1.88% (p\u003d0.001 n\u003d18+20)\nBM_Encrypt/4k/0 1.03µs ± 2% 0.99µs ± 2% -4.29% (p\u003d0.000 n\u003d20+20)\nBM_Encrypt/4k/1 1.18µs ± 3% 1.11µs ± 3% -5.66% (p\u003d0.000 n\u003d20+20)\nBM_Encrypt/4k/8 1.17µs ± 3% 1.11µs ± 2% -5.51% (p\u003d0.000 n\u003d20+20)\nBM_Encrypt/4k/64 1.03µs ± 1% 0.99µs ± 1% -4.08% (p\u003d0.000 n\u003d19+19)\nBM_Encrypt/4k/97 1.17µs ± 3% 1.11µs ± 2% -5.65% (p\u003d0.000 n\u003d20+19)\nBM_Encrypt/32k/0 5.26µs ± 1% 5.19µs ± 2% -1.29% (p\u003d0.000 n\u003d19+20)\nBM_Encrypt/32k/1 5.49µs ± 2% 5.38µs ± 1% -2.01% (p\u003d0.000 n\u003d20+20)\nBM_Encrypt/32k/8 5.45µs ± 2% 5.34µs ± 1% -2.12% (p\u003d0.000 n\u003d20+19)\nBM_Encrypt/32k/64 5.28µs ± 1% 5.19µs ± 1% -1.66% (p\u003d0.000 n\u003d19+20)\nBM_Encrypt/32k/97 5.49µs ± 1% 5.38µs ± 1% -2.02% (p\u003d0.000 n\u003d20+17)\nBM_Encrypt/256k/0 38.9µs ± 1% 38.5µs ± 2% -1.09% (p\u003d0.000 n\u003d20+20)\nBM_Encrypt/256k/1 40.3µs ± 2% 39.6µs ± 1% -1.74% (p\u003d0.000 n\u003d20+20)\nBM_Encrypt/256k/8 39.7µs ± 2% 39.0µs ± 1% -1.82% (p\u003d0.000 n\u003d19+18)\nBM_Encrypt/256k/64 38.9µs ± 1% 38.4µs ± 1% -1.35% (p\u003d0.000 n\u003d20+18)\nBM_Encrypt/256k/97 40.1µs ± 1% 39.6µs ± 1% -1.32% (p\u003d0.000 n\u003d20+20)\nBM_Encrypt/1M/0 154µs ± 1% 153µs ± 1% -0.62% (p\u003d0.001 n\u003d17+18)\nBM_Encrypt/1M/1 160µs ± 2% 158µs ± 1% -1.44% (p\u003d0.000 n\u003d19+20)\nBM_Encrypt/1M/8 158µs ± 1% 155µs ± 1% -1.62% (p\u003d0.000 n\u003d20+19)\nBM_Encrypt/1M/64 155µs ± 2% 153µs ± 1% -1.48% (p\u003d0.000 n\u003d20+20)\nBM_Encrypt/1M/97 160µs ± 1% 158µs ± 2% -1.46% (p\u003d0.000 n\u003d20+20)\nBM_EncryptCord/1/0 310ns ± 3% 307ns ± 4% ~ (p\u003d0.101 n\u003d19+20)\n\nIntel (skylake):\n\nBM_Encrypt/64/0 326ns ± 5% 325ns ± 4% ~ (p\u003d0.817 n\u003d16+17)\nBM_Encrypt/64/1 368ns ± 2% 387ns ±13% ~ (p\u003d0.845 n\u003d17+20)\nBM_Encrypt/64/8 385ns ±14% 365ns ± 3% -5.12% (p\u003d0.013 n\u003d20+18)\nBM_Encrypt/64/64 325ns ± 4% 325ns ± 6% ~ (p\u003d0.621 n\u003d18+16)\nBM_Encrypt/64/97 367ns ± 3% 366ns ± 3% ~ (p\u003d0.963 n\u003d18+18)\nBM_Encrypt/512/0 504ns ± 4% 456ns ± 3% -9.52% (p\u003d0.000 n\u003d17+20)\nBM_Encrypt/512/1 568ns ± 2% 528ns ± 4% -7.09% (p\u003d0.000 n\u003d15+17)\nBM_Encrypt/512/8 580ns ± 3% 541ns ± 4% -6.66% (p\u003d0.000 n\u003d20+17)\nBM_Encrypt/512/64 500ns ± 3% 454ns ± 4% -9.26% (p\u003d0.000 n\u003d17+17)\nBM_Encrypt/512/97 564ns ± 2% 526ns ± 4% -6.82% (p\u003d0.000 n\u003d18+17)\nBM_Encrypt/4k/0 1.26µs ± 2% 1.23µs ± 5% -2.77% (p\u003d0.000 n\u003d19+18)\nBM_Encrypt/4k/1 1.33µs ± 2% 1.28µs ± 3% -4.34% (p\u003d0.000 n\u003d18+18)\nBM_Encrypt/4k/8 1.35µs ± 3% 1.29µs ± 3% -4.31% (p\u003d0.000 n\u003d19+17)\nBM_Encrypt/4k/64 1.27µs ± 3% 1.23µs ± 4% -3.32% (p\u003d0.000 n\u003d18+18)\nBM_Encrypt/4k/97 1.34µs ± 3% 1.29µs ± 3% -3.98% (p\u003d0.000 n\u003d18+16)\nBM_Encrypt/32k/0 8.24µs ± 4% 7.99µs ± 5% -3.00% (p\u003d0.001 n\u003d17+16)\nBM_Encrypt/32k/1 8.23µs ± 2% 7.99µs ± 5% -2.95% (p\u003d0.000 n\u003d17+16)\nBM_Encrypt/32k/8 8.64µs ±15% 8.05µs ± 5% -6.92% (p\u003d0.000 n\u003d20+18)\nBM_Encrypt/32k/64 8.14µs ± 3% 7.96µs ± 3% -2.23% (p\u003d0.000 n\u003d18+17)\nBM_Encrypt/32k/97 8.72µs ±14% 8.01µs ± 4% -8.20% (p\u003d0.000 n\u003d20+17)\nBM_Encrypt/256k/0 63.2µs ± 4% 61.7µs ± 3% -2.35% (p\u003d0.003 n\u003d19+18)\nBM_Encrypt/256k/1 63.5µs ± 4% 61.8µs ± 3% -2.75% (p\u003d0.000 n\u003d17+19)\nBM_Encrypt/256k/8 63.6µs ± 9% 61.0µs ± 1% -4.08% (p\u003d0.000 n\u003d18+16)\nBM_Encrypt/256k/64 63.1µs ± 3% 61.5µs ± 5% -2.60% (p\u003d0.001 n\u003d18+16)\nBM_Encrypt/256k/97 65.6µs ±16% 61.6µs ± 4% -6.09% (p\u003d0.000 n\u003d19+17)\nBM_Encrypt/1M/0 253µs ± 5% 246µs ± 5% -2.88% (p\u003d0.001 n\u003d19+19)\nBM_Encrypt/1M/1 253µs ± 6% 244µs ± 1% -3.71% (p\u003d0.000 n\u003d16+17)\nBM_Encrypt/1M/8 254µs ± 5% 244µs ± 3% -4.15% (p\u003d0.000 n\u003d18+18)\nBM_Encrypt/1M/64 253µs ± 4% 245µs ± 4% -3.10% (p\u003d0.000 n\u003d19+19)\nBM_Encrypt/1M/97 267µs ±14% 246µs ± 4% -8.13% (p\u003d0.000 n\u003d20+18)\n\nBut on AMD with prefetchers disabled and large enough data size,\nto force cache misses this gives \u003e2x improvement:\nBM_Encrypt/64/0 342ns ± 1% 336ns ± 1% -1.63% (p\u003d0.000 n\u003d19+19)\nBM_Encrypt/64/1 485ns ± 2% 484ns ± 2% ~ (p\u003d0.396 n\u003d19+20)\nBM_Encrypt/64/8 490ns ± 1% 488ns ± 2% ~ (p\u003d0.098 n\u003d18+19)\nBM_Encrypt/64/64 340ns ± 2% 335ns ± 1% -1.50% (p\u003d0.000 n\u003d19+19)\nBM_Encrypt/64/97 483ns ± 1% 483ns ± 1% ~ (p\u003d0.912 n\u003d16+20)\nBM_Encrypt/512/0 566ns ± 3% 521ns ± 2% -7.99% (p\u003d0.000 n\u003d18+20)\nBM_Encrypt/512/1 744ns ± 2% 667ns ± 1% -10.31% (p\u003d0.000 n\u003d20+20)\nBM_Encrypt/512/8 745ns ± 1% 666ns ± 1% -10.53% (p\u003d0.000 n\u003d18+20)\nBM_Encrypt/512/64 566ns ± 3% 520ns ± 2% -8.05% (p\u003d0.000 n\u003d17+19)\nBM_Encrypt/512/97 740ns ± 1% 666ns ± 1% -9.92% (p\u003d0.000 n\u003d18+19)\nBM_Encrypt/4k/0 2.50µs ± 1% 1.35µs ± 1% -45.82% (p\u003d0.000 n\u003d19+19)\nBM_Encrypt/4k/1 2.65µs ± 3% 1.50µs ± 1% -43.50% (p\u003d0.000 n\u003d19+19)\nBM_Encrypt/4k/8 2.66µs ± 1% 1.49µs ± 1% -43.71% (p\u003d0.000 n\u003d19+19)\nBM_Encrypt/4k/64 2.47µs ± 4% 1.36µs ± 1% -45.05% (p\u003d0.000 n\u003d20+20)\nBM_Encrypt/4k/97 2.66µs ± 1% 1.50µs ± 2% -43.54% (p\u003d0.000 n\u003d18+19)\nBM_Encrypt/32k/0 18.0µs ± 1% 8.0µs ± 1% -55.38% (p\u003d0.000 n\u003d18+19)\nBM_Encrypt/32k/1 18.2µs ± 1% 8.2µs ± 1% -54.91% (p\u003d0.000 n\u003d14+20)\nBM_Encrypt/32k/8 18.2µs ± 1% 8.2µs ± 1% -54.93% (p\u003d0.000 n\u003d19+18)\nBM_Encrypt/32k/64 18.0µs ± 1% 8.0µs ± 1% -55.35% (p\u003d0.000 n\u003d16+20)\nBM_Encrypt/32k/97 18.1µs ± 3% 8.2µs ± 1% -54.84% (p\u003d0.000 n\u003d20+19)\nBM_Encrypt/256k/0 148µs ± 1% 63µs ± 1% -57.59% (p\u003d0.000 n\u003d18+19)\nBM_Encrypt/256k/1 150µs ± 1% 63µs ± 1% -57.78% (p\u003d0.000 n\u003d16+20)\nBM_Encrypt/256k/8 147µs ± 5% 63µs ± 1% -56.95% (p\u003d0.000 n\u003d20+20)\nBM_Encrypt/256k/64 148µs ± 2% 63µs ± 1% -57.40% (p\u003d0.000 n\u003d18+20)\nBM_Encrypt/256k/97 146µs ± 4% 63µs ± 1% -56.82% (p\u003d0.000 n\u003d20+19)\nBM_Encrypt/1M/0 595µs ± 1% 254µs ± 1% -57.33% (p\u003d0.000 n\u003d19+20)\nBM_Encrypt/1M/1 590µs ± 4% 255µs ± 1% -56.78% (p\u003d0.000 n\u003d20+20)\nBM_Encrypt/1M/8 593µs ± 2% 254µs ± 1% -57.10% (p\u003d0.000 n\u003d18+19)\nBM_Encrypt/1M/64 595µs ± 1% 254µs ± 1% -57.34% (p\u003d0.000 n\u003d16+19)\nBM_Encrypt/1M/97 589µs ± 4% 255µs ± 1% -56.74% (p\u003d0.000 n\u003d20+20)\n\nChange-Id: I13c783ad261093009b2aa5ff56ce569f45ed3300\nReviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/60027\nCommit-Queue: David Benjamin \u003cdavidben@google.com\u003e\nReviewed-by: David Benjamin \u003cdavidben@google.com\u003e\n", "tree_diff": [ { "type": "modify", "old_id": "9a90946b84d212dd8aaaba623f88182cf2fb03b4", "old_mode": 33188, "old_path": "crypto/fipsmodule/aes/asm/aesni-x86_64.pl", "new_id": "215611fd70830fe3b6a5ae5a01b2399c26f654c8", "new_mode": 33188, "new_path": "crypto/fipsmodule/aes/asm/aesni-x86_64.pl" } ] }