Switch to new fiat pipeline. This new version makes it much easier to tell which code is handwritten and which is verified. For some reason, it also is *dramatically* faster for 32-bit x86 GCC. Clang x86_64, however, does take a small hit. Benchmarks below. x86, GCC 7.3.0, OPENSSL_SMALL (For some reason, GCC used to be really bad at compiling the 32-bit curve25519 code. The new one fixes this. I'm not sure what changed.) Before: Did 17135 Ed25519 key generation operations in 10026402us (1709.0 ops/sec) Did 17170 Ed25519 signing operations in 10074192us (1704.4 ops/sec) Did 9180 Ed25519 verify operations in 10034025us (914.9 ops/sec) Did 17271 Curve25519 base-point multiplication operations in 10050837us (1718.4 ops/sec) Did 10605 Curve25519 arbitrary point multiplication operations in 10047714us (1055.5 ops/sec) Did 7800 ECDH P-256 operations in 10018331us (778.6 ops/sec) Did 24308 ECDSA P-256 signing operations in 10019241us (2426.1 ops/sec) Did 9191 ECDSA P-256 verify operations in 10081639us (911.7 ops/sec) After: Did 99873 Ed25519 key generation operations in 10021810us (9965.6 ops/sec) [+483.1%] Did 99960 Ed25519 signing operations in 10052236us (9944.1 ops/sec) [+483.4%] Did 53676 Ed25519 verify operations in 10009078us (5362.7 ops/sec) [+486.2%] Did 102000 Curve25519 base-point multiplication operations in 10039764us (10159.6 ops/sec) [+491.2%] Did 60802 Curve25519 arbitrary point multiplication operations in 10056897us (6045.8 ops/sec) [+472.8%] Did 7900 ECDH P-256 operations in 10054509us (785.7 ops/sec) [+0.9%] Did 24926 ECDSA P-256 signing operations in 10050919us (2480.0 ops/sec) [+2.2%] Did 9494 ECDSA P-256 verify operations in 10064659us (943.3 ops/sec) [+3.5%] x86, Clang 8.0.0 trunk 349417, OPENSSL_SMALL Before: Did 82750 Ed25519 key generation operations in 10051177us (8232.9 ops/sec) Did 82400 Ed25519 signing operations in 10035806us (8210.6 ops/sec) Did 41511 Ed25519 verify operations in 10048919us (4130.9 ops/sec) Did 83300 Curve25519 base-point multiplication operations in 10044283us (8293.3 ops/sec) Did 49700 Curve25519 arbitrary point multiplication operations in 10007005us (4966.5 ops/sec) Did 14039 ECDH P-256 operations in 10093929us (1390.8 ops/sec) Did 40950 ECDSA P-256 signing operations in 10006757us (4092.2 ops/sec) Did 16068 ECDSA P-256 verify operations in 10095996us (1591.5 ops/sec) After: Did 80476 Ed25519 key generation operations in 10048648us (8008.6 ops/sec) [-2.7%] Did 79050 Ed25519 signing operations in 10049180us (7866.3 ops/sec) [-4.2%] Did 40501 Ed25519 verify operations in 10048347us (4030.6 ops/sec) [-2.4%] Did 81300 Curve25519 base-point multiplication operations in 10017480us (8115.8 ops/sec) [-2.1%] Did 48278 Curve25519 arbitrary point multiplication operations in 10092500us (4783.6 ops/sec) [-3.7%] Did 15402 ECDH P-256 operations in 10096705us (1525.4 ops/sec) [+9.7%] Did 44200 ECDSA P-256 signing operations in 10037715us (4403.4 ops/sec) [+7.6%] Did 17000 ECDSA P-256 verify operations in 10008813us (1698.5 ops/sec) [+6.7%] x86_64, GCC 7.3.0 (Note these P-256 numbers are not affected by this change. Included to get a sense of noise.) Before: Did 557000 Ed25519 key generation operations in 10011721us (55634.8 ops/sec) Did 550000 Ed25519 signing operations in 10016449us (54909.7 ops/sec) Did 190000 Ed25519 verify operations in 10014565us (18972.4 ops/sec) Did 587000 Curve25519 base-point multiplication operations in 10015402us (58609.7 ops/sec) Did 230000 Curve25519 arbitrary point multiplication operations in 10023827us (22945.3 ops/sec) Did 179000 ECDH P-256 operations in 10016294us (17870.9 ops/sec) Did 557000 ECDSA P-256 signing operations in 10014158us (55621.3 ops/sec) Did 198000 ECDSA P-256 verify operations in 10036694us (19727.6 ops/sec) After: Did 569000 Ed25519 key generation operations in 10004965us (56871.8 ops/sec) [+2.2%] Did 563000 Ed25519 signing operations in 10000064us (56299.6 ops/sec) [+2.5%] Did 196000 Ed25519 verify operations in 10025650us (19549.9 ops/sec) [+3.0%] Did 596000 Curve25519 base-point multiplication operations in 10008666us (59548.4 ops/sec) [+1.6%] Did 229000 Curve25519 arbitrary point multiplication operations in 10028921us (22834.0 ops/sec) [-0.5%] Did 182910 ECDH P-256 operations in 10014905us (18263.8 ops/sec) [+2.2%] Did 562000 ECDSA P-256 signing operations in 10011944us (56133.0 ops/sec) [+0.9%] Did 202000 ECDSA P-256 verify operations in 10046901us (20105.7 ops/sec) [+1.9%] x86_64, GCC 7.3.0, OPENSSL_SMALL Before: Did 350000 Ed25519 key generation operations in 10002540us (34991.1 ops/sec) Did 344000 Ed25519 signing operations in 10010420us (34364.2 ops/sec) Did 197000 Ed25519 verify operations in 10030593us (19639.9 ops/sec) Did 362000 Curve25519 base-point multiplication operations in 10004615us (36183.3 ops/sec) Did 235000 Curve25519 arbitrary point multiplication operations in 10025951us (23439.2 ops/sec) Did 32032 ECDH P-256 operations in 10056486us (3185.2 ops/sec) Did 96354 ECDSA P-256 signing operations in 10007297us (9628.4 ops/sec) Did 37774 ECDSA P-256 verify operations in 10044892us (3760.5 ops/sec) After: Did 343000 Ed25519 key generation operations in 10025108us (34214.1 ops/sec) [-2.2%] Did 340000 Ed25519 signing operations in 10014870us (33949.5 ops/sec) [-1.2%] Did 192000 Ed25519 verify operations in 10025082us (19152.0 ops/sec) [-2.5%] Did 355000 Curve25519 base-point multiplication operations in 10013220us (35453.1 ops/sec) [-2.0%] Did 231000 Curve25519 arbitrary point multiplication operations in 10010775us (23075.1 ops/sec) [-1.6%] Did 31540 ECDH P-256 operations in 10009664us (3151.0 ops/sec) [-1.1%] Did 99012 ECDSA P-256 signing operations in 10090296us (9812.6 ops/sec) [+1.9%] Did 37695 ECDSA P-256 verify operations in 10092859us (3734.8 ops/sec) [-0.7%] x86_64, Clang 8.0.0 trunk 349417 (Note these P-256 numbers are not affected by this change. Included to get a sense of noise.) Before: Did 600000 Ed25519 key generation operations in 10000278us (59998.3 ops/sec) Did 595000 Ed25519 signing operations in 10010375us (59438.3 ops/sec) Did 184000 Ed25519 verify operations in 10013984us (18374.3 ops/sec) Did 636000 Curve25519 base-point multiplication operations in 10005250us (63566.6 ops/sec) Did 229000 Curve25519 arbitrary point multiplication operations in 10006059us (22886.1 ops/sec) Did 179250 ECDH P-256 operations in 10026354us (17877.9 ops/sec) Did 547000 ECDSA P-256 signing operations in 10017585us (54604.0 ops/sec) Did 197000 ECDSA P-256 verify operations in 10013020us (19674.4 ops/sec) After: Did 560000 Ed25519 key generation operations in 10009295us (55948.0 ops/sec) [-6.8%] Did 548000 Ed25519 signing operations in 10007912us (54756.7 ops/sec) [-7.9%] Did 170000 Ed25519 verify operations in 10056948us (16903.7 ops/sec) [-8.0%] Did 592000 Curve25519 base-point multiplication operations in 10016818us (59100.6 ops/sec) [-7.0%] Did 214000 Curve25519 arbitrary point multiplication operations in 10043918us (21306.4 ops/sec) [-6.9%] Did 180000 ECDH P-256 operations in 10026019us (17953.3 ops/sec) [+0.4%] Did 550000 ECDSA P-256 signing operations in 10004943us (54972.8 ops/sec) [+0.7%] Did 198000 ECDSA P-256 verify operations in 10021714us (19757.1 ops/sec) [+0.4%] x86_64, Clang 8.0.0 trunk 349417, OPENSSL_SMALL Before: Did 326000 Ed25519 key generation operations in 10003266us (32589.4 ops/sec) Did 322000 Ed25519 signing operations in 10026783us (32114.0 ops/sec) Did 181000 Ed25519 verify operations in 10015635us (18071.7 ops/sec) Did 335000 Curve25519 base-point multiplication operations in 10000359us (33498.8 ops/sec) Did 224000 Curve25519 arbitrary point multiplication operations in 10027245us (22339.1 ops/sec) Did 68552 ECDH P-256 operations in 10018900us (6842.3 ops/sec) Did 184000 ECDSA P-256 signing operations in 10014516us (18373.3 ops/sec) Did 76020 ECDSA P-256 verify operations in 10016891us (7589.2 ops/sec) After: Did 310000 Ed25519 key generation operations in 10022086us (30931.7 ops/sec) [-5.1%] Did 308000 Ed25519 signing operations in 10007543us (30776.8 ops/sec) [-4.2%] Did 173000 Ed25519 verify operations in 10005829us (17289.9 ops/sec) [-4.3%] Did 321000 Curve25519 base-point multiplication operations in 10027058us (32013.4 ops/sec) [-4.4%] Did 212000 Curve25519 arbitrary point multiplication operations in 10015203us (21167.8 ops/sec) [-5.2%] Did 64059 ECDH P-256 operations in 10042781us (6378.6 ops/sec) [-6.8%] Did 170000 ECDSA P-256 signing operations in 10030896us (16947.6 ops/sec) [-7.8%] Did 72176 ECDSA P-256 verify operations in 10075369us (7163.6 ops/sec) [-5.6%] Bug: 254 Change-Id: Ib04c773f01b542bcb8611cceb582466bfa6f6d52 Reviewed-on: https://boringssl-review.googlesource.com/c/34306 Commit-Queue: Adam Langley <agl@google.com> Reviewed-by: Adam Langley <agl@google.com>
diff --git a/third_party/fiat/METADATA b/third_party/fiat/METADATA index 6cd1612..0e4012f 100644 --- a/third_party/fiat/METADATA +++ b/third_party/fiat/METADATA
@@ -6,8 +6,8 @@ type: GIT value: "https://github.com/mit-plv/fiat-crypto" } - version: "6c4d4afb26de639718fcac39094353ca7feec365" - last_upgrade_date { year: 2017 month: 11 day: 3 } + version: "4441785fb44b88bb6943ddbf639d872c8c903281" + last_upgrade_date { year: 2019 month: 1 day: 16 } local_modifications: "Fiat-generated code has been integrated into existing BoringSSL code" }
diff --git a/third_party/fiat/curve25519.c b/third_party/fiat/curve25519.c index b64956e..c5fa5da 100644 --- a/third_party/fiat/curve25519.c +++ b/third_party/fiat/curve25519.c
@@ -45,8 +45,14 @@ // Various pre-computed constants. #include "./curve25519_tables.h" +#if defined(BORINGSSL_CURVE25519_64BIT) +#include "./curve25519_64.c" +#else +#include "./curve25519_32.c" +#endif // BORINGSSL_CURVE25519_64BIT -// Low-level intrinsic operations (hand-written). + +// Low-level intrinsic operations static uint64_t load_3(const uint8_t *in) { uint64_t result; @@ -65,706 +71,111 @@ return result; } -#if defined(BORINGSSL_CURVE25519_64BIT) -static uint64_t load_8(const uint8_t *in) { - uint64_t result; - result = (uint64_t)in[0]; - result |= ((uint64_t)in[1]) << 8; - result |= ((uint64_t)in[2]) << 16; - result |= ((uint64_t)in[3]) << 24; - result |= ((uint64_t)in[4]) << 32; - result |= ((uint64_t)in[5]) << 40; - result |= ((uint64_t)in[6]) << 48; - result |= ((uint64_t)in[7]) << 56; - return result; -} - -static uint8_t /*bool*/ addcarryx_u51(uint8_t /*bool*/ c, uint64_t a, - uint64_t b, uint64_t *low) { - // This function extracts 51 bits of result and 1 bit of carry (52 total), so - // a 64-bit intermediate is sufficient. - uint64_t x = a + b + c; - *low = x & ((UINT64_C(1) << 51) - 1); - return (x >> 51) & 1; -} - -static uint8_t /*bool*/ subborrow_u51(uint8_t /*bool*/ c, uint64_t a, - uint64_t b, uint64_t *low) { - // This function extracts 51 bits of result and 1 bit of borrow (52 total), so - // a 64-bit intermediate is sufficient. - uint64_t x = a - b - c; - *low = x & ((UINT64_C(1) << 51) - 1); - return x >> 63; -} - -static uint64_t cmovznz64(uint64_t t, uint64_t z, uint64_t nz) { - t = -!!t; // all set if nonzero, 0 if 0 - return (t&nz) | ((~t)&z); -} - -#else - -static uint8_t /*bool*/ addcarryx_u25(uint8_t /*bool*/ c, uint32_t a, - uint32_t b, uint32_t *low) { - // This function extracts 25 bits of result and 1 bit of carry (26 total), so - // a 32-bit intermediate is sufficient. - uint32_t x = a + b + c; - *low = x & ((1 << 25) - 1); - return (x >> 25) & 1; -} - -static uint8_t /*bool*/ addcarryx_u26(uint8_t /*bool*/ c, uint32_t a, - uint32_t b, uint32_t *low) { - // This function extracts 26 bits of result and 1 bit of carry (27 total), so - // a 32-bit intermediate is sufficient. - uint32_t x = a + b + c; - *low = x & ((1 << 26) - 1); - return (x >> 26) & 1; -} - -static uint8_t /*bool*/ subborrow_u25(uint8_t /*bool*/ c, uint32_t a, - uint32_t b, uint32_t *low) { - // This function extracts 25 bits of result and 1 bit of borrow (26 total), so - // a 32-bit intermediate is sufficient. - uint32_t x = a - b - c; - *low = x & ((1 << 25) - 1); - return x >> 31; -} - -static uint8_t /*bool*/ subborrow_u26(uint8_t /*bool*/ c, uint32_t a, - uint32_t b, uint32_t *low) { - // This function extracts 26 bits of result and 1 bit of borrow (27 total), so - // a 32-bit intermediate is sufficient. - uint32_t x = a - b - c; - *low = x & ((1 << 26) - 1); - return x >> 31; -} - -static uint32_t cmovznz32(uint32_t t, uint32_t z, uint32_t nz) { - t = -!!t; // all set if nonzero, 0 if 0 - return (t&nz) | ((~t)&z); -} - -#endif - // Field operations. #if defined(BORINGSSL_CURVE25519_64BIT) -#define assert_fe(f) do { \ - for (unsigned _assert_fe_i = 0; _assert_fe_i< 5; _assert_fe_i++) { \ - assert(f[_assert_fe_i] < 1.125*(UINT64_C(1)<<51)); \ - } \ -} while (0) +typedef uint64_t fe_limb_t; +#define FE_NUM_LIMBS 5 -#define assert_fe_loose(f) do { \ - for (unsigned _assert_fe_i = 0; _assert_fe_i< 5; _assert_fe_i++) { \ - assert(f[_assert_fe_i] < 3.375*(UINT64_C(1)<<51)); \ - } \ -} while (0) - -#define assert_fe_frozen(f) do { \ - for (unsigned _assert_fe_i = 0; _assert_fe_i< 5; _assert_fe_i++) { \ - assert(f[_assert_fe_i] < (UINT64_C(1)<<51)); \ - } \ -} while (0) - -static void fe_frombytes_impl(uint64_t h[5], const uint8_t s[32]) { - // Ignores top bit of s. - uint64_t a0 = load_8(s); - uint64_t a1 = load_8(s+8); - uint64_t a2 = load_8(s+16); - uint64_t a3 = load_8(s+24); - // Use 51 bits, 64-51 = 13 left. - h[0] = a0 & ((UINT64_C(1) << 51) - 1); - // (64-51) + 38 = 13 + 38 = 51 - h[1] = (a0 >> 51) | ((a1 & ((UINT64_C(1) << 38) - 1)) << 13); - // (64-38) + 25 = 26 + 25 = 51 - h[2] = (a1 >> 38) | ((a2 & ((UINT64_C(1) << 25) - 1)) << 26); - // (64-25) + 12 = 39 + 12 = 51 - h[3] = (a2 >> 25) | ((a3 & ((UINT64_C(1) << 12) - 1)) << 39); - // (64-12) = 52, ignore top bit - h[4] = (a3 >> 12) & ((UINT64_C(1) << 51) - 1); - assert_fe(h); -} - -static void fe_frombytes(fe *h, const uint8_t s[32]) { - fe_frombytes_impl(h->v, s); -} - -static void fe_freeze(uint64_t out[5], const uint64_t in1[5]) { - { const uint64_t x7 = in1[4]; - { const uint64_t x8 = in1[3]; - { const uint64_t x6 = in1[2]; - { const uint64_t x4 = in1[1]; - { const uint64_t x2 = in1[0]; - { uint64_t x10; uint8_t/*bool*/ x11 = subborrow_u51(0x0, x2, 0x7ffffffffffed, &x10); - { uint64_t x13; uint8_t/*bool*/ x14 = subborrow_u51(x11, x4, 0x7ffffffffffff, &x13); - { uint64_t x16; uint8_t/*bool*/ x17 = subborrow_u51(x14, x6, 0x7ffffffffffff, &x16); - { uint64_t x19; uint8_t/*bool*/ x20 = subborrow_u51(x17, x8, 0x7ffffffffffff, &x19); - { uint64_t x22; uint8_t/*bool*/ x23 = subborrow_u51(x20, x7, 0x7ffffffffffff, &x22); - { uint64_t x24 = cmovznz64(x23, 0x0, 0xffffffffffffffffL); - { uint64_t x25 = (x24 & 0x7ffffffffffed); - { uint64_t x27; uint8_t/*bool*/ x28 = addcarryx_u51(0x0, x10, x25, &x27); - { uint64_t x29 = (x24 & 0x7ffffffffffff); - { uint64_t x31; uint8_t/*bool*/ x32 = addcarryx_u51(x28, x13, x29, &x31); - { uint64_t x33 = (x24 & 0x7ffffffffffff); - { uint64_t x35; uint8_t/*bool*/ x36 = addcarryx_u51(x32, x16, x33, &x35); - { uint64_t x37 = (x24 & 0x7ffffffffffff); - { uint64_t x39; uint8_t/*bool*/ x40 = addcarryx_u51(x36, x19, x37, &x39); - { uint64_t x41 = (x24 & 0x7ffffffffffff); - { uint64_t x43; addcarryx_u51(x40, x22, x41, &x43); - out[0] = x27; - out[1] = x31; - out[2] = x35; - out[3] = x39; - out[4] = x43; - }}}}}}}}}}}}}}}}}}}}} -} - -static void fe_tobytes(uint8_t s[32], const fe *f) { - assert_fe(f->v); - uint64_t h[5]; - fe_freeze(h, f->v); - assert_fe_frozen(h); - - s[0] = h[0] >> 0; - s[1] = h[0] >> 8; - s[2] = h[0] >> 16; - s[3] = h[0] >> 24; - s[4] = h[0] >> 32; - s[5] = h[0] >> 40; - s[6] = (h[0] >> 48) | (h[1] << 3); - s[7] = h[1] >> 5; - s[8] = h[1] >> 13; - s[9] = h[1] >> 21; - s[10] = h[1] >> 29; - s[11] = h[1] >> 37; - s[12] = (h[1] >> 45) | (h[2] << 6); - s[13] = h[2] >> 2; - s[14] = h[2] >> 10; - s[15] = h[2] >> 18; - s[16] = h[2] >> 26; - s[17] = h[2] >> 34; - s[18] = h[2] >> 42; - s[19] = (h[2] >> 50) | (h[3] << 1); - s[20] = h[3] >> 7; - s[21] = h[3] >> 15; - s[22] = h[3] >> 23; - s[23] = h[3] >> 31; - s[24] = h[3] >> 39; - s[25] = (h[3] >> 47) | (h[4] << 4); - s[26] = h[4] >> 4; - s[27] = h[4] >> 12; - s[28] = h[4] >> 20; - s[29] = h[4] >> 28; - s[30] = h[4] >> 36; - s[31] = h[4] >> 44; -} - -// h = 0 -static void fe_0(fe *h) { - OPENSSL_memset(h, 0, sizeof(fe)); -} - -static void fe_loose_0(fe_loose *h) { - OPENSSL_memset(h, 0, sizeof(fe_loose)); -} - -// h = 1 -static void fe_1(fe *h) { - OPENSSL_memset(h, 0, sizeof(fe)); - h->v[0] = 1; -} - -static void fe_loose_1(fe_loose *h) { - OPENSSL_memset(h, 0, sizeof(fe_loose)); - h->v[0] = 1; -} - -static void fe_add_impl(uint64_t out[5], const uint64_t in1[5], const uint64_t in2[5]) { - { const uint64_t x10 = in1[4]; - { const uint64_t x11 = in1[3]; - { const uint64_t x9 = in1[2]; - { const uint64_t x7 = in1[1]; - { const uint64_t x5 = in1[0]; - { const uint64_t x18 = in2[4]; - { const uint64_t x19 = in2[3]; - { const uint64_t x17 = in2[2]; - { const uint64_t x15 = in2[1]; - { const uint64_t x13 = in2[0]; - out[0] = (x5 + x13); - out[1] = (x7 + x15); - out[2] = (x9 + x17); - out[3] = (x11 + x19); - out[4] = (x10 + x18); - }}}}}}}}}} -} - -// h = f + g -// Can overlap h with f or g. -static void fe_add(fe_loose *h, const fe *f, const fe *g) { - assert_fe(f->v); - assert_fe(g->v); - fe_add_impl(h->v, f->v, g->v); - assert_fe_loose(h->v); -} - -static void fe_sub_impl(uint64_t out[5], const uint64_t in1[5], const uint64_t in2[5]) { - { const uint64_t x10 = in1[4]; - { const uint64_t x11 = in1[3]; - { const uint64_t x9 = in1[2]; - { const uint64_t x7 = in1[1]; - { const uint64_t x5 = in1[0]; - { const uint64_t x18 = in2[4]; - { const uint64_t x19 = in2[3]; - { const uint64_t x17 = in2[2]; - { const uint64_t x15 = in2[1]; - { const uint64_t x13 = in2[0]; - out[0] = ((0xfffffffffffda + x5) - x13); - out[1] = ((0xffffffffffffe + x7) - x15); - out[2] = ((0xffffffffffffe + x9) - x17); - out[3] = ((0xffffffffffffe + x11) - x19); - out[4] = ((0xffffffffffffe + x10) - x18); - }}}}}}}}}} -} - -// h = f - g -// Can overlap h with f or g. -static void fe_sub(fe_loose *h, const fe *f, const fe *g) { - assert_fe(f->v); - assert_fe(g->v); - fe_sub_impl(h->v, f->v, g->v); - assert_fe_loose(h->v); -} - -static void fe_carry_impl(uint64_t out[5], const uint64_t in1[5]) { - { const uint64_t x7 = in1[4]; - { const uint64_t x8 = in1[3]; - { const uint64_t x6 = in1[2]; - { const uint64_t x4 = in1[1]; - { const uint64_t x2 = in1[0]; - { uint64_t x9 = (x2 >> 0x33); - { uint64_t x10 = (x2 & 0x7ffffffffffff); - { uint64_t x11 = (x9 + x4); - { uint64_t x12 = (x11 >> 0x33); - { uint64_t x13 = (x11 & 0x7ffffffffffff); - { uint64_t x14 = (x12 + x6); - { uint64_t x15 = (x14 >> 0x33); - { uint64_t x16 = (x14 & 0x7ffffffffffff); - { uint64_t x17 = (x15 + x8); - { uint64_t x18 = (x17 >> 0x33); - { uint64_t x19 = (x17 & 0x7ffffffffffff); - { uint64_t x20 = (x18 + x7); - { uint64_t x21 = (x20 >> 0x33); - { uint64_t x22 = (x20 & 0x7ffffffffffff); - { uint64_t x23 = (x10 + (0x13 * x21)); - { uint64_t x24 = (x23 >> 0x33); - { uint64_t x25 = (x23 & 0x7ffffffffffff); - { uint64_t x26 = (x24 + x13); - { uint64_t x27 = (x26 >> 0x33); - { uint64_t x28 = (x26 & 0x7ffffffffffff); - out[0] = x25; - out[1] = x28; - out[2] = (x27 + x16); - out[3] = x19; - out[4] = x22; - }}}}}}}}}}}}}}}}}}}}}}}}} -} - -static void fe_carry(fe *h, const fe_loose* f) { - assert_fe_loose(f->v); - fe_carry_impl(h->v, f->v); - assert_fe(h->v); -} - -static void fe_mul_impl(uint64_t out[5], const uint64_t in1[5], const uint64_t in2[5]) { - assert_fe_loose(in1); - assert_fe_loose(in2); - { const uint64_t x10 = in1[4]; - { const uint64_t x11 = in1[3]; - { const uint64_t x9 = in1[2]; - { const uint64_t x7 = in1[1]; - { const uint64_t x5 = in1[0]; - { const uint64_t x18 = in2[4]; - { const uint64_t x19 = in2[3]; - { const uint64_t x17 = in2[2]; - { const uint64_t x15 = in2[1]; - { const uint64_t x13 = in2[0]; - { uint128_t x20 = ((uint128_t)x5 * x13); - { uint128_t x21 = (((uint128_t)x5 * x15) + ((uint128_t)x7 * x13)); - { uint128_t x22 = ((((uint128_t)x5 * x17) + ((uint128_t)x9 * x13)) + ((uint128_t)x7 * x15)); - { uint128_t x23 = (((((uint128_t)x5 * x19) + ((uint128_t)x11 * x13)) + ((uint128_t)x7 * x17)) + ((uint128_t)x9 * x15)); - { uint128_t x24 = ((((((uint128_t)x5 * x18) + ((uint128_t)x10 * x13)) + ((uint128_t)x11 * x15)) + ((uint128_t)x7 * x19)) + ((uint128_t)x9 * x17)); - { uint64_t x25 = (x10 * 0x13); - { uint64_t x26 = (x7 * 0x13); - { uint64_t x27 = (x9 * 0x13); - { uint64_t x28 = (x11 * 0x13); - { uint128_t x29 = ((((x20 + ((uint128_t)x25 * x15)) + ((uint128_t)x26 * x18)) + ((uint128_t)x27 * x19)) + ((uint128_t)x28 * x17)); - { uint128_t x30 = (((x21 + ((uint128_t)x25 * x17)) + ((uint128_t)x27 * x18)) + ((uint128_t)x28 * x19)); - { uint128_t x31 = ((x22 + ((uint128_t)x25 * x19)) + ((uint128_t)x28 * x18)); - { uint128_t x32 = (x23 + ((uint128_t)x25 * x18)); - { uint64_t x33 = (uint64_t) (x29 >> 0x33); - { uint64_t x34 = ((uint64_t)x29 & 0x7ffffffffffff); - { uint128_t x35 = (x33 + x30); - { uint64_t x36 = (uint64_t) (x35 >> 0x33); - { uint64_t x37 = ((uint64_t)x35 & 0x7ffffffffffff); - { uint128_t x38 = (x36 + x31); - { uint64_t x39 = (uint64_t) (x38 >> 0x33); - { uint64_t x40 = ((uint64_t)x38 & 0x7ffffffffffff); - { uint128_t x41 = (x39 + x32); - { uint64_t x42 = (uint64_t) (x41 >> 0x33); - { uint64_t x43 = ((uint64_t)x41 & 0x7ffffffffffff); - { uint128_t x44 = (x42 + x24); - { uint64_t x45 = (uint64_t) (x44 >> 0x33); - { uint64_t x46 = ((uint64_t)x44 & 0x7ffffffffffff); - { uint64_t x47 = (x34 + (0x13 * x45)); - { uint64_t x48 = (x47 >> 0x33); - { uint64_t x49 = (x47 & 0x7ffffffffffff); - { uint64_t x50 = (x48 + x37); - { uint64_t x51 = (x50 >> 0x33); - { uint64_t x52 = (x50 & 0x7ffffffffffff); - out[0] = x49; - out[1] = x52; - out[2] = (x51 + x40); - out[3] = x43; - out[4] = x46; - }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} - assert_fe(out); -} - -static void fe_mul_ltt(fe_loose *h, const fe *f, const fe *g) { - fe_mul_impl(h->v, f->v, g->v); -} - -static void fe_mul_llt(fe_loose *h, const fe_loose *f, const fe *g) { - fe_mul_impl(h->v, f->v, g->v); -} - -static void fe_mul_ttt(fe *h, const fe *f, const fe *g) { - fe_mul_impl(h->v, f->v, g->v); -} - -static void fe_mul_tlt(fe *h, const fe_loose *f, const fe *g) { - fe_mul_impl(h->v, f->v, g->v); -} - -static void fe_mul_ttl(fe *h, const fe *f, const fe_loose *g) { - fe_mul_impl(h->v, f->v, g->v); -} - -static void fe_mul_tll(fe *h, const fe_loose *f, const fe_loose *g) { - fe_mul_impl(h->v, f->v, g->v); -} - -static void fe_sqr_impl(uint64_t out[5], const uint64_t in1[5]) { - assert_fe_loose(in1); - { const uint64_t x7 = in1[4]; - { const uint64_t x8 = in1[3]; - { const uint64_t x6 = in1[2]; - { const uint64_t x4 = in1[1]; - { const uint64_t x2 = in1[0]; - { uint64_t x9 = (x2 * 0x2); - { uint64_t x10 = (x4 * 0x2); - { uint64_t x11 = ((x6 * 0x2) * 0x13); - { uint64_t x12 = (x7 * 0x13); - { uint64_t x13 = (x12 * 0x2); - { uint128_t x14 = ((((uint128_t)x2 * x2) + ((uint128_t)x13 * x4)) + ((uint128_t)x11 * x8)); - { uint128_t x15 = ((((uint128_t)x9 * x4) + ((uint128_t)x13 * x6)) + ((uint128_t)x8 * (x8 * 0x13))); - { uint128_t x16 = ((((uint128_t)x9 * x6) + ((uint128_t)x4 * x4)) + ((uint128_t)x13 * x8)); - { uint128_t x17 = ((((uint128_t)x9 * x8) + ((uint128_t)x10 * x6)) + ((uint128_t)x7 * x12)); - { uint128_t x18 = ((((uint128_t)x9 * x7) + ((uint128_t)x10 * x8)) + ((uint128_t)x6 * x6)); - { uint64_t x19 = (uint64_t) (x14 >> 0x33); - { uint64_t x20 = ((uint64_t)x14 & 0x7ffffffffffff); - { uint128_t x21 = (x19 + x15); - { uint64_t x22 = (uint64_t) (x21 >> 0x33); - { uint64_t x23 = ((uint64_t)x21 & 0x7ffffffffffff); - { uint128_t x24 = (x22 + x16); - { uint64_t x25 = (uint64_t) (x24 >> 0x33); - { uint64_t x26 = ((uint64_t)x24 & 0x7ffffffffffff); - { uint128_t x27 = (x25 + x17); - { uint64_t x28 = (uint64_t) (x27 >> 0x33); - { uint64_t x29 = ((uint64_t)x27 & 0x7ffffffffffff); - { uint128_t x30 = (x28 + x18); - { uint64_t x31 = (uint64_t) (x30 >> 0x33); - { uint64_t x32 = ((uint64_t)x30 & 0x7ffffffffffff); - { uint64_t x33 = (x20 + (0x13 * x31)); - { uint64_t x34 = (x33 >> 0x33); - { uint64_t x35 = (x33 & 0x7ffffffffffff); - { uint64_t x36 = (x34 + x23); - { uint64_t x37 = (x36 >> 0x33); - { uint64_t x38 = (x36 & 0x7ffffffffffff); - out[0] = x35; - out[1] = x38; - out[2] = (x37 + x26); - out[3] = x29; - out[4] = x32; - }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} - assert_fe(out); -} - -static void fe_sq_tl(fe *h, const fe_loose *f) { - fe_sqr_impl(h->v, f->v); -} - -static void fe_sq_tt(fe *h, const fe *f) { - fe_sqr_impl(h->v, f->v); -} - -// Replace (f,g) with (g,f) if b == 1; -// replace (f,g) with (f,g) if b == 0. +// assert_fe asserts that |f| satisfies bounds: // -// Preconditions: b in {0,1}. -static void fe_cswap(fe *f, fe *g, uint64_t b) { - b = 0-b; - for (unsigned i = 0; i < 5; i++) { - uint64_t x = f->v[i] ^ g->v[i]; - x &= b; - f->v[i] ^= x; - g->v[i] ^= x; - } -} - -// NOTE: based on fiat-crypto fe_mul, edited for in2=121666, 0, 0.. -static void fe_mul_121666_impl(uint64_t out[5], const uint64_t in1[5]) { - { const uint64_t x10 = in1[4]; - { const uint64_t x11 = in1[3]; - { const uint64_t x9 = in1[2]; - { const uint64_t x7 = in1[1]; - { const uint64_t x5 = in1[0]; - { const uint64_t x18 = 0; - { const uint64_t x19 = 0; - { const uint64_t x17 = 0; - { const uint64_t x15 = 0; - { const uint64_t x13 = 121666; - { uint128_t x20 = ((uint128_t)x5 * x13); - { uint128_t x21 = (((uint128_t)x5 * x15) + ((uint128_t)x7 * x13)); - { uint128_t x22 = ((((uint128_t)x5 * x17) + ((uint128_t)x9 * x13)) + ((uint128_t)x7 * x15)); - { uint128_t x23 = (((((uint128_t)x5 * x19) + ((uint128_t)x11 * x13)) + ((uint128_t)x7 * x17)) + ((uint128_t)x9 * x15)); - { uint128_t x24 = ((((((uint128_t)x5 * x18) + ((uint128_t)x10 * x13)) + ((uint128_t)x11 * x15)) + ((uint128_t)x7 * x19)) + ((uint128_t)x9 * x17)); - { uint64_t x25 = (x10 * 0x13); - { uint64_t x26 = (x7 * 0x13); - { uint64_t x27 = (x9 * 0x13); - { uint64_t x28 = (x11 * 0x13); - { uint128_t x29 = ((((x20 + ((uint128_t)x25 * x15)) + ((uint128_t)x26 * x18)) + ((uint128_t)x27 * x19)) + ((uint128_t)x28 * x17)); - { uint128_t x30 = (((x21 + ((uint128_t)x25 * x17)) + ((uint128_t)x27 * x18)) + ((uint128_t)x28 * x19)); - { uint128_t x31 = ((x22 + ((uint128_t)x25 * x19)) + ((uint128_t)x28 * x18)); - { uint128_t x32 = (x23 + ((uint128_t)x25 * x18)); - { uint64_t x33 = (uint64_t) (x29 >> 0x33); - { uint64_t x34 = ((uint64_t)x29 & 0x7ffffffffffff); - { uint128_t x35 = (x33 + x30); - { uint64_t x36 = (uint64_t) (x35 >> 0x33); - { uint64_t x37 = ((uint64_t)x35 & 0x7ffffffffffff); - { uint128_t x38 = (x36 + x31); - { uint64_t x39 = (uint64_t) (x38 >> 0x33); - { uint64_t x40 = ((uint64_t)x38 & 0x7ffffffffffff); - { uint128_t x41 = (x39 + x32); - { uint64_t x42 = (uint64_t) (x41 >> 0x33); - { uint64_t x43 = ((uint64_t)x41 & 0x7ffffffffffff); - { uint128_t x44 = (x42 + x24); - { uint64_t x45 = (uint64_t) (x44 >> 0x33); - { uint64_t x46 = ((uint64_t)x44 & 0x7ffffffffffff); - { uint64_t x47 = (x34 + (0x13 * x45)); - { uint64_t x48 = (x47 >> 0x33); - { uint64_t x49 = (x47 & 0x7ffffffffffff); - { uint64_t x50 = (x48 + x37); - { uint64_t x51 = (x50 >> 0x33); - { uint64_t x52 = (x50 & 0x7ffffffffffff); - out[0] = x49; - out[1] = x52; - out[2] = (x51 + x40); - out[3] = x43; - out[4] = x46; - }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} -} - -static void fe_mul121666(fe *h, const fe_loose *f) { - assert_fe_loose(f->v); - fe_mul_121666_impl(h->v, f->v); - assert_fe(h->v); -} - -// Adapted from Fiat-synthesized |fe_sub_impl| with |out| = 0. -static void fe_neg_impl(uint64_t out[5], const uint64_t in2[5]) { - { const uint64_t x10 = 0; - { const uint64_t x11 = 0; - { const uint64_t x9 = 0; - { const uint64_t x7 = 0; - { const uint64_t x5 = 0; - { const uint64_t x18 = in2[4]; - { const uint64_t x19 = in2[3]; - { const uint64_t x17 = in2[2]; - { const uint64_t x15 = in2[1]; - { const uint64_t x13 = in2[0]; - out[0] = ((0xfffffffffffda + x5) - x13); - out[1] = ((0xffffffffffffe + x7) - x15); - out[2] = ((0xffffffffffffe + x9) - x17); - out[3] = ((0xffffffffffffe + x11) - x19); - out[4] = ((0xffffffffffffe + x10) - x18); - }}}}}}}}}} -} - -// h = -f -static void fe_neg(fe_loose *h, const fe *f) { - assert_fe(f->v); - fe_neg_impl(h->v, f->v); - assert_fe_loose(h->v); -} - -// Replace (f,g) with (g,g) if b == 1; -// replace (f,g) with (f,g) if b == 0. +// [[0x0 ~> 0x8cccccccccccc], +// [0x0 ~> 0x8cccccccccccc], +// [0x0 ~> 0x8cccccccccccc], +// [0x0 ~> 0x8cccccccccccc], +// [0x0 ~> 0x8cccccccccccc]] // -// Preconditions: b in {0,1}. -static void fe_cmov(fe_loose *f, const fe_loose *g, uint64_t b) { - b = 0-b; - for (unsigned i = 0; i < 5; i++) { - uint64_t x = f->v[i] ^ g->v[i]; - x &= b; - f->v[i] ^= x; - } -} +// See comments in curve25519_64.c for which functions use these bounds for +// inputs or outputs. +#define assert_fe(f) \ + do { \ + for (unsigned _assert_fe_i = 0; _assert_fe_i < 5; _assert_fe_i++) { \ + assert(f[_assert_fe_i] <= UINT64_C(0x8cccccccccccc)); \ + } \ + } while (0) + +// assert_fe_loose asserts that |f| satisfies bounds: +// +// [[0x0 ~> 0x1a666666666664], +// [0x0 ~> 0x1a666666666664], +// [0x0 ~> 0x1a666666666664], +// [0x0 ~> 0x1a666666666664], +// [0x0 ~> 0x1a666666666664]] +// +// See comments in curve25519_64.c for which functions use these bounds for +// inputs or outputs. +#define assert_fe_loose(f) \ + do { \ + for (unsigned _assert_fe_i = 0; _assert_fe_i < 5; _assert_fe_i++) { \ + assert(f[_assert_fe_i] <= UINT64_C(0x1a666666666664)); \ + } \ + } while (0) #else -#define assert_fe(f) do { \ - for (unsigned _assert_fe_i = 0; _assert_fe_i< 10; _assert_fe_i++) { \ - assert(f[_assert_fe_i] < 1.125*(1<<(26-(_assert_fe_i&1)))); \ - } \ -} while (0) +typedef uint32_t fe_limb_t; +#define FE_NUM_LIMBS 10 -#define assert_fe_loose(f) do { \ - for (unsigned _assert_fe_i = 0; _assert_fe_i< 10; _assert_fe_i++) { \ - assert(f[_assert_fe_i] < 3.375*(1<<(26-(_assert_fe_i&1)))); \ - } \ -} while (0) +// assert_fe asserts that |f| satisfies bounds: +// +// [[0x0 ~> 0x4666666], [0x0 ~> 0x2333333], +// [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], +// [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], +// [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], +// [0x0 ~> 0x4666666], [0x0 ~> 0x2333333]] +// +// See comments in curve25519_32.c for which functions use these bounds for +// inputs or outputs. +#define assert_fe(f) \ + do { \ + for (unsigned _assert_fe_i = 0; _assert_fe_i < 10; _assert_fe_i++) { \ + assert(f[_assert_fe_i] <= \ + ((_assert_fe_i & 1) ? 0x2333333u : 0x4666666u)); \ + } \ + } while (0) -#define assert_fe_frozen(f) do { \ - for (unsigned _assert_fe_i = 0; _assert_fe_i< 10; _assert_fe_i++) { \ - assert(f[_assert_fe_i] < (1u<<(26-(_assert_fe_i&1)))); \ - } \ -} while (0) +// assert_fe_loose asserts that |f| satisfies bounds: +// +// [[0x0 ~> 0xd333332], [0x0 ~> 0x6999999], +// [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], +// [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], +// [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], +// [0x0 ~> 0xd333332], [0x0 ~> 0x6999999]] +// +// See comments in curve25519_32.c for which functions use these bounds for +// inputs or outputs. +#define assert_fe_loose(f) \ + do { \ + for (unsigned _assert_fe_i = 0; _assert_fe_i < 10; _assert_fe_i++) { \ + assert(f[_assert_fe_i] <= \ + ((_assert_fe_i & 1) ? 0x6999999u : 0xd333332u)); \ + } \ + } while (0) -static void fe_frombytes_impl(uint32_t h[10], const uint8_t s[32]) { - // Ignores top bit of s. - uint32_t a0 = load_4(s); - uint32_t a1 = load_4(s+4); - uint32_t a2 = load_4(s+8); - uint32_t a3 = load_4(s+12); - uint32_t a4 = load_4(s+16); - uint32_t a5 = load_4(s+20); - uint32_t a6 = load_4(s+24); - uint32_t a7 = load_4(s+28); - h[0] = a0&((1<<26)-1); // 26 used, 32-26 left. 26 - h[1] = (a0>>26) | ((a1&((1<<19)-1))<< 6); // (32-26) + 19 = 6+19 = 25 - h[2] = (a1>>19) | ((a2&((1<<13)-1))<<13); // (32-19) + 13 = 13+13 = 26 - h[3] = (a2>>13) | ((a3&((1<< 6)-1))<<19); // (32-13) + 6 = 19+ 6 = 25 - h[4] = (a3>> 6); // (32- 6) = 26 - h[5] = a4&((1<<25)-1); // 25 - h[6] = (a4>>25) | ((a5&((1<<19)-1))<< 7); // (32-25) + 19 = 7+19 = 26 - h[7] = (a5>>19) | ((a6&((1<<12)-1))<<13); // (32-19) + 12 = 13+12 = 25 - h[8] = (a6>>12) | ((a7&((1<< 6)-1))<<20); // (32-12) + 6 = 20+ 6 = 26 - h[9] = (a7>> 6)&((1<<25)-1); // 25 - assert_fe(h); +#endif // BORINGSSL_CURVE25519_64BIT + +OPENSSL_STATIC_ASSERT(sizeof(fe) == sizeof(fe_limb_t) * FE_NUM_LIMBS, + "fe_limb_t[FE_NUM_LIMBS] is inconsistent with fe"); + +static void fe_frombytes_strict(fe *h, const uint8_t s[32]) { + // |fiat_25519_from_bytes| requires the top-most bit be clear. + assert((s[31] & 0x80) == 0); + fiat_25519_from_bytes(h->v, s); + assert_fe(h->v); } static void fe_frombytes(fe *h, const uint8_t s[32]) { - fe_frombytes_impl(h->v, s); -} - -static void fe_freeze(uint32_t out[10], const uint32_t in1[10]) { - { const uint32_t x17 = in1[9]; - { const uint32_t x18 = in1[8]; - { const uint32_t x16 = in1[7]; - { const uint32_t x14 = in1[6]; - { const uint32_t x12 = in1[5]; - { const uint32_t x10 = in1[4]; - { const uint32_t x8 = in1[3]; - { const uint32_t x6 = in1[2]; - { const uint32_t x4 = in1[1]; - { const uint32_t x2 = in1[0]; - { uint32_t x20; uint8_t/*bool*/ x21 = subborrow_u26(0x0, x2, 0x3ffffed, &x20); - { uint32_t x23; uint8_t/*bool*/ x24 = subborrow_u25(x21, x4, 0x1ffffff, &x23); - { uint32_t x26; uint8_t/*bool*/ x27 = subborrow_u26(x24, x6, 0x3ffffff, &x26); - { uint32_t x29; uint8_t/*bool*/ x30 = subborrow_u25(x27, x8, 0x1ffffff, &x29); - { uint32_t x32; uint8_t/*bool*/ x33 = subborrow_u26(x30, x10, 0x3ffffff, &x32); - { uint32_t x35; uint8_t/*bool*/ x36 = subborrow_u25(x33, x12, 0x1ffffff, &x35); - { uint32_t x38; uint8_t/*bool*/ x39 = subborrow_u26(x36, x14, 0x3ffffff, &x38); - { uint32_t x41; uint8_t/*bool*/ x42 = subborrow_u25(x39, x16, 0x1ffffff, &x41); - { uint32_t x44; uint8_t/*bool*/ x45 = subborrow_u26(x42, x18, 0x3ffffff, &x44); - { uint32_t x47; uint8_t/*bool*/ x48 = subborrow_u25(x45, x17, 0x1ffffff, &x47); - { uint32_t x49 = cmovznz32(x48, 0x0, 0xffffffff); - { uint32_t x50 = (x49 & 0x3ffffed); - { uint32_t x52; uint8_t/*bool*/ x53 = addcarryx_u26(0x0, x20, x50, &x52); - { uint32_t x54 = (x49 & 0x1ffffff); - { uint32_t x56; uint8_t/*bool*/ x57 = addcarryx_u25(x53, x23, x54, &x56); - { uint32_t x58 = (x49 & 0x3ffffff); - { uint32_t x60; uint8_t/*bool*/ x61 = addcarryx_u26(x57, x26, x58, &x60); - { uint32_t x62 = (x49 & 0x1ffffff); - { uint32_t x64; uint8_t/*bool*/ x65 = addcarryx_u25(x61, x29, x62, &x64); - { uint32_t x66 = (x49 & 0x3ffffff); - { uint32_t x68; uint8_t/*bool*/ x69 = addcarryx_u26(x65, x32, x66, &x68); - { uint32_t x70 = (x49 & 0x1ffffff); - { uint32_t x72; uint8_t/*bool*/ x73 = addcarryx_u25(x69, x35, x70, &x72); - { uint32_t x74 = (x49 & 0x3ffffff); - { uint32_t x76; uint8_t/*bool*/ x77 = addcarryx_u26(x73, x38, x74, &x76); - { uint32_t x78 = (x49 & 0x1ffffff); - { uint32_t x80; uint8_t/*bool*/ x81 = addcarryx_u25(x77, x41, x78, &x80); - { uint32_t x82 = (x49 & 0x3ffffff); - { uint32_t x84; uint8_t/*bool*/ x85 = addcarryx_u26(x81, x44, x82, &x84); - { uint32_t x86 = (x49 & 0x1ffffff); - { uint32_t x88; addcarryx_u25(x85, x47, x86, &x88); - out[0] = x52; - out[1] = x56; - out[2] = x60; - out[3] = x64; - out[4] = x68; - out[5] = x72; - out[6] = x76; - out[7] = x80; - out[8] = x84; - out[9] = x88; - }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} + uint8_t s_copy[32]; + OPENSSL_memcpy(s_copy, s, 32); + s_copy[31] &= 0x7f; + fe_frombytes_strict(h, s_copy); } static void fe_tobytes(uint8_t s[32], const fe *f) { assert_fe(f->v); - uint32_t h[10]; - fe_freeze(h, f->v); - assert_fe_frozen(h); - - s[0] = h[0] >> 0; - s[1] = h[0] >> 8; - s[2] = h[0] >> 16; - s[3] = (h[0] >> 24) | (h[1] << 2); - s[4] = h[1] >> 6; - s[5] = h[1] >> 14; - s[6] = (h[1] >> 22) | (h[2] << 3); - s[7] = h[2] >> 5; - s[8] = h[2] >> 13; - s[9] = (h[2] >> 21) | (h[3] << 5); - s[10] = h[3] >> 3; - s[11] = h[3] >> 11; - s[12] = (h[3] >> 19) | (h[4] << 6); - s[13] = h[4] >> 2; - s[14] = h[4] >> 10; - s[15] = h[4] >> 18; - s[16] = h[5] >> 0; - s[17] = h[5] >> 8; - s[18] = h[5] >> 16; - s[19] = (h[5] >> 24) | (h[6] << 1); - s[20] = h[6] >> 7; - s[21] = h[6] >> 15; - s[22] = (h[6] >> 23) | (h[7] << 3); - s[23] = h[7] >> 5; - s[24] = h[7] >> 13; - s[25] = (h[7] >> 21) | (h[8] << 4); - s[26] = h[8] >> 4; - s[27] = h[8] >> 12; - s[28] = (h[8] >> 20) | (h[9] << 6); - s[29] = h[9] >> 2; - s[30] = h[9] >> 10; - s[31] = h[9] >> 18; + fiat_25519_to_bytes(s, f->v); } // h = 0 @@ -787,272 +198,36 @@ h->v[0] = 1; } -static void fe_add_impl(uint32_t out[10], const uint32_t in1[10], const uint32_t in2[10]) { - { const uint32_t x20 = in1[9]; - { const uint32_t x21 = in1[8]; - { const uint32_t x19 = in1[7]; - { const uint32_t x17 = in1[6]; - { const uint32_t x15 = in1[5]; - { const uint32_t x13 = in1[4]; - { const uint32_t x11 = in1[3]; - { const uint32_t x9 = in1[2]; - { const uint32_t x7 = in1[1]; - { const uint32_t x5 = in1[0]; - { const uint32_t x38 = in2[9]; - { const uint32_t x39 = in2[8]; - { const uint32_t x37 = in2[7]; - { const uint32_t x35 = in2[6]; - { const uint32_t x33 = in2[5]; - { const uint32_t x31 = in2[4]; - { const uint32_t x29 = in2[3]; - { const uint32_t x27 = in2[2]; - { const uint32_t x25 = in2[1]; - { const uint32_t x23 = in2[0]; - out[0] = (x5 + x23); - out[1] = (x7 + x25); - out[2] = (x9 + x27); - out[3] = (x11 + x29); - out[4] = (x13 + x31); - out[5] = (x15 + x33); - out[6] = (x17 + x35); - out[7] = (x19 + x37); - out[8] = (x21 + x39); - out[9] = (x20 + x38); - }}}}}}}}}}}}}}}}}}}} -} - // h = f + g // Can overlap h with f or g. static void fe_add(fe_loose *h, const fe *f, const fe *g) { assert_fe(f->v); assert_fe(g->v); - fe_add_impl(h->v, f->v, g->v); + fiat_25519_add(h->v, f->v, g->v); assert_fe_loose(h->v); } -static void fe_sub_impl(uint32_t out[10], const uint32_t in1[10], const uint32_t in2[10]) { - { const uint32_t x20 = in1[9]; - { const uint32_t x21 = in1[8]; - { const uint32_t x19 = in1[7]; - { const uint32_t x17 = in1[6]; - { const uint32_t x15 = in1[5]; - { const uint32_t x13 = in1[4]; - { const uint32_t x11 = in1[3]; - { const uint32_t x9 = in1[2]; - { const uint32_t x7 = in1[1]; - { const uint32_t x5 = in1[0]; - { const uint32_t x38 = in2[9]; - { const uint32_t x39 = in2[8]; - { const uint32_t x37 = in2[7]; - { const uint32_t x35 = in2[6]; - { const uint32_t x33 = in2[5]; - { const uint32_t x31 = in2[4]; - { const uint32_t x29 = in2[3]; - { const uint32_t x27 = in2[2]; - { const uint32_t x25 = in2[1]; - { const uint32_t x23 = in2[0]; - out[0] = ((0x7ffffda + x5) - x23); - out[1] = ((0x3fffffe + x7) - x25); - out[2] = ((0x7fffffe + x9) - x27); - out[3] = ((0x3fffffe + x11) - x29); - out[4] = ((0x7fffffe + x13) - x31); - out[5] = ((0x3fffffe + x15) - x33); - out[6] = ((0x7fffffe + x17) - x35); - out[7] = ((0x3fffffe + x19) - x37); - out[8] = ((0x7fffffe + x21) - x39); - out[9] = ((0x3fffffe + x20) - x38); - }}}}}}}}}}}}}}}}}}}} -} - // h = f - g // Can overlap h with f or g. static void fe_sub(fe_loose *h, const fe *f, const fe *g) { assert_fe(f->v); assert_fe(g->v); - fe_sub_impl(h->v, f->v, g->v); + fiat_25519_sub(h->v, f->v, g->v); assert_fe_loose(h->v); } -static void fe_carry_impl(uint32_t out[10], const uint32_t in1[10]) { - { const uint32_t x17 = in1[9]; - { const uint32_t x18 = in1[8]; - { const uint32_t x16 = in1[7]; - { const uint32_t x14 = in1[6]; - { const uint32_t x12 = in1[5]; - { const uint32_t x10 = in1[4]; - { const uint32_t x8 = in1[3]; - { const uint32_t x6 = in1[2]; - { const uint32_t x4 = in1[1]; - { const uint32_t x2 = in1[0]; - { uint32_t x19 = (x2 >> 0x1a); - { uint32_t x20 = (x2 & 0x3ffffff); - { uint32_t x21 = (x19 + x4); - { uint32_t x22 = (x21 >> 0x19); - { uint32_t x23 = (x21 & 0x1ffffff); - { uint32_t x24 = (x22 + x6); - { uint32_t x25 = (x24 >> 0x1a); - { uint32_t x26 = (x24 & 0x3ffffff); - { uint32_t x27 = (x25 + x8); - { uint32_t x28 = (x27 >> 0x19); - { uint32_t x29 = (x27 & 0x1ffffff); - { uint32_t x30 = (x28 + x10); - { uint32_t x31 = (x30 >> 0x1a); - { uint32_t x32 = (x30 & 0x3ffffff); - { uint32_t x33 = (x31 + x12); - { uint32_t x34 = (x33 >> 0x19); - { uint32_t x35 = (x33 & 0x1ffffff); - { uint32_t x36 = (x34 + x14); - { uint32_t x37 = (x36 >> 0x1a); - { uint32_t x38 = (x36 & 0x3ffffff); - { uint32_t x39 = (x37 + x16); - { uint32_t x40 = (x39 >> 0x19); - { uint32_t x41 = (x39 & 0x1ffffff); - { uint32_t x42 = (x40 + x18); - { uint32_t x43 = (x42 >> 0x1a); - { uint32_t x44 = (x42 & 0x3ffffff); - { uint32_t x45 = (x43 + x17); - { uint32_t x46 = (x45 >> 0x19); - { uint32_t x47 = (x45 & 0x1ffffff); - { uint32_t x48 = (x20 + (0x13 * x46)); - { uint32_t x49 = (x48 >> 0x1a); - { uint32_t x50 = (x48 & 0x3ffffff); - { uint32_t x51 = (x49 + x23); - { uint32_t x52 = (x51 >> 0x19); - { uint32_t x53 = (x51 & 0x1ffffff); - out[0] = x50; - out[1] = x53; - out[2] = (x52 + x26); - out[3] = x29; - out[4] = x32; - out[5] = x35; - out[6] = x38; - out[7] = x41; - out[8] = x44; - out[9] = x47; - }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} -} - static void fe_carry(fe *h, const fe_loose* f) { assert_fe_loose(f->v); - fe_carry_impl(h->v, f->v); + fiat_25519_carry(h->v, f->v); assert_fe(h->v); } -static void fe_mul_impl(uint32_t out[10], const uint32_t in1[10], const uint32_t in2[10]) { +static void fe_mul_impl(fe_limb_t out[FE_NUM_LIMBS], + const fe_limb_t in1[FE_NUM_LIMBS], + const fe_limb_t in2[FE_NUM_LIMBS]) { assert_fe_loose(in1); assert_fe_loose(in2); - { const uint32_t x20 = in1[9]; - { const uint32_t x21 = in1[8]; - { const uint32_t x19 = in1[7]; - { const uint32_t x17 = in1[6]; - { const uint32_t x15 = in1[5]; - { const uint32_t x13 = in1[4]; - { const uint32_t x11 = in1[3]; - { const uint32_t x9 = in1[2]; - { const uint32_t x7 = in1[1]; - { const uint32_t x5 = in1[0]; - { const uint32_t x38 = in2[9]; - { const uint32_t x39 = in2[8]; - { const uint32_t x37 = in2[7]; - { const uint32_t x35 = in2[6]; - { const uint32_t x33 = in2[5]; - { const uint32_t x31 = in2[4]; - { const uint32_t x29 = in2[3]; - { const uint32_t x27 = in2[2]; - { const uint32_t x25 = in2[1]; - { const uint32_t x23 = in2[0]; - { uint64_t x40 = ((uint64_t)x23 * x5); - { uint64_t x41 = (((uint64_t)x23 * x7) + ((uint64_t)x25 * x5)); - { uint64_t x42 = ((((uint64_t)(0x2 * x25) * x7) + ((uint64_t)x23 * x9)) + ((uint64_t)x27 * x5)); - { uint64_t x43 = (((((uint64_t)x25 * x9) + ((uint64_t)x27 * x7)) + ((uint64_t)x23 * x11)) + ((uint64_t)x29 * x5)); - { uint64_t x44 = (((((uint64_t)x27 * x9) + (0x2 * (((uint64_t)x25 * x11) + ((uint64_t)x29 * x7)))) + ((uint64_t)x23 * x13)) + ((uint64_t)x31 * x5)); - { uint64_t x45 = (((((((uint64_t)x27 * x11) + ((uint64_t)x29 * x9)) + ((uint64_t)x25 * x13)) + ((uint64_t)x31 * x7)) + ((uint64_t)x23 * x15)) + ((uint64_t)x33 * x5)); - { uint64_t x46 = (((((0x2 * ((((uint64_t)x29 * x11) + ((uint64_t)x25 * x15)) + ((uint64_t)x33 * x7))) + ((uint64_t)x27 * x13)) + ((uint64_t)x31 * x9)) + ((uint64_t)x23 * x17)) + ((uint64_t)x35 * x5)); - { uint64_t x47 = (((((((((uint64_t)x29 * x13) + ((uint64_t)x31 * x11)) + ((uint64_t)x27 * x15)) + ((uint64_t)x33 * x9)) + ((uint64_t)x25 * x17)) + ((uint64_t)x35 * x7)) + ((uint64_t)x23 * x19)) + ((uint64_t)x37 * x5)); - { uint64_t x48 = (((((((uint64_t)x31 * x13) + (0x2 * (((((uint64_t)x29 * x15) + ((uint64_t)x33 * x11)) + ((uint64_t)x25 * x19)) + ((uint64_t)x37 * x7)))) + ((uint64_t)x27 * x17)) + ((uint64_t)x35 * x9)) + ((uint64_t)x23 * x21)) + ((uint64_t)x39 * x5)); - { uint64_t x49 = (((((((((((uint64_t)x31 * x15) + ((uint64_t)x33 * x13)) + ((uint64_t)x29 * x17)) + ((uint64_t)x35 * x11)) + ((uint64_t)x27 * x19)) + ((uint64_t)x37 * x9)) + ((uint64_t)x25 * x21)) + ((uint64_t)x39 * x7)) + ((uint64_t)x23 * x20)) + ((uint64_t)x38 * x5)); - { uint64_t x50 = (((((0x2 * ((((((uint64_t)x33 * x15) + ((uint64_t)x29 * x19)) + ((uint64_t)x37 * x11)) + ((uint64_t)x25 * x20)) + ((uint64_t)x38 * x7))) + ((uint64_t)x31 * x17)) + ((uint64_t)x35 * x13)) + ((uint64_t)x27 * x21)) + ((uint64_t)x39 * x9)); - { uint64_t x51 = (((((((((uint64_t)x33 * x17) + ((uint64_t)x35 * x15)) + ((uint64_t)x31 * x19)) + ((uint64_t)x37 * x13)) + ((uint64_t)x29 * x21)) + ((uint64_t)x39 * x11)) + ((uint64_t)x27 * x20)) + ((uint64_t)x38 * x9)); - { uint64_t x52 = (((((uint64_t)x35 * x17) + (0x2 * (((((uint64_t)x33 * x19) + ((uint64_t)x37 * x15)) + ((uint64_t)x29 * x20)) + ((uint64_t)x38 * x11)))) + ((uint64_t)x31 * x21)) + ((uint64_t)x39 * x13)); - { uint64_t x53 = (((((((uint64_t)x35 * x19) + ((uint64_t)x37 * x17)) + ((uint64_t)x33 * x21)) + ((uint64_t)x39 * x15)) + ((uint64_t)x31 * x20)) + ((uint64_t)x38 * x13)); - { uint64_t x54 = (((0x2 * ((((uint64_t)x37 * x19) + ((uint64_t)x33 * x20)) + ((uint64_t)x38 * x15))) + ((uint64_t)x35 * x21)) + ((uint64_t)x39 * x17)); - { uint64_t x55 = (((((uint64_t)x37 * x21) + ((uint64_t)x39 * x19)) + ((uint64_t)x35 * x20)) + ((uint64_t)x38 * x17)); - { uint64_t x56 = (((uint64_t)x39 * x21) + (0x2 * (((uint64_t)x37 * x20) + ((uint64_t)x38 * x19)))); - { uint64_t x57 = (((uint64_t)x39 * x20) + ((uint64_t)x38 * x21)); - { uint64_t x58 = ((uint64_t)(0x2 * x38) * x20); - { uint64_t x59 = (x48 + (x58 << 0x4)); - { uint64_t x60 = (x59 + (x58 << 0x1)); - { uint64_t x61 = (x60 + x58); - { uint64_t x62 = (x47 + (x57 << 0x4)); - { uint64_t x63 = (x62 + (x57 << 0x1)); - { uint64_t x64 = (x63 + x57); - { uint64_t x65 = (x46 + (x56 << 0x4)); - { uint64_t x66 = (x65 + (x56 << 0x1)); - { uint64_t x67 = (x66 + x56); - { uint64_t x68 = (x45 + (x55 << 0x4)); - { uint64_t x69 = (x68 + (x55 << 0x1)); - { uint64_t x70 = (x69 + x55); - { uint64_t x71 = (x44 + (x54 << 0x4)); - { uint64_t x72 = (x71 + (x54 << 0x1)); - { uint64_t x73 = (x72 + x54); - { uint64_t x74 = (x43 + (x53 << 0x4)); - { uint64_t x75 = (x74 + (x53 << 0x1)); - { uint64_t x76 = (x75 + x53); - { uint64_t x77 = (x42 + (x52 << 0x4)); - { uint64_t x78 = (x77 + (x52 << 0x1)); - { uint64_t x79 = (x78 + x52); - { uint64_t x80 = (x41 + (x51 << 0x4)); - { uint64_t x81 = (x80 + (x51 << 0x1)); - { uint64_t x82 = (x81 + x51); - { uint64_t x83 = (x40 + (x50 << 0x4)); - { uint64_t x84 = (x83 + (x50 << 0x1)); - { uint64_t x85 = (x84 + x50); - { uint64_t x86 = (x85 >> 0x1a); - { uint32_t x87 = ((uint32_t)x85 & 0x3ffffff); - { uint64_t x88 = (x86 + x82); - { uint64_t x89 = (x88 >> 0x19); - { uint32_t x90 = ((uint32_t)x88 & 0x1ffffff); - { uint64_t x91 = (x89 + x79); - { uint64_t x92 = (x91 >> 0x1a); - { uint32_t x93 = ((uint32_t)x91 & 0x3ffffff); - { uint64_t x94 = (x92 + x76); - { uint64_t x95 = (x94 >> 0x19); - { uint32_t x96 = ((uint32_t)x94 & 0x1ffffff); - { uint64_t x97 = (x95 + x73); - { uint64_t x98 = (x97 >> 0x1a); - { uint32_t x99 = ((uint32_t)x97 & 0x3ffffff); - { uint64_t x100 = (x98 + x70); - { uint64_t x101 = (x100 >> 0x19); - { uint32_t x102 = ((uint32_t)x100 & 0x1ffffff); - { uint64_t x103 = (x101 + x67); - { uint64_t x104 = (x103 >> 0x1a); - { uint32_t x105 = ((uint32_t)x103 & 0x3ffffff); - { uint64_t x106 = (x104 + x64); - { uint64_t x107 = (x106 >> 0x19); - { uint32_t x108 = ((uint32_t)x106 & 0x1ffffff); - { uint64_t x109 = (x107 + x61); - { uint64_t x110 = (x109 >> 0x1a); - { uint32_t x111 = ((uint32_t)x109 & 0x3ffffff); - { uint64_t x112 = (x110 + x49); - { uint64_t x113 = (x112 >> 0x19); - { uint32_t x114 = ((uint32_t)x112 & 0x1ffffff); - { uint64_t x115 = (x87 + (0x13 * x113)); - { uint32_t x116 = (uint32_t) (x115 >> 0x1a); - { uint32_t x117 = ((uint32_t)x115 & 0x3ffffff); - { uint32_t x118 = (x116 + x90); - { uint32_t x119 = (x118 >> 0x19); - { uint32_t x120 = (x118 & 0x1ffffff); - out[0] = x117; - out[1] = x120; - out[2] = (x119 + x93); - out[3] = x96; - out[4] = x99; - out[5] = x102; - out[6] = x105; - out[7] = x108; - out[8] = x111; - out[9] = x114; - }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} + fiat_25519_carry_mul(out, in1, in2); assert_fe(out); } @@ -1080,297 +255,42 @@ fe_mul_impl(h->v, f->v, g->v); } -static void fe_sqr_impl(uint32_t out[10], const uint32_t in1[10]) { - assert_fe_loose(in1); - { const uint32_t x17 = in1[9]; - { const uint32_t x18 = in1[8]; - { const uint32_t x16 = in1[7]; - { const uint32_t x14 = in1[6]; - { const uint32_t x12 = in1[5]; - { const uint32_t x10 = in1[4]; - { const uint32_t x8 = in1[3]; - { const uint32_t x6 = in1[2]; - { const uint32_t x4 = in1[1]; - { const uint32_t x2 = in1[0]; - { uint64_t x19 = ((uint64_t)x2 * x2); - { uint64_t x20 = ((uint64_t)(0x2 * x2) * x4); - { uint64_t x21 = (0x2 * (((uint64_t)x4 * x4) + ((uint64_t)x2 * x6))); - { uint64_t x22 = (0x2 * (((uint64_t)x4 * x6) + ((uint64_t)x2 * x8))); - { uint64_t x23 = ((((uint64_t)x6 * x6) + ((uint64_t)(0x4 * x4) * x8)) + ((uint64_t)(0x2 * x2) * x10)); - { uint64_t x24 = (0x2 * ((((uint64_t)x6 * x8) + ((uint64_t)x4 * x10)) + ((uint64_t)x2 * x12))); - { uint64_t x25 = (0x2 * (((((uint64_t)x8 * x8) + ((uint64_t)x6 * x10)) + ((uint64_t)x2 * x14)) + ((uint64_t)(0x2 * x4) * x12))); - { uint64_t x26 = (0x2 * (((((uint64_t)x8 * x10) + ((uint64_t)x6 * x12)) + ((uint64_t)x4 * x14)) + ((uint64_t)x2 * x16))); - { uint64_t x27 = (((uint64_t)x10 * x10) + (0x2 * ((((uint64_t)x6 * x14) + ((uint64_t)x2 * x18)) + (0x2 * (((uint64_t)x4 * x16) + ((uint64_t)x8 * x12)))))); - { uint64_t x28 = (0x2 * ((((((uint64_t)x10 * x12) + ((uint64_t)x8 * x14)) + ((uint64_t)x6 * x16)) + ((uint64_t)x4 * x18)) + ((uint64_t)x2 * x17))); - { uint64_t x29 = (0x2 * (((((uint64_t)x12 * x12) + ((uint64_t)x10 * x14)) + ((uint64_t)x6 * x18)) + (0x2 * (((uint64_t)x8 * x16) + ((uint64_t)x4 * x17))))); - { uint64_t x30 = (0x2 * (((((uint64_t)x12 * x14) + ((uint64_t)x10 * x16)) + ((uint64_t)x8 * x18)) + ((uint64_t)x6 * x17))); - { uint64_t x31 = (((uint64_t)x14 * x14) + (0x2 * (((uint64_t)x10 * x18) + (0x2 * (((uint64_t)x12 * x16) + ((uint64_t)x8 * x17)))))); - { uint64_t x32 = (0x2 * ((((uint64_t)x14 * x16) + ((uint64_t)x12 * x18)) + ((uint64_t)x10 * x17))); - { uint64_t x33 = (0x2 * ((((uint64_t)x16 * x16) + ((uint64_t)x14 * x18)) + ((uint64_t)(0x2 * x12) * x17))); - { uint64_t x34 = (0x2 * (((uint64_t)x16 * x18) + ((uint64_t)x14 * x17))); - { uint64_t x35 = (((uint64_t)x18 * x18) + ((uint64_t)(0x4 * x16) * x17)); - { uint64_t x36 = ((uint64_t)(0x2 * x18) * x17); - { uint64_t x37 = ((uint64_t)(0x2 * x17) * x17); - { uint64_t x38 = (x27 + (x37 << 0x4)); - { uint64_t x39 = (x38 + (x37 << 0x1)); - { uint64_t x40 = (x39 + x37); - { uint64_t x41 = (x26 + (x36 << 0x4)); - { uint64_t x42 = (x41 + (x36 << 0x1)); - { uint64_t x43 = (x42 + x36); - { uint64_t x44 = (x25 + (x35 << 0x4)); - { uint64_t x45 = (x44 + (x35 << 0x1)); - { uint64_t x46 = (x45 + x35); - { uint64_t x47 = (x24 + (x34 << 0x4)); - { uint64_t x48 = (x47 + (x34 << 0x1)); - { uint64_t x49 = (x48 + x34); - { uint64_t x50 = (x23 + (x33 << 0x4)); - { uint64_t x51 = (x50 + (x33 << 0x1)); - { uint64_t x52 = (x51 + x33); - { uint64_t x53 = (x22 + (x32 << 0x4)); - { uint64_t x54 = (x53 + (x32 << 0x1)); - { uint64_t x55 = (x54 + x32); - { uint64_t x56 = (x21 + (x31 << 0x4)); - { uint64_t x57 = (x56 + (x31 << 0x1)); - { uint64_t x58 = (x57 + x31); - { uint64_t x59 = (x20 + (x30 << 0x4)); - { uint64_t x60 = (x59 + (x30 << 0x1)); - { uint64_t x61 = (x60 + x30); - { uint64_t x62 = (x19 + (x29 << 0x4)); - { uint64_t x63 = (x62 + (x29 << 0x1)); - { uint64_t x64 = (x63 + x29); - { uint64_t x65 = (x64 >> 0x1a); - { uint32_t x66 = ((uint32_t)x64 & 0x3ffffff); - { uint64_t x67 = (x65 + x61); - { uint64_t x68 = (x67 >> 0x19); - { uint32_t x69 = ((uint32_t)x67 & 0x1ffffff); - { uint64_t x70 = (x68 + x58); - { uint64_t x71 = (x70 >> 0x1a); - { uint32_t x72 = ((uint32_t)x70 & 0x3ffffff); - { uint64_t x73 = (x71 + x55); - { uint64_t x74 = (x73 >> 0x19); - { uint32_t x75 = ((uint32_t)x73 & 0x1ffffff); - { uint64_t x76 = (x74 + x52); - { uint64_t x77 = (x76 >> 0x1a); - { uint32_t x78 = ((uint32_t)x76 & 0x3ffffff); - { uint64_t x79 = (x77 + x49); - { uint64_t x80 = (x79 >> 0x19); - { uint32_t x81 = ((uint32_t)x79 & 0x1ffffff); - { uint64_t x82 = (x80 + x46); - { uint64_t x83 = (x82 >> 0x1a); - { uint32_t x84 = ((uint32_t)x82 & 0x3ffffff); - { uint64_t x85 = (x83 + x43); - { uint64_t x86 = (x85 >> 0x19); - { uint32_t x87 = ((uint32_t)x85 & 0x1ffffff); - { uint64_t x88 = (x86 + x40); - { uint64_t x89 = (x88 >> 0x1a); - { uint32_t x90 = ((uint32_t)x88 & 0x3ffffff); - { uint64_t x91 = (x89 + x28); - { uint64_t x92 = (x91 >> 0x19); - { uint32_t x93 = ((uint32_t)x91 & 0x1ffffff); - { uint64_t x94 = (x66 + (0x13 * x92)); - { uint32_t x95 = (uint32_t) (x94 >> 0x1a); - { uint32_t x96 = ((uint32_t)x94 & 0x3ffffff); - { uint32_t x97 = (x95 + x69); - { uint32_t x98 = (x97 >> 0x19); - { uint32_t x99 = (x97 & 0x1ffffff); - out[0] = x96; - out[1] = x99; - out[2] = (x98 + x72); - out[3] = x75; - out[4] = x78; - out[5] = x81; - out[6] = x84; - out[7] = x87; - out[8] = x90; - out[9] = x93; - }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} - assert_fe(out); -} - static void fe_sq_tl(fe *h, const fe_loose *f) { - fe_sqr_impl(h->v, f->v); + assert_fe_loose(f->v); + fiat_25519_carry_square(h->v, f->v); + assert_fe(h->v); } static void fe_sq_tt(fe *h, const fe *f) { - fe_sqr_impl(h->v, f->v); + assert_fe_loose(f->v); + fiat_25519_carry_square(h->v, f->v); + assert_fe(h->v); } // Replace (f,g) with (g,f) if b == 1; // replace (f,g) with (f,g) if b == 0. // // Preconditions: b in {0,1}. -static void fe_cswap(fe *f, fe *g, unsigned int b) { +static void fe_cswap(fe *f, fe *g, fe_limb_t b) { b = 0-b; - unsigned i; - for (i = 0; i < 10; i++) { - uint32_t x = f->v[i] ^ g->v[i]; + for (unsigned i = 0; i < FE_NUM_LIMBS; i++) { + fe_limb_t x = f->v[i] ^ g->v[i]; x &= b; f->v[i] ^= x; g->v[i] ^= x; } } -// NOTE: based on fiat-crypto fe_mul, edited for in2=121666, 0, 0.. -static void fe_mul_121666_impl(uint32_t out[10], const uint32_t in1[10]) { - { const uint32_t x20 = in1[9]; - { const uint32_t x21 = in1[8]; - { const uint32_t x19 = in1[7]; - { const uint32_t x17 = in1[6]; - { const uint32_t x15 = in1[5]; - { const uint32_t x13 = in1[4]; - { const uint32_t x11 = in1[3]; - { const uint32_t x9 = in1[2]; - { const uint32_t x7 = in1[1]; - { const uint32_t x5 = in1[0]; - { const uint32_t x38 = 0; - { const uint32_t x39 = 0; - { const uint32_t x37 = 0; - { const uint32_t x35 = 0; - { const uint32_t x33 = 0; - { const uint32_t x31 = 0; - { const uint32_t x29 = 0; - { const uint32_t x27 = 0; - { const uint32_t x25 = 0; - { const uint32_t x23 = 121666; - { uint64_t x40 = ((uint64_t)x23 * x5); - { uint64_t x41 = (((uint64_t)x23 * x7) + ((uint64_t)x25 * x5)); - { uint64_t x42 = ((((uint64_t)(0x2 * x25) * x7) + ((uint64_t)x23 * x9)) + ((uint64_t)x27 * x5)); - { uint64_t x43 = (((((uint64_t)x25 * x9) + ((uint64_t)x27 * x7)) + ((uint64_t)x23 * x11)) + ((uint64_t)x29 * x5)); - { uint64_t x44 = (((((uint64_t)x27 * x9) + (0x2 * (((uint64_t)x25 * x11) + ((uint64_t)x29 * x7)))) + ((uint64_t)x23 * x13)) + ((uint64_t)x31 * x5)); - { uint64_t x45 = (((((((uint64_t)x27 * x11) + ((uint64_t)x29 * x9)) + ((uint64_t)x25 * x13)) + ((uint64_t)x31 * x7)) + ((uint64_t)x23 * x15)) + ((uint64_t)x33 * x5)); - { uint64_t x46 = (((((0x2 * ((((uint64_t)x29 * x11) + ((uint64_t)x25 * x15)) + ((uint64_t)x33 * x7))) + ((uint64_t)x27 * x13)) + ((uint64_t)x31 * x9)) + ((uint64_t)x23 * x17)) + ((uint64_t)x35 * x5)); - { uint64_t x47 = (((((((((uint64_t)x29 * x13) + ((uint64_t)x31 * x11)) + ((uint64_t)x27 * x15)) + ((uint64_t)x33 * x9)) + ((uint64_t)x25 * x17)) + ((uint64_t)x35 * x7)) + ((uint64_t)x23 * x19)) + ((uint64_t)x37 * x5)); - { uint64_t x48 = (((((((uint64_t)x31 * x13) + (0x2 * (((((uint64_t)x29 * x15) + ((uint64_t)x33 * x11)) + ((uint64_t)x25 * x19)) + ((uint64_t)x37 * x7)))) + ((uint64_t)x27 * x17)) + ((uint64_t)x35 * x9)) + ((uint64_t)x23 * x21)) + ((uint64_t)x39 * x5)); - { uint64_t x49 = (((((((((((uint64_t)x31 * x15) + ((uint64_t)x33 * x13)) + ((uint64_t)x29 * x17)) + ((uint64_t)x35 * x11)) + ((uint64_t)x27 * x19)) + ((uint64_t)x37 * x9)) + ((uint64_t)x25 * x21)) + ((uint64_t)x39 * x7)) + ((uint64_t)x23 * x20)) + ((uint64_t)x38 * x5)); - { uint64_t x50 = (((((0x2 * ((((((uint64_t)x33 * x15) + ((uint64_t)x29 * x19)) + ((uint64_t)x37 * x11)) + ((uint64_t)x25 * x20)) + ((uint64_t)x38 * x7))) + ((uint64_t)x31 * x17)) + ((uint64_t)x35 * x13)) + ((uint64_t)x27 * x21)) + ((uint64_t)x39 * x9)); - { uint64_t x51 = (((((((((uint64_t)x33 * x17) + ((uint64_t)x35 * x15)) + ((uint64_t)x31 * x19)) + ((uint64_t)x37 * x13)) + ((uint64_t)x29 * x21)) + ((uint64_t)x39 * x11)) + ((uint64_t)x27 * x20)) + ((uint64_t)x38 * x9)); - { uint64_t x52 = (((((uint64_t)x35 * x17) + (0x2 * (((((uint64_t)x33 * x19) + ((uint64_t)x37 * x15)) + ((uint64_t)x29 * x20)) + ((uint64_t)x38 * x11)))) + ((uint64_t)x31 * x21)) + ((uint64_t)x39 * x13)); - { uint64_t x53 = (((((((uint64_t)x35 * x19) + ((uint64_t)x37 * x17)) + ((uint64_t)x33 * x21)) + ((uint64_t)x39 * x15)) + ((uint64_t)x31 * x20)) + ((uint64_t)x38 * x13)); - { uint64_t x54 = (((0x2 * ((((uint64_t)x37 * x19) + ((uint64_t)x33 * x20)) + ((uint64_t)x38 * x15))) + ((uint64_t)x35 * x21)) + ((uint64_t)x39 * x17)); - { uint64_t x55 = (((((uint64_t)x37 * x21) + ((uint64_t)x39 * x19)) + ((uint64_t)x35 * x20)) + ((uint64_t)x38 * x17)); - { uint64_t x56 = (((uint64_t)x39 * x21) + (0x2 * (((uint64_t)x37 * x20) + ((uint64_t)x38 * x19)))); - { uint64_t x57 = (((uint64_t)x39 * x20) + ((uint64_t)x38 * x21)); - { uint64_t x58 = ((uint64_t)(0x2 * x38) * x20); - { uint64_t x59 = (x48 + (x58 << 0x4)); - { uint64_t x60 = (x59 + (x58 << 0x1)); - { uint64_t x61 = (x60 + x58); - { uint64_t x62 = (x47 + (x57 << 0x4)); - { uint64_t x63 = (x62 + (x57 << 0x1)); - { uint64_t x64 = (x63 + x57); - { uint64_t x65 = (x46 + (x56 << 0x4)); - { uint64_t x66 = (x65 + (x56 << 0x1)); - { uint64_t x67 = (x66 + x56); - { uint64_t x68 = (x45 + (x55 << 0x4)); - { uint64_t x69 = (x68 + (x55 << 0x1)); - { uint64_t x70 = (x69 + x55); - { uint64_t x71 = (x44 + (x54 << 0x4)); - { uint64_t x72 = (x71 + (x54 << 0x1)); - { uint64_t x73 = (x72 + x54); - { uint64_t x74 = (x43 + (x53 << 0x4)); - { uint64_t x75 = (x74 + (x53 << 0x1)); - { uint64_t x76 = (x75 + x53); - { uint64_t x77 = (x42 + (x52 << 0x4)); - { uint64_t x78 = (x77 + (x52 << 0x1)); - { uint64_t x79 = (x78 + x52); - { uint64_t x80 = (x41 + (x51 << 0x4)); - { uint64_t x81 = (x80 + (x51 << 0x1)); - { uint64_t x82 = (x81 + x51); - { uint64_t x83 = (x40 + (x50 << 0x4)); - { uint64_t x84 = (x83 + (x50 << 0x1)); - { uint64_t x85 = (x84 + x50); - { uint64_t x86 = (x85 >> 0x1a); - { uint32_t x87 = ((uint32_t)x85 & 0x3ffffff); - { uint64_t x88 = (x86 + x82); - { uint64_t x89 = (x88 >> 0x19); - { uint32_t x90 = ((uint32_t)x88 & 0x1ffffff); - { uint64_t x91 = (x89 + x79); - { uint64_t x92 = (x91 >> 0x1a); - { uint32_t x93 = ((uint32_t)x91 & 0x3ffffff); - { uint64_t x94 = (x92 + x76); - { uint64_t x95 = (x94 >> 0x19); - { uint32_t x96 = ((uint32_t)x94 & 0x1ffffff); - { uint64_t x97 = (x95 + x73); - { uint64_t x98 = (x97 >> 0x1a); - { uint32_t x99 = ((uint32_t)x97 & 0x3ffffff); - { uint64_t x100 = (x98 + x70); - { uint64_t x101 = (x100 >> 0x19); - { uint32_t x102 = ((uint32_t)x100 & 0x1ffffff); - { uint64_t x103 = (x101 + x67); - { uint64_t x104 = (x103 >> 0x1a); - { uint32_t x105 = ((uint32_t)x103 & 0x3ffffff); - { uint64_t x106 = (x104 + x64); - { uint64_t x107 = (x106 >> 0x19); - { uint32_t x108 = ((uint32_t)x106 & 0x1ffffff); - { uint64_t x109 = (x107 + x61); - { uint64_t x110 = (x109 >> 0x1a); - { uint32_t x111 = ((uint32_t)x109 & 0x3ffffff); - { uint64_t x112 = (x110 + x49); - { uint64_t x113 = (x112 >> 0x19); - { uint32_t x114 = ((uint32_t)x112 & 0x1ffffff); - { uint64_t x115 = (x87 + (0x13 * x113)); - { uint32_t x116 = (uint32_t) (x115 >> 0x1a); - { uint32_t x117 = ((uint32_t)x115 & 0x3ffffff); - { uint32_t x118 = (x116 + x90); - { uint32_t x119 = (x118 >> 0x19); - { uint32_t x120 = (x118 & 0x1ffffff); - out[0] = x117; - out[1] = x120; - out[2] = (x119 + x93); - out[3] = x96; - out[4] = x99; - out[5] = x102; - out[6] = x105; - out[7] = x108; - out[8] = x111; - out[9] = x114; - }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} -} - static void fe_mul121666(fe *h, const fe_loose *f) { assert_fe_loose(f->v); - fe_mul_121666_impl(h->v, f->v); + fiat_25519_carry_scmul_121666(h->v, f->v); assert_fe(h->v); } -// Adapted from Fiat-synthesized |fe_sub_impl| with |out| = 0. -static void fe_neg_impl(uint32_t out[10], const uint32_t in2[10]) { - { const uint32_t x20 = 0; - { const uint32_t x21 = 0; - { const uint32_t x19 = 0; - { const uint32_t x17 = 0; - { const uint32_t x15 = 0; - { const uint32_t x13 = 0; - { const uint32_t x11 = 0; - { const uint32_t x9 = 0; - { const uint32_t x7 = 0; - { const uint32_t x5 = 0; - { const uint32_t x38 = in2[9]; - { const uint32_t x39 = in2[8]; - { const uint32_t x37 = in2[7]; - { const uint32_t x35 = in2[6]; - { const uint32_t x33 = in2[5]; - { const uint32_t x31 = in2[4]; - { const uint32_t x29 = in2[3]; - { const uint32_t x27 = in2[2]; - { const uint32_t x25 = in2[1]; - { const uint32_t x23 = in2[0]; - out[0] = ((0x7ffffda + x5) - x23); - out[1] = ((0x3fffffe + x7) - x25); - out[2] = ((0x7fffffe + x9) - x27); - out[3] = ((0x3fffffe + x11) - x29); - out[4] = ((0x7fffffe + x13) - x31); - out[5] = ((0x3fffffe + x15) - x33); - out[6] = ((0x7fffffe + x17) - x35); - out[7] = ((0x3fffffe + x19) - x37); - out[8] = ((0x7fffffe + x21) - x39); - out[9] = ((0x3fffffe + x20) - x38); - }}}}}}}}}}}}}}}}}}}} -} - // h = -f static void fe_neg(fe_loose *h, const fe *f) { assert_fe(f->v); - fe_neg_impl(h->v, f->v); + fiat_25519_opp(h->v, f->v); assert_fe_loose(h->v); } @@ -1378,18 +298,22 @@ // replace (f,g) with (f,g) if b == 0. // // Preconditions: b in {0,1}. -static void fe_cmov(fe_loose *f, const fe_loose *g, unsigned b) { +static void fe_cmov(fe_loose *f, const fe_loose *g, fe_limb_t b) { + // Silence an unused function warning. |fiat_25519_selectznz| isn't quite the + // calling convention the rest of this code wants, so implement it by hand. + // + // TODO(davidben): Switch to fiat's calling convention, or ask fiat to emit a + // different one. + (void)fiat_25519_selectznz; + b = 0-b; - unsigned i; - for (i = 0; i < 10; i++) { - uint32_t x = f->v[i] ^ g->v[i]; + for (unsigned i = 0; i < FE_NUM_LIMBS; i++) { + fe_limb_t x = f->v[i] ^ g->v[i]; x &= b; f->v[i] ^= x; } } -#endif // BORINGSSL_CURVE25519_64BIT - // h = f static void fe_copy(fe *h, const fe *f) { OPENSSL_memmove(h, f, sizeof(fe)); @@ -1813,10 +737,12 @@ unsigned i; for (i = 0; i < 15; i++) { + // The precomputed table is assumed to already clear the top bit, so + // |fe_frombytes_strict| may be used directly. const uint8_t *bytes = &precomp_table[i*(2 * 32)]; fe x, y; - fe_frombytes(&x, bytes); - fe_frombytes(&y, bytes + 32); + fe_frombytes_strict(&x, bytes); + fe_frombytes_strict(&y, bytes + 32); ge_precomp *out = &multiples[i]; fe_add(&out->yplusx, &y, &x);
diff --git a/third_party/fiat/curve25519_32.c b/third_party/fiat/curve25519_32.c new file mode 100644 index 0000000..820a5c9 --- /dev/null +++ b/third_party/fiat/curve25519_32.c
@@ -0,0 +1,905 @@ +/* Autogenerated */ +/* curve description: 25519 */ +/* requested operations: carry_mul, carry_square, carry_scmul121666, carry, add, sub, opp, selectznz, to_bytes, from_bytes */ +/* n = 10 (from "10") */ +/* s = 0x8000000000000000000000000000000000000000000000000000000000000000 (from "2^255") */ +/* c = [(1, 19)] (from "1,19") */ +/* machine_wordsize = 32 (from "32") */ + +#include <stdint.h> +typedef unsigned char fiat_25519_uint1; +typedef signed char fiat_25519_int1; + + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [0x0 ~> 0x3ffffff] + * arg3: [0x0 ~> 0x3ffffff] + * Output Bounds: + * out1: [0x0 ~> 0x3ffffff] + * out2: [0x0 ~> 0x1] + */ +static void fiat_25519_addcarryx_u26(uint32_t* out1, fiat_25519_uint1* out2, fiat_25519_uint1 arg1, uint32_t arg2, uint32_t arg3) { + uint32_t x1 = ((arg1 + arg2) + arg3); + uint32_t x2 = (x1 & UINT32_C(0x3ffffff)); + fiat_25519_uint1 x3 = (fiat_25519_uint1)(x1 >> 26); + *out1 = x2; + *out2 = x3; +} + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [0x0 ~> 0x3ffffff] + * arg3: [0x0 ~> 0x3ffffff] + * Output Bounds: + * out1: [0x0 ~> 0x3ffffff] + * out2: [0x0 ~> 0x1] + */ +static void fiat_25519_subborrowx_u26(uint32_t* out1, fiat_25519_uint1* out2, fiat_25519_uint1 arg1, uint32_t arg2, uint32_t arg3) { + int32_t x1 = ((int32_t)(arg2 - arg1) - (int32_t)arg3); + fiat_25519_int1 x2 = (fiat_25519_int1)(x1 >> 26); + uint32_t x3 = (x1 & UINT32_C(0x3ffffff)); + *out1 = x3; + *out2 = (fiat_25519_uint1)(0x0 - x2); +} + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [0x0 ~> 0x1ffffff] + * arg3: [0x0 ~> 0x1ffffff] + * Output Bounds: + * out1: [0x0 ~> 0x1ffffff] + * out2: [0x0 ~> 0x1] + */ +static void fiat_25519_addcarryx_u25(uint32_t* out1, fiat_25519_uint1* out2, fiat_25519_uint1 arg1, uint32_t arg2, uint32_t arg3) { + uint32_t x1 = ((arg1 + arg2) + arg3); + uint32_t x2 = (x1 & UINT32_C(0x1ffffff)); + fiat_25519_uint1 x3 = (fiat_25519_uint1)(x1 >> 25); + *out1 = x2; + *out2 = x3; +} + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [0x0 ~> 0x1ffffff] + * arg3: [0x0 ~> 0x1ffffff] + * Output Bounds: + * out1: [0x0 ~> 0x1ffffff] + * out2: [0x0 ~> 0x1] + */ +static void fiat_25519_subborrowx_u25(uint32_t* out1, fiat_25519_uint1* out2, fiat_25519_uint1 arg1, uint32_t arg2, uint32_t arg3) { + int32_t x1 = ((int32_t)(arg2 - arg1) - (int32_t)arg3); + fiat_25519_int1 x2 = (fiat_25519_int1)(x1 >> 25); + uint32_t x3 = (x1 & UINT32_C(0x1ffffff)); + *out1 = x3; + *out2 = (fiat_25519_uint1)(0x0 - x2); +} + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [0x0 ~> 0xffffffff] + * arg3: [0x0 ~> 0xffffffff] + * Output Bounds: + * out1: [0x0 ~> 0xffffffff] + */ +static void fiat_25519_cmovznz_u32(uint32_t* out1, fiat_25519_uint1 arg1, uint32_t arg2, uint32_t arg3) { + fiat_25519_uint1 x1 = (!(!arg1)); + uint32_t x2 = ((fiat_25519_int1)(0x0 - x1) & UINT32_C(0xffffffff)); + uint32_t x3 = ((x2 & arg3) | ((~x2) & arg2)); + *out1 = x3; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999]] + * arg2: [[0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999]] + * Output Bounds: + * out1: [[0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333]] + */ +static void fiat_25519_carry_mul(uint32_t out1[10], const uint32_t arg1[10], const uint32_t arg2[10]) { + uint64_t x1 = ((uint64_t)(arg1[9]) * ((arg2[9]) * ((uint32_t)0x2 * UINT8_C(0x13)))); + uint64_t x2 = ((uint64_t)(arg1[9]) * ((arg2[8]) * (uint32_t)UINT8_C(0x13))); + uint64_t x3 = ((uint64_t)(arg1[9]) * ((arg2[7]) * ((uint32_t)0x2 * UINT8_C(0x13)))); + uint64_t x4 = ((uint64_t)(arg1[9]) * ((arg2[6]) * (uint32_t)UINT8_C(0x13))); + uint64_t x5 = ((uint64_t)(arg1[9]) * ((arg2[5]) * ((uint32_t)0x2 * UINT8_C(0x13)))); + uint64_t x6 = ((uint64_t)(arg1[9]) * ((arg2[4]) * (uint32_t)UINT8_C(0x13))); + uint64_t x7 = ((uint64_t)(arg1[9]) * ((arg2[3]) * ((uint32_t)0x2 * UINT8_C(0x13)))); + uint64_t x8 = ((uint64_t)(arg1[9]) * ((arg2[2]) * (uint32_t)UINT8_C(0x13))); + uint64_t x9 = ((uint64_t)(arg1[9]) * ((arg2[1]) * ((uint32_t)0x2 * UINT8_C(0x13)))); + uint64_t x10 = ((uint64_t)(arg1[8]) * ((arg2[9]) * (uint32_t)UINT8_C(0x13))); + uint64_t x11 = ((uint64_t)(arg1[8]) * ((arg2[8]) * (uint32_t)UINT8_C(0x13))); + uint64_t x12 = ((uint64_t)(arg1[8]) * ((arg2[7]) * (uint32_t)UINT8_C(0x13))); + uint64_t x13 = ((uint64_t)(arg1[8]) * ((arg2[6]) * (uint32_t)UINT8_C(0x13))); + uint64_t x14 = ((uint64_t)(arg1[8]) * ((arg2[5]) * (uint32_t)UINT8_C(0x13))); + uint64_t x15 = ((uint64_t)(arg1[8]) * ((arg2[4]) * (uint32_t)UINT8_C(0x13))); + uint64_t x16 = ((uint64_t)(arg1[8]) * ((arg2[3]) * (uint32_t)UINT8_C(0x13))); + uint64_t x17 = ((uint64_t)(arg1[8]) * ((arg2[2]) * (uint32_t)UINT8_C(0x13))); + uint64_t x18 = ((uint64_t)(arg1[7]) * ((arg2[9]) * ((uint32_t)0x2 * UINT8_C(0x13)))); + uint64_t x19 = ((uint64_t)(arg1[7]) * ((arg2[8]) * (uint32_t)UINT8_C(0x13))); + uint64_t x20 = ((uint64_t)(arg1[7]) * ((arg2[7]) * ((uint32_t)0x2 * UINT8_C(0x13)))); + uint64_t x21 = ((uint64_t)(arg1[7]) * ((arg2[6]) * (uint32_t)UINT8_C(0x13))); + uint64_t x22 = ((uint64_t)(arg1[7]) * ((arg2[5]) * ((uint32_t)0x2 * UINT8_C(0x13)))); + uint64_t x23 = ((uint64_t)(arg1[7]) * ((arg2[4]) * (uint32_t)UINT8_C(0x13))); + uint64_t x24 = ((uint64_t)(arg1[7]) * ((arg2[3]) * ((uint32_t)0x2 * UINT8_C(0x13)))); + uint64_t x25 = ((uint64_t)(arg1[6]) * ((arg2[9]) * (uint32_t)UINT8_C(0x13))); + uint64_t x26 = ((uint64_t)(arg1[6]) * ((arg2[8]) * (uint32_t)UINT8_C(0x13))); + uint64_t x27 = ((uint64_t)(arg1[6]) * ((arg2[7]) * (uint32_t)UINT8_C(0x13))); + uint64_t x28 = ((uint64_t)(arg1[6]) * ((arg2[6]) * (uint32_t)UINT8_C(0x13))); + uint64_t x29 = ((uint64_t)(arg1[6]) * ((arg2[5]) * (uint32_t)UINT8_C(0x13))); + uint64_t x30 = ((uint64_t)(arg1[6]) * ((arg2[4]) * (uint32_t)UINT8_C(0x13))); + uint64_t x31 = ((uint64_t)(arg1[5]) * ((arg2[9]) * ((uint32_t)0x2 * UINT8_C(0x13)))); + uint64_t x32 = ((uint64_t)(arg1[5]) * ((arg2[8]) * (uint32_t)UINT8_C(0x13))); + uint64_t x33 = ((uint64_t)(arg1[5]) * ((arg2[7]) * ((uint32_t)0x2 * UINT8_C(0x13)))); + uint64_t x34 = ((uint64_t)(arg1[5]) * ((arg2[6]) * (uint32_t)UINT8_C(0x13))); + uint64_t x35 = ((uint64_t)(arg1[5]) * ((arg2[5]) * ((uint32_t)0x2 * UINT8_C(0x13)))); + uint64_t x36 = ((uint64_t)(arg1[4]) * ((arg2[9]) * (uint32_t)UINT8_C(0x13))); + uint64_t x37 = ((uint64_t)(arg1[4]) * ((arg2[8]) * (uint32_t)UINT8_C(0x13))); + uint64_t x38 = ((uint64_t)(arg1[4]) * ((arg2[7]) * (uint32_t)UINT8_C(0x13))); + uint64_t x39 = ((uint64_t)(arg1[4]) * ((arg2[6]) * (uint32_t)UINT8_C(0x13))); + uint64_t x40 = ((uint64_t)(arg1[3]) * ((arg2[9]) * ((uint32_t)0x2 * UINT8_C(0x13)))); + uint64_t x41 = ((uint64_t)(arg1[3]) * ((arg2[8]) * (uint32_t)UINT8_C(0x13))); + uint64_t x42 = ((uint64_t)(arg1[3]) * ((arg2[7]) * ((uint32_t)0x2 * UINT8_C(0x13)))); + uint64_t x43 = ((uint64_t)(arg1[2]) * ((arg2[9]) * (uint32_t)UINT8_C(0x13))); + uint64_t x44 = ((uint64_t)(arg1[2]) * ((arg2[8]) * (uint32_t)UINT8_C(0x13))); + uint64_t x45 = ((uint64_t)(arg1[1]) * ((arg2[9]) * ((uint32_t)0x2 * UINT8_C(0x13)))); + uint64_t x46 = ((uint64_t)(arg1[9]) * (arg2[0])); + uint64_t x47 = ((uint64_t)(arg1[8]) * (arg2[1])); + uint64_t x48 = ((uint64_t)(arg1[8]) * (arg2[0])); + uint64_t x49 = ((uint64_t)(arg1[7]) * (arg2[2])); + uint64_t x50 = ((uint64_t)(arg1[7]) * ((arg2[1]) * (uint32_t)0x2)); + uint64_t x51 = ((uint64_t)(arg1[7]) * (arg2[0])); + uint64_t x52 = ((uint64_t)(arg1[6]) * (arg2[3])); + uint64_t x53 = ((uint64_t)(arg1[6]) * (arg2[2])); + uint64_t x54 = ((uint64_t)(arg1[6]) * (arg2[1])); + uint64_t x55 = ((uint64_t)(arg1[6]) * (arg2[0])); + uint64_t x56 = ((uint64_t)(arg1[5]) * (arg2[4])); + uint64_t x57 = ((uint64_t)(arg1[5]) * ((arg2[3]) * (uint32_t)0x2)); + uint64_t x58 = ((uint64_t)(arg1[5]) * (arg2[2])); + uint64_t x59 = ((uint64_t)(arg1[5]) * ((arg2[1]) * (uint32_t)0x2)); + uint64_t x60 = ((uint64_t)(arg1[5]) * (arg2[0])); + uint64_t x61 = ((uint64_t)(arg1[4]) * (arg2[5])); + uint64_t x62 = ((uint64_t)(arg1[4]) * (arg2[4])); + uint64_t x63 = ((uint64_t)(arg1[4]) * (arg2[3])); + uint64_t x64 = ((uint64_t)(arg1[4]) * (arg2[2])); + uint64_t x65 = ((uint64_t)(arg1[4]) * (arg2[1])); + uint64_t x66 = ((uint64_t)(arg1[4]) * (arg2[0])); + uint64_t x67 = ((uint64_t)(arg1[3]) * (arg2[6])); + uint64_t x68 = ((uint64_t)(arg1[3]) * ((arg2[5]) * (uint32_t)0x2)); + uint64_t x69 = ((uint64_t)(arg1[3]) * (arg2[4])); + uint64_t x70 = ((uint64_t)(arg1[3]) * ((arg2[3]) * (uint32_t)0x2)); + uint64_t x71 = ((uint64_t)(arg1[3]) * (arg2[2])); + uint64_t x72 = ((uint64_t)(arg1[3]) * ((arg2[1]) * (uint32_t)0x2)); + uint64_t x73 = ((uint64_t)(arg1[3]) * (arg2[0])); + uint64_t x74 = ((uint64_t)(arg1[2]) * (arg2[7])); + uint64_t x75 = ((uint64_t)(arg1[2]) * (arg2[6])); + uint64_t x76 = ((uint64_t)(arg1[2]) * (arg2[5])); + uint64_t x77 = ((uint64_t)(arg1[2]) * (arg2[4])); + uint64_t x78 = ((uint64_t)(arg1[2]) * (arg2[3])); + uint64_t x79 = ((uint64_t)(arg1[2]) * (arg2[2])); + uint64_t x80 = ((uint64_t)(arg1[2]) * (arg2[1])); + uint64_t x81 = ((uint64_t)(arg1[2]) * (arg2[0])); + uint64_t x82 = ((uint64_t)(arg1[1]) * (arg2[8])); + uint64_t x83 = ((uint64_t)(arg1[1]) * ((arg2[7]) * (uint32_t)0x2)); + uint64_t x84 = ((uint64_t)(arg1[1]) * (arg2[6])); + uint64_t x85 = ((uint64_t)(arg1[1]) * ((arg2[5]) * (uint32_t)0x2)); + uint64_t x86 = ((uint64_t)(arg1[1]) * (arg2[4])); + uint64_t x87 = ((uint64_t)(arg1[1]) * ((arg2[3]) * (uint32_t)0x2)); + uint64_t x88 = ((uint64_t)(arg1[1]) * (arg2[2])); + uint64_t x89 = ((uint64_t)(arg1[1]) * ((arg2[1]) * (uint32_t)0x2)); + uint64_t x90 = ((uint64_t)(arg1[1]) * (arg2[0])); + uint64_t x91 = ((uint64_t)(arg1[0]) * (arg2[9])); + uint64_t x92 = ((uint64_t)(arg1[0]) * (arg2[8])); + uint64_t x93 = ((uint64_t)(arg1[0]) * (arg2[7])); + uint64_t x94 = ((uint64_t)(arg1[0]) * (arg2[6])); + uint64_t x95 = ((uint64_t)(arg1[0]) * (arg2[5])); + uint64_t x96 = ((uint64_t)(arg1[0]) * (arg2[4])); + uint64_t x97 = ((uint64_t)(arg1[0]) * (arg2[3])); + uint64_t x98 = ((uint64_t)(arg1[0]) * (arg2[2])); + uint64_t x99 = ((uint64_t)(arg1[0]) * (arg2[1])); + uint64_t x100 = ((uint64_t)(arg1[0]) * (arg2[0])); + uint64_t x101 = (x100 + (x45 + (x44 + (x42 + (x39 + (x35 + (x30 + (x24 + (x17 + x9))))))))); + uint64_t x102 = (x101 >> 26); + uint32_t x103 = (uint32_t)(x101 & UINT32_C(0x3ffffff)); + uint64_t x104 = (x91 + (x82 + (x74 + (x67 + (x61 + (x56 + (x52 + (x49 + (x47 + x46))))))))); + uint64_t x105 = (x92 + (x83 + (x75 + (x68 + (x62 + (x57 + (x53 + (x50 + (x48 + x1))))))))); + uint64_t x106 = (x93 + (x84 + (x76 + (x69 + (x63 + (x58 + (x54 + (x51 + (x10 + x2))))))))); + uint64_t x107 = (x94 + (x85 + (x77 + (x70 + (x64 + (x59 + (x55 + (x18 + (x11 + x3))))))))); + uint64_t x108 = (x95 + (x86 + (x78 + (x71 + (x65 + (x60 + (x25 + (x19 + (x12 + x4))))))))); + uint64_t x109 = (x96 + (x87 + (x79 + (x72 + (x66 + (x31 + (x26 + (x20 + (x13 + x5))))))))); + uint64_t x110 = (x97 + (x88 + (x80 + (x73 + (x36 + (x32 + (x27 + (x21 + (x14 + x6))))))))); + uint64_t x111 = (x98 + (x89 + (x81 + (x40 + (x37 + (x33 + (x28 + (x22 + (x15 + x7))))))))); + uint64_t x112 = (x99 + (x90 + (x43 + (x41 + (x38 + (x34 + (x29 + (x23 + (x16 + x8))))))))); + uint64_t x113 = (x102 + x112); + uint64_t x114 = (x113 >> 25); + uint32_t x115 = (uint32_t)(x113 & UINT32_C(0x1ffffff)); + uint64_t x116 = (x114 + x111); + uint64_t x117 = (x116 >> 26); + uint32_t x118 = (uint32_t)(x116 & UINT32_C(0x3ffffff)); + uint64_t x119 = (x117 + x110); + uint64_t x120 = (x119 >> 25); + uint32_t x121 = (uint32_t)(x119 & UINT32_C(0x1ffffff)); + uint64_t x122 = (x120 + x109); + uint64_t x123 = (x122 >> 26); + uint32_t x124 = (uint32_t)(x122 & UINT32_C(0x3ffffff)); + uint64_t x125 = (x123 + x108); + uint64_t x126 = (x125 >> 25); + uint32_t x127 = (uint32_t)(x125 & UINT32_C(0x1ffffff)); + uint64_t x128 = (x126 + x107); + uint64_t x129 = (x128 >> 26); + uint32_t x130 = (uint32_t)(x128 & UINT32_C(0x3ffffff)); + uint64_t x131 = (x129 + x106); + uint64_t x132 = (x131 >> 25); + uint32_t x133 = (uint32_t)(x131 & UINT32_C(0x1ffffff)); + uint64_t x134 = (x132 + x105); + uint64_t x135 = (x134 >> 26); + uint32_t x136 = (uint32_t)(x134 & UINT32_C(0x3ffffff)); + uint64_t x137 = (x135 + x104); + uint64_t x138 = (x137 >> 25); + uint32_t x139 = (uint32_t)(x137 & UINT32_C(0x1ffffff)); + uint64_t x140 = (x138 * (uint64_t)UINT8_C(0x13)); + uint64_t x141 = (x103 + x140); + uint32_t x142 = (uint32_t)(x141 >> 26); + uint32_t x143 = (uint32_t)(x141 & UINT32_C(0x3ffffff)); + uint32_t x144 = (x142 + x115); + uint32_t x145 = (x144 >> 25); + uint32_t x146 = (x144 & UINT32_C(0x1ffffff)); + uint32_t x147 = (x145 + x118); + out1[0] = x143; + out1[1] = x146; + out1[2] = x147; + out1[3] = x121; + out1[4] = x124; + out1[5] = x127; + out1[6] = x130; + out1[7] = x133; + out1[8] = x136; + out1[9] = x139; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999]] + * Output Bounds: + * out1: [[0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333]] + */ +static void fiat_25519_carry_square(uint32_t out1[10], const uint32_t arg1[10]) { + uint32_t x1 = ((arg1[9]) * (uint32_t)UINT8_C(0x13)); + uint32_t x2 = (x1 * (uint32_t)0x2); + uint32_t x3 = ((arg1[9]) * (uint32_t)0x2); + uint32_t x4 = ((arg1[8]) * (uint32_t)UINT8_C(0x13)); + uint64_t x5 = (x4 * (uint64_t)0x2); + uint32_t x6 = ((arg1[8]) * (uint32_t)0x2); + uint32_t x7 = ((arg1[7]) * (uint32_t)UINT8_C(0x13)); + uint32_t x8 = (x7 * (uint32_t)0x2); + uint32_t x9 = ((arg1[7]) * (uint32_t)0x2); + uint32_t x10 = ((arg1[6]) * (uint32_t)UINT8_C(0x13)); + uint64_t x11 = (x10 * (uint64_t)0x2); + uint32_t x12 = ((arg1[6]) * (uint32_t)0x2); + uint32_t x13 = ((arg1[5]) * (uint32_t)UINT8_C(0x13)); + uint32_t x14 = ((arg1[5]) * (uint32_t)0x2); + uint32_t x15 = ((arg1[4]) * (uint32_t)0x2); + uint32_t x16 = ((arg1[3]) * (uint32_t)0x2); + uint32_t x17 = ((arg1[2]) * (uint32_t)0x2); + uint32_t x18 = ((arg1[1]) * (uint32_t)0x2); + uint64_t x19 = ((uint64_t)(arg1[9]) * (x1 * (uint32_t)0x2)); + uint64_t x20 = ((uint64_t)(arg1[8]) * x2); + uint64_t x21 = ((uint64_t)(arg1[8]) * x4); + uint64_t x22 = ((arg1[7]) * (x2 * (uint64_t)0x2)); + uint64_t x23 = ((arg1[7]) * x5); + uint64_t x24 = ((uint64_t)(arg1[7]) * (x7 * (uint32_t)0x2)); + uint64_t x25 = ((uint64_t)(arg1[6]) * x2); + uint64_t x26 = ((arg1[6]) * x5); + uint64_t x27 = ((uint64_t)(arg1[6]) * x8); + uint64_t x28 = ((uint64_t)(arg1[6]) * x10); + uint64_t x29 = ((arg1[5]) * (x2 * (uint64_t)0x2)); + uint64_t x30 = ((arg1[5]) * x5); + uint64_t x31 = ((arg1[5]) * (x8 * (uint64_t)0x2)); + uint64_t x32 = ((arg1[5]) * x11); + uint64_t x33 = ((uint64_t)(arg1[5]) * (x13 * (uint32_t)0x2)); + uint64_t x34 = ((uint64_t)(arg1[4]) * x2); + uint64_t x35 = ((arg1[4]) * x5); + uint64_t x36 = ((uint64_t)(arg1[4]) * x8); + uint64_t x37 = ((arg1[4]) * x11); + uint64_t x38 = ((uint64_t)(arg1[4]) * x14); + uint64_t x39 = ((uint64_t)(arg1[4]) * (arg1[4])); + uint64_t x40 = ((arg1[3]) * (x2 * (uint64_t)0x2)); + uint64_t x41 = ((arg1[3]) * x5); + uint64_t x42 = ((arg1[3]) * (x8 * (uint64_t)0x2)); + uint64_t x43 = ((uint64_t)(arg1[3]) * x12); + uint64_t x44 = ((uint64_t)(arg1[3]) * (x14 * (uint32_t)0x2)); + uint64_t x45 = ((uint64_t)(arg1[3]) * x15); + uint64_t x46 = ((uint64_t)(arg1[3]) * ((arg1[3]) * (uint32_t)0x2)); + uint64_t x47 = ((uint64_t)(arg1[2]) * x2); + uint64_t x48 = ((arg1[2]) * x5); + uint64_t x49 = ((uint64_t)(arg1[2]) * x9); + uint64_t x50 = ((uint64_t)(arg1[2]) * x12); + uint64_t x51 = ((uint64_t)(arg1[2]) * x14); + uint64_t x52 = ((uint64_t)(arg1[2]) * x15); + uint64_t x53 = ((uint64_t)(arg1[2]) * x16); + uint64_t x54 = ((uint64_t)(arg1[2]) * (arg1[2])); + uint64_t x55 = ((arg1[1]) * (x2 * (uint64_t)0x2)); + uint64_t x56 = ((uint64_t)(arg1[1]) * x6); + uint64_t x57 = ((uint64_t)(arg1[1]) * (x9 * (uint32_t)0x2)); + uint64_t x58 = ((uint64_t)(arg1[1]) * x12); + uint64_t x59 = ((uint64_t)(arg1[1]) * (x14 * (uint32_t)0x2)); + uint64_t x60 = ((uint64_t)(arg1[1]) * x15); + uint64_t x61 = ((uint64_t)(arg1[1]) * (x16 * (uint32_t)0x2)); + uint64_t x62 = ((uint64_t)(arg1[1]) * x17); + uint64_t x63 = ((uint64_t)(arg1[1]) * ((arg1[1]) * (uint32_t)0x2)); + uint64_t x64 = ((uint64_t)(arg1[0]) * x3); + uint64_t x65 = ((uint64_t)(arg1[0]) * x6); + uint64_t x66 = ((uint64_t)(arg1[0]) * x9); + uint64_t x67 = ((uint64_t)(arg1[0]) * x12); + uint64_t x68 = ((uint64_t)(arg1[0]) * x14); + uint64_t x69 = ((uint64_t)(arg1[0]) * x15); + uint64_t x70 = ((uint64_t)(arg1[0]) * x16); + uint64_t x71 = ((uint64_t)(arg1[0]) * x17); + uint64_t x72 = ((uint64_t)(arg1[0]) * x18); + uint64_t x73 = ((uint64_t)(arg1[0]) * (arg1[0])); + uint64_t x74 = (x73 + (x55 + (x48 + (x42 + (x37 + x33))))); + uint64_t x75 = (x74 >> 26); + uint32_t x76 = (uint32_t)(x74 & UINT32_C(0x3ffffff)); + uint64_t x77 = (x64 + (x56 + (x49 + (x43 + x38)))); + uint64_t x78 = (x65 + (x57 + (x50 + (x44 + (x39 + x19))))); + uint64_t x79 = (x66 + (x58 + (x51 + (x45 + x20)))); + uint64_t x80 = (x67 + (x59 + (x52 + (x46 + (x22 + x21))))); + uint64_t x81 = (x68 + (x60 + (x53 + (x25 + x23)))); + uint64_t x82 = (x69 + (x61 + (x54 + (x29 + (x26 + x24))))); + uint64_t x83 = (x70 + (x62 + (x34 + (x30 + x27)))); + uint64_t x84 = (x71 + (x63 + (x40 + (x35 + (x31 + x28))))); + uint64_t x85 = (x72 + (x47 + (x41 + (x36 + x32)))); + uint64_t x86 = (x75 + x85); + uint64_t x87 = (x86 >> 25); + uint32_t x88 = (uint32_t)(x86 & UINT32_C(0x1ffffff)); + uint64_t x89 = (x87 + x84); + uint64_t x90 = (x89 >> 26); + uint32_t x91 = (uint32_t)(x89 & UINT32_C(0x3ffffff)); + uint64_t x92 = (x90 + x83); + uint64_t x93 = (x92 >> 25); + uint32_t x94 = (uint32_t)(x92 & UINT32_C(0x1ffffff)); + uint64_t x95 = (x93 + x82); + uint64_t x96 = (x95 >> 26); + uint32_t x97 = (uint32_t)(x95 & UINT32_C(0x3ffffff)); + uint64_t x98 = (x96 + x81); + uint64_t x99 = (x98 >> 25); + uint32_t x100 = (uint32_t)(x98 & UINT32_C(0x1ffffff)); + uint64_t x101 = (x99 + x80); + uint64_t x102 = (x101 >> 26); + uint32_t x103 = (uint32_t)(x101 & UINT32_C(0x3ffffff)); + uint64_t x104 = (x102 + x79); + uint64_t x105 = (x104 >> 25); + uint32_t x106 = (uint32_t)(x104 & UINT32_C(0x1ffffff)); + uint64_t x107 = (x105 + x78); + uint64_t x108 = (x107 >> 26); + uint32_t x109 = (uint32_t)(x107 & UINT32_C(0x3ffffff)); + uint64_t x110 = (x108 + x77); + uint64_t x111 = (x110 >> 25); + uint32_t x112 = (uint32_t)(x110 & UINT32_C(0x1ffffff)); + uint64_t x113 = (x111 * (uint64_t)UINT8_C(0x13)); + uint64_t x114 = (x76 + x113); + uint32_t x115 = (uint32_t)(x114 >> 26); + uint32_t x116 = (uint32_t)(x114 & UINT32_C(0x3ffffff)); + uint32_t x117 = (x115 + x88); + uint32_t x118 = (x117 >> 25); + uint32_t x119 = (x117 & UINT32_C(0x1ffffff)); + uint32_t x120 = (x118 + x91); + out1[0] = x116; + out1[1] = x119; + out1[2] = x120; + out1[3] = x94; + out1[4] = x97; + out1[5] = x100; + out1[6] = x103; + out1[7] = x106; + out1[8] = x109; + out1[9] = x112; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999]] + * Output Bounds: + * out1: [[0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333]] + */ +static void fiat_25519_carry_scmul_121666(uint32_t out1[10], const uint32_t arg1[10]) { + uint64_t x1 = ((uint64_t)UINT32_C(0x1db42) * (arg1[9])); + uint64_t x2 = ((uint64_t)UINT32_C(0x1db42) * (arg1[8])); + uint64_t x3 = ((uint64_t)UINT32_C(0x1db42) * (arg1[7])); + uint64_t x4 = ((uint64_t)UINT32_C(0x1db42) * (arg1[6])); + uint64_t x5 = ((uint64_t)UINT32_C(0x1db42) * (arg1[5])); + uint64_t x6 = ((uint64_t)UINT32_C(0x1db42) * (arg1[4])); + uint64_t x7 = ((uint64_t)UINT32_C(0x1db42) * (arg1[3])); + uint64_t x8 = ((uint64_t)UINT32_C(0x1db42) * (arg1[2])); + uint64_t x9 = ((uint64_t)UINT32_C(0x1db42) * (arg1[1])); + uint64_t x10 = ((uint64_t)UINT32_C(0x1db42) * (arg1[0])); + uint32_t x11 = (uint32_t)(x10 >> 26); + uint32_t x12 = (uint32_t)(x10 & UINT32_C(0x3ffffff)); + uint64_t x13 = (x11 + x9); + uint32_t x14 = (uint32_t)(x13 >> 25); + uint32_t x15 = (uint32_t)(x13 & UINT32_C(0x1ffffff)); + uint64_t x16 = (x14 + x8); + uint32_t x17 = (uint32_t)(x16 >> 26); + uint32_t x18 = (uint32_t)(x16 & UINT32_C(0x3ffffff)); + uint64_t x19 = (x17 + x7); + uint32_t x20 = (uint32_t)(x19 >> 25); + uint32_t x21 = (uint32_t)(x19 & UINT32_C(0x1ffffff)); + uint64_t x22 = (x20 + x6); + uint32_t x23 = (uint32_t)(x22 >> 26); + uint32_t x24 = (uint32_t)(x22 & UINT32_C(0x3ffffff)); + uint64_t x25 = (x23 + x5); + uint32_t x26 = (uint32_t)(x25 >> 25); + uint32_t x27 = (uint32_t)(x25 & UINT32_C(0x1ffffff)); + uint64_t x28 = (x26 + x4); + uint32_t x29 = (uint32_t)(x28 >> 26); + uint32_t x30 = (uint32_t)(x28 & UINT32_C(0x3ffffff)); + uint64_t x31 = (x29 + x3); + uint32_t x32 = (uint32_t)(x31 >> 25); + uint32_t x33 = (uint32_t)(x31 & UINT32_C(0x1ffffff)); + uint64_t x34 = (x32 + x2); + uint32_t x35 = (uint32_t)(x34 >> 26); + uint32_t x36 = (uint32_t)(x34 & UINT32_C(0x3ffffff)); + uint64_t x37 = (x35 + x1); + uint32_t x38 = (uint32_t)(x37 >> 25); + uint32_t x39 = (uint32_t)(x37 & UINT32_C(0x1ffffff)); + uint32_t x40 = (x38 * (uint32_t)UINT8_C(0x13)); + uint32_t x41 = (x12 + x40); + uint32_t x42 = (x41 >> 26); + uint32_t x43 = (x41 & UINT32_C(0x3ffffff)); + uint32_t x44 = (x42 + x15); + uint32_t x45 = (x44 >> 25); + uint32_t x46 = (x44 & UINT32_C(0x1ffffff)); + uint32_t x47 = (x45 + x18); + out1[0] = x43; + out1[1] = x46; + out1[2] = x47; + out1[3] = x21; + out1[4] = x24; + out1[5] = x27; + out1[6] = x30; + out1[7] = x33; + out1[8] = x36; + out1[9] = x39; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999]] + * Output Bounds: + * out1: [[0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333]] + */ +static void fiat_25519_carry(uint32_t out1[10], const uint32_t arg1[10]) { + uint32_t x1 = (arg1[0]); + uint32_t x2 = ((x1 >> 26) + (arg1[1])); + uint32_t x3 = ((x2 >> 25) + (arg1[2])); + uint32_t x4 = ((x3 >> 26) + (arg1[3])); + uint32_t x5 = ((x4 >> 25) + (arg1[4])); + uint32_t x6 = ((x5 >> 26) + (arg1[5])); + uint32_t x7 = ((x6 >> 25) + (arg1[6])); + uint32_t x8 = ((x7 >> 26) + (arg1[7])); + uint32_t x9 = ((x8 >> 25) + (arg1[8])); + uint32_t x10 = ((x9 >> 26) + (arg1[9])); + uint32_t x11 = ((x1 & UINT32_C(0x3ffffff)) + ((x10 >> 25) * (uint32_t)UINT8_C(0x13))); + uint32_t x12 = ((x11 >> 26) + (x2 & UINT32_C(0x1ffffff))); + uint32_t x13 = (x11 & UINT32_C(0x3ffffff)); + uint32_t x14 = (x12 & UINT32_C(0x1ffffff)); + uint32_t x15 = ((x12 >> 25) + (x3 & UINT32_C(0x3ffffff))); + uint32_t x16 = (x4 & UINT32_C(0x1ffffff)); + uint32_t x17 = (x5 & UINT32_C(0x3ffffff)); + uint32_t x18 = (x6 & UINT32_C(0x1ffffff)); + uint32_t x19 = (x7 & UINT32_C(0x3ffffff)); + uint32_t x20 = (x8 & UINT32_C(0x1ffffff)); + uint32_t x21 = (x9 & UINT32_C(0x3ffffff)); + uint32_t x22 = (x10 & UINT32_C(0x1ffffff)); + out1[0] = x13; + out1[1] = x14; + out1[2] = x15; + out1[3] = x16; + out1[4] = x17; + out1[5] = x18; + out1[6] = x19; + out1[7] = x20; + out1[8] = x21; + out1[9] = x22; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333]] + * arg2: [[0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333]] + * Output Bounds: + * out1: [[0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999]] + */ +static void fiat_25519_add(uint32_t out1[10], const uint32_t arg1[10], const uint32_t arg2[10]) { + uint32_t x1 = ((arg1[0]) + (arg2[0])); + uint32_t x2 = ((arg1[1]) + (arg2[1])); + uint32_t x3 = ((arg1[2]) + (arg2[2])); + uint32_t x4 = ((arg1[3]) + (arg2[3])); + uint32_t x5 = ((arg1[4]) + (arg2[4])); + uint32_t x6 = ((arg1[5]) + (arg2[5])); + uint32_t x7 = ((arg1[6]) + (arg2[6])); + uint32_t x8 = ((arg1[7]) + (arg2[7])); + uint32_t x9 = ((arg1[8]) + (arg2[8])); + uint32_t x10 = ((arg1[9]) + (arg2[9])); + out1[0] = x1; + out1[1] = x2; + out1[2] = x3; + out1[3] = x4; + out1[4] = x5; + out1[5] = x6; + out1[6] = x7; + out1[7] = x8; + out1[8] = x9; + out1[9] = x10; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333]] + * arg2: [[0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333]] + * Output Bounds: + * out1: [[0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999]] + */ +static void fiat_25519_sub(uint32_t out1[10], const uint32_t arg1[10], const uint32_t arg2[10]) { + uint32_t x1 = ((UINT32_C(0x7ffffda) + (arg1[0])) - (arg2[0])); + uint32_t x2 = ((UINT32_C(0x3fffffe) + (arg1[1])) - (arg2[1])); + uint32_t x3 = ((UINT32_C(0x7fffffe) + (arg1[2])) - (arg2[2])); + uint32_t x4 = ((UINT32_C(0x3fffffe) + (arg1[3])) - (arg2[3])); + uint32_t x5 = ((UINT32_C(0x7fffffe) + (arg1[4])) - (arg2[4])); + uint32_t x6 = ((UINT32_C(0x3fffffe) + (arg1[5])) - (arg2[5])); + uint32_t x7 = ((UINT32_C(0x7fffffe) + (arg1[6])) - (arg2[6])); + uint32_t x8 = ((UINT32_C(0x3fffffe) + (arg1[7])) - (arg2[7])); + uint32_t x9 = ((UINT32_C(0x7fffffe) + (arg1[8])) - (arg2[8])); + uint32_t x10 = ((UINT32_C(0x3fffffe) + (arg1[9])) - (arg2[9])); + out1[0] = x1; + out1[1] = x2; + out1[2] = x3; + out1[3] = x4; + out1[4] = x5; + out1[5] = x6; + out1[6] = x7; + out1[7] = x8; + out1[8] = x9; + out1[9] = x10; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333]] + * Output Bounds: + * out1: [[0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999], [0x0 ~> 0xd333332], [0x0 ~> 0x6999999]] + */ +static void fiat_25519_opp(uint32_t out1[10], const uint32_t arg1[10]) { + uint32_t x1 = (UINT32_C(0x7ffffda) - (arg1[0])); + uint32_t x2 = (UINT32_C(0x3fffffe) - (arg1[1])); + uint32_t x3 = (UINT32_C(0x7fffffe) - (arg1[2])); + uint32_t x4 = (UINT32_C(0x3fffffe) - (arg1[3])); + uint32_t x5 = (UINT32_C(0x7fffffe) - (arg1[4])); + uint32_t x6 = (UINT32_C(0x3fffffe) - (arg1[5])); + uint32_t x7 = (UINT32_C(0x7fffffe) - (arg1[6])); + uint32_t x8 = (UINT32_C(0x3fffffe) - (arg1[7])); + uint32_t x9 = (UINT32_C(0x7fffffe) - (arg1[8])); + uint32_t x10 = (UINT32_C(0x3fffffe) - (arg1[9])); + out1[0] = x1; + out1[1] = x2; + out1[2] = x3; + out1[3] = x4; + out1[4] = x5; + out1[5] = x6; + out1[6] = x7; + out1[7] = x8; + out1[8] = x9; + out1[9] = x10; +} + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + * arg3: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + */ +static void fiat_25519_selectznz(uint32_t out1[10], fiat_25519_uint1 arg1, const uint32_t arg2[10], const uint32_t arg3[10]) { + uint32_t x1; + fiat_25519_cmovznz_u32(&x1, arg1, (arg2[0]), (arg3[0])); + uint32_t x2; + fiat_25519_cmovznz_u32(&x2, arg1, (arg2[1]), (arg3[1])); + uint32_t x3; + fiat_25519_cmovznz_u32(&x3, arg1, (arg2[2]), (arg3[2])); + uint32_t x4; + fiat_25519_cmovznz_u32(&x4, arg1, (arg2[3]), (arg3[3])); + uint32_t x5; + fiat_25519_cmovznz_u32(&x5, arg1, (arg2[4]), (arg3[4])); + uint32_t x6; + fiat_25519_cmovznz_u32(&x6, arg1, (arg2[5]), (arg3[5])); + uint32_t x7; + fiat_25519_cmovznz_u32(&x7, arg1, (arg2[6]), (arg3[6])); + uint32_t x8; + fiat_25519_cmovznz_u32(&x8, arg1, (arg2[7]), (arg3[7])); + uint32_t x9; + fiat_25519_cmovznz_u32(&x9, arg1, (arg2[8]), (arg3[8])); + uint32_t x10; + fiat_25519_cmovznz_u32(&x10, arg1, (arg2[9]), (arg3[9])); + out1[0] = x1; + out1[1] = x2; + out1[2] = x3; + out1[3] = x4; + out1[4] = x5; + out1[5] = x6; + out1[6] = x7; + out1[7] = x8; + out1[8] = x9; + out1[9] = x10; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333]] + * Output Bounds: + * out1: [[0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0x7f]] + */ +static void fiat_25519_to_bytes(uint8_t out1[32], const uint32_t arg1[10]) { + uint32_t x1; + fiat_25519_uint1 x2; + fiat_25519_subborrowx_u26(&x1, &x2, 0x0, (arg1[0]), UINT32_C(0x3ffffed)); + uint32_t x3; + fiat_25519_uint1 x4; + fiat_25519_subborrowx_u25(&x3, &x4, x2, (arg1[1]), UINT32_C(0x1ffffff)); + uint32_t x5; + fiat_25519_uint1 x6; + fiat_25519_subborrowx_u26(&x5, &x6, x4, (arg1[2]), UINT32_C(0x3ffffff)); + uint32_t x7; + fiat_25519_uint1 x8; + fiat_25519_subborrowx_u25(&x7, &x8, x6, (arg1[3]), UINT32_C(0x1ffffff)); + uint32_t x9; + fiat_25519_uint1 x10; + fiat_25519_subborrowx_u26(&x9, &x10, x8, (arg1[4]), UINT32_C(0x3ffffff)); + uint32_t x11; + fiat_25519_uint1 x12; + fiat_25519_subborrowx_u25(&x11, &x12, x10, (arg1[5]), UINT32_C(0x1ffffff)); + uint32_t x13; + fiat_25519_uint1 x14; + fiat_25519_subborrowx_u26(&x13, &x14, x12, (arg1[6]), UINT32_C(0x3ffffff)); + uint32_t x15; + fiat_25519_uint1 x16; + fiat_25519_subborrowx_u25(&x15, &x16, x14, (arg1[7]), UINT32_C(0x1ffffff)); + uint32_t x17; + fiat_25519_uint1 x18; + fiat_25519_subborrowx_u26(&x17, &x18, x16, (arg1[8]), UINT32_C(0x3ffffff)); + uint32_t x19; + fiat_25519_uint1 x20; + fiat_25519_subborrowx_u25(&x19, &x20, x18, (arg1[9]), UINT32_C(0x1ffffff)); + uint32_t x21; + fiat_25519_cmovznz_u32(&x21, x20, 0x0, UINT32_C(0xffffffff)); + uint32_t x22; + fiat_25519_uint1 x23; + fiat_25519_addcarryx_u26(&x22, &x23, 0x0, (x21 & UINT32_C(0x3ffffed)), x1); + uint32_t x24; + fiat_25519_uint1 x25; + fiat_25519_addcarryx_u25(&x24, &x25, x23, (x21 & UINT32_C(0x1ffffff)), x3); + uint32_t x26; + fiat_25519_uint1 x27; + fiat_25519_addcarryx_u26(&x26, &x27, x25, (x21 & UINT32_C(0x3ffffff)), x5); + uint32_t x28; + fiat_25519_uint1 x29; + fiat_25519_addcarryx_u25(&x28, &x29, x27, (x21 & UINT32_C(0x1ffffff)), x7); + uint32_t x30; + fiat_25519_uint1 x31; + fiat_25519_addcarryx_u26(&x30, &x31, x29, (x21 & UINT32_C(0x3ffffff)), x9); + uint32_t x32; + fiat_25519_uint1 x33; + fiat_25519_addcarryx_u25(&x32, &x33, x31, (x21 & UINT32_C(0x1ffffff)), x11); + uint32_t x34; + fiat_25519_uint1 x35; + fiat_25519_addcarryx_u26(&x34, &x35, x33, (x21 & UINT32_C(0x3ffffff)), x13); + uint32_t x36; + fiat_25519_uint1 x37; + fiat_25519_addcarryx_u25(&x36, &x37, x35, (x21 & UINT32_C(0x1ffffff)), x15); + uint32_t x38; + fiat_25519_uint1 x39; + fiat_25519_addcarryx_u26(&x38, &x39, x37, (x21 & UINT32_C(0x3ffffff)), x17); + uint32_t x40; + fiat_25519_uint1 x41; + fiat_25519_addcarryx_u25(&x40, &x41, x39, (x21 & UINT32_C(0x1ffffff)), x19); + uint32_t x42 = (x40 << 6); + uint32_t x43 = (x38 << 4); + uint32_t x44 = (x36 << 3); + uint32_t x45 = (x34 * (uint32_t)0x2); + uint32_t x46 = (x30 << 6); + uint32_t x47 = (x28 << 5); + uint32_t x48 = (x26 << 3); + uint32_t x49 = (x24 << 2); + uint32_t x50 = (x22 >> 8); + uint8_t x51 = (uint8_t)(x22 & UINT8_C(0xff)); + uint32_t x52 = (x50 >> 8); + uint8_t x53 = (uint8_t)(x50 & UINT8_C(0xff)); + uint8_t x54 = (uint8_t)(x52 >> 8); + uint8_t x55 = (uint8_t)(x52 & UINT8_C(0xff)); + uint32_t x56 = (x54 + x49); + uint32_t x57 = (x56 >> 8); + uint8_t x58 = (uint8_t)(x56 & UINT8_C(0xff)); + uint32_t x59 = (x57 >> 8); + uint8_t x60 = (uint8_t)(x57 & UINT8_C(0xff)); + uint8_t x61 = (uint8_t)(x59 >> 8); + uint8_t x62 = (uint8_t)(x59 & UINT8_C(0xff)); + uint32_t x63 = (x61 + x48); + uint32_t x64 = (x63 >> 8); + uint8_t x65 = (uint8_t)(x63 & UINT8_C(0xff)); + uint32_t x66 = (x64 >> 8); + uint8_t x67 = (uint8_t)(x64 & UINT8_C(0xff)); + uint8_t x68 = (uint8_t)(x66 >> 8); + uint8_t x69 = (uint8_t)(x66 & UINT8_C(0xff)); + uint32_t x70 = (x68 + x47); + uint32_t x71 = (x70 >> 8); + uint8_t x72 = (uint8_t)(x70 & UINT8_C(0xff)); + uint32_t x73 = (x71 >> 8); + uint8_t x74 = (uint8_t)(x71 & UINT8_C(0xff)); + uint8_t x75 = (uint8_t)(x73 >> 8); + uint8_t x76 = (uint8_t)(x73 & UINT8_C(0xff)); + uint32_t x77 = (x75 + x46); + uint32_t x78 = (x77 >> 8); + uint8_t x79 = (uint8_t)(x77 & UINT8_C(0xff)); + uint32_t x80 = (x78 >> 8); + uint8_t x81 = (uint8_t)(x78 & UINT8_C(0xff)); + uint8_t x82 = (uint8_t)(x80 >> 8); + uint8_t x83 = (uint8_t)(x80 & UINT8_C(0xff)); + uint8_t x84 = (uint8_t)(x82 & UINT8_C(0xff)); + uint32_t x85 = (x32 >> 8); + uint8_t x86 = (uint8_t)(x32 & UINT8_C(0xff)); + uint32_t x87 = (x85 >> 8); + uint8_t x88 = (uint8_t)(x85 & UINT8_C(0xff)); + fiat_25519_uint1 x89 = (fiat_25519_uint1)(x87 >> 8); + uint8_t x90 = (uint8_t)(x87 & UINT8_C(0xff)); + uint32_t x91 = (x89 + x45); + uint32_t x92 = (x91 >> 8); + uint8_t x93 = (uint8_t)(x91 & UINT8_C(0xff)); + uint32_t x94 = (x92 >> 8); + uint8_t x95 = (uint8_t)(x92 & UINT8_C(0xff)); + uint8_t x96 = (uint8_t)(x94 >> 8); + uint8_t x97 = (uint8_t)(x94 & UINT8_C(0xff)); + uint32_t x98 = (x96 + x44); + uint32_t x99 = (x98 >> 8); + uint8_t x100 = (uint8_t)(x98 & UINT8_C(0xff)); + uint32_t x101 = (x99 >> 8); + uint8_t x102 = (uint8_t)(x99 & UINT8_C(0xff)); + uint8_t x103 = (uint8_t)(x101 >> 8); + uint8_t x104 = (uint8_t)(x101 & UINT8_C(0xff)); + uint32_t x105 = (x103 + x43); + uint32_t x106 = (x105 >> 8); + uint8_t x107 = (uint8_t)(x105 & UINT8_C(0xff)); + uint32_t x108 = (x106 >> 8); + uint8_t x109 = (uint8_t)(x106 & UINT8_C(0xff)); + uint8_t x110 = (uint8_t)(x108 >> 8); + uint8_t x111 = (uint8_t)(x108 & UINT8_C(0xff)); + uint32_t x112 = (x110 + x42); + uint32_t x113 = (x112 >> 8); + uint8_t x114 = (uint8_t)(x112 & UINT8_C(0xff)); + uint32_t x115 = (x113 >> 8); + uint8_t x116 = (uint8_t)(x113 & UINT8_C(0xff)); + uint8_t x117 = (uint8_t)(x115 >> 8); + uint8_t x118 = (uint8_t)(x115 & UINT8_C(0xff)); + out1[0] = x51; + out1[1] = x53; + out1[2] = x55; + out1[3] = x58; + out1[4] = x60; + out1[5] = x62; + out1[6] = x65; + out1[7] = x67; + out1[8] = x69; + out1[9] = x72; + out1[10] = x74; + out1[11] = x76; + out1[12] = x79; + out1[13] = x81; + out1[14] = x83; + out1[15] = x84; + out1[16] = x86; + out1[17] = x88; + out1[18] = x90; + out1[19] = x93; + out1[20] = x95; + out1[21] = x97; + out1[22] = x100; + out1[23] = x102; + out1[24] = x104; + out1[25] = x107; + out1[26] = x109; + out1[27] = x111; + out1[28] = x114; + out1[29] = x116; + out1[30] = x118; + out1[31] = x117; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0x7f]] + * Output Bounds: + * out1: [[0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333], [0x0 ~> 0x4666666], [0x0 ~> 0x2333333]] + */ +static void fiat_25519_from_bytes(uint32_t out1[10], const uint8_t arg1[32]) { + uint32_t x1 = ((uint32_t)(arg1[31]) << 18); + uint32_t x2 = ((uint32_t)(arg1[30]) << 10); + uint32_t x3 = ((uint32_t)(arg1[29]) << 2); + uint32_t x4 = ((uint32_t)(arg1[28]) << 20); + uint32_t x5 = ((uint32_t)(arg1[27]) << 12); + uint32_t x6 = ((uint32_t)(arg1[26]) << 4); + uint32_t x7 = ((uint32_t)(arg1[25]) << 21); + uint32_t x8 = ((uint32_t)(arg1[24]) << 13); + uint32_t x9 = ((uint32_t)(arg1[23]) << 5); + uint32_t x10 = ((uint32_t)(arg1[22]) << 23); + uint32_t x11 = ((uint32_t)(arg1[21]) << 15); + uint32_t x12 = ((uint32_t)(arg1[20]) << 7); + uint32_t x13 = ((uint32_t)(arg1[19]) << 24); + uint32_t x14 = ((uint32_t)(arg1[18]) << 16); + uint32_t x15 = ((uint32_t)(arg1[17]) << 8); + uint8_t x16 = (arg1[16]); + uint32_t x17 = ((uint32_t)(arg1[15]) << 18); + uint32_t x18 = ((uint32_t)(arg1[14]) << 10); + uint32_t x19 = ((uint32_t)(arg1[13]) << 2); + uint32_t x20 = ((uint32_t)(arg1[12]) << 19); + uint32_t x21 = ((uint32_t)(arg1[11]) << 11); + uint32_t x22 = ((uint32_t)(arg1[10]) << 3); + uint32_t x23 = ((uint32_t)(arg1[9]) << 21); + uint32_t x24 = ((uint32_t)(arg1[8]) << 13); + uint32_t x25 = ((uint32_t)(arg1[7]) << 5); + uint32_t x26 = ((uint32_t)(arg1[6]) << 22); + uint32_t x27 = ((uint32_t)(arg1[5]) << 14); + uint32_t x28 = ((uint32_t)(arg1[4]) << 6); + uint32_t x29 = ((uint32_t)(arg1[3]) << 24); + uint32_t x30 = ((uint32_t)(arg1[2]) << 16); + uint32_t x31 = ((uint32_t)(arg1[1]) << 8); + uint8_t x32 = (arg1[0]); + uint32_t x33 = (x32 + (x31 + (x30 + x29))); + uint8_t x34 = (uint8_t)(x33 >> 26); + uint32_t x35 = (x33 & UINT32_C(0x3ffffff)); + uint32_t x36 = (x3 + (x2 + x1)); + uint32_t x37 = (x6 + (x5 + x4)); + uint32_t x38 = (x9 + (x8 + x7)); + uint32_t x39 = (x12 + (x11 + x10)); + uint32_t x40 = (x16 + (x15 + (x14 + x13))); + uint32_t x41 = (x19 + (x18 + x17)); + uint32_t x42 = (x22 + (x21 + x20)); + uint32_t x43 = (x25 + (x24 + x23)); + uint32_t x44 = (x28 + (x27 + x26)); + uint32_t x45 = (x34 + x44); + uint8_t x46 = (uint8_t)(x45 >> 25); + uint32_t x47 = (x45 & UINT32_C(0x1ffffff)); + uint32_t x48 = (x46 + x43); + uint8_t x49 = (uint8_t)(x48 >> 26); + uint32_t x50 = (x48 & UINT32_C(0x3ffffff)); + uint32_t x51 = (x49 + x42); + uint8_t x52 = (uint8_t)(x51 >> 25); + uint32_t x53 = (x51 & UINT32_C(0x1ffffff)); + uint32_t x54 = (x52 + x41); + uint32_t x55 = (x54 & UINT32_C(0x3ffffff)); + uint8_t x56 = (uint8_t)(x40 >> 25); + uint32_t x57 = (x40 & UINT32_C(0x1ffffff)); + uint32_t x58 = (x56 + x39); + uint8_t x59 = (uint8_t)(x58 >> 26); + uint32_t x60 = (x58 & UINT32_C(0x3ffffff)); + uint32_t x61 = (x59 + x38); + uint8_t x62 = (uint8_t)(x61 >> 25); + uint32_t x63 = (x61 & UINT32_C(0x1ffffff)); + uint32_t x64 = (x62 + x37); + uint8_t x65 = (uint8_t)(x64 >> 26); + uint32_t x66 = (x64 & UINT32_C(0x3ffffff)); + uint32_t x67 = (x65 + x36); + out1[0] = x35; + out1[1] = x47; + out1[2] = x50; + out1[3] = x53; + out1[4] = x55; + out1[5] = x57; + out1[6] = x60; + out1[7] = x63; + out1[8] = x66; + out1[9] = x67; +} +
diff --git a/third_party/fiat/curve25519_64.c b/third_party/fiat/curve25519_64.c new file mode 100644 index 0000000..23bf361 --- /dev/null +++ b/third_party/fiat/curve25519_64.c
@@ -0,0 +1,553 @@ +/* Autogenerated */ +/* curve description: 25519 */ +/* requested operations: carry_mul, carry_square, carry_scmul121666, carry, add, sub, opp, selectznz, to_bytes, from_bytes */ +/* n = 5 (from "5") */ +/* s = 0x8000000000000000000000000000000000000000000000000000000000000000 (from "2^255") */ +/* c = [(1, 19)] (from "1,19") */ +/* machine_wordsize = 64 (from "64") */ + +#include <stdint.h> +typedef unsigned char fiat_25519_uint1; +typedef signed char fiat_25519_int1; +typedef signed __int128 fiat_25519_int128; +typedef unsigned __int128 fiat_25519_uint128; + + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [0x0 ~> 0x7ffffffffffff] + * arg3: [0x0 ~> 0x7ffffffffffff] + * Output Bounds: + * out1: [0x0 ~> 0x7ffffffffffff] + * out2: [0x0 ~> 0x1] + */ +static void fiat_25519_addcarryx_u51(uint64_t* out1, fiat_25519_uint1* out2, fiat_25519_uint1 arg1, uint64_t arg2, uint64_t arg3) { + uint64_t x1 = ((arg1 + arg2) + arg3); + uint64_t x2 = (x1 & UINT64_C(0x7ffffffffffff)); + fiat_25519_uint1 x3 = (fiat_25519_uint1)(x1 >> 51); + *out1 = x2; + *out2 = x3; +} + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [0x0 ~> 0x7ffffffffffff] + * arg3: [0x0 ~> 0x7ffffffffffff] + * Output Bounds: + * out1: [0x0 ~> 0x7ffffffffffff] + * out2: [0x0 ~> 0x1] + */ +static void fiat_25519_subborrowx_u51(uint64_t* out1, fiat_25519_uint1* out2, fiat_25519_uint1 arg1, uint64_t arg2, uint64_t arg3) { + int64_t x1 = ((int64_t)(arg2 - (int64_t)arg1) - (int64_t)arg3); + fiat_25519_int1 x2 = (fiat_25519_int1)(x1 >> 51); + uint64_t x3 = (x1 & UINT64_C(0x7ffffffffffff)); + *out1 = x3; + *out2 = (fiat_25519_uint1)(0x0 - x2); +} + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [0x0 ~> 0xffffffffffffffff] + * arg3: [0x0 ~> 0xffffffffffffffff] + * Output Bounds: + * out1: [0x0 ~> 0xffffffffffffffff] + */ +static void fiat_25519_cmovznz_u64(uint64_t* out1, fiat_25519_uint1 arg1, uint64_t arg2, uint64_t arg3) { + fiat_25519_uint1 x1 = (!(!arg1)); + uint64_t x2 = ((fiat_25519_int1)(0x0 - x1) & UINT64_C(0xffffffffffffffff)); + uint64_t x3 = ((x2 & arg3) | ((~x2) & arg2)); + *out1 = x3; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664]] + * arg2: [[0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664]] + * Output Bounds: + * out1: [[0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc]] + */ +static void fiat_25519_carry_mul(uint64_t out1[5], const uint64_t arg1[5], const uint64_t arg2[5]) { + fiat_25519_uint128 x1 = ((fiat_25519_uint128)(arg1[4]) * ((arg2[4]) * (uint64_t)UINT8_C(0x13))); + fiat_25519_uint128 x2 = ((fiat_25519_uint128)(arg1[4]) * ((arg2[3]) * (uint64_t)UINT8_C(0x13))); + fiat_25519_uint128 x3 = ((fiat_25519_uint128)(arg1[4]) * ((arg2[2]) * (uint64_t)UINT8_C(0x13))); + fiat_25519_uint128 x4 = ((fiat_25519_uint128)(arg1[4]) * ((arg2[1]) * (uint64_t)UINT8_C(0x13))); + fiat_25519_uint128 x5 = ((fiat_25519_uint128)(arg1[3]) * ((arg2[4]) * (uint64_t)UINT8_C(0x13))); + fiat_25519_uint128 x6 = ((fiat_25519_uint128)(arg1[3]) * ((arg2[3]) * (uint64_t)UINT8_C(0x13))); + fiat_25519_uint128 x7 = ((fiat_25519_uint128)(arg1[3]) * ((arg2[2]) * (uint64_t)UINT8_C(0x13))); + fiat_25519_uint128 x8 = ((fiat_25519_uint128)(arg1[2]) * ((arg2[4]) * (uint64_t)UINT8_C(0x13))); + fiat_25519_uint128 x9 = ((fiat_25519_uint128)(arg1[2]) * ((arg2[3]) * (uint64_t)UINT8_C(0x13))); + fiat_25519_uint128 x10 = ((fiat_25519_uint128)(arg1[1]) * ((arg2[4]) * (uint64_t)UINT8_C(0x13))); + fiat_25519_uint128 x11 = ((fiat_25519_uint128)(arg1[4]) * (arg2[0])); + fiat_25519_uint128 x12 = ((fiat_25519_uint128)(arg1[3]) * (arg2[1])); + fiat_25519_uint128 x13 = ((fiat_25519_uint128)(arg1[3]) * (arg2[0])); + fiat_25519_uint128 x14 = ((fiat_25519_uint128)(arg1[2]) * (arg2[2])); + fiat_25519_uint128 x15 = ((fiat_25519_uint128)(arg1[2]) * (arg2[1])); + fiat_25519_uint128 x16 = ((fiat_25519_uint128)(arg1[2]) * (arg2[0])); + fiat_25519_uint128 x17 = ((fiat_25519_uint128)(arg1[1]) * (arg2[3])); + fiat_25519_uint128 x18 = ((fiat_25519_uint128)(arg1[1]) * (arg2[2])); + fiat_25519_uint128 x19 = ((fiat_25519_uint128)(arg1[1]) * (arg2[1])); + fiat_25519_uint128 x20 = ((fiat_25519_uint128)(arg1[1]) * (arg2[0])); + fiat_25519_uint128 x21 = ((fiat_25519_uint128)(arg1[0]) * (arg2[4])); + fiat_25519_uint128 x22 = ((fiat_25519_uint128)(arg1[0]) * (arg2[3])); + fiat_25519_uint128 x23 = ((fiat_25519_uint128)(arg1[0]) * (arg2[2])); + fiat_25519_uint128 x24 = ((fiat_25519_uint128)(arg1[0]) * (arg2[1])); + fiat_25519_uint128 x25 = ((fiat_25519_uint128)(arg1[0]) * (arg2[0])); + fiat_25519_uint128 x26 = (x25 + (x10 + (x9 + (x7 + x4)))); + uint64_t x27 = (uint64_t)(x26 >> 51); + uint64_t x28 = (uint64_t)(x26 & UINT64_C(0x7ffffffffffff)); + fiat_25519_uint128 x29 = (x21 + (x17 + (x14 + (x12 + x11)))); + fiat_25519_uint128 x30 = (x22 + (x18 + (x15 + (x13 + x1)))); + fiat_25519_uint128 x31 = (x23 + (x19 + (x16 + (x5 + x2)))); + fiat_25519_uint128 x32 = (x24 + (x20 + (x8 + (x6 + x3)))); + fiat_25519_uint128 x33 = (x27 + x32); + uint64_t x34 = (uint64_t)(x33 >> 51); + uint64_t x35 = (uint64_t)(x33 & UINT64_C(0x7ffffffffffff)); + fiat_25519_uint128 x36 = (x34 + x31); + uint64_t x37 = (uint64_t)(x36 >> 51); + uint64_t x38 = (uint64_t)(x36 & UINT64_C(0x7ffffffffffff)); + fiat_25519_uint128 x39 = (x37 + x30); + uint64_t x40 = (uint64_t)(x39 >> 51); + uint64_t x41 = (uint64_t)(x39 & UINT64_C(0x7ffffffffffff)); + fiat_25519_uint128 x42 = (x40 + x29); + uint64_t x43 = (uint64_t)(x42 >> 51); + uint64_t x44 = (uint64_t)(x42 & UINT64_C(0x7ffffffffffff)); + uint64_t x45 = (x43 * (uint64_t)UINT8_C(0x13)); + uint64_t x46 = (x28 + x45); + uint64_t x47 = (x46 >> 51); + uint64_t x48 = (x46 & UINT64_C(0x7ffffffffffff)); + uint64_t x49 = (x47 + x35); + uint64_t x50 = (x49 >> 51); + uint64_t x51 = (x49 & UINT64_C(0x7ffffffffffff)); + uint64_t x52 = (x50 + x38); + out1[0] = x48; + out1[1] = x51; + out1[2] = x52; + out1[3] = x41; + out1[4] = x44; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664]] + * Output Bounds: + * out1: [[0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc]] + */ +static void fiat_25519_carry_square(uint64_t out1[5], const uint64_t arg1[5]) { + uint64_t x1 = ((arg1[4]) * (uint64_t)UINT8_C(0x13)); + uint64_t x2 = (x1 * (uint64_t)0x2); + uint64_t x3 = ((arg1[4]) * (uint64_t)0x2); + uint64_t x4 = ((arg1[3]) * (uint64_t)UINT8_C(0x13)); + uint64_t x5 = (x4 * (uint64_t)0x2); + uint64_t x6 = ((arg1[3]) * (uint64_t)0x2); + uint64_t x7 = ((arg1[2]) * (uint64_t)0x2); + uint64_t x8 = ((arg1[1]) * (uint64_t)0x2); + fiat_25519_uint128 x9 = ((fiat_25519_uint128)(arg1[4]) * x1); + fiat_25519_uint128 x10 = ((fiat_25519_uint128)(arg1[3]) * x2); + fiat_25519_uint128 x11 = ((fiat_25519_uint128)(arg1[3]) * x4); + fiat_25519_uint128 x12 = ((fiat_25519_uint128)(arg1[2]) * x2); + fiat_25519_uint128 x13 = ((fiat_25519_uint128)(arg1[2]) * x5); + fiat_25519_uint128 x14 = ((fiat_25519_uint128)(arg1[2]) * (arg1[2])); + fiat_25519_uint128 x15 = ((fiat_25519_uint128)(arg1[1]) * x2); + fiat_25519_uint128 x16 = ((fiat_25519_uint128)(arg1[1]) * x6); + fiat_25519_uint128 x17 = ((fiat_25519_uint128)(arg1[1]) * x7); + fiat_25519_uint128 x18 = ((fiat_25519_uint128)(arg1[1]) * (arg1[1])); + fiat_25519_uint128 x19 = ((fiat_25519_uint128)(arg1[0]) * x3); + fiat_25519_uint128 x20 = ((fiat_25519_uint128)(arg1[0]) * x6); + fiat_25519_uint128 x21 = ((fiat_25519_uint128)(arg1[0]) * x7); + fiat_25519_uint128 x22 = ((fiat_25519_uint128)(arg1[0]) * x8); + fiat_25519_uint128 x23 = ((fiat_25519_uint128)(arg1[0]) * (arg1[0])); + fiat_25519_uint128 x24 = (x23 + (x15 + x13)); + uint64_t x25 = (uint64_t)(x24 >> 51); + uint64_t x26 = (uint64_t)(x24 & UINT64_C(0x7ffffffffffff)); + fiat_25519_uint128 x27 = (x19 + (x16 + x14)); + fiat_25519_uint128 x28 = (x20 + (x17 + x9)); + fiat_25519_uint128 x29 = (x21 + (x18 + x10)); + fiat_25519_uint128 x30 = (x22 + (x12 + x11)); + fiat_25519_uint128 x31 = (x25 + x30); + uint64_t x32 = (uint64_t)(x31 >> 51); + uint64_t x33 = (uint64_t)(x31 & UINT64_C(0x7ffffffffffff)); + fiat_25519_uint128 x34 = (x32 + x29); + uint64_t x35 = (uint64_t)(x34 >> 51); + uint64_t x36 = (uint64_t)(x34 & UINT64_C(0x7ffffffffffff)); + fiat_25519_uint128 x37 = (x35 + x28); + uint64_t x38 = (uint64_t)(x37 >> 51); + uint64_t x39 = (uint64_t)(x37 & UINT64_C(0x7ffffffffffff)); + fiat_25519_uint128 x40 = (x38 + x27); + uint64_t x41 = (uint64_t)(x40 >> 51); + uint64_t x42 = (uint64_t)(x40 & UINT64_C(0x7ffffffffffff)); + uint64_t x43 = (x41 * (uint64_t)UINT8_C(0x13)); + uint64_t x44 = (x26 + x43); + uint64_t x45 = (x44 >> 51); + uint64_t x46 = (x44 & UINT64_C(0x7ffffffffffff)); + uint64_t x47 = (x45 + x33); + uint64_t x48 = (x47 >> 51); + uint64_t x49 = (x47 & UINT64_C(0x7ffffffffffff)); + uint64_t x50 = (x48 + x36); + out1[0] = x46; + out1[1] = x49; + out1[2] = x50; + out1[3] = x39; + out1[4] = x42; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664]] + * Output Bounds: + * out1: [[0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc]] + */ +static void fiat_25519_carry_scmul_121666(uint64_t out1[5], const uint64_t arg1[5]) { + fiat_25519_uint128 x1 = (UINT32_C(0x1db42) * (fiat_25519_uint128)(arg1[4])); + fiat_25519_uint128 x2 = (UINT32_C(0x1db42) * (fiat_25519_uint128)(arg1[3])); + fiat_25519_uint128 x3 = (UINT32_C(0x1db42) * (fiat_25519_uint128)(arg1[2])); + fiat_25519_uint128 x4 = (UINT32_C(0x1db42) * (fiat_25519_uint128)(arg1[1])); + fiat_25519_uint128 x5 = (UINT32_C(0x1db42) * (fiat_25519_uint128)(arg1[0])); + uint64_t x6 = (uint64_t)(x5 >> 51); + uint64_t x7 = (uint64_t)(x5 & UINT64_C(0x7ffffffffffff)); + fiat_25519_uint128 x8 = (x6 + x4); + uint64_t x9 = (uint64_t)(x8 >> 51); + uint64_t x10 = (uint64_t)(x8 & UINT64_C(0x7ffffffffffff)); + fiat_25519_uint128 x11 = (x9 + x3); + uint64_t x12 = (uint64_t)(x11 >> 51); + uint64_t x13 = (uint64_t)(x11 & UINT64_C(0x7ffffffffffff)); + fiat_25519_uint128 x14 = (x12 + x2); + uint64_t x15 = (uint64_t)(x14 >> 51); + uint64_t x16 = (uint64_t)(x14 & UINT64_C(0x7ffffffffffff)); + fiat_25519_uint128 x17 = (x15 + x1); + uint64_t x18 = (uint64_t)(x17 >> 51); + uint64_t x19 = (uint64_t)(x17 & UINT64_C(0x7ffffffffffff)); + uint64_t x20 = (x18 * (uint64_t)UINT8_C(0x13)); + uint64_t x21 = (x7 + x20); + uint64_t x22 = (x21 >> 51); + uint64_t x23 = (x21 & UINT64_C(0x7ffffffffffff)); + uint64_t x24 = (x22 + x10); + uint64_t x25 = (x24 >> 51); + uint64_t x26 = (x24 & UINT64_C(0x7ffffffffffff)); + uint64_t x27 = (x25 + x13); + out1[0] = x23; + out1[1] = x26; + out1[2] = x27; + out1[3] = x16; + out1[4] = x19; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664]] + * Output Bounds: + * out1: [[0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc]] + */ +static void fiat_25519_carry(uint64_t out1[5], const uint64_t arg1[5]) { + uint64_t x1 = (arg1[0]); + uint64_t x2 = ((x1 >> 51) + (arg1[1])); + uint64_t x3 = ((x2 >> 51) + (arg1[2])); + uint64_t x4 = ((x3 >> 51) + (arg1[3])); + uint64_t x5 = ((x4 >> 51) + (arg1[4])); + uint64_t x6 = ((x1 & UINT64_C(0x7ffffffffffff)) + ((x5 >> 51) * (uint64_t)UINT8_C(0x13))); + uint64_t x7 = ((x6 >> 51) + (x2 & UINT64_C(0x7ffffffffffff))); + uint64_t x8 = (x6 & UINT64_C(0x7ffffffffffff)); + uint64_t x9 = (x7 & UINT64_C(0x7ffffffffffff)); + uint64_t x10 = ((x7 >> 51) + (x3 & UINT64_C(0x7ffffffffffff))); + uint64_t x11 = (x4 & UINT64_C(0x7ffffffffffff)); + uint64_t x12 = (x5 & UINT64_C(0x7ffffffffffff)); + out1[0] = x8; + out1[1] = x9; + out1[2] = x10; + out1[3] = x11; + out1[4] = x12; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc]] + * arg2: [[0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc]] + * Output Bounds: + * out1: [[0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664]] + */ +static void fiat_25519_add(uint64_t out1[5], const uint64_t arg1[5], const uint64_t arg2[5]) { + uint64_t x1 = ((arg1[0]) + (arg2[0])); + uint64_t x2 = ((arg1[1]) + (arg2[1])); + uint64_t x3 = ((arg1[2]) + (arg2[2])); + uint64_t x4 = ((arg1[3]) + (arg2[3])); + uint64_t x5 = ((arg1[4]) + (arg2[4])); + out1[0] = x1; + out1[1] = x2; + out1[2] = x3; + out1[3] = x4; + out1[4] = x5; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc]] + * arg2: [[0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc]] + * Output Bounds: + * out1: [[0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664]] + */ +static void fiat_25519_sub(uint64_t out1[5], const uint64_t arg1[5], const uint64_t arg2[5]) { + uint64_t x1 = ((UINT64_C(0xfffffffffffda) + (arg1[0])) - (arg2[0])); + uint64_t x2 = ((UINT64_C(0xffffffffffffe) + (arg1[1])) - (arg2[1])); + uint64_t x3 = ((UINT64_C(0xffffffffffffe) + (arg1[2])) - (arg2[2])); + uint64_t x4 = ((UINT64_C(0xffffffffffffe) + (arg1[3])) - (arg2[3])); + uint64_t x5 = ((UINT64_C(0xffffffffffffe) + (arg1[4])) - (arg2[4])); + out1[0] = x1; + out1[1] = x2; + out1[2] = x3; + out1[3] = x4; + out1[4] = x5; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc]] + * Output Bounds: + * out1: [[0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664], [0x0 ~> 0x1a666666666664]] + */ +static void fiat_25519_opp(uint64_t out1[5], const uint64_t arg1[5]) { + uint64_t x1 = (UINT64_C(0xfffffffffffda) - (arg1[0])); + uint64_t x2 = (UINT64_C(0xffffffffffffe) - (arg1[1])); + uint64_t x3 = (UINT64_C(0xffffffffffffe) - (arg1[2])); + uint64_t x4 = (UINT64_C(0xffffffffffffe) - (arg1[3])); + uint64_t x5 = (UINT64_C(0xffffffffffffe) - (arg1[4])); + out1[0] = x1; + out1[1] = x2; + out1[2] = x3; + out1[3] = x4; + out1[4] = x5; +} + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + * arg3: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + */ +static void fiat_25519_selectznz(uint64_t out1[5], fiat_25519_uint1 arg1, const uint64_t arg2[5], const uint64_t arg3[5]) { + uint64_t x1; + fiat_25519_cmovznz_u64(&x1, arg1, (arg2[0]), (arg3[0])); + uint64_t x2; + fiat_25519_cmovznz_u64(&x2, arg1, (arg2[1]), (arg3[1])); + uint64_t x3; + fiat_25519_cmovznz_u64(&x3, arg1, (arg2[2]), (arg3[2])); + uint64_t x4; + fiat_25519_cmovznz_u64(&x4, arg1, (arg2[3]), (arg3[3])); + uint64_t x5; + fiat_25519_cmovznz_u64(&x5, arg1, (arg2[4]), (arg3[4])); + out1[0] = x1; + out1[1] = x2; + out1[2] = x3; + out1[3] = x4; + out1[4] = x5; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc]] + * Output Bounds: + * out1: [[0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0x7f]] + */ +static void fiat_25519_to_bytes(uint8_t out1[32], const uint64_t arg1[5]) { + uint64_t x1; + fiat_25519_uint1 x2; + fiat_25519_subborrowx_u51(&x1, &x2, 0x0, (arg1[0]), UINT64_C(0x7ffffffffffed)); + uint64_t x3; + fiat_25519_uint1 x4; + fiat_25519_subborrowx_u51(&x3, &x4, x2, (arg1[1]), UINT64_C(0x7ffffffffffff)); + uint64_t x5; + fiat_25519_uint1 x6; + fiat_25519_subborrowx_u51(&x5, &x6, x4, (arg1[2]), UINT64_C(0x7ffffffffffff)); + uint64_t x7; + fiat_25519_uint1 x8; + fiat_25519_subborrowx_u51(&x7, &x8, x6, (arg1[3]), UINT64_C(0x7ffffffffffff)); + uint64_t x9; + fiat_25519_uint1 x10; + fiat_25519_subborrowx_u51(&x9, &x10, x8, (arg1[4]), UINT64_C(0x7ffffffffffff)); + uint64_t x11; + fiat_25519_cmovznz_u64(&x11, x10, 0x0, UINT64_C(0xffffffffffffffff)); + uint64_t x12; + fiat_25519_uint1 x13; + fiat_25519_addcarryx_u51(&x12, &x13, 0x0, (x11 & UINT64_C(0x7ffffffffffed)), x1); + uint64_t x14; + fiat_25519_uint1 x15; + fiat_25519_addcarryx_u51(&x14, &x15, x13, (x11 & UINT64_C(0x7ffffffffffff)), x3); + uint64_t x16; + fiat_25519_uint1 x17; + fiat_25519_addcarryx_u51(&x16, &x17, x15, (x11 & UINT64_C(0x7ffffffffffff)), x5); + uint64_t x18; + fiat_25519_uint1 x19; + fiat_25519_addcarryx_u51(&x18, &x19, x17, (x11 & UINT64_C(0x7ffffffffffff)), x7); + uint64_t x20; + fiat_25519_uint1 x21; + fiat_25519_addcarryx_u51(&x20, &x21, x19, (x11 & UINT64_C(0x7ffffffffffff)), x9); + uint64_t x22 = (x20 << 4); + uint64_t x23 = (x18 * (uint64_t)0x2); + uint64_t x24 = (x16 << 6); + uint64_t x25 = (x14 << 3); + uint64_t x26 = (x12 >> 8); + uint8_t x27 = (uint8_t)(x12 & UINT8_C(0xff)); + uint64_t x28 = (x26 >> 8); + uint8_t x29 = (uint8_t)(x26 & UINT8_C(0xff)); + uint64_t x30 = (x28 >> 8); + uint8_t x31 = (uint8_t)(x28 & UINT8_C(0xff)); + uint64_t x32 = (x30 >> 8); + uint8_t x33 = (uint8_t)(x30 & UINT8_C(0xff)); + uint64_t x34 = (x32 >> 8); + uint8_t x35 = (uint8_t)(x32 & UINT8_C(0xff)); + uint8_t x36 = (uint8_t)(x34 >> 8); + uint8_t x37 = (uint8_t)(x34 & UINT8_C(0xff)); + uint64_t x38 = (x36 + x25); + uint64_t x39 = (x38 >> 8); + uint8_t x40 = (uint8_t)(x38 & UINT8_C(0xff)); + uint64_t x41 = (x39 >> 8); + uint8_t x42 = (uint8_t)(x39 & UINT8_C(0xff)); + uint64_t x43 = (x41 >> 8); + uint8_t x44 = (uint8_t)(x41 & UINT8_C(0xff)); + uint64_t x45 = (x43 >> 8); + uint8_t x46 = (uint8_t)(x43 & UINT8_C(0xff)); + uint64_t x47 = (x45 >> 8); + uint8_t x48 = (uint8_t)(x45 & UINT8_C(0xff)); + uint8_t x49 = (uint8_t)(x47 >> 8); + uint8_t x50 = (uint8_t)(x47 & UINT8_C(0xff)); + uint64_t x51 = (x49 + x24); + uint64_t x52 = (x51 >> 8); + uint8_t x53 = (uint8_t)(x51 & UINT8_C(0xff)); + uint64_t x54 = (x52 >> 8); + uint8_t x55 = (uint8_t)(x52 & UINT8_C(0xff)); + uint64_t x56 = (x54 >> 8); + uint8_t x57 = (uint8_t)(x54 & UINT8_C(0xff)); + uint64_t x58 = (x56 >> 8); + uint8_t x59 = (uint8_t)(x56 & UINT8_C(0xff)); + uint64_t x60 = (x58 >> 8); + uint8_t x61 = (uint8_t)(x58 & UINT8_C(0xff)); + uint64_t x62 = (x60 >> 8); + uint8_t x63 = (uint8_t)(x60 & UINT8_C(0xff)); + fiat_25519_uint1 x64 = (fiat_25519_uint1)(x62 >> 8); + uint8_t x65 = (uint8_t)(x62 & UINT8_C(0xff)); + uint64_t x66 = (x64 + x23); + uint64_t x67 = (x66 >> 8); + uint8_t x68 = (uint8_t)(x66 & UINT8_C(0xff)); + uint64_t x69 = (x67 >> 8); + uint8_t x70 = (uint8_t)(x67 & UINT8_C(0xff)); + uint64_t x71 = (x69 >> 8); + uint8_t x72 = (uint8_t)(x69 & UINT8_C(0xff)); + uint64_t x73 = (x71 >> 8); + uint8_t x74 = (uint8_t)(x71 & UINT8_C(0xff)); + uint64_t x75 = (x73 >> 8); + uint8_t x76 = (uint8_t)(x73 & UINT8_C(0xff)); + uint8_t x77 = (uint8_t)(x75 >> 8); + uint8_t x78 = (uint8_t)(x75 & UINT8_C(0xff)); + uint64_t x79 = (x77 + x22); + uint64_t x80 = (x79 >> 8); + uint8_t x81 = (uint8_t)(x79 & UINT8_C(0xff)); + uint64_t x82 = (x80 >> 8); + uint8_t x83 = (uint8_t)(x80 & UINT8_C(0xff)); + uint64_t x84 = (x82 >> 8); + uint8_t x85 = (uint8_t)(x82 & UINT8_C(0xff)); + uint64_t x86 = (x84 >> 8); + uint8_t x87 = (uint8_t)(x84 & UINT8_C(0xff)); + uint64_t x88 = (x86 >> 8); + uint8_t x89 = (uint8_t)(x86 & UINT8_C(0xff)); + uint8_t x90 = (uint8_t)(x88 >> 8); + uint8_t x91 = (uint8_t)(x88 & UINT8_C(0xff)); + out1[0] = x27; + out1[1] = x29; + out1[2] = x31; + out1[3] = x33; + out1[4] = x35; + out1[5] = x37; + out1[6] = x40; + out1[7] = x42; + out1[8] = x44; + out1[9] = x46; + out1[10] = x48; + out1[11] = x50; + out1[12] = x53; + out1[13] = x55; + out1[14] = x57; + out1[15] = x59; + out1[16] = x61; + out1[17] = x63; + out1[18] = x65; + out1[19] = x68; + out1[20] = x70; + out1[21] = x72; + out1[22] = x74; + out1[23] = x76; + out1[24] = x78; + out1[25] = x81; + out1[26] = x83; + out1[27] = x85; + out1[28] = x87; + out1[29] = x89; + out1[30] = x91; + out1[31] = x90; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0x7f]] + * Output Bounds: + * out1: [[0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc], [0x0 ~> 0x8cccccccccccc]] + */ +static void fiat_25519_from_bytes(uint64_t out1[5], const uint8_t arg1[32]) { + uint64_t x1 = ((uint64_t)(arg1[31]) << 44); + uint64_t x2 = ((uint64_t)(arg1[30]) << 36); + uint64_t x3 = ((uint64_t)(arg1[29]) << 28); + uint64_t x4 = ((uint64_t)(arg1[28]) << 20); + uint64_t x5 = ((uint64_t)(arg1[27]) << 12); + uint64_t x6 = ((uint64_t)(arg1[26]) << 4); + uint64_t x7 = ((uint64_t)(arg1[25]) << 47); + uint64_t x8 = ((uint64_t)(arg1[24]) << 39); + uint64_t x9 = ((uint64_t)(arg1[23]) << 31); + uint64_t x10 = ((uint64_t)(arg1[22]) << 23); + uint64_t x11 = ((uint64_t)(arg1[21]) << 15); + uint64_t x12 = ((uint64_t)(arg1[20]) << 7); + uint64_t x13 = ((uint64_t)(arg1[19]) << 50); + uint64_t x14 = ((uint64_t)(arg1[18]) << 42); + uint64_t x15 = ((uint64_t)(arg1[17]) << 34); + uint64_t x16 = ((uint64_t)(arg1[16]) << 26); + uint64_t x17 = ((uint64_t)(arg1[15]) << 18); + uint64_t x18 = ((uint64_t)(arg1[14]) << 10); + uint64_t x19 = ((uint64_t)(arg1[13]) << 2); + uint64_t x20 = ((uint64_t)(arg1[12]) << 45); + uint64_t x21 = ((uint64_t)(arg1[11]) << 37); + uint64_t x22 = ((uint64_t)(arg1[10]) << 29); + uint64_t x23 = ((uint64_t)(arg1[9]) << 21); + uint64_t x24 = ((uint64_t)(arg1[8]) << 13); + uint64_t x25 = ((uint64_t)(arg1[7]) << 5); + uint64_t x26 = ((uint64_t)(arg1[6]) << 48); + uint64_t x27 = ((uint64_t)(arg1[5]) << 40); + uint64_t x28 = ((uint64_t)(arg1[4]) << 32); + uint64_t x29 = ((uint64_t)(arg1[3]) << 24); + uint64_t x30 = ((uint64_t)(arg1[2]) << 16); + uint64_t x31 = ((uint64_t)(arg1[1]) << 8); + uint8_t x32 = (arg1[0]); + uint64_t x33 = (x32 + (x31 + (x30 + (x29 + (x28 + (x27 + x26)))))); + uint8_t x34 = (uint8_t)(x33 >> 51); + uint64_t x35 = (x33 & UINT64_C(0x7ffffffffffff)); + uint64_t x36 = (x6 + (x5 + (x4 + (x3 + (x2 + x1))))); + uint64_t x37 = (x12 + (x11 + (x10 + (x9 + (x8 + x7))))); + uint64_t x38 = (x19 + (x18 + (x17 + (x16 + (x15 + (x14 + x13)))))); + uint64_t x39 = (x25 + (x24 + (x23 + (x22 + (x21 + x20))))); + uint64_t x40 = (x34 + x39); + uint8_t x41 = (uint8_t)(x40 >> 51); + uint64_t x42 = (x40 & UINT64_C(0x7ffffffffffff)); + uint64_t x43 = (x41 + x38); + uint8_t x44 = (uint8_t)(x43 >> 51); + uint64_t x45 = (x43 & UINT64_C(0x7ffffffffffff)); + uint64_t x46 = (x44 + x37); + uint8_t x47 = (uint8_t)(x46 >> 51); + uint64_t x48 = (x46 & UINT64_C(0x7ffffffffffff)); + uint64_t x49 = (x47 + x36); + out1[0] = x35; + out1[1] = x42; + out1[2] = x45; + out1[3] = x48; + out1[4] = x49; +} +
diff --git a/third_party/fiat/p256.c b/third_party/fiat/p256.c index 414b7e0..3c2ce1d 100644 --- a/third_party/fiat/p256.c +++ b/third_party/fiat/p256.c
@@ -46,791 +46,11 @@ // MSVC does not implement uint128_t, and crashes with intrinsics #if defined(BORINGSSL_HAS_UINT128) #define BORINGSSL_NISTP256_64BIT 1 -#endif - -// "intrinsics" - -#if defined(BORINGSSL_NISTP256_64BIT) - -static uint64_t mulx_u64(uint64_t a, uint64_t b, uint64_t *high) { - uint128_t x = (uint128_t)a * b; - *high = (uint64_t) (x >> 64); - return (uint64_t) x; -} - -static uint64_t addcarryx_u64(uint8_t c, uint64_t a, uint64_t b, uint64_t *low) { - uint128_t x = (uint128_t)a + b + c; - *low = (uint64_t) x; - return (uint64_t) (x>>64); -} - -static uint64_t subborrow_u64(uint8_t c, uint64_t a, uint64_t b, uint64_t *low) { - uint128_t t = ((uint128_t) b + c); - uint128_t x = a-t; - *low = (uint64_t) x; - return (uint8_t) (x>>127); -} - -static uint64_t cmovznz_u64(uint64_t t, uint64_t z, uint64_t nz) { - t = -!!t; // all set if nonzero, 0 if 0 - return (t&nz) | ((~t)&z); -} - +#include "p256_64.c" #else - -static uint32_t mulx_u32(uint32_t a, uint32_t b, uint32_t *high) { - uint64_t x = (uint64_t)a * b; - *high = (uint32_t) (x >> 32); - return (uint32_t) x; -} - -static uint32_t addcarryx_u32(uint8_t c, uint32_t a, uint32_t b, uint32_t *low) { - uint64_t x = (uint64_t)a + b + c; - *low = (uint32_t) x; - return (uint32_t) (x>>32); -} - -static uint32_t subborrow_u32(uint8_t c, uint32_t a, uint32_t b, uint32_t *low) { - uint64_t t = ((uint64_t) b + c); - uint64_t x = a-t; - *low = (uint32_t) x; - return (uint8_t) (x>>63); -} - -static uint32_t cmovznz_u32(uint32_t t, uint32_t z, uint32_t nz) { - t = -!!t; // all set if nonzero, 0 if 0 - return (t&nz) | ((~t)&z); -} - +#include "p256_32.c" #endif -// fiat-crypto generated code - -#if defined(BORINGSSL_NISTP256_64BIT) - -static void fe_add(uint64_t out[4], const uint64_t in1[4], const uint64_t in2[4]) { - { const uint64_t x8 = in1[3]; - { const uint64_t x9 = in1[2]; - { const uint64_t x7 = in1[1]; - { const uint64_t x5 = in1[0]; - { const uint64_t x14 = in2[3]; - { const uint64_t x15 = in2[2]; - { const uint64_t x13 = in2[1]; - { const uint64_t x11 = in2[0]; - { uint64_t x17; uint8_t x18 = addcarryx_u64(0x0, x5, x11, &x17); - { uint64_t x20; uint8_t x21 = addcarryx_u64(x18, x7, x13, &x20); - { uint64_t x23; uint8_t x24 = addcarryx_u64(x21, x9, x15, &x23); - { uint64_t x26; uint8_t x27 = addcarryx_u64(x24, x8, x14, &x26); - { uint64_t x29; uint8_t x30 = subborrow_u64(0x0, x17, 0xffffffffffffffffL, &x29); - { uint64_t x32; uint8_t x33 = subborrow_u64(x30, x20, 0xffffffff, &x32); - { uint64_t x35; uint8_t x36 = subborrow_u64(x33, x23, 0x0, &x35); - { uint64_t x38; uint8_t x39 = subborrow_u64(x36, x26, 0xffffffff00000001L, &x38); - { uint64_t _1; uint8_t x42 = subborrow_u64(x39, x27, 0x0, &_1); - { uint64_t x43 = cmovznz_u64(x42, x38, x26); - { uint64_t x44 = cmovznz_u64(x42, x35, x23); - { uint64_t x45 = cmovznz_u64(x42, x32, x20); - { uint64_t x46 = cmovznz_u64(x42, x29, x17); - out[0] = x46; - out[1] = x45; - out[2] = x44; - out[3] = x43; - }}}}}}}}}}}}}}}}}}}}} -} - -// fe_op sets out = -in -static void fe_opp(uint64_t out[4], const uint64_t in1[4]) { - const uint64_t x5 = in1[3]; - const uint64_t x6 = in1[2]; - const uint64_t x4 = in1[1]; - const uint64_t x2 = in1[0]; - uint64_t x8; uint8_t x9 = subborrow_u64(0x0, 0x0, x2, &x8); - uint64_t x11; uint8_t x12 = subborrow_u64(x9, 0x0, x4, &x11); - uint64_t x14; uint8_t x15 = subborrow_u64(x12, 0x0, x6, &x14); - uint64_t x17; uint8_t x18 = subborrow_u64(x15, 0x0, x5, &x17); - uint64_t x19 = (uint64_t)cmovznz_u64(x18, 0x0, 0xffffffffffffffffL); - uint64_t x20 = (x19 & 0xffffffffffffffffL); - uint64_t x22; uint8_t x23 = addcarryx_u64(0x0, x8, x20, &x22); - uint64_t x24 = (x19 & 0xffffffff); - uint64_t x26; uint8_t x27 = addcarryx_u64(x23, x11, x24, &x26); - uint64_t x29; uint8_t x30 = addcarryx_u64(x27, x14, 0x0, &x29); - uint64_t x31 = (x19 & 0xffffffff00000001L); - uint64_t x33; addcarryx_u64(x30, x17, x31, &x33); - out[0] = x22; - out[1] = x26; - out[2] = x29; - out[3] = x33; -} - -static void fe_mul(uint64_t out[4], const uint64_t in1[4], const uint64_t in2[4]) { - const uint64_t x8 = in1[3]; - const uint64_t x9 = in1[2]; - const uint64_t x7 = in1[1]; - const uint64_t x5 = in1[0]; - const uint64_t x14 = in2[3]; - const uint64_t x15 = in2[2]; - const uint64_t x13 = in2[1]; - const uint64_t x11 = in2[0]; - uint64_t x18; uint64_t x17 = mulx_u64(x5, x11, &x18); - uint64_t x21; uint64_t x20 = mulx_u64(x5, x13, &x21); - uint64_t x24; uint64_t x23 = mulx_u64(x5, x15, &x24); - uint64_t x27; uint64_t x26 = mulx_u64(x5, x14, &x27); - uint64_t x29; uint8_t x30 = addcarryx_u64(0x0, x18, x20, &x29); - uint64_t x32; uint8_t x33 = addcarryx_u64(x30, x21, x23, &x32); - uint64_t x35; uint8_t x36 = addcarryx_u64(x33, x24, x26, &x35); - uint64_t x38; addcarryx_u64(0x0, x36, x27, &x38); - uint64_t x42; uint64_t x41 = mulx_u64(x17, 0xffffffffffffffffL, &x42); - uint64_t x45; uint64_t x44 = mulx_u64(x17, 0xffffffff, &x45); - uint64_t x48; uint64_t x47 = mulx_u64(x17, 0xffffffff00000001L, &x48); - uint64_t x50; uint8_t x51 = addcarryx_u64(0x0, x42, x44, &x50); - uint64_t x53; uint8_t x54 = addcarryx_u64(x51, x45, 0x0, &x53); - uint64_t x56; uint8_t x57 = addcarryx_u64(x54, 0x0, x47, &x56); - uint64_t x59; addcarryx_u64(0x0, x57, x48, &x59); - uint64_t _2; uint8_t x63 = addcarryx_u64(0x0, x17, x41, &_2); - uint64_t x65; uint8_t x66 = addcarryx_u64(x63, x29, x50, &x65); - uint64_t x68; uint8_t x69 = addcarryx_u64(x66, x32, x53, &x68); - uint64_t x71; uint8_t x72 = addcarryx_u64(x69, x35, x56, &x71); - uint64_t x74; uint8_t x75 = addcarryx_u64(x72, x38, x59, &x74); - uint64_t x78; uint64_t x77 = mulx_u64(x7, x11, &x78); - uint64_t x81; uint64_t x80 = mulx_u64(x7, x13, &x81); - uint64_t x84; uint64_t x83 = mulx_u64(x7, x15, &x84); - uint64_t x87; uint64_t x86 = mulx_u64(x7, x14, &x87); - uint64_t x89; uint8_t x90 = addcarryx_u64(0x0, x78, x80, &x89); - uint64_t x92; uint8_t x93 = addcarryx_u64(x90, x81, x83, &x92); - uint64_t x95; uint8_t x96 = addcarryx_u64(x93, x84, x86, &x95); - uint64_t x98; addcarryx_u64(0x0, x96, x87, &x98); - uint64_t x101; uint8_t x102 = addcarryx_u64(0x0, x65, x77, &x101); - uint64_t x104; uint8_t x105 = addcarryx_u64(x102, x68, x89, &x104); - uint64_t x107; uint8_t x108 = addcarryx_u64(x105, x71, x92, &x107); - uint64_t x110; uint8_t x111 = addcarryx_u64(x108, x74, x95, &x110); - uint64_t x113; uint8_t x114 = addcarryx_u64(x111, x75, x98, &x113); - uint64_t x117; uint64_t x116 = mulx_u64(x101, 0xffffffffffffffffL, &x117); - uint64_t x120; uint64_t x119 = mulx_u64(x101, 0xffffffff, &x120); - uint64_t x123; uint64_t x122 = mulx_u64(x101, 0xffffffff00000001L, &x123); - uint64_t x125; uint8_t x126 = addcarryx_u64(0x0, x117, x119, &x125); - uint64_t x128; uint8_t x129 = addcarryx_u64(x126, x120, 0x0, &x128); - uint64_t x131; uint8_t x132 = addcarryx_u64(x129, 0x0, x122, &x131); - uint64_t x134; addcarryx_u64(0x0, x132, x123, &x134); - uint64_t _3; uint8_t x138 = addcarryx_u64(0x0, x101, x116, &_3); - uint64_t x140; uint8_t x141 = addcarryx_u64(x138, x104, x125, &x140); - uint64_t x143; uint8_t x144 = addcarryx_u64(x141, x107, x128, &x143); - uint64_t x146; uint8_t x147 = addcarryx_u64(x144, x110, x131, &x146); - uint64_t x149; uint8_t x150 = addcarryx_u64(x147, x113, x134, &x149); - uint8_t x151 = (x150 + x114); - uint64_t x154; uint64_t x153 = mulx_u64(x9, x11, &x154); - uint64_t x157; uint64_t x156 = mulx_u64(x9, x13, &x157); - uint64_t x160; uint64_t x159 = mulx_u64(x9, x15, &x160); - uint64_t x163; uint64_t x162 = mulx_u64(x9, x14, &x163); - uint64_t x165; uint8_t x166 = addcarryx_u64(0x0, x154, x156, &x165); - uint64_t x168; uint8_t x169 = addcarryx_u64(x166, x157, x159, &x168); - uint64_t x171; uint8_t x172 = addcarryx_u64(x169, x160, x162, &x171); - uint64_t x174; addcarryx_u64(0x0, x172, x163, &x174); - uint64_t x177; uint8_t x178 = addcarryx_u64(0x0, x140, x153, &x177); - uint64_t x180; uint8_t x181 = addcarryx_u64(x178, x143, x165, &x180); - uint64_t x183; uint8_t x184 = addcarryx_u64(x181, x146, x168, &x183); - uint64_t x186; uint8_t x187 = addcarryx_u64(x184, x149, x171, &x186); - uint64_t x189; uint8_t x190 = addcarryx_u64(x187, x151, x174, &x189); - uint64_t x193; uint64_t x192 = mulx_u64(x177, 0xffffffffffffffffL, &x193); - uint64_t x196; uint64_t x195 = mulx_u64(x177, 0xffffffff, &x196); - uint64_t x199; uint64_t x198 = mulx_u64(x177, 0xffffffff00000001L, &x199); - uint64_t x201; uint8_t x202 = addcarryx_u64(0x0, x193, x195, &x201); - uint64_t x204; uint8_t x205 = addcarryx_u64(x202, x196, 0x0, &x204); - uint64_t x207; uint8_t x208 = addcarryx_u64(x205, 0x0, x198, &x207); - uint64_t x210; addcarryx_u64(0x0, x208, x199, &x210); - uint64_t _4; uint8_t x214 = addcarryx_u64(0x0, x177, x192, &_4); - uint64_t x216; uint8_t x217 = addcarryx_u64(x214, x180, x201, &x216); - uint64_t x219; uint8_t x220 = addcarryx_u64(x217, x183, x204, &x219); - uint64_t x222; uint8_t x223 = addcarryx_u64(x220, x186, x207, &x222); - uint64_t x225; uint8_t x226 = addcarryx_u64(x223, x189, x210, &x225); - uint8_t x227 = (x226 + x190); - uint64_t x230; uint64_t x229 = mulx_u64(x8, x11, &x230); - uint64_t x233; uint64_t x232 = mulx_u64(x8, x13, &x233); - uint64_t x236; uint64_t x235 = mulx_u64(x8, x15, &x236); - uint64_t x239; uint64_t x238 = mulx_u64(x8, x14, &x239); - uint64_t x241; uint8_t x242 = addcarryx_u64(0x0, x230, x232, &x241); - uint64_t x244; uint8_t x245 = addcarryx_u64(x242, x233, x235, &x244); - uint64_t x247; uint8_t x248 = addcarryx_u64(x245, x236, x238, &x247); - uint64_t x250; addcarryx_u64(0x0, x248, x239, &x250); - uint64_t x253; uint8_t x254 = addcarryx_u64(0x0, x216, x229, &x253); - uint64_t x256; uint8_t x257 = addcarryx_u64(x254, x219, x241, &x256); - uint64_t x259; uint8_t x260 = addcarryx_u64(x257, x222, x244, &x259); - uint64_t x262; uint8_t x263 = addcarryx_u64(x260, x225, x247, &x262); - uint64_t x265; uint8_t x266 = addcarryx_u64(x263, x227, x250, &x265); - uint64_t x269; uint64_t x268 = mulx_u64(x253, 0xffffffffffffffffL, &x269); - uint64_t x272; uint64_t x271 = mulx_u64(x253, 0xffffffff, &x272); - uint64_t x275; uint64_t x274 = mulx_u64(x253, 0xffffffff00000001L, &x275); - uint64_t x277; uint8_t x278 = addcarryx_u64(0x0, x269, x271, &x277); - uint64_t x280; uint8_t x281 = addcarryx_u64(x278, x272, 0x0, &x280); - uint64_t x283; uint8_t x284 = addcarryx_u64(x281, 0x0, x274, &x283); - uint64_t x286; addcarryx_u64(0x0, x284, x275, &x286); - uint64_t _5; uint8_t x290 = addcarryx_u64(0x0, x253, x268, &_5); - uint64_t x292; uint8_t x293 = addcarryx_u64(x290, x256, x277, &x292); - uint64_t x295; uint8_t x296 = addcarryx_u64(x293, x259, x280, &x295); - uint64_t x298; uint8_t x299 = addcarryx_u64(x296, x262, x283, &x298); - uint64_t x301; uint8_t x302 = addcarryx_u64(x299, x265, x286, &x301); - uint8_t x303 = (x302 + x266); - uint64_t x305; uint8_t x306 = subborrow_u64(0x0, x292, 0xffffffffffffffffL, &x305); - uint64_t x308; uint8_t x309 = subborrow_u64(x306, x295, 0xffffffff, &x308); - uint64_t x311; uint8_t x312 = subborrow_u64(x309, x298, 0x0, &x311); - uint64_t x314; uint8_t x315 = subborrow_u64(x312, x301, 0xffffffff00000001L, &x314); - uint64_t _6; uint8_t x318 = subborrow_u64(x315, x303, 0x0, &_6); - uint64_t x319 = cmovznz_u64(x318, x314, x301); - uint64_t x320 = cmovznz_u64(x318, x311, x298); - uint64_t x321 = cmovznz_u64(x318, x308, x295); - uint64_t x322 = cmovznz_u64(x318, x305, x292); - out[0] = x322; - out[1] = x321; - out[2] = x320; - out[3] = x319; -} - -static void fe_sub(uint64_t out[4], const uint64_t in1[4], const uint64_t in2[4]) { - const uint64_t x8 = in1[3]; - const uint64_t x9 = in1[2]; - const uint64_t x7 = in1[1]; - const uint64_t x5 = in1[0]; - const uint64_t x14 = in2[3]; - const uint64_t x15 = in2[2]; - const uint64_t x13 = in2[1]; - const uint64_t x11 = in2[0]; - uint64_t x17; uint8_t x18 = subborrow_u64(0x0, x5, x11, &x17); - uint64_t x20; uint8_t x21 = subborrow_u64(x18, x7, x13, &x20); - uint64_t x23; uint8_t x24 = subborrow_u64(x21, x9, x15, &x23); - uint64_t x26; uint8_t x27 = subborrow_u64(x24, x8, x14, &x26); - uint64_t x28 = (uint64_t)cmovznz_u64(x27, 0x0, 0xffffffffffffffffL); - uint64_t x29 = (x28 & 0xffffffffffffffffL); - uint64_t x31; uint8_t x32 = addcarryx_u64(0x0, x17, x29, &x31); - uint64_t x33 = (x28 & 0xffffffff); - uint64_t x35; uint8_t x36 = addcarryx_u64(x32, x20, x33, &x35); - uint64_t x38; uint8_t x39 = addcarryx_u64(x36, x23, 0x0, &x38); - uint64_t x40 = (x28 & 0xffffffff00000001L); - uint64_t x42; addcarryx_u64(x39, x26, x40, &x42); - out[0] = x31; - out[1] = x35; - out[2] = x38; - out[3] = x42; -} - -#else // 64BIT, else 32BIT - -static void fe_add(uint32_t out[8], const uint32_t in1[8], const uint32_t in2[8]) { - const uint32_t x16 = in1[7]; - const uint32_t x17 = in1[6]; - const uint32_t x15 = in1[5]; - const uint32_t x13 = in1[4]; - const uint32_t x11 = in1[3]; - const uint32_t x9 = in1[2]; - const uint32_t x7 = in1[1]; - const uint32_t x5 = in1[0]; - const uint32_t x30 = in2[7]; - const uint32_t x31 = in2[6]; - const uint32_t x29 = in2[5]; - const uint32_t x27 = in2[4]; - const uint32_t x25 = in2[3]; - const uint32_t x23 = in2[2]; - const uint32_t x21 = in2[1]; - const uint32_t x19 = in2[0]; - uint32_t x33; uint8_t x34 = addcarryx_u32(0x0, x5, x19, &x33); - uint32_t x36; uint8_t x37 = addcarryx_u32(x34, x7, x21, &x36); - uint32_t x39; uint8_t x40 = addcarryx_u32(x37, x9, x23, &x39); - uint32_t x42; uint8_t x43 = addcarryx_u32(x40, x11, x25, &x42); - uint32_t x45; uint8_t x46 = addcarryx_u32(x43, x13, x27, &x45); - uint32_t x48; uint8_t x49 = addcarryx_u32(x46, x15, x29, &x48); - uint32_t x51; uint8_t x52 = addcarryx_u32(x49, x17, x31, &x51); - uint32_t x54; uint8_t x55 = addcarryx_u32(x52, x16, x30, &x54); - uint32_t x57; uint8_t x58 = subborrow_u32(0x0, x33, 0xffffffff, &x57); - uint32_t x60; uint8_t x61 = subborrow_u32(x58, x36, 0xffffffff, &x60); - uint32_t x63; uint8_t x64 = subborrow_u32(x61, x39, 0xffffffff, &x63); - uint32_t x66; uint8_t x67 = subborrow_u32(x64, x42, 0x0, &x66); - uint32_t x69; uint8_t x70 = subborrow_u32(x67, x45, 0x0, &x69); - uint32_t x72; uint8_t x73 = subborrow_u32(x70, x48, 0x0, &x72); - uint32_t x75; uint8_t x76 = subborrow_u32(x73, x51, 0x1, &x75); - uint32_t x78; uint8_t x79 = subborrow_u32(x76, x54, 0xffffffff, &x78); - uint32_t _; uint8_t x82 = subborrow_u32(x79, x55, 0x0, &_); - uint32_t x83 = cmovznz_u32(x82, x78, x54); - uint32_t x84 = cmovznz_u32(x82, x75, x51); - uint32_t x85 = cmovznz_u32(x82, x72, x48); - uint32_t x86 = cmovznz_u32(x82, x69, x45); - uint32_t x87 = cmovznz_u32(x82, x66, x42); - uint32_t x88 = cmovznz_u32(x82, x63, x39); - uint32_t x89 = cmovznz_u32(x82, x60, x36); - uint32_t x90 = cmovznz_u32(x82, x57, x33); - out[0] = x90; - out[1] = x89; - out[2] = x88; - out[3] = x87; - out[4] = x86; - out[5] = x85; - out[6] = x84; - out[7] = x83; -} - -static void fe_mul(uint32_t out[8], const uint32_t in1[8], const uint32_t in2[8]) { - const uint32_t x16 = in1[7]; - const uint32_t x17 = in1[6]; - const uint32_t x15 = in1[5]; - const uint32_t x13 = in1[4]; - const uint32_t x11 = in1[3]; - const uint32_t x9 = in1[2]; - const uint32_t x7 = in1[1]; - const uint32_t x5 = in1[0]; - const uint32_t x30 = in2[7]; - const uint32_t x31 = in2[6]; - const uint32_t x29 = in2[5]; - const uint32_t x27 = in2[4]; - const uint32_t x25 = in2[3]; - const uint32_t x23 = in2[2]; - const uint32_t x21 = in2[1]; - const uint32_t x19 = in2[0]; - uint32_t x34; uint32_t x33 = mulx_u32(x5, x19, &x34); - uint32_t x37; uint32_t x36 = mulx_u32(x5, x21, &x37); - uint32_t x40; uint32_t x39 = mulx_u32(x5, x23, &x40); - uint32_t x43; uint32_t x42 = mulx_u32(x5, x25, &x43); - uint32_t x46; uint32_t x45 = mulx_u32(x5, x27, &x46); - uint32_t x49; uint32_t x48 = mulx_u32(x5, x29, &x49); - uint32_t x52; uint32_t x51 = mulx_u32(x5, x31, &x52); - uint32_t x55; uint32_t x54 = mulx_u32(x5, x30, &x55); - uint32_t x57; uint8_t x58 = addcarryx_u32(0x0, x34, x36, &x57); - uint32_t x60; uint8_t x61 = addcarryx_u32(x58, x37, x39, &x60); - uint32_t x63; uint8_t x64 = addcarryx_u32(x61, x40, x42, &x63); - uint32_t x66; uint8_t x67 = addcarryx_u32(x64, x43, x45, &x66); - uint32_t x69; uint8_t x70 = addcarryx_u32(x67, x46, x48, &x69); - uint32_t x72; uint8_t x73 = addcarryx_u32(x70, x49, x51, &x72); - uint32_t x75; uint8_t x76 = addcarryx_u32(x73, x52, x54, &x75); - uint32_t x78; addcarryx_u32(0x0, x76, x55, &x78); - uint32_t x82; uint32_t x81 = mulx_u32(x33, 0xffffffff, &x82); - uint32_t x85; uint32_t x84 = mulx_u32(x33, 0xffffffff, &x85); - uint32_t x88; uint32_t x87 = mulx_u32(x33, 0xffffffff, &x88); - uint32_t x91; uint32_t x90 = mulx_u32(x33, 0xffffffff, &x91); - uint32_t x93; uint8_t x94 = addcarryx_u32(0x0, x82, x84, &x93); - uint32_t x96; uint8_t x97 = addcarryx_u32(x94, x85, x87, &x96); - uint32_t x99; uint8_t x100 = addcarryx_u32(x97, x88, 0x0, &x99); - uint8_t x101 = (0x0 + 0x0); - uint32_t _1; uint8_t x104 = addcarryx_u32(0x0, x33, x81, &_1); - uint32_t x106; uint8_t x107 = addcarryx_u32(x104, x57, x93, &x106); - uint32_t x109; uint8_t x110 = addcarryx_u32(x107, x60, x96, &x109); - uint32_t x112; uint8_t x113 = addcarryx_u32(x110, x63, x99, &x112); - uint32_t x115; uint8_t x116 = addcarryx_u32(x113, x66, x100, &x115); - uint32_t x118; uint8_t x119 = addcarryx_u32(x116, x69, x101, &x118); - uint32_t x121; uint8_t x122 = addcarryx_u32(x119, x72, x33, &x121); - uint32_t x124; uint8_t x125 = addcarryx_u32(x122, x75, x90, &x124); - uint32_t x127; uint8_t x128 = addcarryx_u32(x125, x78, x91, &x127); - uint8_t x129 = (x128 + 0x0); - uint32_t x132; uint32_t x131 = mulx_u32(x7, x19, &x132); - uint32_t x135; uint32_t x134 = mulx_u32(x7, x21, &x135); - uint32_t x138; uint32_t x137 = mulx_u32(x7, x23, &x138); - uint32_t x141; uint32_t x140 = mulx_u32(x7, x25, &x141); - uint32_t x144; uint32_t x143 = mulx_u32(x7, x27, &x144); - uint32_t x147; uint32_t x146 = mulx_u32(x7, x29, &x147); - uint32_t x150; uint32_t x149 = mulx_u32(x7, x31, &x150); - uint32_t x153; uint32_t x152 = mulx_u32(x7, x30, &x153); - uint32_t x155; uint8_t x156 = addcarryx_u32(0x0, x132, x134, &x155); - uint32_t x158; uint8_t x159 = addcarryx_u32(x156, x135, x137, &x158); - uint32_t x161; uint8_t x162 = addcarryx_u32(x159, x138, x140, &x161); - uint32_t x164; uint8_t x165 = addcarryx_u32(x162, x141, x143, &x164); - uint32_t x167; uint8_t x168 = addcarryx_u32(x165, x144, x146, &x167); - uint32_t x170; uint8_t x171 = addcarryx_u32(x168, x147, x149, &x170); - uint32_t x173; uint8_t x174 = addcarryx_u32(x171, x150, x152, &x173); - uint32_t x176; addcarryx_u32(0x0, x174, x153, &x176); - uint32_t x179; uint8_t x180 = addcarryx_u32(0x0, x106, x131, &x179); - uint32_t x182; uint8_t x183 = addcarryx_u32(x180, x109, x155, &x182); - uint32_t x185; uint8_t x186 = addcarryx_u32(x183, x112, x158, &x185); - uint32_t x188; uint8_t x189 = addcarryx_u32(x186, x115, x161, &x188); - uint32_t x191; uint8_t x192 = addcarryx_u32(x189, x118, x164, &x191); - uint32_t x194; uint8_t x195 = addcarryx_u32(x192, x121, x167, &x194); - uint32_t x197; uint8_t x198 = addcarryx_u32(x195, x124, x170, &x197); - uint32_t x200; uint8_t x201 = addcarryx_u32(x198, x127, x173, &x200); - uint32_t x203; uint8_t x204 = addcarryx_u32(x201, x129, x176, &x203); - uint32_t x207; uint32_t x206 = mulx_u32(x179, 0xffffffff, &x207); - uint32_t x210; uint32_t x209 = mulx_u32(x179, 0xffffffff, &x210); - uint32_t x213; uint32_t x212 = mulx_u32(x179, 0xffffffff, &x213); - uint32_t x216; uint32_t x215 = mulx_u32(x179, 0xffffffff, &x216); - uint32_t x218; uint8_t x219 = addcarryx_u32(0x0, x207, x209, &x218); - uint32_t x221; uint8_t x222 = addcarryx_u32(x219, x210, x212, &x221); - uint32_t x224; uint8_t x225 = addcarryx_u32(x222, x213, 0x0, &x224); - uint8_t x226 = (0x0 + 0x0); - uint32_t _2; uint8_t x229 = addcarryx_u32(0x0, x179, x206, &_2); - uint32_t x231; uint8_t x232 = addcarryx_u32(x229, x182, x218, &x231); - uint32_t x234; uint8_t x235 = addcarryx_u32(x232, x185, x221, &x234); - uint32_t x237; uint8_t x238 = addcarryx_u32(x235, x188, x224, &x237); - uint32_t x240; uint8_t x241 = addcarryx_u32(x238, x191, x225, &x240); - uint32_t x243; uint8_t x244 = addcarryx_u32(x241, x194, x226, &x243); - uint32_t x246; uint8_t x247 = addcarryx_u32(x244, x197, x179, &x246); - uint32_t x249; uint8_t x250 = addcarryx_u32(x247, x200, x215, &x249); - uint32_t x252; uint8_t x253 = addcarryx_u32(x250, x203, x216, &x252); - uint8_t x254 = (x253 + x204); - uint32_t x257; uint32_t x256 = mulx_u32(x9, x19, &x257); - uint32_t x260; uint32_t x259 = mulx_u32(x9, x21, &x260); - uint32_t x263; uint32_t x262 = mulx_u32(x9, x23, &x263); - uint32_t x266; uint32_t x265 = mulx_u32(x9, x25, &x266); - uint32_t x269; uint32_t x268 = mulx_u32(x9, x27, &x269); - uint32_t x272; uint32_t x271 = mulx_u32(x9, x29, &x272); - uint32_t x275; uint32_t x274 = mulx_u32(x9, x31, &x275); - uint32_t x278; uint32_t x277 = mulx_u32(x9, x30, &x278); - uint32_t x280; uint8_t x281 = addcarryx_u32(0x0, x257, x259, &x280); - uint32_t x283; uint8_t x284 = addcarryx_u32(x281, x260, x262, &x283); - uint32_t x286; uint8_t x287 = addcarryx_u32(x284, x263, x265, &x286); - uint32_t x289; uint8_t x290 = addcarryx_u32(x287, x266, x268, &x289); - uint32_t x292; uint8_t x293 = addcarryx_u32(x290, x269, x271, &x292); - uint32_t x295; uint8_t x296 = addcarryx_u32(x293, x272, x274, &x295); - uint32_t x298; uint8_t x299 = addcarryx_u32(x296, x275, x277, &x298); - uint32_t x301; addcarryx_u32(0x0, x299, x278, &x301); - uint32_t x304; uint8_t x305 = addcarryx_u32(0x0, x231, x256, &x304); - uint32_t x307; uint8_t x308 = addcarryx_u32(x305, x234, x280, &x307); - uint32_t x310; uint8_t x311 = addcarryx_u32(x308, x237, x283, &x310); - uint32_t x313; uint8_t x314 = addcarryx_u32(x311, x240, x286, &x313); - uint32_t x316; uint8_t x317 = addcarryx_u32(x314, x243, x289, &x316); - uint32_t x319; uint8_t x320 = addcarryx_u32(x317, x246, x292, &x319); - uint32_t x322; uint8_t x323 = addcarryx_u32(x320, x249, x295, &x322); - uint32_t x325; uint8_t x326 = addcarryx_u32(x323, x252, x298, &x325); - uint32_t x328; uint8_t x329 = addcarryx_u32(x326, x254, x301, &x328); - uint32_t x332; uint32_t x331 = mulx_u32(x304, 0xffffffff, &x332); - uint32_t x335; uint32_t x334 = mulx_u32(x304, 0xffffffff, &x335); - uint32_t x338; uint32_t x337 = mulx_u32(x304, 0xffffffff, &x338); - uint32_t x341; uint32_t x340 = mulx_u32(x304, 0xffffffff, &x341); - uint32_t x343; uint8_t x344 = addcarryx_u32(0x0, x332, x334, &x343); - uint32_t x346; uint8_t x347 = addcarryx_u32(x344, x335, x337, &x346); - uint32_t x349; uint8_t x350 = addcarryx_u32(x347, x338, 0x0, &x349); - uint8_t x351 = (0x0 + 0x0); - uint32_t _3; uint8_t x354 = addcarryx_u32(0x0, x304, x331, &_3); - uint32_t x356; uint8_t x357 = addcarryx_u32(x354, x307, x343, &x356); - uint32_t x359; uint8_t x360 = addcarryx_u32(x357, x310, x346, &x359); - uint32_t x362; uint8_t x363 = addcarryx_u32(x360, x313, x349, &x362); - uint32_t x365; uint8_t x366 = addcarryx_u32(x363, x316, x350, &x365); - uint32_t x368; uint8_t x369 = addcarryx_u32(x366, x319, x351, &x368); - uint32_t x371; uint8_t x372 = addcarryx_u32(x369, x322, x304, &x371); - uint32_t x374; uint8_t x375 = addcarryx_u32(x372, x325, x340, &x374); - uint32_t x377; uint8_t x378 = addcarryx_u32(x375, x328, x341, &x377); - uint8_t x379 = (x378 + x329); - uint32_t x382; uint32_t x381 = mulx_u32(x11, x19, &x382); - uint32_t x385; uint32_t x384 = mulx_u32(x11, x21, &x385); - uint32_t x388; uint32_t x387 = mulx_u32(x11, x23, &x388); - uint32_t x391; uint32_t x390 = mulx_u32(x11, x25, &x391); - uint32_t x394; uint32_t x393 = mulx_u32(x11, x27, &x394); - uint32_t x397; uint32_t x396 = mulx_u32(x11, x29, &x397); - uint32_t x400; uint32_t x399 = mulx_u32(x11, x31, &x400); - uint32_t x403; uint32_t x402 = mulx_u32(x11, x30, &x403); - uint32_t x405; uint8_t x406 = addcarryx_u32(0x0, x382, x384, &x405); - uint32_t x408; uint8_t x409 = addcarryx_u32(x406, x385, x387, &x408); - uint32_t x411; uint8_t x412 = addcarryx_u32(x409, x388, x390, &x411); - uint32_t x414; uint8_t x415 = addcarryx_u32(x412, x391, x393, &x414); - uint32_t x417; uint8_t x418 = addcarryx_u32(x415, x394, x396, &x417); - uint32_t x420; uint8_t x421 = addcarryx_u32(x418, x397, x399, &x420); - uint32_t x423; uint8_t x424 = addcarryx_u32(x421, x400, x402, &x423); - uint32_t x426; addcarryx_u32(0x0, x424, x403, &x426); - uint32_t x429; uint8_t x430 = addcarryx_u32(0x0, x356, x381, &x429); - uint32_t x432; uint8_t x433 = addcarryx_u32(x430, x359, x405, &x432); - uint32_t x435; uint8_t x436 = addcarryx_u32(x433, x362, x408, &x435); - uint32_t x438; uint8_t x439 = addcarryx_u32(x436, x365, x411, &x438); - uint32_t x441; uint8_t x442 = addcarryx_u32(x439, x368, x414, &x441); - uint32_t x444; uint8_t x445 = addcarryx_u32(x442, x371, x417, &x444); - uint32_t x447; uint8_t x448 = addcarryx_u32(x445, x374, x420, &x447); - uint32_t x450; uint8_t x451 = addcarryx_u32(x448, x377, x423, &x450); - uint32_t x453; uint8_t x454 = addcarryx_u32(x451, x379, x426, &x453); - uint32_t x457; uint32_t x456 = mulx_u32(x429, 0xffffffff, &x457); - uint32_t x460; uint32_t x459 = mulx_u32(x429, 0xffffffff, &x460); - uint32_t x463; uint32_t x462 = mulx_u32(x429, 0xffffffff, &x463); - uint32_t x466; uint32_t x465 = mulx_u32(x429, 0xffffffff, &x466); - uint32_t x468; uint8_t x469 = addcarryx_u32(0x0, x457, x459, &x468); - uint32_t x471; uint8_t x472 = addcarryx_u32(x469, x460, x462, &x471); - uint32_t x474; uint8_t x475 = addcarryx_u32(x472, x463, 0x0, &x474); - uint8_t x476 = (0x0 + 0x0); - uint32_t _4; uint8_t x479 = addcarryx_u32(0x0, x429, x456, &_4); - uint32_t x481; uint8_t x482 = addcarryx_u32(x479, x432, x468, &x481); - uint32_t x484; uint8_t x485 = addcarryx_u32(x482, x435, x471, &x484); - uint32_t x487; uint8_t x488 = addcarryx_u32(x485, x438, x474, &x487); - uint32_t x490; uint8_t x491 = addcarryx_u32(x488, x441, x475, &x490); - uint32_t x493; uint8_t x494 = addcarryx_u32(x491, x444, x476, &x493); - uint32_t x496; uint8_t x497 = addcarryx_u32(x494, x447, x429, &x496); - uint32_t x499; uint8_t x500 = addcarryx_u32(x497, x450, x465, &x499); - uint32_t x502; uint8_t x503 = addcarryx_u32(x500, x453, x466, &x502); - uint8_t x504 = (x503 + x454); - uint32_t x507; uint32_t x506 = mulx_u32(x13, x19, &x507); - uint32_t x510; uint32_t x509 = mulx_u32(x13, x21, &x510); - uint32_t x513; uint32_t x512 = mulx_u32(x13, x23, &x513); - uint32_t x516; uint32_t x515 = mulx_u32(x13, x25, &x516); - uint32_t x519; uint32_t x518 = mulx_u32(x13, x27, &x519); - uint32_t x522; uint32_t x521 = mulx_u32(x13, x29, &x522); - uint32_t x525; uint32_t x524 = mulx_u32(x13, x31, &x525); - uint32_t x528; uint32_t x527 = mulx_u32(x13, x30, &x528); - uint32_t x530; uint8_t x531 = addcarryx_u32(0x0, x507, x509, &x530); - uint32_t x533; uint8_t x534 = addcarryx_u32(x531, x510, x512, &x533); - uint32_t x536; uint8_t x537 = addcarryx_u32(x534, x513, x515, &x536); - uint32_t x539; uint8_t x540 = addcarryx_u32(x537, x516, x518, &x539); - uint32_t x542; uint8_t x543 = addcarryx_u32(x540, x519, x521, &x542); - uint32_t x545; uint8_t x546 = addcarryx_u32(x543, x522, x524, &x545); - uint32_t x548; uint8_t x549 = addcarryx_u32(x546, x525, x527, &x548); - uint32_t x551; addcarryx_u32(0x0, x549, x528, &x551); - uint32_t x554; uint8_t x555 = addcarryx_u32(0x0, x481, x506, &x554); - uint32_t x557; uint8_t x558 = addcarryx_u32(x555, x484, x530, &x557); - uint32_t x560; uint8_t x561 = addcarryx_u32(x558, x487, x533, &x560); - uint32_t x563; uint8_t x564 = addcarryx_u32(x561, x490, x536, &x563); - uint32_t x566; uint8_t x567 = addcarryx_u32(x564, x493, x539, &x566); - uint32_t x569; uint8_t x570 = addcarryx_u32(x567, x496, x542, &x569); - uint32_t x572; uint8_t x573 = addcarryx_u32(x570, x499, x545, &x572); - uint32_t x575; uint8_t x576 = addcarryx_u32(x573, x502, x548, &x575); - uint32_t x578; uint8_t x579 = addcarryx_u32(x576, x504, x551, &x578); - uint32_t x582; uint32_t x581 = mulx_u32(x554, 0xffffffff, &x582); - uint32_t x585; uint32_t x584 = mulx_u32(x554, 0xffffffff, &x585); - uint32_t x588; uint32_t x587 = mulx_u32(x554, 0xffffffff, &x588); - uint32_t x591; uint32_t x590 = mulx_u32(x554, 0xffffffff, &x591); - uint32_t x593; uint8_t x594 = addcarryx_u32(0x0, x582, x584, &x593); - uint32_t x596; uint8_t x597 = addcarryx_u32(x594, x585, x587, &x596); - uint32_t x599; uint8_t x600 = addcarryx_u32(x597, x588, 0x0, &x599); - uint8_t x601 = (0x0 + 0x0); - uint32_t _5; uint8_t x604 = addcarryx_u32(0x0, x554, x581, &_5); - uint32_t x606; uint8_t x607 = addcarryx_u32(x604, x557, x593, &x606); - uint32_t x609; uint8_t x610 = addcarryx_u32(x607, x560, x596, &x609); - uint32_t x612; uint8_t x613 = addcarryx_u32(x610, x563, x599, &x612); - uint32_t x615; uint8_t x616 = addcarryx_u32(x613, x566, x600, &x615); - uint32_t x618; uint8_t x619 = addcarryx_u32(x616, x569, x601, &x618); - uint32_t x621; uint8_t x622 = addcarryx_u32(x619, x572, x554, &x621); - uint32_t x624; uint8_t x625 = addcarryx_u32(x622, x575, x590, &x624); - uint32_t x627; uint8_t x628 = addcarryx_u32(x625, x578, x591, &x627); - uint8_t x629 = (x628 + x579); - uint32_t x632; uint32_t x631 = mulx_u32(x15, x19, &x632); - uint32_t x635; uint32_t x634 = mulx_u32(x15, x21, &x635); - uint32_t x638; uint32_t x637 = mulx_u32(x15, x23, &x638); - uint32_t x641; uint32_t x640 = mulx_u32(x15, x25, &x641); - uint32_t x644; uint32_t x643 = mulx_u32(x15, x27, &x644); - uint32_t x647; uint32_t x646 = mulx_u32(x15, x29, &x647); - uint32_t x650; uint32_t x649 = mulx_u32(x15, x31, &x650); - uint32_t x653; uint32_t x652 = mulx_u32(x15, x30, &x653); - uint32_t x655; uint8_t x656 = addcarryx_u32(0x0, x632, x634, &x655); - uint32_t x658; uint8_t x659 = addcarryx_u32(x656, x635, x637, &x658); - uint32_t x661; uint8_t x662 = addcarryx_u32(x659, x638, x640, &x661); - uint32_t x664; uint8_t x665 = addcarryx_u32(x662, x641, x643, &x664); - uint32_t x667; uint8_t x668 = addcarryx_u32(x665, x644, x646, &x667); - uint32_t x670; uint8_t x671 = addcarryx_u32(x668, x647, x649, &x670); - uint32_t x673; uint8_t x674 = addcarryx_u32(x671, x650, x652, &x673); - uint32_t x676; addcarryx_u32(0x0, x674, x653, &x676); - uint32_t x679; uint8_t x680 = addcarryx_u32(0x0, x606, x631, &x679); - uint32_t x682; uint8_t x683 = addcarryx_u32(x680, x609, x655, &x682); - uint32_t x685; uint8_t x686 = addcarryx_u32(x683, x612, x658, &x685); - uint32_t x688; uint8_t x689 = addcarryx_u32(x686, x615, x661, &x688); - uint32_t x691; uint8_t x692 = addcarryx_u32(x689, x618, x664, &x691); - uint32_t x694; uint8_t x695 = addcarryx_u32(x692, x621, x667, &x694); - uint32_t x697; uint8_t x698 = addcarryx_u32(x695, x624, x670, &x697); - uint32_t x700; uint8_t x701 = addcarryx_u32(x698, x627, x673, &x700); - uint32_t x703; uint8_t x704 = addcarryx_u32(x701, x629, x676, &x703); - uint32_t x707; uint32_t x706 = mulx_u32(x679, 0xffffffff, &x707); - uint32_t x710; uint32_t x709 = mulx_u32(x679, 0xffffffff, &x710); - uint32_t x713; uint32_t x712 = mulx_u32(x679, 0xffffffff, &x713); - uint32_t x716; uint32_t x715 = mulx_u32(x679, 0xffffffff, &x716); - uint32_t x718; uint8_t x719 = addcarryx_u32(0x0, x707, x709, &x718); - uint32_t x721; uint8_t x722 = addcarryx_u32(x719, x710, x712, &x721); - uint32_t x724; uint8_t x725 = addcarryx_u32(x722, x713, 0x0, &x724); - uint8_t x726 = (0x0 + 0x0); - uint32_t _6; uint8_t x729 = addcarryx_u32(0x0, x679, x706, &_6); - uint32_t x731; uint8_t x732 = addcarryx_u32(x729, x682, x718, &x731); - uint32_t x734; uint8_t x735 = addcarryx_u32(x732, x685, x721, &x734); - uint32_t x737; uint8_t x738 = addcarryx_u32(x735, x688, x724, &x737); - uint32_t x740; uint8_t x741 = addcarryx_u32(x738, x691, x725, &x740); - uint32_t x743; uint8_t x744 = addcarryx_u32(x741, x694, x726, &x743); - uint32_t x746; uint8_t x747 = addcarryx_u32(x744, x697, x679, &x746); - uint32_t x749; uint8_t x750 = addcarryx_u32(x747, x700, x715, &x749); - uint32_t x752; uint8_t x753 = addcarryx_u32(x750, x703, x716, &x752); - uint8_t x754 = (x753 + x704); - uint32_t x757; uint32_t x756 = mulx_u32(x17, x19, &x757); - uint32_t x760; uint32_t x759 = mulx_u32(x17, x21, &x760); - uint32_t x763; uint32_t x762 = mulx_u32(x17, x23, &x763); - uint32_t x766; uint32_t x765 = mulx_u32(x17, x25, &x766); - uint32_t x769; uint32_t x768 = mulx_u32(x17, x27, &x769); - uint32_t x772; uint32_t x771 = mulx_u32(x17, x29, &x772); - uint32_t x775; uint32_t x774 = mulx_u32(x17, x31, &x775); - uint32_t x778; uint32_t x777 = mulx_u32(x17, x30, &x778); - uint32_t x780; uint8_t x781 = addcarryx_u32(0x0, x757, x759, &x780); - uint32_t x783; uint8_t x784 = addcarryx_u32(x781, x760, x762, &x783); - uint32_t x786; uint8_t x787 = addcarryx_u32(x784, x763, x765, &x786); - uint32_t x789; uint8_t x790 = addcarryx_u32(x787, x766, x768, &x789); - uint32_t x792; uint8_t x793 = addcarryx_u32(x790, x769, x771, &x792); - uint32_t x795; uint8_t x796 = addcarryx_u32(x793, x772, x774, &x795); - uint32_t x798; uint8_t x799 = addcarryx_u32(x796, x775, x777, &x798); - uint32_t x801; addcarryx_u32(0x0, x799, x778, &x801); - uint32_t x804; uint8_t x805 = addcarryx_u32(0x0, x731, x756, &x804); - uint32_t x807; uint8_t x808 = addcarryx_u32(x805, x734, x780, &x807); - uint32_t x810; uint8_t x811 = addcarryx_u32(x808, x737, x783, &x810); - uint32_t x813; uint8_t x814 = addcarryx_u32(x811, x740, x786, &x813); - uint32_t x816; uint8_t x817 = addcarryx_u32(x814, x743, x789, &x816); - uint32_t x819; uint8_t x820 = addcarryx_u32(x817, x746, x792, &x819); - uint32_t x822; uint8_t x823 = addcarryx_u32(x820, x749, x795, &x822); - uint32_t x825; uint8_t x826 = addcarryx_u32(x823, x752, x798, &x825); - uint32_t x828; uint8_t x829 = addcarryx_u32(x826, x754, x801, &x828); - uint32_t x832; uint32_t x831 = mulx_u32(x804, 0xffffffff, &x832); - uint32_t x835; uint32_t x834 = mulx_u32(x804, 0xffffffff, &x835); - uint32_t x838; uint32_t x837 = mulx_u32(x804, 0xffffffff, &x838); - uint32_t x841; uint32_t x840 = mulx_u32(x804, 0xffffffff, &x841); - uint32_t x843; uint8_t x844 = addcarryx_u32(0x0, x832, x834, &x843); - uint32_t x846; uint8_t x847 = addcarryx_u32(x844, x835, x837, &x846); - uint32_t x849; uint8_t x850 = addcarryx_u32(x847, x838, 0x0, &x849); - uint8_t x851 = (0x0 + 0x0); - uint32_t _7; uint8_t x854 = addcarryx_u32(0x0, x804, x831, &_7); - uint32_t x856; uint8_t x857 = addcarryx_u32(x854, x807, x843, &x856); - uint32_t x859; uint8_t x860 = addcarryx_u32(x857, x810, x846, &x859); - uint32_t x862; uint8_t x863 = addcarryx_u32(x860, x813, x849, &x862); - uint32_t x865; uint8_t x866 = addcarryx_u32(x863, x816, x850, &x865); - uint32_t x868; uint8_t x869 = addcarryx_u32(x866, x819, x851, &x868); - uint32_t x871; uint8_t x872 = addcarryx_u32(x869, x822, x804, &x871); - uint32_t x874; uint8_t x875 = addcarryx_u32(x872, x825, x840, &x874); - uint32_t x877; uint8_t x878 = addcarryx_u32(x875, x828, x841, &x877); - uint8_t x879 = (x878 + x829); - uint32_t x882; uint32_t x881 = mulx_u32(x16, x19, &x882); - uint32_t x885; uint32_t x884 = mulx_u32(x16, x21, &x885); - uint32_t x888; uint32_t x887 = mulx_u32(x16, x23, &x888); - uint32_t x891; uint32_t x890 = mulx_u32(x16, x25, &x891); - uint32_t x894; uint32_t x893 = mulx_u32(x16, x27, &x894); - uint32_t x897; uint32_t x896 = mulx_u32(x16, x29, &x897); - uint32_t x900; uint32_t x899 = mulx_u32(x16, x31, &x900); - uint32_t x903; uint32_t x902 = mulx_u32(x16, x30, &x903); - uint32_t x905; uint8_t x906 = addcarryx_u32(0x0, x882, x884, &x905); - uint32_t x908; uint8_t x909 = addcarryx_u32(x906, x885, x887, &x908); - uint32_t x911; uint8_t x912 = addcarryx_u32(x909, x888, x890, &x911); - uint32_t x914; uint8_t x915 = addcarryx_u32(x912, x891, x893, &x914); - uint32_t x917; uint8_t x918 = addcarryx_u32(x915, x894, x896, &x917); - uint32_t x920; uint8_t x921 = addcarryx_u32(x918, x897, x899, &x920); - uint32_t x923; uint8_t x924 = addcarryx_u32(x921, x900, x902, &x923); - uint32_t x926; addcarryx_u32(0x0, x924, x903, &x926); - uint32_t x929; uint8_t x930 = addcarryx_u32(0x0, x856, x881, &x929); - uint32_t x932; uint8_t x933 = addcarryx_u32(x930, x859, x905, &x932); - uint32_t x935; uint8_t x936 = addcarryx_u32(x933, x862, x908, &x935); - uint32_t x938; uint8_t x939 = addcarryx_u32(x936, x865, x911, &x938); - uint32_t x941; uint8_t x942 = addcarryx_u32(x939, x868, x914, &x941); - uint32_t x944; uint8_t x945 = addcarryx_u32(x942, x871, x917, &x944); - uint32_t x947; uint8_t x948 = addcarryx_u32(x945, x874, x920, &x947); - uint32_t x950; uint8_t x951 = addcarryx_u32(x948, x877, x923, &x950); - uint32_t x953; uint8_t x954 = addcarryx_u32(x951, x879, x926, &x953); - uint32_t x957; uint32_t x956 = mulx_u32(x929, 0xffffffff, &x957); - uint32_t x960; uint32_t x959 = mulx_u32(x929, 0xffffffff, &x960); - uint32_t x963; uint32_t x962 = mulx_u32(x929, 0xffffffff, &x963); - uint32_t x966; uint32_t x965 = mulx_u32(x929, 0xffffffff, &x966); - uint32_t x968; uint8_t x969 = addcarryx_u32(0x0, x957, x959, &x968); - uint32_t x971; uint8_t x972 = addcarryx_u32(x969, x960, x962, &x971); - uint32_t x974; uint8_t x975 = addcarryx_u32(x972, x963, 0x0, &x974); - uint8_t x976 = (0x0 + 0x0); - uint32_t _8; uint8_t x979 = addcarryx_u32(0x0, x929, x956, &_8); - uint32_t x981; uint8_t x982 = addcarryx_u32(x979, x932, x968, &x981); - uint32_t x984; uint8_t x985 = addcarryx_u32(x982, x935, x971, &x984); - uint32_t x987; uint8_t x988 = addcarryx_u32(x985, x938, x974, &x987); - uint32_t x990; uint8_t x991 = addcarryx_u32(x988, x941, x975, &x990); - uint32_t x993; uint8_t x994 = addcarryx_u32(x991, x944, x976, &x993); - uint32_t x996; uint8_t x997 = addcarryx_u32(x994, x947, x929, &x996); - uint32_t x999; uint8_t x1000 = addcarryx_u32(x997, x950, x965, &x999); - uint32_t x1002; uint8_t x1003 = addcarryx_u32(x1000, x953, x966, &x1002); - uint8_t x1004 = (x1003 + x954); - uint32_t x1006; uint8_t x1007 = subborrow_u32(0x0, x981, 0xffffffff, &x1006); - uint32_t x1009; uint8_t x1010 = subborrow_u32(x1007, x984, 0xffffffff, &x1009); - uint32_t x1012; uint8_t x1013 = subborrow_u32(x1010, x987, 0xffffffff, &x1012); - uint32_t x1015; uint8_t x1016 = subborrow_u32(x1013, x990, 0x0, &x1015); - uint32_t x1018; uint8_t x1019 = subborrow_u32(x1016, x993, 0x0, &x1018); - uint32_t x1021; uint8_t x1022 = subborrow_u32(x1019, x996, 0x0, &x1021); - uint32_t x1024; uint8_t x1025 = subborrow_u32(x1022, x999, 0x1, &x1024); - uint32_t x1027; uint8_t x1028 = subborrow_u32(x1025, x1002, 0xffffffff, &x1027); - uint32_t _9; uint8_t x1031 = subborrow_u32(x1028, x1004, 0x0, &_9); - uint32_t x1032 = cmovznz_u32(x1031, x1027, x1002); - uint32_t x1033 = cmovznz_u32(x1031, x1024, x999); - uint32_t x1034 = cmovznz_u32(x1031, x1021, x996); - uint32_t x1035 = cmovznz_u32(x1031, x1018, x993); - uint32_t x1036 = cmovznz_u32(x1031, x1015, x990); - uint32_t x1037 = cmovznz_u32(x1031, x1012, x987); - uint32_t x1038 = cmovznz_u32(x1031, x1009, x984); - uint32_t x1039 = cmovznz_u32(x1031, x1006, x981); - out[0] = x1039; - out[1] = x1038; - out[2] = x1037; - out[3] = x1036; - out[4] = x1035; - out[5] = x1034; - out[6] = x1033; - out[7] = x1032; -} - -// NOTE: the following functions are generated from fiat-crypto, from the same -// template as their 64-bit counterparts above, but the correctness proof of -// the template was not composed with the correctness proof of the -// specialization pipeline. This is because Coq unexplainedly loops on trying -// to synthesize opp and sub using the normal pipeline. - -static void fe_sub(uint32_t out[8], const uint32_t in1[8], const uint32_t in2[8]) { - const uint32_t x14 = in1[7]; - const uint32_t x15 = in1[6]; - const uint32_t x13 = in1[5]; - const uint32_t x11 = in1[4]; - const uint32_t x9 = in1[3]; - const uint32_t x7 = in1[2]; - const uint32_t x5 = in1[1]; - const uint32_t x3 = in1[0]; - const uint32_t x28 = in2[7]; - const uint32_t x29 = in2[6]; - const uint32_t x27 = in2[5]; - const uint32_t x25 = in2[4]; - const uint32_t x23 = in2[3]; - const uint32_t x21 = in2[2]; - const uint32_t x19 = in2[1]; - const uint32_t x17 = in2[0]; - uint32_t x31; uint8_t x32 = subborrow_u32(0x0, x3, x17, &x31); - uint32_t x34; uint8_t x35 = subborrow_u32(x32, x5, x19, &x34); - uint32_t x37; uint8_t x38 = subborrow_u32(x35, x7, x21, &x37); - uint32_t x40; uint8_t x41 = subborrow_u32(x38, x9, x23, &x40); - uint32_t x43; uint8_t x44 = subborrow_u32(x41, x11, x25, &x43); - uint32_t x46; uint8_t x47 = subborrow_u32(x44, x13, x27, &x46); - uint32_t x49; uint8_t x50 = subborrow_u32(x47, x15, x29, &x49); - uint32_t x52; uint8_t x53 = subborrow_u32(x50, x14, x28, &x52); - uint32_t x54 = cmovznz_u32(x53, 0x0, 0xffffffff); - uint32_t x56; uint8_t x57 = addcarryx_u32(0x0, x31, (x54 & 0xffffffff), &x56); - uint32_t x59; uint8_t x60 = addcarryx_u32(x57, x34, (x54 & 0xffffffff), &x59); - uint32_t x62; uint8_t x63 = addcarryx_u32(x60, x37, (x54 & 0xffffffff), &x62); - uint32_t x65; uint8_t x66 = addcarryx_u32(x63, x40, 0x0, &x65); - uint32_t x68; uint8_t x69 = addcarryx_u32(x66, x43, 0x0, &x68); - uint32_t x71; uint8_t x72 = addcarryx_u32(x69, x46, 0x0, &x71); - uint32_t x74; uint8_t x75 = addcarryx_u32(x72, x49, ((uint8_t)x54 & 0x1), &x74); - uint32_t x77; addcarryx_u32(x75, x52, (x54 & 0xffffffff), &x77); - out[0] = x56; - out[1] = x59; - out[2] = x62; - out[3] = x65; - out[4] = x68; - out[5] = x71; - out[6] = x74; - out[7] = x77; -} - -// fe_op sets out = -in -static void fe_opp(uint32_t out[8], const uint32_t in1[8]) { - const uint32_t x12 = in1[7]; - const uint32_t x13 = in1[6]; - const uint32_t x11 = in1[5]; - const uint32_t x9 = in1[4]; - const uint32_t x7 = in1[3]; - const uint32_t x5 = in1[2]; - const uint32_t x3 = in1[1]; - const uint32_t x1 = in1[0]; - uint32_t x15; uint8_t x16 = subborrow_u32(0x0, 0x0, x1, &x15); - uint32_t x18; uint8_t x19 = subborrow_u32(x16, 0x0, x3, &x18); - uint32_t x21; uint8_t x22 = subborrow_u32(x19, 0x0, x5, &x21); - uint32_t x24; uint8_t x25 = subborrow_u32(x22, 0x0, x7, &x24); - uint32_t x27; uint8_t x28 = subborrow_u32(x25, 0x0, x9, &x27); - uint32_t x30; uint8_t x31 = subborrow_u32(x28, 0x0, x11, &x30); - uint32_t x33; uint8_t x34 = subborrow_u32(x31, 0x0, x13, &x33); - uint32_t x36; uint8_t x37 = subborrow_u32(x34, 0x0, x12, &x36); - uint32_t x38 = cmovznz_u32(x37, 0x0, 0xffffffff); - uint32_t x40; uint8_t x41 = addcarryx_u32(0x0, x15, (x38 & 0xffffffff), &x40); - uint32_t x43; uint8_t x44 = addcarryx_u32(x41, x18, (x38 & 0xffffffff), &x43); - uint32_t x46; uint8_t x47 = addcarryx_u32(x44, x21, (x38 & 0xffffffff), &x46); - uint32_t x49; uint8_t x50 = addcarryx_u32(x47, x24, 0x0, &x49); - uint32_t x52; uint8_t x53 = addcarryx_u32(x50, x27, 0x0, &x52); - uint32_t x55; uint8_t x56 = addcarryx_u32(x53, x30, 0x0, &x55); - uint32_t x58; uint8_t x59 = addcarryx_u32(x56, x33, ((uint8_t)x38 & 0x1), &x58); - uint32_t x61; addcarryx_u32(x59, x36, (x38 & 0xffffffff), &x61); - out[0] = x40; - out[1] = x43; - out[2] = x46; - out[3] = x49; - out[4] = x52; - out[5] = x55; - out[6] = x58; - out[7] = x61; -} - -#endif // utility functions, handwritten @@ -840,22 +60,28 @@ #define NLIMBS 4 typedef uint64_t limb_t; -#define cmovznz_limb cmovznz_u64 typedef uint64_t fe[NLIMBS]; #else // 64BIT; else 32BIT #define NLIMBS 8 typedef uint32_t limb_t; -#define cmovznz_limb cmovznz_u32 typedef uint32_t fe[NLIMBS]; #endif // 64BIT +#define fe_add fiat_p256_add +#define fe_sub fiat_p256_sub +#define fe_opp fiat_p256_opp + +#define fe_mul fiat_p256_mul +#define fe_sqr fiat_p256_square + +#define fe_tobytes fiat_p256_to_bytes +#define fe_frombytes fiat_p256_from_bytes + static limb_t fe_nz(const limb_t in1[NLIMBS]) { - limb_t ret = 0; - for (int i = 0; i < NLIMBS; i++) { - ret |= in1[i]; - } + limb_t ret; + fiat_p256_nonzero(&ret, in1); return ret; } @@ -867,33 +93,11 @@ static void fe_cmovznz(limb_t out[NLIMBS], limb_t t, const limb_t z[NLIMBS], const limb_t nz[NLIMBS]) { - for (int i = 0; i < NLIMBS; i++) { - out[i] = cmovznz_limb(t, z[i], nz[i]); - } -} - -static void fe_sqr(fe out, const fe in) { - fe_mul(out, in, in); -} - -static void fe_tobytes(uint8_t out[NBYTES], const fe in) { - for (int i = 0; i<NBYTES; i++) { - out[i] = (uint8_t)(in[i/sizeof(in[0])] >> (8*(i%sizeof(in[0])))); - } -} - -static void fe_frombytes(fe out, const uint8_t in[NBYTES]) { - for (int i = 0; i<NLIMBS; i++) { - out[i] = 0; - } - for (int i = 0; i<NBYTES; i++) { - out[i/sizeof(out[0])] |= ((limb_t)in[i]) << (8*(i%sizeof(out[0]))); - } + fiat_p256_selectznz(out, !!t, z, nz); } static void fe_from_montgomery(fe x) { - static const limb_t kOne[NLIMBS] = {1, 0}; - fe_mul(x, x, kOne); + fiat_p256_from_montgomery(x, x); } static void fe_from_generic(fe out, const EC_FELEM *in) {
diff --git a/third_party/fiat/p256_32.c b/third_party/fiat/p256_32.c new file mode 100644 index 0000000..faaa0b0 --- /dev/null +++ b/third_party/fiat/p256_32.c
@@ -0,0 +1,3220 @@ +/* Autogenerated */ +/* curve description: p256 */ +/* requested operations: (all) */ +/* m = 0xffffffff00000001000000000000000000000000ffffffffffffffffffffffff (from "2^256 - 2^224 + 2^192 + 2^96 - 1") */ +/* machine_wordsize = 32 (from "32") */ +/* */ +/* NOTE: In addition to the bounds specified above each function, all */ +/* functions synthesized for this Montgomery arithmetic require the */ +/* input to be strictly less than the prime modulus (m), and also */ +/* require the input to be in the unique saturated representation. */ +/* All functions also ensure that these two properties are true of */ +/* return values. */ + +#include <stdint.h> +typedef unsigned char fiat_p256_uint1; +typedef signed char fiat_p256_int1; + + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [0x0 ~> 0xffffffff] + * arg3: [0x0 ~> 0xffffffff] + * Output Bounds: + * out1: [0x0 ~> 0xffffffff] + * out2: [0x0 ~> 0x1] + */ +static void fiat_p256_addcarryx_u32(uint32_t* out1, fiat_p256_uint1* out2, fiat_p256_uint1 arg1, uint32_t arg2, uint32_t arg3) { + uint64_t x1 = ((arg1 + (uint64_t)arg2) + arg3); + uint32_t x2 = (uint32_t)(x1 & UINT32_C(0xffffffff)); + fiat_p256_uint1 x3 = (fiat_p256_uint1)(x1 >> 32); + *out1 = x2; + *out2 = x3; +} + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [0x0 ~> 0xffffffff] + * arg3: [0x0 ~> 0xffffffff] + * Output Bounds: + * out1: [0x0 ~> 0xffffffff] + * out2: [0x0 ~> 0x1] + */ +static void fiat_p256_subborrowx_u32(uint32_t* out1, fiat_p256_uint1* out2, fiat_p256_uint1 arg1, uint32_t arg2, uint32_t arg3) { + int64_t x1 = ((arg2 - (int64_t)arg1) - arg3); + fiat_p256_int1 x2 = (fiat_p256_int1)(x1 >> 32); + uint32_t x3 = (uint32_t)(x1 & UINT32_C(0xffffffff)); + *out1 = x3; + *out2 = (fiat_p256_uint1)(0x0 - x2); +} + +/* + * Input Bounds: + * arg1: [0x0 ~> 0xffffffff] + * arg2: [0x0 ~> 0xffffffff] + * Output Bounds: + * out1: [0x0 ~> 0xffffffff] + * out2: [0x0 ~> 0xffffffff] + */ +static void fiat_p256_mulx_u32(uint32_t* out1, uint32_t* out2, uint32_t arg1, uint32_t arg2) { + uint64_t x1 = ((uint64_t)arg1 * arg2); + uint32_t x2 = (uint32_t)(x1 & UINT32_C(0xffffffff)); + uint32_t x3 = (uint32_t)(x1 >> 32); + *out1 = x2; + *out2 = x3; +} + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [0x0 ~> 0xffffffff] + * arg3: [0x0 ~> 0xffffffff] + * Output Bounds: + * out1: [0x0 ~> 0xffffffff] + */ +static void fiat_p256_cmovznz_u32(uint32_t* out1, fiat_p256_uint1 arg1, uint32_t arg2, uint32_t arg3) { + fiat_p256_uint1 x1 = (!(!arg1)); + uint32_t x2 = ((fiat_p256_int1)(0x0 - x1) & UINT32_C(0xffffffff)); + uint32_t x3 = ((x2 & arg3) | ((~x2) & arg2)); + *out1 = x3; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + * arg2: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + */ +static void fiat_p256_mul(uint32_t out1[8], const uint32_t arg1[8], const uint32_t arg2[8]) { + uint32_t x1 = (arg1[1]); + uint32_t x2 = (arg1[2]); + uint32_t x3 = (arg1[3]); + uint32_t x4 = (arg1[4]); + uint32_t x5 = (arg1[5]); + uint32_t x6 = (arg1[6]); + uint32_t x7 = (arg1[7]); + uint32_t x8 = (arg1[0]); + uint32_t x9; + uint32_t x10; + fiat_p256_mulx_u32(&x9, &x10, x8, (arg2[7])); + uint32_t x11; + uint32_t x12; + fiat_p256_mulx_u32(&x11, &x12, x8, (arg2[6])); + uint32_t x13; + uint32_t x14; + fiat_p256_mulx_u32(&x13, &x14, x8, (arg2[5])); + uint32_t x15; + uint32_t x16; + fiat_p256_mulx_u32(&x15, &x16, x8, (arg2[4])); + uint32_t x17; + uint32_t x18; + fiat_p256_mulx_u32(&x17, &x18, x8, (arg2[3])); + uint32_t x19; + uint32_t x20; + fiat_p256_mulx_u32(&x19, &x20, x8, (arg2[2])); + uint32_t x21; + uint32_t x22; + fiat_p256_mulx_u32(&x21, &x22, x8, (arg2[1])); + uint32_t x23; + uint32_t x24; + fiat_p256_mulx_u32(&x23, &x24, x8, (arg2[0])); + uint32_t x25; + fiat_p256_uint1 x26; + fiat_p256_addcarryx_u32(&x25, &x26, 0x0, x21, x24); + uint32_t x27; + fiat_p256_uint1 x28; + fiat_p256_addcarryx_u32(&x27, &x28, x26, x19, x22); + uint32_t x29; + fiat_p256_uint1 x30; + fiat_p256_addcarryx_u32(&x29, &x30, x28, x17, x20); + uint32_t x31; + fiat_p256_uint1 x32; + fiat_p256_addcarryx_u32(&x31, &x32, x30, x15, x18); + uint32_t x33; + fiat_p256_uint1 x34; + fiat_p256_addcarryx_u32(&x33, &x34, x32, x13, x16); + uint32_t x35; + fiat_p256_uint1 x36; + fiat_p256_addcarryx_u32(&x35, &x36, x34, x11, x14); + uint32_t x37; + fiat_p256_uint1 x38; + fiat_p256_addcarryx_u32(&x37, &x38, x36, x9, x12); + uint32_t x39; + fiat_p256_uint1 x40; + fiat_p256_addcarryx_u32(&x39, &x40, x38, 0x0, x10); + uint32_t x41; + uint32_t x42; + fiat_p256_mulx_u32(&x41, &x42, x23, UINT32_C(0xffffffff)); + uint32_t x43; + uint32_t x44; + fiat_p256_mulx_u32(&x43, &x44, x23, UINT32_C(0xffffffff)); + uint32_t x45; + uint32_t x46; + fiat_p256_mulx_u32(&x45, &x46, x23, UINT32_C(0xffffffff)); + uint32_t x47; + uint32_t x48; + fiat_p256_mulx_u32(&x47, &x48, x23, UINT32_C(0xffffffff)); + uint32_t x49; + fiat_p256_uint1 x50; + fiat_p256_addcarryx_u32(&x49, &x50, 0x0, x45, x48); + uint32_t x51; + fiat_p256_uint1 x52; + fiat_p256_addcarryx_u32(&x51, &x52, x50, x43, x46); + uint32_t x53; + fiat_p256_uint1 x54; + fiat_p256_addcarryx_u32(&x53, &x54, x52, 0x0, x44); + uint32_t x55; + fiat_p256_uint1 x56; + fiat_p256_addcarryx_u32(&x55, &x56, 0x0, x47, x23); + uint32_t x57; + fiat_p256_uint1 x58; + fiat_p256_addcarryx_u32(&x57, &x58, x56, x49, x25); + uint32_t x59; + fiat_p256_uint1 x60; + fiat_p256_addcarryx_u32(&x59, &x60, x58, x51, x27); + uint32_t x61; + fiat_p256_uint1 x62; + fiat_p256_addcarryx_u32(&x61, &x62, x60, x53, x29); + uint32_t x63; + fiat_p256_uint1 x64; + fiat_p256_addcarryx_u32(&x63, &x64, x62, 0x0, x31); + uint32_t x65; + fiat_p256_uint1 x66; + fiat_p256_addcarryx_u32(&x65, &x66, x64, 0x0, x33); + uint32_t x67; + fiat_p256_uint1 x68; + fiat_p256_addcarryx_u32(&x67, &x68, x66, x23, x35); + uint32_t x69; + fiat_p256_uint1 x70; + fiat_p256_addcarryx_u32(&x69, &x70, x68, x41, x37); + uint32_t x71; + fiat_p256_uint1 x72; + fiat_p256_addcarryx_u32(&x71, &x72, x70, x42, x39); + uint32_t x73; + fiat_p256_uint1 x74; + fiat_p256_addcarryx_u32(&x73, &x74, x72, 0x0, 0x0); + uint32_t x75; + uint32_t x76; + fiat_p256_mulx_u32(&x75, &x76, x1, (arg2[7])); + uint32_t x77; + uint32_t x78; + fiat_p256_mulx_u32(&x77, &x78, x1, (arg2[6])); + uint32_t x79; + uint32_t x80; + fiat_p256_mulx_u32(&x79, &x80, x1, (arg2[5])); + uint32_t x81; + uint32_t x82; + fiat_p256_mulx_u32(&x81, &x82, x1, (arg2[4])); + uint32_t x83; + uint32_t x84; + fiat_p256_mulx_u32(&x83, &x84, x1, (arg2[3])); + uint32_t x85; + uint32_t x86; + fiat_p256_mulx_u32(&x85, &x86, x1, (arg2[2])); + uint32_t x87; + uint32_t x88; + fiat_p256_mulx_u32(&x87, &x88, x1, (arg2[1])); + uint32_t x89; + uint32_t x90; + fiat_p256_mulx_u32(&x89, &x90, x1, (arg2[0])); + uint32_t x91; + fiat_p256_uint1 x92; + fiat_p256_addcarryx_u32(&x91, &x92, 0x0, x87, x90); + uint32_t x93; + fiat_p256_uint1 x94; + fiat_p256_addcarryx_u32(&x93, &x94, x92, x85, x88); + uint32_t x95; + fiat_p256_uint1 x96; + fiat_p256_addcarryx_u32(&x95, &x96, x94, x83, x86); + uint32_t x97; + fiat_p256_uint1 x98; + fiat_p256_addcarryx_u32(&x97, &x98, x96, x81, x84); + uint32_t x99; + fiat_p256_uint1 x100; + fiat_p256_addcarryx_u32(&x99, &x100, x98, x79, x82); + uint32_t x101; + fiat_p256_uint1 x102; + fiat_p256_addcarryx_u32(&x101, &x102, x100, x77, x80); + uint32_t x103; + fiat_p256_uint1 x104; + fiat_p256_addcarryx_u32(&x103, &x104, x102, x75, x78); + uint32_t x105; + fiat_p256_uint1 x106; + fiat_p256_addcarryx_u32(&x105, &x106, x104, 0x0, x76); + uint32_t x107; + fiat_p256_uint1 x108; + fiat_p256_addcarryx_u32(&x107, &x108, 0x0, x89, x57); + uint32_t x109; + fiat_p256_uint1 x110; + fiat_p256_addcarryx_u32(&x109, &x110, x108, x91, x59); + uint32_t x111; + fiat_p256_uint1 x112; + fiat_p256_addcarryx_u32(&x111, &x112, x110, x93, x61); + uint32_t x113; + fiat_p256_uint1 x114; + fiat_p256_addcarryx_u32(&x113, &x114, x112, x95, x63); + uint32_t x115; + fiat_p256_uint1 x116; + fiat_p256_addcarryx_u32(&x115, &x116, x114, x97, x65); + uint32_t x117; + fiat_p256_uint1 x118; + fiat_p256_addcarryx_u32(&x117, &x118, x116, x99, x67); + uint32_t x119; + fiat_p256_uint1 x120; + fiat_p256_addcarryx_u32(&x119, &x120, x118, x101, x69); + uint32_t x121; + fiat_p256_uint1 x122; + fiat_p256_addcarryx_u32(&x121, &x122, x120, x103, x71); + uint32_t x123; + fiat_p256_uint1 x124; + fiat_p256_addcarryx_u32(&x123, &x124, x122, x105, (fiat_p256_uint1)x73); + uint32_t x125; + uint32_t x126; + fiat_p256_mulx_u32(&x125, &x126, x107, UINT32_C(0xffffffff)); + uint32_t x127; + uint32_t x128; + fiat_p256_mulx_u32(&x127, &x128, x107, UINT32_C(0xffffffff)); + uint32_t x129; + uint32_t x130; + fiat_p256_mulx_u32(&x129, &x130, x107, UINT32_C(0xffffffff)); + uint32_t x131; + uint32_t x132; + fiat_p256_mulx_u32(&x131, &x132, x107, UINT32_C(0xffffffff)); + uint32_t x133; + fiat_p256_uint1 x134; + fiat_p256_addcarryx_u32(&x133, &x134, 0x0, x129, x132); + uint32_t x135; + fiat_p256_uint1 x136; + fiat_p256_addcarryx_u32(&x135, &x136, x134, x127, x130); + uint32_t x137; + fiat_p256_uint1 x138; + fiat_p256_addcarryx_u32(&x137, &x138, x136, 0x0, x128); + uint32_t x139; + fiat_p256_uint1 x140; + fiat_p256_addcarryx_u32(&x139, &x140, 0x0, x131, x107); + uint32_t x141; + fiat_p256_uint1 x142; + fiat_p256_addcarryx_u32(&x141, &x142, x140, x133, x109); + uint32_t x143; + fiat_p256_uint1 x144; + fiat_p256_addcarryx_u32(&x143, &x144, x142, x135, x111); + uint32_t x145; + fiat_p256_uint1 x146; + fiat_p256_addcarryx_u32(&x145, &x146, x144, x137, x113); + uint32_t x147; + fiat_p256_uint1 x148; + fiat_p256_addcarryx_u32(&x147, &x148, x146, 0x0, x115); + uint32_t x149; + fiat_p256_uint1 x150; + fiat_p256_addcarryx_u32(&x149, &x150, x148, 0x0, x117); + uint32_t x151; + fiat_p256_uint1 x152; + fiat_p256_addcarryx_u32(&x151, &x152, x150, x107, x119); + uint32_t x153; + fiat_p256_uint1 x154; + fiat_p256_addcarryx_u32(&x153, &x154, x152, x125, x121); + uint32_t x155; + fiat_p256_uint1 x156; + fiat_p256_addcarryx_u32(&x155, &x156, x154, x126, x123); + uint32_t x157; + fiat_p256_uint1 x158; + fiat_p256_addcarryx_u32(&x157, &x158, x156, 0x0, x124); + uint32_t x159; + uint32_t x160; + fiat_p256_mulx_u32(&x159, &x160, x2, (arg2[7])); + uint32_t x161; + uint32_t x162; + fiat_p256_mulx_u32(&x161, &x162, x2, (arg2[6])); + uint32_t x163; + uint32_t x164; + fiat_p256_mulx_u32(&x163, &x164, x2, (arg2[5])); + uint32_t x165; + uint32_t x166; + fiat_p256_mulx_u32(&x165, &x166, x2, (arg2[4])); + uint32_t x167; + uint32_t x168; + fiat_p256_mulx_u32(&x167, &x168, x2, (arg2[3])); + uint32_t x169; + uint32_t x170; + fiat_p256_mulx_u32(&x169, &x170, x2, (arg2[2])); + uint32_t x171; + uint32_t x172; + fiat_p256_mulx_u32(&x171, &x172, x2, (arg2[1])); + uint32_t x173; + uint32_t x174; + fiat_p256_mulx_u32(&x173, &x174, x2, (arg2[0])); + uint32_t x175; + fiat_p256_uint1 x176; + fiat_p256_addcarryx_u32(&x175, &x176, 0x0, x171, x174); + uint32_t x177; + fiat_p256_uint1 x178; + fiat_p256_addcarryx_u32(&x177, &x178, x176, x169, x172); + uint32_t x179; + fiat_p256_uint1 x180; + fiat_p256_addcarryx_u32(&x179, &x180, x178, x167, x170); + uint32_t x181; + fiat_p256_uint1 x182; + fiat_p256_addcarryx_u32(&x181, &x182, x180, x165, x168); + uint32_t x183; + fiat_p256_uint1 x184; + fiat_p256_addcarryx_u32(&x183, &x184, x182, x163, x166); + uint32_t x185; + fiat_p256_uint1 x186; + fiat_p256_addcarryx_u32(&x185, &x186, x184, x161, x164); + uint32_t x187; + fiat_p256_uint1 x188; + fiat_p256_addcarryx_u32(&x187, &x188, x186, x159, x162); + uint32_t x189; + fiat_p256_uint1 x190; + fiat_p256_addcarryx_u32(&x189, &x190, x188, 0x0, x160); + uint32_t x191; + fiat_p256_uint1 x192; + fiat_p256_addcarryx_u32(&x191, &x192, 0x0, x173, x141); + uint32_t x193; + fiat_p256_uint1 x194; + fiat_p256_addcarryx_u32(&x193, &x194, x192, x175, x143); + uint32_t x195; + fiat_p256_uint1 x196; + fiat_p256_addcarryx_u32(&x195, &x196, x194, x177, x145); + uint32_t x197; + fiat_p256_uint1 x198; + fiat_p256_addcarryx_u32(&x197, &x198, x196, x179, x147); + uint32_t x199; + fiat_p256_uint1 x200; + fiat_p256_addcarryx_u32(&x199, &x200, x198, x181, x149); + uint32_t x201; + fiat_p256_uint1 x202; + fiat_p256_addcarryx_u32(&x201, &x202, x200, x183, x151); + uint32_t x203; + fiat_p256_uint1 x204; + fiat_p256_addcarryx_u32(&x203, &x204, x202, x185, x153); + uint32_t x205; + fiat_p256_uint1 x206; + fiat_p256_addcarryx_u32(&x205, &x206, x204, x187, x155); + uint32_t x207; + fiat_p256_uint1 x208; + fiat_p256_addcarryx_u32(&x207, &x208, x206, x189, x157); + uint32_t x209; + uint32_t x210; + fiat_p256_mulx_u32(&x209, &x210, x191, UINT32_C(0xffffffff)); + uint32_t x211; + uint32_t x212; + fiat_p256_mulx_u32(&x211, &x212, x191, UINT32_C(0xffffffff)); + uint32_t x213; + uint32_t x214; + fiat_p256_mulx_u32(&x213, &x214, x191, UINT32_C(0xffffffff)); + uint32_t x215; + uint32_t x216; + fiat_p256_mulx_u32(&x215, &x216, x191, UINT32_C(0xffffffff)); + uint32_t x217; + fiat_p256_uint1 x218; + fiat_p256_addcarryx_u32(&x217, &x218, 0x0, x213, x216); + uint32_t x219; + fiat_p256_uint1 x220; + fiat_p256_addcarryx_u32(&x219, &x220, x218, x211, x214); + uint32_t x221; + fiat_p256_uint1 x222; + fiat_p256_addcarryx_u32(&x221, &x222, x220, 0x0, x212); + uint32_t x223; + fiat_p256_uint1 x224; + fiat_p256_addcarryx_u32(&x223, &x224, 0x0, x215, x191); + uint32_t x225; + fiat_p256_uint1 x226; + fiat_p256_addcarryx_u32(&x225, &x226, x224, x217, x193); + uint32_t x227; + fiat_p256_uint1 x228; + fiat_p256_addcarryx_u32(&x227, &x228, x226, x219, x195); + uint32_t x229; + fiat_p256_uint1 x230; + fiat_p256_addcarryx_u32(&x229, &x230, x228, x221, x197); + uint32_t x231; + fiat_p256_uint1 x232; + fiat_p256_addcarryx_u32(&x231, &x232, x230, 0x0, x199); + uint32_t x233; + fiat_p256_uint1 x234; + fiat_p256_addcarryx_u32(&x233, &x234, x232, 0x0, x201); + uint32_t x235; + fiat_p256_uint1 x236; + fiat_p256_addcarryx_u32(&x235, &x236, x234, x191, x203); + uint32_t x237; + fiat_p256_uint1 x238; + fiat_p256_addcarryx_u32(&x237, &x238, x236, x209, x205); + uint32_t x239; + fiat_p256_uint1 x240; + fiat_p256_addcarryx_u32(&x239, &x240, x238, x210, x207); + uint32_t x241; + fiat_p256_uint1 x242; + fiat_p256_addcarryx_u32(&x241, &x242, x240, 0x0, x208); + uint32_t x243; + uint32_t x244; + fiat_p256_mulx_u32(&x243, &x244, x3, (arg2[7])); + uint32_t x245; + uint32_t x246; + fiat_p256_mulx_u32(&x245, &x246, x3, (arg2[6])); + uint32_t x247; + uint32_t x248; + fiat_p256_mulx_u32(&x247, &x248, x3, (arg2[5])); + uint32_t x249; + uint32_t x250; + fiat_p256_mulx_u32(&x249, &x250, x3, (arg2[4])); + uint32_t x251; + uint32_t x252; + fiat_p256_mulx_u32(&x251, &x252, x3, (arg2[3])); + uint32_t x253; + uint32_t x254; + fiat_p256_mulx_u32(&x253, &x254, x3, (arg2[2])); + uint32_t x255; + uint32_t x256; + fiat_p256_mulx_u32(&x255, &x256, x3, (arg2[1])); + uint32_t x257; + uint32_t x258; + fiat_p256_mulx_u32(&x257, &x258, x3, (arg2[0])); + uint32_t x259; + fiat_p256_uint1 x260; + fiat_p256_addcarryx_u32(&x259, &x260, 0x0, x255, x258); + uint32_t x261; + fiat_p256_uint1 x262; + fiat_p256_addcarryx_u32(&x261, &x262, x260, x253, x256); + uint32_t x263; + fiat_p256_uint1 x264; + fiat_p256_addcarryx_u32(&x263, &x264, x262, x251, x254); + uint32_t x265; + fiat_p256_uint1 x266; + fiat_p256_addcarryx_u32(&x265, &x266, x264, x249, x252); + uint32_t x267; + fiat_p256_uint1 x268; + fiat_p256_addcarryx_u32(&x267, &x268, x266, x247, x250); + uint32_t x269; + fiat_p256_uint1 x270; + fiat_p256_addcarryx_u32(&x269, &x270, x268, x245, x248); + uint32_t x271; + fiat_p256_uint1 x272; + fiat_p256_addcarryx_u32(&x271, &x272, x270, x243, x246); + uint32_t x273; + fiat_p256_uint1 x274; + fiat_p256_addcarryx_u32(&x273, &x274, x272, 0x0, x244); + uint32_t x275; + fiat_p256_uint1 x276; + fiat_p256_addcarryx_u32(&x275, &x276, 0x0, x257, x225); + uint32_t x277; + fiat_p256_uint1 x278; + fiat_p256_addcarryx_u32(&x277, &x278, x276, x259, x227); + uint32_t x279; + fiat_p256_uint1 x280; + fiat_p256_addcarryx_u32(&x279, &x280, x278, x261, x229); + uint32_t x281; + fiat_p256_uint1 x282; + fiat_p256_addcarryx_u32(&x281, &x282, x280, x263, x231); + uint32_t x283; + fiat_p256_uint1 x284; + fiat_p256_addcarryx_u32(&x283, &x284, x282, x265, x233); + uint32_t x285; + fiat_p256_uint1 x286; + fiat_p256_addcarryx_u32(&x285, &x286, x284, x267, x235); + uint32_t x287; + fiat_p256_uint1 x288; + fiat_p256_addcarryx_u32(&x287, &x288, x286, x269, x237); + uint32_t x289; + fiat_p256_uint1 x290; + fiat_p256_addcarryx_u32(&x289, &x290, x288, x271, x239); + uint32_t x291; + fiat_p256_uint1 x292; + fiat_p256_addcarryx_u32(&x291, &x292, x290, x273, x241); + uint32_t x293; + uint32_t x294; + fiat_p256_mulx_u32(&x293, &x294, x275, UINT32_C(0xffffffff)); + uint32_t x295; + uint32_t x296; + fiat_p256_mulx_u32(&x295, &x296, x275, UINT32_C(0xffffffff)); + uint32_t x297; + uint32_t x298; + fiat_p256_mulx_u32(&x297, &x298, x275, UINT32_C(0xffffffff)); + uint32_t x299; + uint32_t x300; + fiat_p256_mulx_u32(&x299, &x300, x275, UINT32_C(0xffffffff)); + uint32_t x301; + fiat_p256_uint1 x302; + fiat_p256_addcarryx_u32(&x301, &x302, 0x0, x297, x300); + uint32_t x303; + fiat_p256_uint1 x304; + fiat_p256_addcarryx_u32(&x303, &x304, x302, x295, x298); + uint32_t x305; + fiat_p256_uint1 x306; + fiat_p256_addcarryx_u32(&x305, &x306, x304, 0x0, x296); + uint32_t x307; + fiat_p256_uint1 x308; + fiat_p256_addcarryx_u32(&x307, &x308, 0x0, x299, x275); + uint32_t x309; + fiat_p256_uint1 x310; + fiat_p256_addcarryx_u32(&x309, &x310, x308, x301, x277); + uint32_t x311; + fiat_p256_uint1 x312; + fiat_p256_addcarryx_u32(&x311, &x312, x310, x303, x279); + uint32_t x313; + fiat_p256_uint1 x314; + fiat_p256_addcarryx_u32(&x313, &x314, x312, x305, x281); + uint32_t x315; + fiat_p256_uint1 x316; + fiat_p256_addcarryx_u32(&x315, &x316, x314, 0x0, x283); + uint32_t x317; + fiat_p256_uint1 x318; + fiat_p256_addcarryx_u32(&x317, &x318, x316, 0x0, x285); + uint32_t x319; + fiat_p256_uint1 x320; + fiat_p256_addcarryx_u32(&x319, &x320, x318, x275, x287); + uint32_t x321; + fiat_p256_uint1 x322; + fiat_p256_addcarryx_u32(&x321, &x322, x320, x293, x289); + uint32_t x323; + fiat_p256_uint1 x324; + fiat_p256_addcarryx_u32(&x323, &x324, x322, x294, x291); + uint32_t x325; + fiat_p256_uint1 x326; + fiat_p256_addcarryx_u32(&x325, &x326, x324, 0x0, x292); + uint32_t x327; + uint32_t x328; + fiat_p256_mulx_u32(&x327, &x328, x4, (arg2[7])); + uint32_t x329; + uint32_t x330; + fiat_p256_mulx_u32(&x329, &x330, x4, (arg2[6])); + uint32_t x331; + uint32_t x332; + fiat_p256_mulx_u32(&x331, &x332, x4, (arg2[5])); + uint32_t x333; + uint32_t x334; + fiat_p256_mulx_u32(&x333, &x334, x4, (arg2[4])); + uint32_t x335; + uint32_t x336; + fiat_p256_mulx_u32(&x335, &x336, x4, (arg2[3])); + uint32_t x337; + uint32_t x338; + fiat_p256_mulx_u32(&x337, &x338, x4, (arg2[2])); + uint32_t x339; + uint32_t x340; + fiat_p256_mulx_u32(&x339, &x340, x4, (arg2[1])); + uint32_t x341; + uint32_t x342; + fiat_p256_mulx_u32(&x341, &x342, x4, (arg2[0])); + uint32_t x343; + fiat_p256_uint1 x344; + fiat_p256_addcarryx_u32(&x343, &x344, 0x0, x339, x342); + uint32_t x345; + fiat_p256_uint1 x346; + fiat_p256_addcarryx_u32(&x345, &x346, x344, x337, x340); + uint32_t x347; + fiat_p256_uint1 x348; + fiat_p256_addcarryx_u32(&x347, &x348, x346, x335, x338); + uint32_t x349; + fiat_p256_uint1 x350; + fiat_p256_addcarryx_u32(&x349, &x350, x348, x333, x336); + uint32_t x351; + fiat_p256_uint1 x352; + fiat_p256_addcarryx_u32(&x351, &x352, x350, x331, x334); + uint32_t x353; + fiat_p256_uint1 x354; + fiat_p256_addcarryx_u32(&x353, &x354, x352, x329, x332); + uint32_t x355; + fiat_p256_uint1 x356; + fiat_p256_addcarryx_u32(&x355, &x356, x354, x327, x330); + uint32_t x357; + fiat_p256_uint1 x358; + fiat_p256_addcarryx_u32(&x357, &x358, x356, 0x0, x328); + uint32_t x359; + fiat_p256_uint1 x360; + fiat_p256_addcarryx_u32(&x359, &x360, 0x0, x341, x309); + uint32_t x361; + fiat_p256_uint1 x362; + fiat_p256_addcarryx_u32(&x361, &x362, x360, x343, x311); + uint32_t x363; + fiat_p256_uint1 x364; + fiat_p256_addcarryx_u32(&x363, &x364, x362, x345, x313); + uint32_t x365; + fiat_p256_uint1 x366; + fiat_p256_addcarryx_u32(&x365, &x366, x364, x347, x315); + uint32_t x367; + fiat_p256_uint1 x368; + fiat_p256_addcarryx_u32(&x367, &x368, x366, x349, x317); + uint32_t x369; + fiat_p256_uint1 x370; + fiat_p256_addcarryx_u32(&x369, &x370, x368, x351, x319); + uint32_t x371; + fiat_p256_uint1 x372; + fiat_p256_addcarryx_u32(&x371, &x372, x370, x353, x321); + uint32_t x373; + fiat_p256_uint1 x374; + fiat_p256_addcarryx_u32(&x373, &x374, x372, x355, x323); + uint32_t x375; + fiat_p256_uint1 x376; + fiat_p256_addcarryx_u32(&x375, &x376, x374, x357, x325); + uint32_t x377; + uint32_t x378; + fiat_p256_mulx_u32(&x377, &x378, x359, UINT32_C(0xffffffff)); + uint32_t x379; + uint32_t x380; + fiat_p256_mulx_u32(&x379, &x380, x359, UINT32_C(0xffffffff)); + uint32_t x381; + uint32_t x382; + fiat_p256_mulx_u32(&x381, &x382, x359, UINT32_C(0xffffffff)); + uint32_t x383; + uint32_t x384; + fiat_p256_mulx_u32(&x383, &x384, x359, UINT32_C(0xffffffff)); + uint32_t x385; + fiat_p256_uint1 x386; + fiat_p256_addcarryx_u32(&x385, &x386, 0x0, x381, x384); + uint32_t x387; + fiat_p256_uint1 x388; + fiat_p256_addcarryx_u32(&x387, &x388, x386, x379, x382); + uint32_t x389; + fiat_p256_uint1 x390; + fiat_p256_addcarryx_u32(&x389, &x390, x388, 0x0, x380); + uint32_t x391; + fiat_p256_uint1 x392; + fiat_p256_addcarryx_u32(&x391, &x392, 0x0, x383, x359); + uint32_t x393; + fiat_p256_uint1 x394; + fiat_p256_addcarryx_u32(&x393, &x394, x392, x385, x361); + uint32_t x395; + fiat_p256_uint1 x396; + fiat_p256_addcarryx_u32(&x395, &x396, x394, x387, x363); + uint32_t x397; + fiat_p256_uint1 x398; + fiat_p256_addcarryx_u32(&x397, &x398, x396, x389, x365); + uint32_t x399; + fiat_p256_uint1 x400; + fiat_p256_addcarryx_u32(&x399, &x400, x398, 0x0, x367); + uint32_t x401; + fiat_p256_uint1 x402; + fiat_p256_addcarryx_u32(&x401, &x402, x400, 0x0, x369); + uint32_t x403; + fiat_p256_uint1 x404; + fiat_p256_addcarryx_u32(&x403, &x404, x402, x359, x371); + uint32_t x405; + fiat_p256_uint1 x406; + fiat_p256_addcarryx_u32(&x405, &x406, x404, x377, x373); + uint32_t x407; + fiat_p256_uint1 x408; + fiat_p256_addcarryx_u32(&x407, &x408, x406, x378, x375); + uint32_t x409; + fiat_p256_uint1 x410; + fiat_p256_addcarryx_u32(&x409, &x410, x408, 0x0, x376); + uint32_t x411; + uint32_t x412; + fiat_p256_mulx_u32(&x411, &x412, x5, (arg2[7])); + uint32_t x413; + uint32_t x414; + fiat_p256_mulx_u32(&x413, &x414, x5, (arg2[6])); + uint32_t x415; + uint32_t x416; + fiat_p256_mulx_u32(&x415, &x416, x5, (arg2[5])); + uint32_t x417; + uint32_t x418; + fiat_p256_mulx_u32(&x417, &x418, x5, (arg2[4])); + uint32_t x419; + uint32_t x420; + fiat_p256_mulx_u32(&x419, &x420, x5, (arg2[3])); + uint32_t x421; + uint32_t x422; + fiat_p256_mulx_u32(&x421, &x422, x5, (arg2[2])); + uint32_t x423; + uint32_t x424; + fiat_p256_mulx_u32(&x423, &x424, x5, (arg2[1])); + uint32_t x425; + uint32_t x426; + fiat_p256_mulx_u32(&x425, &x426, x5, (arg2[0])); + uint32_t x427; + fiat_p256_uint1 x428; + fiat_p256_addcarryx_u32(&x427, &x428, 0x0, x423, x426); + uint32_t x429; + fiat_p256_uint1 x430; + fiat_p256_addcarryx_u32(&x429, &x430, x428, x421, x424); + uint32_t x431; + fiat_p256_uint1 x432; + fiat_p256_addcarryx_u32(&x431, &x432, x430, x419, x422); + uint32_t x433; + fiat_p256_uint1 x434; + fiat_p256_addcarryx_u32(&x433, &x434, x432, x417, x420); + uint32_t x435; + fiat_p256_uint1 x436; + fiat_p256_addcarryx_u32(&x435, &x436, x434, x415, x418); + uint32_t x437; + fiat_p256_uint1 x438; + fiat_p256_addcarryx_u32(&x437, &x438, x436, x413, x416); + uint32_t x439; + fiat_p256_uint1 x440; + fiat_p256_addcarryx_u32(&x439, &x440, x438, x411, x414); + uint32_t x441; + fiat_p256_uint1 x442; + fiat_p256_addcarryx_u32(&x441, &x442, x440, 0x0, x412); + uint32_t x443; + fiat_p256_uint1 x444; + fiat_p256_addcarryx_u32(&x443, &x444, 0x0, x425, x393); + uint32_t x445; + fiat_p256_uint1 x446; + fiat_p256_addcarryx_u32(&x445, &x446, x444, x427, x395); + uint32_t x447; + fiat_p256_uint1 x448; + fiat_p256_addcarryx_u32(&x447, &x448, x446, x429, x397); + uint32_t x449; + fiat_p256_uint1 x450; + fiat_p256_addcarryx_u32(&x449, &x450, x448, x431, x399); + uint32_t x451; + fiat_p256_uint1 x452; + fiat_p256_addcarryx_u32(&x451, &x452, x450, x433, x401); + uint32_t x453; + fiat_p256_uint1 x454; + fiat_p256_addcarryx_u32(&x453, &x454, x452, x435, x403); + uint32_t x455; + fiat_p256_uint1 x456; + fiat_p256_addcarryx_u32(&x455, &x456, x454, x437, x405); + uint32_t x457; + fiat_p256_uint1 x458; + fiat_p256_addcarryx_u32(&x457, &x458, x456, x439, x407); + uint32_t x459; + fiat_p256_uint1 x460; + fiat_p256_addcarryx_u32(&x459, &x460, x458, x441, x409); + uint32_t x461; + uint32_t x462; + fiat_p256_mulx_u32(&x461, &x462, x443, UINT32_C(0xffffffff)); + uint32_t x463; + uint32_t x464; + fiat_p256_mulx_u32(&x463, &x464, x443, UINT32_C(0xffffffff)); + uint32_t x465; + uint32_t x466; + fiat_p256_mulx_u32(&x465, &x466, x443, UINT32_C(0xffffffff)); + uint32_t x467; + uint32_t x468; + fiat_p256_mulx_u32(&x467, &x468, x443, UINT32_C(0xffffffff)); + uint32_t x469; + fiat_p256_uint1 x470; + fiat_p256_addcarryx_u32(&x469, &x470, 0x0, x465, x468); + uint32_t x471; + fiat_p256_uint1 x472; + fiat_p256_addcarryx_u32(&x471, &x472, x470, x463, x466); + uint32_t x473; + fiat_p256_uint1 x474; + fiat_p256_addcarryx_u32(&x473, &x474, x472, 0x0, x464); + uint32_t x475; + fiat_p256_uint1 x476; + fiat_p256_addcarryx_u32(&x475, &x476, 0x0, x467, x443); + uint32_t x477; + fiat_p256_uint1 x478; + fiat_p256_addcarryx_u32(&x477, &x478, x476, x469, x445); + uint32_t x479; + fiat_p256_uint1 x480; + fiat_p256_addcarryx_u32(&x479, &x480, x478, x471, x447); + uint32_t x481; + fiat_p256_uint1 x482; + fiat_p256_addcarryx_u32(&x481, &x482, x480, x473, x449); + uint32_t x483; + fiat_p256_uint1 x484; + fiat_p256_addcarryx_u32(&x483, &x484, x482, 0x0, x451); + uint32_t x485; + fiat_p256_uint1 x486; + fiat_p256_addcarryx_u32(&x485, &x486, x484, 0x0, x453); + uint32_t x487; + fiat_p256_uint1 x488; + fiat_p256_addcarryx_u32(&x487, &x488, x486, x443, x455); + uint32_t x489; + fiat_p256_uint1 x490; + fiat_p256_addcarryx_u32(&x489, &x490, x488, x461, x457); + uint32_t x491; + fiat_p256_uint1 x492; + fiat_p256_addcarryx_u32(&x491, &x492, x490, x462, x459); + uint32_t x493; + fiat_p256_uint1 x494; + fiat_p256_addcarryx_u32(&x493, &x494, x492, 0x0, x460); + uint32_t x495; + uint32_t x496; + fiat_p256_mulx_u32(&x495, &x496, x6, (arg2[7])); + uint32_t x497; + uint32_t x498; + fiat_p256_mulx_u32(&x497, &x498, x6, (arg2[6])); + uint32_t x499; + uint32_t x500; + fiat_p256_mulx_u32(&x499, &x500, x6, (arg2[5])); + uint32_t x501; + uint32_t x502; + fiat_p256_mulx_u32(&x501, &x502, x6, (arg2[4])); + uint32_t x503; + uint32_t x504; + fiat_p256_mulx_u32(&x503, &x504, x6, (arg2[3])); + uint32_t x505; + uint32_t x506; + fiat_p256_mulx_u32(&x505, &x506, x6, (arg2[2])); + uint32_t x507; + uint32_t x508; + fiat_p256_mulx_u32(&x507, &x508, x6, (arg2[1])); + uint32_t x509; + uint32_t x510; + fiat_p256_mulx_u32(&x509, &x510, x6, (arg2[0])); + uint32_t x511; + fiat_p256_uint1 x512; + fiat_p256_addcarryx_u32(&x511, &x512, 0x0, x507, x510); + uint32_t x513; + fiat_p256_uint1 x514; + fiat_p256_addcarryx_u32(&x513, &x514, x512, x505, x508); + uint32_t x515; + fiat_p256_uint1 x516; + fiat_p256_addcarryx_u32(&x515, &x516, x514, x503, x506); + uint32_t x517; + fiat_p256_uint1 x518; + fiat_p256_addcarryx_u32(&x517, &x518, x516, x501, x504); + uint32_t x519; + fiat_p256_uint1 x520; + fiat_p256_addcarryx_u32(&x519, &x520, x518, x499, x502); + uint32_t x521; + fiat_p256_uint1 x522; + fiat_p256_addcarryx_u32(&x521, &x522, x520, x497, x500); + uint32_t x523; + fiat_p256_uint1 x524; + fiat_p256_addcarryx_u32(&x523, &x524, x522, x495, x498); + uint32_t x525; + fiat_p256_uint1 x526; + fiat_p256_addcarryx_u32(&x525, &x526, x524, 0x0, x496); + uint32_t x527; + fiat_p256_uint1 x528; + fiat_p256_addcarryx_u32(&x527, &x528, 0x0, x509, x477); + uint32_t x529; + fiat_p256_uint1 x530; + fiat_p256_addcarryx_u32(&x529, &x530, x528, x511, x479); + uint32_t x531; + fiat_p256_uint1 x532; + fiat_p256_addcarryx_u32(&x531, &x532, x530, x513, x481); + uint32_t x533; + fiat_p256_uint1 x534; + fiat_p256_addcarryx_u32(&x533, &x534, x532, x515, x483); + uint32_t x535; + fiat_p256_uint1 x536; + fiat_p256_addcarryx_u32(&x535, &x536, x534, x517, x485); + uint32_t x537; + fiat_p256_uint1 x538; + fiat_p256_addcarryx_u32(&x537, &x538, x536, x519, x487); + uint32_t x539; + fiat_p256_uint1 x540; + fiat_p256_addcarryx_u32(&x539, &x540, x538, x521, x489); + uint32_t x541; + fiat_p256_uint1 x542; + fiat_p256_addcarryx_u32(&x541, &x542, x540, x523, x491); + uint32_t x543; + fiat_p256_uint1 x544; + fiat_p256_addcarryx_u32(&x543, &x544, x542, x525, x493); + uint32_t x545; + uint32_t x546; + fiat_p256_mulx_u32(&x545, &x546, x527, UINT32_C(0xffffffff)); + uint32_t x547; + uint32_t x548; + fiat_p256_mulx_u32(&x547, &x548, x527, UINT32_C(0xffffffff)); + uint32_t x549; + uint32_t x550; + fiat_p256_mulx_u32(&x549, &x550, x527, UINT32_C(0xffffffff)); + uint32_t x551; + uint32_t x552; + fiat_p256_mulx_u32(&x551, &x552, x527, UINT32_C(0xffffffff)); + uint32_t x553; + fiat_p256_uint1 x554; + fiat_p256_addcarryx_u32(&x553, &x554, 0x0, x549, x552); + uint32_t x555; + fiat_p256_uint1 x556; + fiat_p256_addcarryx_u32(&x555, &x556, x554, x547, x550); + uint32_t x557; + fiat_p256_uint1 x558; + fiat_p256_addcarryx_u32(&x557, &x558, x556, 0x0, x548); + uint32_t x559; + fiat_p256_uint1 x560; + fiat_p256_addcarryx_u32(&x559, &x560, 0x0, x551, x527); + uint32_t x561; + fiat_p256_uint1 x562; + fiat_p256_addcarryx_u32(&x561, &x562, x560, x553, x529); + uint32_t x563; + fiat_p256_uint1 x564; + fiat_p256_addcarryx_u32(&x563, &x564, x562, x555, x531); + uint32_t x565; + fiat_p256_uint1 x566; + fiat_p256_addcarryx_u32(&x565, &x566, x564, x557, x533); + uint32_t x567; + fiat_p256_uint1 x568; + fiat_p256_addcarryx_u32(&x567, &x568, x566, 0x0, x535); + uint32_t x569; + fiat_p256_uint1 x570; + fiat_p256_addcarryx_u32(&x569, &x570, x568, 0x0, x537); + uint32_t x571; + fiat_p256_uint1 x572; + fiat_p256_addcarryx_u32(&x571, &x572, x570, x527, x539); + uint32_t x573; + fiat_p256_uint1 x574; + fiat_p256_addcarryx_u32(&x573, &x574, x572, x545, x541); + uint32_t x575; + fiat_p256_uint1 x576; + fiat_p256_addcarryx_u32(&x575, &x576, x574, x546, x543); + uint32_t x577; + fiat_p256_uint1 x578; + fiat_p256_addcarryx_u32(&x577, &x578, x576, 0x0, x544); + uint32_t x579; + uint32_t x580; + fiat_p256_mulx_u32(&x579, &x580, x7, (arg2[7])); + uint32_t x581; + uint32_t x582; + fiat_p256_mulx_u32(&x581, &x582, x7, (arg2[6])); + uint32_t x583; + uint32_t x584; + fiat_p256_mulx_u32(&x583, &x584, x7, (arg2[5])); + uint32_t x585; + uint32_t x586; + fiat_p256_mulx_u32(&x585, &x586, x7, (arg2[4])); + uint32_t x587; + uint32_t x588; + fiat_p256_mulx_u32(&x587, &x588, x7, (arg2[3])); + uint32_t x589; + uint32_t x590; + fiat_p256_mulx_u32(&x589, &x590, x7, (arg2[2])); + uint32_t x591; + uint32_t x592; + fiat_p256_mulx_u32(&x591, &x592, x7, (arg2[1])); + uint32_t x593; + uint32_t x594; + fiat_p256_mulx_u32(&x593, &x594, x7, (arg2[0])); + uint32_t x595; + fiat_p256_uint1 x596; + fiat_p256_addcarryx_u32(&x595, &x596, 0x0, x591, x594); + uint32_t x597; + fiat_p256_uint1 x598; + fiat_p256_addcarryx_u32(&x597, &x598, x596, x589, x592); + uint32_t x599; + fiat_p256_uint1 x600; + fiat_p256_addcarryx_u32(&x599, &x600, x598, x587, x590); + uint32_t x601; + fiat_p256_uint1 x602; + fiat_p256_addcarryx_u32(&x601, &x602, x600, x585, x588); + uint32_t x603; + fiat_p256_uint1 x604; + fiat_p256_addcarryx_u32(&x603, &x604, x602, x583, x586); + uint32_t x605; + fiat_p256_uint1 x606; + fiat_p256_addcarryx_u32(&x605, &x606, x604, x581, x584); + uint32_t x607; + fiat_p256_uint1 x608; + fiat_p256_addcarryx_u32(&x607, &x608, x606, x579, x582); + uint32_t x609; + fiat_p256_uint1 x610; + fiat_p256_addcarryx_u32(&x609, &x610, x608, 0x0, x580); + uint32_t x611; + fiat_p256_uint1 x612; + fiat_p256_addcarryx_u32(&x611, &x612, 0x0, x593, x561); + uint32_t x613; + fiat_p256_uint1 x614; + fiat_p256_addcarryx_u32(&x613, &x614, x612, x595, x563); + uint32_t x615; + fiat_p256_uint1 x616; + fiat_p256_addcarryx_u32(&x615, &x616, x614, x597, x565); + uint32_t x617; + fiat_p256_uint1 x618; + fiat_p256_addcarryx_u32(&x617, &x618, x616, x599, x567); + uint32_t x619; + fiat_p256_uint1 x620; + fiat_p256_addcarryx_u32(&x619, &x620, x618, x601, x569); + uint32_t x621; + fiat_p256_uint1 x622; + fiat_p256_addcarryx_u32(&x621, &x622, x620, x603, x571); + uint32_t x623; + fiat_p256_uint1 x624; + fiat_p256_addcarryx_u32(&x623, &x624, x622, x605, x573); + uint32_t x625; + fiat_p256_uint1 x626; + fiat_p256_addcarryx_u32(&x625, &x626, x624, x607, x575); + uint32_t x627; + fiat_p256_uint1 x628; + fiat_p256_addcarryx_u32(&x627, &x628, x626, x609, x577); + uint32_t x629; + uint32_t x630; + fiat_p256_mulx_u32(&x629, &x630, x611, UINT32_C(0xffffffff)); + uint32_t x631; + uint32_t x632; + fiat_p256_mulx_u32(&x631, &x632, x611, UINT32_C(0xffffffff)); + uint32_t x633; + uint32_t x634; + fiat_p256_mulx_u32(&x633, &x634, x611, UINT32_C(0xffffffff)); + uint32_t x635; + uint32_t x636; + fiat_p256_mulx_u32(&x635, &x636, x611, UINT32_C(0xffffffff)); + uint32_t x637; + fiat_p256_uint1 x638; + fiat_p256_addcarryx_u32(&x637, &x638, 0x0, x633, x636); + uint32_t x639; + fiat_p256_uint1 x640; + fiat_p256_addcarryx_u32(&x639, &x640, x638, x631, x634); + uint32_t x641; + fiat_p256_uint1 x642; + fiat_p256_addcarryx_u32(&x641, &x642, x640, 0x0, x632); + uint32_t x643; + fiat_p256_uint1 x644; + fiat_p256_addcarryx_u32(&x643, &x644, 0x0, x635, x611); + uint32_t x645; + fiat_p256_uint1 x646; + fiat_p256_addcarryx_u32(&x645, &x646, x644, x637, x613); + uint32_t x647; + fiat_p256_uint1 x648; + fiat_p256_addcarryx_u32(&x647, &x648, x646, x639, x615); + uint32_t x649; + fiat_p256_uint1 x650; + fiat_p256_addcarryx_u32(&x649, &x650, x648, x641, x617); + uint32_t x651; + fiat_p256_uint1 x652; + fiat_p256_addcarryx_u32(&x651, &x652, x650, 0x0, x619); + uint32_t x653; + fiat_p256_uint1 x654; + fiat_p256_addcarryx_u32(&x653, &x654, x652, 0x0, x621); + uint32_t x655; + fiat_p256_uint1 x656; + fiat_p256_addcarryx_u32(&x655, &x656, x654, x611, x623); + uint32_t x657; + fiat_p256_uint1 x658; + fiat_p256_addcarryx_u32(&x657, &x658, x656, x629, x625); + uint32_t x659; + fiat_p256_uint1 x660; + fiat_p256_addcarryx_u32(&x659, &x660, x658, x630, x627); + uint32_t x661; + fiat_p256_uint1 x662; + fiat_p256_addcarryx_u32(&x661, &x662, x660, 0x0, x628); + uint32_t x663; + fiat_p256_uint1 x664; + fiat_p256_subborrowx_u32(&x663, &x664, 0x0, x645, UINT32_C(0xffffffff)); + uint32_t x665; + fiat_p256_uint1 x666; + fiat_p256_subborrowx_u32(&x665, &x666, x664, x647, UINT32_C(0xffffffff)); + uint32_t x667; + fiat_p256_uint1 x668; + fiat_p256_subborrowx_u32(&x667, &x668, x666, x649, UINT32_C(0xffffffff)); + uint32_t x669; + fiat_p256_uint1 x670; + fiat_p256_subborrowx_u32(&x669, &x670, x668, x651, 0x0); + uint32_t x671; + fiat_p256_uint1 x672; + fiat_p256_subborrowx_u32(&x671, &x672, x670, x653, 0x0); + uint32_t x673; + fiat_p256_uint1 x674; + fiat_p256_subborrowx_u32(&x673, &x674, x672, x655, 0x0); + uint32_t x675; + fiat_p256_uint1 x676; + fiat_p256_subborrowx_u32(&x675, &x676, x674, x657, 0x1); + uint32_t x677; + fiat_p256_uint1 x678; + fiat_p256_subborrowx_u32(&x677, &x678, x676, x659, UINT32_C(0xffffffff)); + uint32_t x679; + fiat_p256_uint1 x680; + fiat_p256_subborrowx_u32(&x679, &x680, x678, x661, 0x0); + uint32_t x681; + fiat_p256_cmovznz_u32(&x681, x680, x663, x645); + uint32_t x682; + fiat_p256_cmovznz_u32(&x682, x680, x665, x647); + uint32_t x683; + fiat_p256_cmovznz_u32(&x683, x680, x667, x649); + uint32_t x684; + fiat_p256_cmovznz_u32(&x684, x680, x669, x651); + uint32_t x685; + fiat_p256_cmovznz_u32(&x685, x680, x671, x653); + uint32_t x686; + fiat_p256_cmovznz_u32(&x686, x680, x673, x655); + uint32_t x687; + fiat_p256_cmovznz_u32(&x687, x680, x675, x657); + uint32_t x688; + fiat_p256_cmovznz_u32(&x688, x680, x677, x659); + out1[0] = x681; + out1[1] = x682; + out1[2] = x683; + out1[3] = x684; + out1[4] = x685; + out1[5] = x686; + out1[6] = x687; + out1[7] = x688; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + */ +static void fiat_p256_square(uint32_t out1[8], const uint32_t arg1[8]) { + uint32_t x1 = (arg1[1]); + uint32_t x2 = (arg1[2]); + uint32_t x3 = (arg1[3]); + uint32_t x4 = (arg1[4]); + uint32_t x5 = (arg1[5]); + uint32_t x6 = (arg1[6]); + uint32_t x7 = (arg1[7]); + uint32_t x8 = (arg1[0]); + uint32_t x9; + uint32_t x10; + fiat_p256_mulx_u32(&x9, &x10, x8, (arg1[7])); + uint32_t x11; + uint32_t x12; + fiat_p256_mulx_u32(&x11, &x12, x8, (arg1[6])); + uint32_t x13; + uint32_t x14; + fiat_p256_mulx_u32(&x13, &x14, x8, (arg1[5])); + uint32_t x15; + uint32_t x16; + fiat_p256_mulx_u32(&x15, &x16, x8, (arg1[4])); + uint32_t x17; + uint32_t x18; + fiat_p256_mulx_u32(&x17, &x18, x8, (arg1[3])); + uint32_t x19; + uint32_t x20; + fiat_p256_mulx_u32(&x19, &x20, x8, (arg1[2])); + uint32_t x21; + uint32_t x22; + fiat_p256_mulx_u32(&x21, &x22, x8, (arg1[1])); + uint32_t x23; + uint32_t x24; + fiat_p256_mulx_u32(&x23, &x24, x8, (arg1[0])); + uint32_t x25; + fiat_p256_uint1 x26; + fiat_p256_addcarryx_u32(&x25, &x26, 0x0, x21, x24); + uint32_t x27; + fiat_p256_uint1 x28; + fiat_p256_addcarryx_u32(&x27, &x28, x26, x19, x22); + uint32_t x29; + fiat_p256_uint1 x30; + fiat_p256_addcarryx_u32(&x29, &x30, x28, x17, x20); + uint32_t x31; + fiat_p256_uint1 x32; + fiat_p256_addcarryx_u32(&x31, &x32, x30, x15, x18); + uint32_t x33; + fiat_p256_uint1 x34; + fiat_p256_addcarryx_u32(&x33, &x34, x32, x13, x16); + uint32_t x35; + fiat_p256_uint1 x36; + fiat_p256_addcarryx_u32(&x35, &x36, x34, x11, x14); + uint32_t x37; + fiat_p256_uint1 x38; + fiat_p256_addcarryx_u32(&x37, &x38, x36, x9, x12); + uint32_t x39; + fiat_p256_uint1 x40; + fiat_p256_addcarryx_u32(&x39, &x40, x38, 0x0, x10); + uint32_t x41; + uint32_t x42; + fiat_p256_mulx_u32(&x41, &x42, x23, UINT32_C(0xffffffff)); + uint32_t x43; + uint32_t x44; + fiat_p256_mulx_u32(&x43, &x44, x23, UINT32_C(0xffffffff)); + uint32_t x45; + uint32_t x46; + fiat_p256_mulx_u32(&x45, &x46, x23, UINT32_C(0xffffffff)); + uint32_t x47; + uint32_t x48; + fiat_p256_mulx_u32(&x47, &x48, x23, UINT32_C(0xffffffff)); + uint32_t x49; + fiat_p256_uint1 x50; + fiat_p256_addcarryx_u32(&x49, &x50, 0x0, x45, x48); + uint32_t x51; + fiat_p256_uint1 x52; + fiat_p256_addcarryx_u32(&x51, &x52, x50, x43, x46); + uint32_t x53; + fiat_p256_uint1 x54; + fiat_p256_addcarryx_u32(&x53, &x54, x52, 0x0, x44); + uint32_t x55; + fiat_p256_uint1 x56; + fiat_p256_addcarryx_u32(&x55, &x56, 0x0, x47, x23); + uint32_t x57; + fiat_p256_uint1 x58; + fiat_p256_addcarryx_u32(&x57, &x58, x56, x49, x25); + uint32_t x59; + fiat_p256_uint1 x60; + fiat_p256_addcarryx_u32(&x59, &x60, x58, x51, x27); + uint32_t x61; + fiat_p256_uint1 x62; + fiat_p256_addcarryx_u32(&x61, &x62, x60, x53, x29); + uint32_t x63; + fiat_p256_uint1 x64; + fiat_p256_addcarryx_u32(&x63, &x64, x62, 0x0, x31); + uint32_t x65; + fiat_p256_uint1 x66; + fiat_p256_addcarryx_u32(&x65, &x66, x64, 0x0, x33); + uint32_t x67; + fiat_p256_uint1 x68; + fiat_p256_addcarryx_u32(&x67, &x68, x66, x23, x35); + uint32_t x69; + fiat_p256_uint1 x70; + fiat_p256_addcarryx_u32(&x69, &x70, x68, x41, x37); + uint32_t x71; + fiat_p256_uint1 x72; + fiat_p256_addcarryx_u32(&x71, &x72, x70, x42, x39); + uint32_t x73; + fiat_p256_uint1 x74; + fiat_p256_addcarryx_u32(&x73, &x74, x72, 0x0, 0x0); + uint32_t x75; + uint32_t x76; + fiat_p256_mulx_u32(&x75, &x76, x1, (arg1[7])); + uint32_t x77; + uint32_t x78; + fiat_p256_mulx_u32(&x77, &x78, x1, (arg1[6])); + uint32_t x79; + uint32_t x80; + fiat_p256_mulx_u32(&x79, &x80, x1, (arg1[5])); + uint32_t x81; + uint32_t x82; + fiat_p256_mulx_u32(&x81, &x82, x1, (arg1[4])); + uint32_t x83; + uint32_t x84; + fiat_p256_mulx_u32(&x83, &x84, x1, (arg1[3])); + uint32_t x85; + uint32_t x86; + fiat_p256_mulx_u32(&x85, &x86, x1, (arg1[2])); + uint32_t x87; + uint32_t x88; + fiat_p256_mulx_u32(&x87, &x88, x1, (arg1[1])); + uint32_t x89; + uint32_t x90; + fiat_p256_mulx_u32(&x89, &x90, x1, (arg1[0])); + uint32_t x91; + fiat_p256_uint1 x92; + fiat_p256_addcarryx_u32(&x91, &x92, 0x0, x87, x90); + uint32_t x93; + fiat_p256_uint1 x94; + fiat_p256_addcarryx_u32(&x93, &x94, x92, x85, x88); + uint32_t x95; + fiat_p256_uint1 x96; + fiat_p256_addcarryx_u32(&x95, &x96, x94, x83, x86); + uint32_t x97; + fiat_p256_uint1 x98; + fiat_p256_addcarryx_u32(&x97, &x98, x96, x81, x84); + uint32_t x99; + fiat_p256_uint1 x100; + fiat_p256_addcarryx_u32(&x99, &x100, x98, x79, x82); + uint32_t x101; + fiat_p256_uint1 x102; + fiat_p256_addcarryx_u32(&x101, &x102, x100, x77, x80); + uint32_t x103; + fiat_p256_uint1 x104; + fiat_p256_addcarryx_u32(&x103, &x104, x102, x75, x78); + uint32_t x105; + fiat_p256_uint1 x106; + fiat_p256_addcarryx_u32(&x105, &x106, x104, 0x0, x76); + uint32_t x107; + fiat_p256_uint1 x108; + fiat_p256_addcarryx_u32(&x107, &x108, 0x0, x89, x57); + uint32_t x109; + fiat_p256_uint1 x110; + fiat_p256_addcarryx_u32(&x109, &x110, x108, x91, x59); + uint32_t x111; + fiat_p256_uint1 x112; + fiat_p256_addcarryx_u32(&x111, &x112, x110, x93, x61); + uint32_t x113; + fiat_p256_uint1 x114; + fiat_p256_addcarryx_u32(&x113, &x114, x112, x95, x63); + uint32_t x115; + fiat_p256_uint1 x116; + fiat_p256_addcarryx_u32(&x115, &x116, x114, x97, x65); + uint32_t x117; + fiat_p256_uint1 x118; + fiat_p256_addcarryx_u32(&x117, &x118, x116, x99, x67); + uint32_t x119; + fiat_p256_uint1 x120; + fiat_p256_addcarryx_u32(&x119, &x120, x118, x101, x69); + uint32_t x121; + fiat_p256_uint1 x122; + fiat_p256_addcarryx_u32(&x121, &x122, x120, x103, x71); + uint32_t x123; + fiat_p256_uint1 x124; + fiat_p256_addcarryx_u32(&x123, &x124, x122, x105, (fiat_p256_uint1)x73); + uint32_t x125; + uint32_t x126; + fiat_p256_mulx_u32(&x125, &x126, x107, UINT32_C(0xffffffff)); + uint32_t x127; + uint32_t x128; + fiat_p256_mulx_u32(&x127, &x128, x107, UINT32_C(0xffffffff)); + uint32_t x129; + uint32_t x130; + fiat_p256_mulx_u32(&x129, &x130, x107, UINT32_C(0xffffffff)); + uint32_t x131; + uint32_t x132; + fiat_p256_mulx_u32(&x131, &x132, x107, UINT32_C(0xffffffff)); + uint32_t x133; + fiat_p256_uint1 x134; + fiat_p256_addcarryx_u32(&x133, &x134, 0x0, x129, x132); + uint32_t x135; + fiat_p256_uint1 x136; + fiat_p256_addcarryx_u32(&x135, &x136, x134, x127, x130); + uint32_t x137; + fiat_p256_uint1 x138; + fiat_p256_addcarryx_u32(&x137, &x138, x136, 0x0, x128); + uint32_t x139; + fiat_p256_uint1 x140; + fiat_p256_addcarryx_u32(&x139, &x140, 0x0, x131, x107); + uint32_t x141; + fiat_p256_uint1 x142; + fiat_p256_addcarryx_u32(&x141, &x142, x140, x133, x109); + uint32_t x143; + fiat_p256_uint1 x144; + fiat_p256_addcarryx_u32(&x143, &x144, x142, x135, x111); + uint32_t x145; + fiat_p256_uint1 x146; + fiat_p256_addcarryx_u32(&x145, &x146, x144, x137, x113); + uint32_t x147; + fiat_p256_uint1 x148; + fiat_p256_addcarryx_u32(&x147, &x148, x146, 0x0, x115); + uint32_t x149; + fiat_p256_uint1 x150; + fiat_p256_addcarryx_u32(&x149, &x150, x148, 0x0, x117); + uint32_t x151; + fiat_p256_uint1 x152; + fiat_p256_addcarryx_u32(&x151, &x152, x150, x107, x119); + uint32_t x153; + fiat_p256_uint1 x154; + fiat_p256_addcarryx_u32(&x153, &x154, x152, x125, x121); + uint32_t x155; + fiat_p256_uint1 x156; + fiat_p256_addcarryx_u32(&x155, &x156, x154, x126, x123); + uint32_t x157; + fiat_p256_uint1 x158; + fiat_p256_addcarryx_u32(&x157, &x158, x156, 0x0, x124); + uint32_t x159; + uint32_t x160; + fiat_p256_mulx_u32(&x159, &x160, x2, (arg1[7])); + uint32_t x161; + uint32_t x162; + fiat_p256_mulx_u32(&x161, &x162, x2, (arg1[6])); + uint32_t x163; + uint32_t x164; + fiat_p256_mulx_u32(&x163, &x164, x2, (arg1[5])); + uint32_t x165; + uint32_t x166; + fiat_p256_mulx_u32(&x165, &x166, x2, (arg1[4])); + uint32_t x167; + uint32_t x168; + fiat_p256_mulx_u32(&x167, &x168, x2, (arg1[3])); + uint32_t x169; + uint32_t x170; + fiat_p256_mulx_u32(&x169, &x170, x2, (arg1[2])); + uint32_t x171; + uint32_t x172; + fiat_p256_mulx_u32(&x171, &x172, x2, (arg1[1])); + uint32_t x173; + uint32_t x174; + fiat_p256_mulx_u32(&x173, &x174, x2, (arg1[0])); + uint32_t x175; + fiat_p256_uint1 x176; + fiat_p256_addcarryx_u32(&x175, &x176, 0x0, x171, x174); + uint32_t x177; + fiat_p256_uint1 x178; + fiat_p256_addcarryx_u32(&x177, &x178, x176, x169, x172); + uint32_t x179; + fiat_p256_uint1 x180; + fiat_p256_addcarryx_u32(&x179, &x180, x178, x167, x170); + uint32_t x181; + fiat_p256_uint1 x182; + fiat_p256_addcarryx_u32(&x181, &x182, x180, x165, x168); + uint32_t x183; + fiat_p256_uint1 x184; + fiat_p256_addcarryx_u32(&x183, &x184, x182, x163, x166); + uint32_t x185; + fiat_p256_uint1 x186; + fiat_p256_addcarryx_u32(&x185, &x186, x184, x161, x164); + uint32_t x187; + fiat_p256_uint1 x188; + fiat_p256_addcarryx_u32(&x187, &x188, x186, x159, x162); + uint32_t x189; + fiat_p256_uint1 x190; + fiat_p256_addcarryx_u32(&x189, &x190, x188, 0x0, x160); + uint32_t x191; + fiat_p256_uint1 x192; + fiat_p256_addcarryx_u32(&x191, &x192, 0x0, x173, x141); + uint32_t x193; + fiat_p256_uint1 x194; + fiat_p256_addcarryx_u32(&x193, &x194, x192, x175, x143); + uint32_t x195; + fiat_p256_uint1 x196; + fiat_p256_addcarryx_u32(&x195, &x196, x194, x177, x145); + uint32_t x197; + fiat_p256_uint1 x198; + fiat_p256_addcarryx_u32(&x197, &x198, x196, x179, x147); + uint32_t x199; + fiat_p256_uint1 x200; + fiat_p256_addcarryx_u32(&x199, &x200, x198, x181, x149); + uint32_t x201; + fiat_p256_uint1 x202; + fiat_p256_addcarryx_u32(&x201, &x202, x200, x183, x151); + uint32_t x203; + fiat_p256_uint1 x204; + fiat_p256_addcarryx_u32(&x203, &x204, x202, x185, x153); + uint32_t x205; + fiat_p256_uint1 x206; + fiat_p256_addcarryx_u32(&x205, &x206, x204, x187, x155); + uint32_t x207; + fiat_p256_uint1 x208; + fiat_p256_addcarryx_u32(&x207, &x208, x206, x189, x157); + uint32_t x209; + uint32_t x210; + fiat_p256_mulx_u32(&x209, &x210, x191, UINT32_C(0xffffffff)); + uint32_t x211; + uint32_t x212; + fiat_p256_mulx_u32(&x211, &x212, x191, UINT32_C(0xffffffff)); + uint32_t x213; + uint32_t x214; + fiat_p256_mulx_u32(&x213, &x214, x191, UINT32_C(0xffffffff)); + uint32_t x215; + uint32_t x216; + fiat_p256_mulx_u32(&x215, &x216, x191, UINT32_C(0xffffffff)); + uint32_t x217; + fiat_p256_uint1 x218; + fiat_p256_addcarryx_u32(&x217, &x218, 0x0, x213, x216); + uint32_t x219; + fiat_p256_uint1 x220; + fiat_p256_addcarryx_u32(&x219, &x220, x218, x211, x214); + uint32_t x221; + fiat_p256_uint1 x222; + fiat_p256_addcarryx_u32(&x221, &x222, x220, 0x0, x212); + uint32_t x223; + fiat_p256_uint1 x224; + fiat_p256_addcarryx_u32(&x223, &x224, 0x0, x215, x191); + uint32_t x225; + fiat_p256_uint1 x226; + fiat_p256_addcarryx_u32(&x225, &x226, x224, x217, x193); + uint32_t x227; + fiat_p256_uint1 x228; + fiat_p256_addcarryx_u32(&x227, &x228, x226, x219, x195); + uint32_t x229; + fiat_p256_uint1 x230; + fiat_p256_addcarryx_u32(&x229, &x230, x228, x221, x197); + uint32_t x231; + fiat_p256_uint1 x232; + fiat_p256_addcarryx_u32(&x231, &x232, x230, 0x0, x199); + uint32_t x233; + fiat_p256_uint1 x234; + fiat_p256_addcarryx_u32(&x233, &x234, x232, 0x0, x201); + uint32_t x235; + fiat_p256_uint1 x236; + fiat_p256_addcarryx_u32(&x235, &x236, x234, x191, x203); + uint32_t x237; + fiat_p256_uint1 x238; + fiat_p256_addcarryx_u32(&x237, &x238, x236, x209, x205); + uint32_t x239; + fiat_p256_uint1 x240; + fiat_p256_addcarryx_u32(&x239, &x240, x238, x210, x207); + uint32_t x241; + fiat_p256_uint1 x242; + fiat_p256_addcarryx_u32(&x241, &x242, x240, 0x0, x208); + uint32_t x243; + uint32_t x244; + fiat_p256_mulx_u32(&x243, &x244, x3, (arg1[7])); + uint32_t x245; + uint32_t x246; + fiat_p256_mulx_u32(&x245, &x246, x3, (arg1[6])); + uint32_t x247; + uint32_t x248; + fiat_p256_mulx_u32(&x247, &x248, x3, (arg1[5])); + uint32_t x249; + uint32_t x250; + fiat_p256_mulx_u32(&x249, &x250, x3, (arg1[4])); + uint32_t x251; + uint32_t x252; + fiat_p256_mulx_u32(&x251, &x252, x3, (arg1[3])); + uint32_t x253; + uint32_t x254; + fiat_p256_mulx_u32(&x253, &x254, x3, (arg1[2])); + uint32_t x255; + uint32_t x256; + fiat_p256_mulx_u32(&x255, &x256, x3, (arg1[1])); + uint32_t x257; + uint32_t x258; + fiat_p256_mulx_u32(&x257, &x258, x3, (arg1[0])); + uint32_t x259; + fiat_p256_uint1 x260; + fiat_p256_addcarryx_u32(&x259, &x260, 0x0, x255, x258); + uint32_t x261; + fiat_p256_uint1 x262; + fiat_p256_addcarryx_u32(&x261, &x262, x260, x253, x256); + uint32_t x263; + fiat_p256_uint1 x264; + fiat_p256_addcarryx_u32(&x263, &x264, x262, x251, x254); + uint32_t x265; + fiat_p256_uint1 x266; + fiat_p256_addcarryx_u32(&x265, &x266, x264, x249, x252); + uint32_t x267; + fiat_p256_uint1 x268; + fiat_p256_addcarryx_u32(&x267, &x268, x266, x247, x250); + uint32_t x269; + fiat_p256_uint1 x270; + fiat_p256_addcarryx_u32(&x269, &x270, x268, x245, x248); + uint32_t x271; + fiat_p256_uint1 x272; + fiat_p256_addcarryx_u32(&x271, &x272, x270, x243, x246); + uint32_t x273; + fiat_p256_uint1 x274; + fiat_p256_addcarryx_u32(&x273, &x274, x272, 0x0, x244); + uint32_t x275; + fiat_p256_uint1 x276; + fiat_p256_addcarryx_u32(&x275, &x276, 0x0, x257, x225); + uint32_t x277; + fiat_p256_uint1 x278; + fiat_p256_addcarryx_u32(&x277, &x278, x276, x259, x227); + uint32_t x279; + fiat_p256_uint1 x280; + fiat_p256_addcarryx_u32(&x279, &x280, x278, x261, x229); + uint32_t x281; + fiat_p256_uint1 x282; + fiat_p256_addcarryx_u32(&x281, &x282, x280, x263, x231); + uint32_t x283; + fiat_p256_uint1 x284; + fiat_p256_addcarryx_u32(&x283, &x284, x282, x265, x233); + uint32_t x285; + fiat_p256_uint1 x286; + fiat_p256_addcarryx_u32(&x285, &x286, x284, x267, x235); + uint32_t x287; + fiat_p256_uint1 x288; + fiat_p256_addcarryx_u32(&x287, &x288, x286, x269, x237); + uint32_t x289; + fiat_p256_uint1 x290; + fiat_p256_addcarryx_u32(&x289, &x290, x288, x271, x239); + uint32_t x291; + fiat_p256_uint1 x292; + fiat_p256_addcarryx_u32(&x291, &x292, x290, x273, x241); + uint32_t x293; + uint32_t x294; + fiat_p256_mulx_u32(&x293, &x294, x275, UINT32_C(0xffffffff)); + uint32_t x295; + uint32_t x296; + fiat_p256_mulx_u32(&x295, &x296, x275, UINT32_C(0xffffffff)); + uint32_t x297; + uint32_t x298; + fiat_p256_mulx_u32(&x297, &x298, x275, UINT32_C(0xffffffff)); + uint32_t x299; + uint32_t x300; + fiat_p256_mulx_u32(&x299, &x300, x275, UINT32_C(0xffffffff)); + uint32_t x301; + fiat_p256_uint1 x302; + fiat_p256_addcarryx_u32(&x301, &x302, 0x0, x297, x300); + uint32_t x303; + fiat_p256_uint1 x304; + fiat_p256_addcarryx_u32(&x303, &x304, x302, x295, x298); + uint32_t x305; + fiat_p256_uint1 x306; + fiat_p256_addcarryx_u32(&x305, &x306, x304, 0x0, x296); + uint32_t x307; + fiat_p256_uint1 x308; + fiat_p256_addcarryx_u32(&x307, &x308, 0x0, x299, x275); + uint32_t x309; + fiat_p256_uint1 x310; + fiat_p256_addcarryx_u32(&x309, &x310, x308, x301, x277); + uint32_t x311; + fiat_p256_uint1 x312; + fiat_p256_addcarryx_u32(&x311, &x312, x310, x303, x279); + uint32_t x313; + fiat_p256_uint1 x314; + fiat_p256_addcarryx_u32(&x313, &x314, x312, x305, x281); + uint32_t x315; + fiat_p256_uint1 x316; + fiat_p256_addcarryx_u32(&x315, &x316, x314, 0x0, x283); + uint32_t x317; + fiat_p256_uint1 x318; + fiat_p256_addcarryx_u32(&x317, &x318, x316, 0x0, x285); + uint32_t x319; + fiat_p256_uint1 x320; + fiat_p256_addcarryx_u32(&x319, &x320, x318, x275, x287); + uint32_t x321; + fiat_p256_uint1 x322; + fiat_p256_addcarryx_u32(&x321, &x322, x320, x293, x289); + uint32_t x323; + fiat_p256_uint1 x324; + fiat_p256_addcarryx_u32(&x323, &x324, x322, x294, x291); + uint32_t x325; + fiat_p256_uint1 x326; + fiat_p256_addcarryx_u32(&x325, &x326, x324, 0x0, x292); + uint32_t x327; + uint32_t x328; + fiat_p256_mulx_u32(&x327, &x328, x4, (arg1[7])); + uint32_t x329; + uint32_t x330; + fiat_p256_mulx_u32(&x329, &x330, x4, (arg1[6])); + uint32_t x331; + uint32_t x332; + fiat_p256_mulx_u32(&x331, &x332, x4, (arg1[5])); + uint32_t x333; + uint32_t x334; + fiat_p256_mulx_u32(&x333, &x334, x4, (arg1[4])); + uint32_t x335; + uint32_t x336; + fiat_p256_mulx_u32(&x335, &x336, x4, (arg1[3])); + uint32_t x337; + uint32_t x338; + fiat_p256_mulx_u32(&x337, &x338, x4, (arg1[2])); + uint32_t x339; + uint32_t x340; + fiat_p256_mulx_u32(&x339, &x340, x4, (arg1[1])); + uint32_t x341; + uint32_t x342; + fiat_p256_mulx_u32(&x341, &x342, x4, (arg1[0])); + uint32_t x343; + fiat_p256_uint1 x344; + fiat_p256_addcarryx_u32(&x343, &x344, 0x0, x339, x342); + uint32_t x345; + fiat_p256_uint1 x346; + fiat_p256_addcarryx_u32(&x345, &x346, x344, x337, x340); + uint32_t x347; + fiat_p256_uint1 x348; + fiat_p256_addcarryx_u32(&x347, &x348, x346, x335, x338); + uint32_t x349; + fiat_p256_uint1 x350; + fiat_p256_addcarryx_u32(&x349, &x350, x348, x333, x336); + uint32_t x351; + fiat_p256_uint1 x352; + fiat_p256_addcarryx_u32(&x351, &x352, x350, x331, x334); + uint32_t x353; + fiat_p256_uint1 x354; + fiat_p256_addcarryx_u32(&x353, &x354, x352, x329, x332); + uint32_t x355; + fiat_p256_uint1 x356; + fiat_p256_addcarryx_u32(&x355, &x356, x354, x327, x330); + uint32_t x357; + fiat_p256_uint1 x358; + fiat_p256_addcarryx_u32(&x357, &x358, x356, 0x0, x328); + uint32_t x359; + fiat_p256_uint1 x360; + fiat_p256_addcarryx_u32(&x359, &x360, 0x0, x341, x309); + uint32_t x361; + fiat_p256_uint1 x362; + fiat_p256_addcarryx_u32(&x361, &x362, x360, x343, x311); + uint32_t x363; + fiat_p256_uint1 x364; + fiat_p256_addcarryx_u32(&x363, &x364, x362, x345, x313); + uint32_t x365; + fiat_p256_uint1 x366; + fiat_p256_addcarryx_u32(&x365, &x366, x364, x347, x315); + uint32_t x367; + fiat_p256_uint1 x368; + fiat_p256_addcarryx_u32(&x367, &x368, x366, x349, x317); + uint32_t x369; + fiat_p256_uint1 x370; + fiat_p256_addcarryx_u32(&x369, &x370, x368, x351, x319); + uint32_t x371; + fiat_p256_uint1 x372; + fiat_p256_addcarryx_u32(&x371, &x372, x370, x353, x321); + uint32_t x373; + fiat_p256_uint1 x374; + fiat_p256_addcarryx_u32(&x373, &x374, x372, x355, x323); + uint32_t x375; + fiat_p256_uint1 x376; + fiat_p256_addcarryx_u32(&x375, &x376, x374, x357, x325); + uint32_t x377; + uint32_t x378; + fiat_p256_mulx_u32(&x377, &x378, x359, UINT32_C(0xffffffff)); + uint32_t x379; + uint32_t x380; + fiat_p256_mulx_u32(&x379, &x380, x359, UINT32_C(0xffffffff)); + uint32_t x381; + uint32_t x382; + fiat_p256_mulx_u32(&x381, &x382, x359, UINT32_C(0xffffffff)); + uint32_t x383; + uint32_t x384; + fiat_p256_mulx_u32(&x383, &x384, x359, UINT32_C(0xffffffff)); + uint32_t x385; + fiat_p256_uint1 x386; + fiat_p256_addcarryx_u32(&x385, &x386, 0x0, x381, x384); + uint32_t x387; + fiat_p256_uint1 x388; + fiat_p256_addcarryx_u32(&x387, &x388, x386, x379, x382); + uint32_t x389; + fiat_p256_uint1 x390; + fiat_p256_addcarryx_u32(&x389, &x390, x388, 0x0, x380); + uint32_t x391; + fiat_p256_uint1 x392; + fiat_p256_addcarryx_u32(&x391, &x392, 0x0, x383, x359); + uint32_t x393; + fiat_p256_uint1 x394; + fiat_p256_addcarryx_u32(&x393, &x394, x392, x385, x361); + uint32_t x395; + fiat_p256_uint1 x396; + fiat_p256_addcarryx_u32(&x395, &x396, x394, x387, x363); + uint32_t x397; + fiat_p256_uint1 x398; + fiat_p256_addcarryx_u32(&x397, &x398, x396, x389, x365); + uint32_t x399; + fiat_p256_uint1 x400; + fiat_p256_addcarryx_u32(&x399, &x400, x398, 0x0, x367); + uint32_t x401; + fiat_p256_uint1 x402; + fiat_p256_addcarryx_u32(&x401, &x402, x400, 0x0, x369); + uint32_t x403; + fiat_p256_uint1 x404; + fiat_p256_addcarryx_u32(&x403, &x404, x402, x359, x371); + uint32_t x405; + fiat_p256_uint1 x406; + fiat_p256_addcarryx_u32(&x405, &x406, x404, x377, x373); + uint32_t x407; + fiat_p256_uint1 x408; + fiat_p256_addcarryx_u32(&x407, &x408, x406, x378, x375); + uint32_t x409; + fiat_p256_uint1 x410; + fiat_p256_addcarryx_u32(&x409, &x410, x408, 0x0, x376); + uint32_t x411; + uint32_t x412; + fiat_p256_mulx_u32(&x411, &x412, x5, (arg1[7])); + uint32_t x413; + uint32_t x414; + fiat_p256_mulx_u32(&x413, &x414, x5, (arg1[6])); + uint32_t x415; + uint32_t x416; + fiat_p256_mulx_u32(&x415, &x416, x5, (arg1[5])); + uint32_t x417; + uint32_t x418; + fiat_p256_mulx_u32(&x417, &x418, x5, (arg1[4])); + uint32_t x419; + uint32_t x420; + fiat_p256_mulx_u32(&x419, &x420, x5, (arg1[3])); + uint32_t x421; + uint32_t x422; + fiat_p256_mulx_u32(&x421, &x422, x5, (arg1[2])); + uint32_t x423; + uint32_t x424; + fiat_p256_mulx_u32(&x423, &x424, x5, (arg1[1])); + uint32_t x425; + uint32_t x426; + fiat_p256_mulx_u32(&x425, &x426, x5, (arg1[0])); + uint32_t x427; + fiat_p256_uint1 x428; + fiat_p256_addcarryx_u32(&x427, &x428, 0x0, x423, x426); + uint32_t x429; + fiat_p256_uint1 x430; + fiat_p256_addcarryx_u32(&x429, &x430, x428, x421, x424); + uint32_t x431; + fiat_p256_uint1 x432; + fiat_p256_addcarryx_u32(&x431, &x432, x430, x419, x422); + uint32_t x433; + fiat_p256_uint1 x434; + fiat_p256_addcarryx_u32(&x433, &x434, x432, x417, x420); + uint32_t x435; + fiat_p256_uint1 x436; + fiat_p256_addcarryx_u32(&x435, &x436, x434, x415, x418); + uint32_t x437; + fiat_p256_uint1 x438; + fiat_p256_addcarryx_u32(&x437, &x438, x436, x413, x416); + uint32_t x439; + fiat_p256_uint1 x440; + fiat_p256_addcarryx_u32(&x439, &x440, x438, x411, x414); + uint32_t x441; + fiat_p256_uint1 x442; + fiat_p256_addcarryx_u32(&x441, &x442, x440, 0x0, x412); + uint32_t x443; + fiat_p256_uint1 x444; + fiat_p256_addcarryx_u32(&x443, &x444, 0x0, x425, x393); + uint32_t x445; + fiat_p256_uint1 x446; + fiat_p256_addcarryx_u32(&x445, &x446, x444, x427, x395); + uint32_t x447; + fiat_p256_uint1 x448; + fiat_p256_addcarryx_u32(&x447, &x448, x446, x429, x397); + uint32_t x449; + fiat_p256_uint1 x450; + fiat_p256_addcarryx_u32(&x449, &x450, x448, x431, x399); + uint32_t x451; + fiat_p256_uint1 x452; + fiat_p256_addcarryx_u32(&x451, &x452, x450, x433, x401); + uint32_t x453; + fiat_p256_uint1 x454; + fiat_p256_addcarryx_u32(&x453, &x454, x452, x435, x403); + uint32_t x455; + fiat_p256_uint1 x456; + fiat_p256_addcarryx_u32(&x455, &x456, x454, x437, x405); + uint32_t x457; + fiat_p256_uint1 x458; + fiat_p256_addcarryx_u32(&x457, &x458, x456, x439, x407); + uint32_t x459; + fiat_p256_uint1 x460; + fiat_p256_addcarryx_u32(&x459, &x460, x458, x441, x409); + uint32_t x461; + uint32_t x462; + fiat_p256_mulx_u32(&x461, &x462, x443, UINT32_C(0xffffffff)); + uint32_t x463; + uint32_t x464; + fiat_p256_mulx_u32(&x463, &x464, x443, UINT32_C(0xffffffff)); + uint32_t x465; + uint32_t x466; + fiat_p256_mulx_u32(&x465, &x466, x443, UINT32_C(0xffffffff)); + uint32_t x467; + uint32_t x468; + fiat_p256_mulx_u32(&x467, &x468, x443, UINT32_C(0xffffffff)); + uint32_t x469; + fiat_p256_uint1 x470; + fiat_p256_addcarryx_u32(&x469, &x470, 0x0, x465, x468); + uint32_t x471; + fiat_p256_uint1 x472; + fiat_p256_addcarryx_u32(&x471, &x472, x470, x463, x466); + uint32_t x473; + fiat_p256_uint1 x474; + fiat_p256_addcarryx_u32(&x473, &x474, x472, 0x0, x464); + uint32_t x475; + fiat_p256_uint1 x476; + fiat_p256_addcarryx_u32(&x475, &x476, 0x0, x467, x443); + uint32_t x477; + fiat_p256_uint1 x478; + fiat_p256_addcarryx_u32(&x477, &x478, x476, x469, x445); + uint32_t x479; + fiat_p256_uint1 x480; + fiat_p256_addcarryx_u32(&x479, &x480, x478, x471, x447); + uint32_t x481; + fiat_p256_uint1 x482; + fiat_p256_addcarryx_u32(&x481, &x482, x480, x473, x449); + uint32_t x483; + fiat_p256_uint1 x484; + fiat_p256_addcarryx_u32(&x483, &x484, x482, 0x0, x451); + uint32_t x485; + fiat_p256_uint1 x486; + fiat_p256_addcarryx_u32(&x485, &x486, x484, 0x0, x453); + uint32_t x487; + fiat_p256_uint1 x488; + fiat_p256_addcarryx_u32(&x487, &x488, x486, x443, x455); + uint32_t x489; + fiat_p256_uint1 x490; + fiat_p256_addcarryx_u32(&x489, &x490, x488, x461, x457); + uint32_t x491; + fiat_p256_uint1 x492; + fiat_p256_addcarryx_u32(&x491, &x492, x490, x462, x459); + uint32_t x493; + fiat_p256_uint1 x494; + fiat_p256_addcarryx_u32(&x493, &x494, x492, 0x0, x460); + uint32_t x495; + uint32_t x496; + fiat_p256_mulx_u32(&x495, &x496, x6, (arg1[7])); + uint32_t x497; + uint32_t x498; + fiat_p256_mulx_u32(&x497, &x498, x6, (arg1[6])); + uint32_t x499; + uint32_t x500; + fiat_p256_mulx_u32(&x499, &x500, x6, (arg1[5])); + uint32_t x501; + uint32_t x502; + fiat_p256_mulx_u32(&x501, &x502, x6, (arg1[4])); + uint32_t x503; + uint32_t x504; + fiat_p256_mulx_u32(&x503, &x504, x6, (arg1[3])); + uint32_t x505; + uint32_t x506; + fiat_p256_mulx_u32(&x505, &x506, x6, (arg1[2])); + uint32_t x507; + uint32_t x508; + fiat_p256_mulx_u32(&x507, &x508, x6, (arg1[1])); + uint32_t x509; + uint32_t x510; + fiat_p256_mulx_u32(&x509, &x510, x6, (arg1[0])); + uint32_t x511; + fiat_p256_uint1 x512; + fiat_p256_addcarryx_u32(&x511, &x512, 0x0, x507, x510); + uint32_t x513; + fiat_p256_uint1 x514; + fiat_p256_addcarryx_u32(&x513, &x514, x512, x505, x508); + uint32_t x515; + fiat_p256_uint1 x516; + fiat_p256_addcarryx_u32(&x515, &x516, x514, x503, x506); + uint32_t x517; + fiat_p256_uint1 x518; + fiat_p256_addcarryx_u32(&x517, &x518, x516, x501, x504); + uint32_t x519; + fiat_p256_uint1 x520; + fiat_p256_addcarryx_u32(&x519, &x520, x518, x499, x502); + uint32_t x521; + fiat_p256_uint1 x522; + fiat_p256_addcarryx_u32(&x521, &x522, x520, x497, x500); + uint32_t x523; + fiat_p256_uint1 x524; + fiat_p256_addcarryx_u32(&x523, &x524, x522, x495, x498); + uint32_t x525; + fiat_p256_uint1 x526; + fiat_p256_addcarryx_u32(&x525, &x526, x524, 0x0, x496); + uint32_t x527; + fiat_p256_uint1 x528; + fiat_p256_addcarryx_u32(&x527, &x528, 0x0, x509, x477); + uint32_t x529; + fiat_p256_uint1 x530; + fiat_p256_addcarryx_u32(&x529, &x530, x528, x511, x479); + uint32_t x531; + fiat_p256_uint1 x532; + fiat_p256_addcarryx_u32(&x531, &x532, x530, x513, x481); + uint32_t x533; + fiat_p256_uint1 x534; + fiat_p256_addcarryx_u32(&x533, &x534, x532, x515, x483); + uint32_t x535; + fiat_p256_uint1 x536; + fiat_p256_addcarryx_u32(&x535, &x536, x534, x517, x485); + uint32_t x537; + fiat_p256_uint1 x538; + fiat_p256_addcarryx_u32(&x537, &x538, x536, x519, x487); + uint32_t x539; + fiat_p256_uint1 x540; + fiat_p256_addcarryx_u32(&x539, &x540, x538, x521, x489); + uint32_t x541; + fiat_p256_uint1 x542; + fiat_p256_addcarryx_u32(&x541, &x542, x540, x523, x491); + uint32_t x543; + fiat_p256_uint1 x544; + fiat_p256_addcarryx_u32(&x543, &x544, x542, x525, x493); + uint32_t x545; + uint32_t x546; + fiat_p256_mulx_u32(&x545, &x546, x527, UINT32_C(0xffffffff)); + uint32_t x547; + uint32_t x548; + fiat_p256_mulx_u32(&x547, &x548, x527, UINT32_C(0xffffffff)); + uint32_t x549; + uint32_t x550; + fiat_p256_mulx_u32(&x549, &x550, x527, UINT32_C(0xffffffff)); + uint32_t x551; + uint32_t x552; + fiat_p256_mulx_u32(&x551, &x552, x527, UINT32_C(0xffffffff)); + uint32_t x553; + fiat_p256_uint1 x554; + fiat_p256_addcarryx_u32(&x553, &x554, 0x0, x549, x552); + uint32_t x555; + fiat_p256_uint1 x556; + fiat_p256_addcarryx_u32(&x555, &x556, x554, x547, x550); + uint32_t x557; + fiat_p256_uint1 x558; + fiat_p256_addcarryx_u32(&x557, &x558, x556, 0x0, x548); + uint32_t x559; + fiat_p256_uint1 x560; + fiat_p256_addcarryx_u32(&x559, &x560, 0x0, x551, x527); + uint32_t x561; + fiat_p256_uint1 x562; + fiat_p256_addcarryx_u32(&x561, &x562, x560, x553, x529); + uint32_t x563; + fiat_p256_uint1 x564; + fiat_p256_addcarryx_u32(&x563, &x564, x562, x555, x531); + uint32_t x565; + fiat_p256_uint1 x566; + fiat_p256_addcarryx_u32(&x565, &x566, x564, x557, x533); + uint32_t x567; + fiat_p256_uint1 x568; + fiat_p256_addcarryx_u32(&x567, &x568, x566, 0x0, x535); + uint32_t x569; + fiat_p256_uint1 x570; + fiat_p256_addcarryx_u32(&x569, &x570, x568, 0x0, x537); + uint32_t x571; + fiat_p256_uint1 x572; + fiat_p256_addcarryx_u32(&x571, &x572, x570, x527, x539); + uint32_t x573; + fiat_p256_uint1 x574; + fiat_p256_addcarryx_u32(&x573, &x574, x572, x545, x541); + uint32_t x575; + fiat_p256_uint1 x576; + fiat_p256_addcarryx_u32(&x575, &x576, x574, x546, x543); + uint32_t x577; + fiat_p256_uint1 x578; + fiat_p256_addcarryx_u32(&x577, &x578, x576, 0x0, x544); + uint32_t x579; + uint32_t x580; + fiat_p256_mulx_u32(&x579, &x580, x7, (arg1[7])); + uint32_t x581; + uint32_t x582; + fiat_p256_mulx_u32(&x581, &x582, x7, (arg1[6])); + uint32_t x583; + uint32_t x584; + fiat_p256_mulx_u32(&x583, &x584, x7, (arg1[5])); + uint32_t x585; + uint32_t x586; + fiat_p256_mulx_u32(&x585, &x586, x7, (arg1[4])); + uint32_t x587; + uint32_t x588; + fiat_p256_mulx_u32(&x587, &x588, x7, (arg1[3])); + uint32_t x589; + uint32_t x590; + fiat_p256_mulx_u32(&x589, &x590, x7, (arg1[2])); + uint32_t x591; + uint32_t x592; + fiat_p256_mulx_u32(&x591, &x592, x7, (arg1[1])); + uint32_t x593; + uint32_t x594; + fiat_p256_mulx_u32(&x593, &x594, x7, (arg1[0])); + uint32_t x595; + fiat_p256_uint1 x596; + fiat_p256_addcarryx_u32(&x595, &x596, 0x0, x591, x594); + uint32_t x597; + fiat_p256_uint1 x598; + fiat_p256_addcarryx_u32(&x597, &x598, x596, x589, x592); + uint32_t x599; + fiat_p256_uint1 x600; + fiat_p256_addcarryx_u32(&x599, &x600, x598, x587, x590); + uint32_t x601; + fiat_p256_uint1 x602; + fiat_p256_addcarryx_u32(&x601, &x602, x600, x585, x588); + uint32_t x603; + fiat_p256_uint1 x604; + fiat_p256_addcarryx_u32(&x603, &x604, x602, x583, x586); + uint32_t x605; + fiat_p256_uint1 x606; + fiat_p256_addcarryx_u32(&x605, &x606, x604, x581, x584); + uint32_t x607; + fiat_p256_uint1 x608; + fiat_p256_addcarryx_u32(&x607, &x608, x606, x579, x582); + uint32_t x609; + fiat_p256_uint1 x610; + fiat_p256_addcarryx_u32(&x609, &x610, x608, 0x0, x580); + uint32_t x611; + fiat_p256_uint1 x612; + fiat_p256_addcarryx_u32(&x611, &x612, 0x0, x593, x561); + uint32_t x613; + fiat_p256_uint1 x614; + fiat_p256_addcarryx_u32(&x613, &x614, x612, x595, x563); + uint32_t x615; + fiat_p256_uint1 x616; + fiat_p256_addcarryx_u32(&x615, &x616, x614, x597, x565); + uint32_t x617; + fiat_p256_uint1 x618; + fiat_p256_addcarryx_u32(&x617, &x618, x616, x599, x567); + uint32_t x619; + fiat_p256_uint1 x620; + fiat_p256_addcarryx_u32(&x619, &x620, x618, x601, x569); + uint32_t x621; + fiat_p256_uint1 x622; + fiat_p256_addcarryx_u32(&x621, &x622, x620, x603, x571); + uint32_t x623; + fiat_p256_uint1 x624; + fiat_p256_addcarryx_u32(&x623, &x624, x622, x605, x573); + uint32_t x625; + fiat_p256_uint1 x626; + fiat_p256_addcarryx_u32(&x625, &x626, x624, x607, x575); + uint32_t x627; + fiat_p256_uint1 x628; + fiat_p256_addcarryx_u32(&x627, &x628, x626, x609, x577); + uint32_t x629; + uint32_t x630; + fiat_p256_mulx_u32(&x629, &x630, x611, UINT32_C(0xffffffff)); + uint32_t x631; + uint32_t x632; + fiat_p256_mulx_u32(&x631, &x632, x611, UINT32_C(0xffffffff)); + uint32_t x633; + uint32_t x634; + fiat_p256_mulx_u32(&x633, &x634, x611, UINT32_C(0xffffffff)); + uint32_t x635; + uint32_t x636; + fiat_p256_mulx_u32(&x635, &x636, x611, UINT32_C(0xffffffff)); + uint32_t x637; + fiat_p256_uint1 x638; + fiat_p256_addcarryx_u32(&x637, &x638, 0x0, x633, x636); + uint32_t x639; + fiat_p256_uint1 x640; + fiat_p256_addcarryx_u32(&x639, &x640, x638, x631, x634); + uint32_t x641; + fiat_p256_uint1 x642; + fiat_p256_addcarryx_u32(&x641, &x642, x640, 0x0, x632); + uint32_t x643; + fiat_p256_uint1 x644; + fiat_p256_addcarryx_u32(&x643, &x644, 0x0, x635, x611); + uint32_t x645; + fiat_p256_uint1 x646; + fiat_p256_addcarryx_u32(&x645, &x646, x644, x637, x613); + uint32_t x647; + fiat_p256_uint1 x648; + fiat_p256_addcarryx_u32(&x647, &x648, x646, x639, x615); + uint32_t x649; + fiat_p256_uint1 x650; + fiat_p256_addcarryx_u32(&x649, &x650, x648, x641, x617); + uint32_t x651; + fiat_p256_uint1 x652; + fiat_p256_addcarryx_u32(&x651, &x652, x650, 0x0, x619); + uint32_t x653; + fiat_p256_uint1 x654; + fiat_p256_addcarryx_u32(&x653, &x654, x652, 0x0, x621); + uint32_t x655; + fiat_p256_uint1 x656; + fiat_p256_addcarryx_u32(&x655, &x656, x654, x611, x623); + uint32_t x657; + fiat_p256_uint1 x658; + fiat_p256_addcarryx_u32(&x657, &x658, x656, x629, x625); + uint32_t x659; + fiat_p256_uint1 x660; + fiat_p256_addcarryx_u32(&x659, &x660, x658, x630, x627); + uint32_t x661; + fiat_p256_uint1 x662; + fiat_p256_addcarryx_u32(&x661, &x662, x660, 0x0, x628); + uint32_t x663; + fiat_p256_uint1 x664; + fiat_p256_subborrowx_u32(&x663, &x664, 0x0, x645, UINT32_C(0xffffffff)); + uint32_t x665; + fiat_p256_uint1 x666; + fiat_p256_subborrowx_u32(&x665, &x666, x664, x647, UINT32_C(0xffffffff)); + uint32_t x667; + fiat_p256_uint1 x668; + fiat_p256_subborrowx_u32(&x667, &x668, x666, x649, UINT32_C(0xffffffff)); + uint32_t x669; + fiat_p256_uint1 x670; + fiat_p256_subborrowx_u32(&x669, &x670, x668, x651, 0x0); + uint32_t x671; + fiat_p256_uint1 x672; + fiat_p256_subborrowx_u32(&x671, &x672, x670, x653, 0x0); + uint32_t x673; + fiat_p256_uint1 x674; + fiat_p256_subborrowx_u32(&x673, &x674, x672, x655, 0x0); + uint32_t x675; + fiat_p256_uint1 x676; + fiat_p256_subborrowx_u32(&x675, &x676, x674, x657, 0x1); + uint32_t x677; + fiat_p256_uint1 x678; + fiat_p256_subborrowx_u32(&x677, &x678, x676, x659, UINT32_C(0xffffffff)); + uint32_t x679; + fiat_p256_uint1 x680; + fiat_p256_subborrowx_u32(&x679, &x680, x678, x661, 0x0); + uint32_t x681; + fiat_p256_cmovznz_u32(&x681, x680, x663, x645); + uint32_t x682; + fiat_p256_cmovznz_u32(&x682, x680, x665, x647); + uint32_t x683; + fiat_p256_cmovznz_u32(&x683, x680, x667, x649); + uint32_t x684; + fiat_p256_cmovznz_u32(&x684, x680, x669, x651); + uint32_t x685; + fiat_p256_cmovznz_u32(&x685, x680, x671, x653); + uint32_t x686; + fiat_p256_cmovznz_u32(&x686, x680, x673, x655); + uint32_t x687; + fiat_p256_cmovznz_u32(&x687, x680, x675, x657); + uint32_t x688; + fiat_p256_cmovznz_u32(&x688, x680, x677, x659); + out1[0] = x681; + out1[1] = x682; + out1[2] = x683; + out1[3] = x684; + out1[4] = x685; + out1[5] = x686; + out1[6] = x687; + out1[7] = x688; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + * arg2: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + */ +static void fiat_p256_add(uint32_t out1[8], const uint32_t arg1[8], const uint32_t arg2[8]) { + uint32_t x1; + fiat_p256_uint1 x2; + fiat_p256_addcarryx_u32(&x1, &x2, 0x0, (arg2[0]), (arg1[0])); + uint32_t x3; + fiat_p256_uint1 x4; + fiat_p256_addcarryx_u32(&x3, &x4, x2, (arg2[1]), (arg1[1])); + uint32_t x5; + fiat_p256_uint1 x6; + fiat_p256_addcarryx_u32(&x5, &x6, x4, (arg2[2]), (arg1[2])); + uint32_t x7; + fiat_p256_uint1 x8; + fiat_p256_addcarryx_u32(&x7, &x8, x6, (arg2[3]), (arg1[3])); + uint32_t x9; + fiat_p256_uint1 x10; + fiat_p256_addcarryx_u32(&x9, &x10, x8, (arg2[4]), (arg1[4])); + uint32_t x11; + fiat_p256_uint1 x12; + fiat_p256_addcarryx_u32(&x11, &x12, x10, (arg2[5]), (arg1[5])); + uint32_t x13; + fiat_p256_uint1 x14; + fiat_p256_addcarryx_u32(&x13, &x14, x12, (arg2[6]), (arg1[6])); + uint32_t x15; + fiat_p256_uint1 x16; + fiat_p256_addcarryx_u32(&x15, &x16, x14, (arg2[7]), (arg1[7])); + uint32_t x17; + fiat_p256_uint1 x18; + fiat_p256_subborrowx_u32(&x17, &x18, 0x0, x1, UINT32_C(0xffffffff)); + uint32_t x19; + fiat_p256_uint1 x20; + fiat_p256_subborrowx_u32(&x19, &x20, x18, x3, UINT32_C(0xffffffff)); + uint32_t x21; + fiat_p256_uint1 x22; + fiat_p256_subborrowx_u32(&x21, &x22, x20, x5, UINT32_C(0xffffffff)); + uint32_t x23; + fiat_p256_uint1 x24; + fiat_p256_subborrowx_u32(&x23, &x24, x22, x7, 0x0); + uint32_t x25; + fiat_p256_uint1 x26; + fiat_p256_subborrowx_u32(&x25, &x26, x24, x9, 0x0); + uint32_t x27; + fiat_p256_uint1 x28; + fiat_p256_subborrowx_u32(&x27, &x28, x26, x11, 0x0); + uint32_t x29; + fiat_p256_uint1 x30; + fiat_p256_subborrowx_u32(&x29, &x30, x28, x13, 0x1); + uint32_t x31; + fiat_p256_uint1 x32; + fiat_p256_subborrowx_u32(&x31, &x32, x30, x15, UINT32_C(0xffffffff)); + uint32_t x33; + fiat_p256_uint1 x34; + fiat_p256_subborrowx_u32(&x33, &x34, x32, x16, 0x0); + uint32_t x35; + fiat_p256_cmovznz_u32(&x35, x34, x17, x1); + uint32_t x36; + fiat_p256_cmovznz_u32(&x36, x34, x19, x3); + uint32_t x37; + fiat_p256_cmovznz_u32(&x37, x34, x21, x5); + uint32_t x38; + fiat_p256_cmovznz_u32(&x38, x34, x23, x7); + uint32_t x39; + fiat_p256_cmovznz_u32(&x39, x34, x25, x9); + uint32_t x40; + fiat_p256_cmovznz_u32(&x40, x34, x27, x11); + uint32_t x41; + fiat_p256_cmovznz_u32(&x41, x34, x29, x13); + uint32_t x42; + fiat_p256_cmovznz_u32(&x42, x34, x31, x15); + out1[0] = x35; + out1[1] = x36; + out1[2] = x37; + out1[3] = x38; + out1[4] = x39; + out1[5] = x40; + out1[6] = x41; + out1[7] = x42; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + * arg2: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + */ +static void fiat_p256_sub(uint32_t out1[8], const uint32_t arg1[8], const uint32_t arg2[8]) { + uint32_t x1; + fiat_p256_uint1 x2; + fiat_p256_subborrowx_u32(&x1, &x2, 0x0, (arg1[0]), (arg2[0])); + uint32_t x3; + fiat_p256_uint1 x4; + fiat_p256_subborrowx_u32(&x3, &x4, x2, (arg1[1]), (arg2[1])); + uint32_t x5; + fiat_p256_uint1 x6; + fiat_p256_subborrowx_u32(&x5, &x6, x4, (arg1[2]), (arg2[2])); + uint32_t x7; + fiat_p256_uint1 x8; + fiat_p256_subborrowx_u32(&x7, &x8, x6, (arg1[3]), (arg2[3])); + uint32_t x9; + fiat_p256_uint1 x10; + fiat_p256_subborrowx_u32(&x9, &x10, x8, (arg1[4]), (arg2[4])); + uint32_t x11; + fiat_p256_uint1 x12; + fiat_p256_subborrowx_u32(&x11, &x12, x10, (arg1[5]), (arg2[5])); + uint32_t x13; + fiat_p256_uint1 x14; + fiat_p256_subborrowx_u32(&x13, &x14, x12, (arg1[6]), (arg2[6])); + uint32_t x15; + fiat_p256_uint1 x16; + fiat_p256_subborrowx_u32(&x15, &x16, x14, (arg1[7]), (arg2[7])); + uint32_t x17; + fiat_p256_cmovznz_u32(&x17, x16, 0x0, UINT32_C(0xffffffff)); + uint32_t x18; + fiat_p256_uint1 x19; + fiat_p256_addcarryx_u32(&x18, &x19, 0x0, (x17 & UINT32_C(0xffffffff)), x1); + uint32_t x20; + fiat_p256_uint1 x21; + fiat_p256_addcarryx_u32(&x20, &x21, x19, (x17 & UINT32_C(0xffffffff)), x3); + uint32_t x22; + fiat_p256_uint1 x23; + fiat_p256_addcarryx_u32(&x22, &x23, x21, (x17 & UINT32_C(0xffffffff)), x5); + uint32_t x24; + fiat_p256_uint1 x25; + fiat_p256_addcarryx_u32(&x24, &x25, x23, 0x0, x7); + uint32_t x26; + fiat_p256_uint1 x27; + fiat_p256_addcarryx_u32(&x26, &x27, x25, 0x0, x9); + uint32_t x28; + fiat_p256_uint1 x29; + fiat_p256_addcarryx_u32(&x28, &x29, x27, 0x0, x11); + uint32_t x30; + fiat_p256_uint1 x31; + fiat_p256_addcarryx_u32(&x30, &x31, x29, (fiat_p256_uint1)(x17 & 0x1), x13); + uint32_t x32; + fiat_p256_uint1 x33; + fiat_p256_addcarryx_u32(&x32, &x33, x31, (x17 & UINT32_C(0xffffffff)), x15); + out1[0] = x18; + out1[1] = x20; + out1[2] = x22; + out1[3] = x24; + out1[4] = x26; + out1[5] = x28; + out1[6] = x30; + out1[7] = x32; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + */ +static void fiat_p256_opp(uint32_t out1[8], const uint32_t arg1[8]) { + uint32_t x1; + fiat_p256_uint1 x2; + fiat_p256_subborrowx_u32(&x1, &x2, 0x0, 0x0, (arg1[0])); + uint32_t x3; + fiat_p256_uint1 x4; + fiat_p256_subborrowx_u32(&x3, &x4, x2, 0x0, (arg1[1])); + uint32_t x5; + fiat_p256_uint1 x6; + fiat_p256_subborrowx_u32(&x5, &x6, x4, 0x0, (arg1[2])); + uint32_t x7; + fiat_p256_uint1 x8; + fiat_p256_subborrowx_u32(&x7, &x8, x6, 0x0, (arg1[3])); + uint32_t x9; + fiat_p256_uint1 x10; + fiat_p256_subborrowx_u32(&x9, &x10, x8, 0x0, (arg1[4])); + uint32_t x11; + fiat_p256_uint1 x12; + fiat_p256_subborrowx_u32(&x11, &x12, x10, 0x0, (arg1[5])); + uint32_t x13; + fiat_p256_uint1 x14; + fiat_p256_subborrowx_u32(&x13, &x14, x12, 0x0, (arg1[6])); + uint32_t x15; + fiat_p256_uint1 x16; + fiat_p256_subborrowx_u32(&x15, &x16, x14, 0x0, (arg1[7])); + uint32_t x17; + fiat_p256_cmovznz_u32(&x17, x16, 0x0, UINT32_C(0xffffffff)); + uint32_t x18; + fiat_p256_uint1 x19; + fiat_p256_addcarryx_u32(&x18, &x19, 0x0, (x17 & UINT32_C(0xffffffff)), x1); + uint32_t x20; + fiat_p256_uint1 x21; + fiat_p256_addcarryx_u32(&x20, &x21, x19, (x17 & UINT32_C(0xffffffff)), x3); + uint32_t x22; + fiat_p256_uint1 x23; + fiat_p256_addcarryx_u32(&x22, &x23, x21, (x17 & UINT32_C(0xffffffff)), x5); + uint32_t x24; + fiat_p256_uint1 x25; + fiat_p256_addcarryx_u32(&x24, &x25, x23, 0x0, x7); + uint32_t x26; + fiat_p256_uint1 x27; + fiat_p256_addcarryx_u32(&x26, &x27, x25, 0x0, x9); + uint32_t x28; + fiat_p256_uint1 x29; + fiat_p256_addcarryx_u32(&x28, &x29, x27, 0x0, x11); + uint32_t x30; + fiat_p256_uint1 x31; + fiat_p256_addcarryx_u32(&x30, &x31, x29, (fiat_p256_uint1)(x17 & 0x1), x13); + uint32_t x32; + fiat_p256_uint1 x33; + fiat_p256_addcarryx_u32(&x32, &x33, x31, (x17 & UINT32_C(0xffffffff)), x15); + out1[0] = x18; + out1[1] = x20; + out1[2] = x22; + out1[3] = x24; + out1[4] = x26; + out1[5] = x28; + out1[6] = x30; + out1[7] = x32; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + */ +static void fiat_p256_from_montgomery(uint32_t out1[8], const uint32_t arg1[8]) { + uint32_t x1 = (arg1[0]); + uint32_t x2; + uint32_t x3; + fiat_p256_mulx_u32(&x2, &x3, x1, UINT32_C(0xffffffff)); + uint32_t x4; + uint32_t x5; + fiat_p256_mulx_u32(&x4, &x5, x1, UINT32_C(0xffffffff)); + uint32_t x6; + uint32_t x7; + fiat_p256_mulx_u32(&x6, &x7, x1, UINT32_C(0xffffffff)); + uint32_t x8; + uint32_t x9; + fiat_p256_mulx_u32(&x8, &x9, x1, UINT32_C(0xffffffff)); + uint32_t x10; + fiat_p256_uint1 x11; + fiat_p256_addcarryx_u32(&x10, &x11, 0x0, x6, x9); + uint32_t x12; + fiat_p256_uint1 x13; + fiat_p256_addcarryx_u32(&x12, &x13, x11, x4, x7); + uint32_t x14; + fiat_p256_uint1 x15; + fiat_p256_addcarryx_u32(&x14, &x15, 0x0, x8, x1); + uint32_t x16; + fiat_p256_uint1 x17; + fiat_p256_addcarryx_u32(&x16, &x17, x15, x10, 0x0); + uint32_t x18; + fiat_p256_uint1 x19; + fiat_p256_addcarryx_u32(&x18, &x19, x17, x12, 0x0); + uint32_t x20; + fiat_p256_uint1 x21; + fiat_p256_addcarryx_u32(&x20, &x21, x13, 0x0, x5); + uint32_t x22; + fiat_p256_uint1 x23; + fiat_p256_addcarryx_u32(&x22, &x23, x19, x20, 0x0); + uint32_t x24; + fiat_p256_uint1 x25; + fiat_p256_addcarryx_u32(&x24, &x25, 0x0, (arg1[1]), x16); + uint32_t x26; + fiat_p256_uint1 x27; + fiat_p256_addcarryx_u32(&x26, &x27, x25, 0x0, x18); + uint32_t x28; + fiat_p256_uint1 x29; + fiat_p256_addcarryx_u32(&x28, &x29, x27, 0x0, x22); + uint32_t x30; + uint32_t x31; + fiat_p256_mulx_u32(&x30, &x31, x24, UINT32_C(0xffffffff)); + uint32_t x32; + uint32_t x33; + fiat_p256_mulx_u32(&x32, &x33, x24, UINT32_C(0xffffffff)); + uint32_t x34; + uint32_t x35; + fiat_p256_mulx_u32(&x34, &x35, x24, UINT32_C(0xffffffff)); + uint32_t x36; + uint32_t x37; + fiat_p256_mulx_u32(&x36, &x37, x24, UINT32_C(0xffffffff)); + uint32_t x38; + fiat_p256_uint1 x39; + fiat_p256_addcarryx_u32(&x38, &x39, 0x0, x34, x37); + uint32_t x40; + fiat_p256_uint1 x41; + fiat_p256_addcarryx_u32(&x40, &x41, x39, x32, x35); + uint32_t x42; + fiat_p256_uint1 x43; + fiat_p256_addcarryx_u32(&x42, &x43, 0x0, x36, x24); + uint32_t x44; + fiat_p256_uint1 x45; + fiat_p256_addcarryx_u32(&x44, &x45, x43, x38, x26); + uint32_t x46; + fiat_p256_uint1 x47; + fiat_p256_addcarryx_u32(&x46, &x47, x45, x40, x28); + uint32_t x48; + fiat_p256_uint1 x49; + fiat_p256_addcarryx_u32(&x48, &x49, x23, 0x0, 0x0); + uint32_t x50; + fiat_p256_uint1 x51; + fiat_p256_addcarryx_u32(&x50, &x51, x29, 0x0, (fiat_p256_uint1)x48); + uint32_t x52; + fiat_p256_uint1 x53; + fiat_p256_addcarryx_u32(&x52, &x53, x41, 0x0, x33); + uint32_t x54; + fiat_p256_uint1 x55; + fiat_p256_addcarryx_u32(&x54, &x55, x47, x52, x50); + uint32_t x56; + fiat_p256_uint1 x57; + fiat_p256_addcarryx_u32(&x56, &x57, 0x0, x24, x2); + uint32_t x58; + fiat_p256_uint1 x59; + fiat_p256_addcarryx_u32(&x58, &x59, x57, x30, x3); + uint32_t x60; + fiat_p256_uint1 x61; + fiat_p256_addcarryx_u32(&x60, &x61, 0x0, (arg1[2]), x44); + uint32_t x62; + fiat_p256_uint1 x63; + fiat_p256_addcarryx_u32(&x62, &x63, x61, 0x0, x46); + uint32_t x64; + fiat_p256_uint1 x65; + fiat_p256_addcarryx_u32(&x64, &x65, x63, 0x0, x54); + uint32_t x66; + uint32_t x67; + fiat_p256_mulx_u32(&x66, &x67, x60, UINT32_C(0xffffffff)); + uint32_t x68; + uint32_t x69; + fiat_p256_mulx_u32(&x68, &x69, x60, UINT32_C(0xffffffff)); + uint32_t x70; + uint32_t x71; + fiat_p256_mulx_u32(&x70, &x71, x60, UINT32_C(0xffffffff)); + uint32_t x72; + uint32_t x73; + fiat_p256_mulx_u32(&x72, &x73, x60, UINT32_C(0xffffffff)); + uint32_t x74; + fiat_p256_uint1 x75; + fiat_p256_addcarryx_u32(&x74, &x75, 0x0, x70, x73); + uint32_t x76; + fiat_p256_uint1 x77; + fiat_p256_addcarryx_u32(&x76, &x77, x75, x68, x71); + uint32_t x78; + fiat_p256_uint1 x79; + fiat_p256_addcarryx_u32(&x78, &x79, 0x0, x72, x60); + uint32_t x80; + fiat_p256_uint1 x81; + fiat_p256_addcarryx_u32(&x80, &x81, x79, x74, x62); + uint32_t x82; + fiat_p256_uint1 x83; + fiat_p256_addcarryx_u32(&x82, &x83, x81, x76, x64); + uint32_t x84; + fiat_p256_uint1 x85; + fiat_p256_addcarryx_u32(&x84, &x85, x55, 0x0, 0x0); + uint32_t x86; + fiat_p256_uint1 x87; + fiat_p256_addcarryx_u32(&x86, &x87, x65, 0x0, (fiat_p256_uint1)x84); + uint32_t x88; + fiat_p256_uint1 x89; + fiat_p256_addcarryx_u32(&x88, &x89, x77, 0x0, x69); + uint32_t x90; + fiat_p256_uint1 x91; + fiat_p256_addcarryx_u32(&x90, &x91, x83, x88, x86); + uint32_t x92; + fiat_p256_uint1 x93; + fiat_p256_addcarryx_u32(&x92, &x93, x91, 0x0, x1); + uint32_t x94; + fiat_p256_uint1 x95; + fiat_p256_addcarryx_u32(&x94, &x95, x93, 0x0, x56); + uint32_t x96; + fiat_p256_uint1 x97; + fiat_p256_addcarryx_u32(&x96, &x97, x95, x60, x58); + uint32_t x98; + fiat_p256_uint1 x99; + fiat_p256_addcarryx_u32(&x98, &x99, x59, x31, 0x0); + uint32_t x100; + fiat_p256_uint1 x101; + fiat_p256_addcarryx_u32(&x100, &x101, x97, x66, x98); + uint32_t x102; + fiat_p256_uint1 x103; + fiat_p256_addcarryx_u32(&x102, &x103, 0x0, (arg1[3]), x80); + uint32_t x104; + fiat_p256_uint1 x105; + fiat_p256_addcarryx_u32(&x104, &x105, x103, 0x0, x82); + uint32_t x106; + fiat_p256_uint1 x107; + fiat_p256_addcarryx_u32(&x106, &x107, x105, 0x0, x90); + uint32_t x108; + fiat_p256_uint1 x109; + fiat_p256_addcarryx_u32(&x108, &x109, x107, 0x0, x92); + uint32_t x110; + fiat_p256_uint1 x111; + fiat_p256_addcarryx_u32(&x110, &x111, x109, 0x0, x94); + uint32_t x112; + fiat_p256_uint1 x113; + fiat_p256_addcarryx_u32(&x112, &x113, x111, 0x0, x96); + uint32_t x114; + fiat_p256_uint1 x115; + fiat_p256_addcarryx_u32(&x114, &x115, x113, 0x0, x100); + uint32_t x116; + fiat_p256_uint1 x117; + fiat_p256_addcarryx_u32(&x116, &x117, x101, x67, 0x0); + uint32_t x118; + fiat_p256_uint1 x119; + fiat_p256_addcarryx_u32(&x118, &x119, x115, 0x0, x116); + uint32_t x120; + uint32_t x121; + fiat_p256_mulx_u32(&x120, &x121, x102, UINT32_C(0xffffffff)); + uint32_t x122; + uint32_t x123; + fiat_p256_mulx_u32(&x122, &x123, x102, UINT32_C(0xffffffff)); + uint32_t x124; + uint32_t x125; + fiat_p256_mulx_u32(&x124, &x125, x102, UINT32_C(0xffffffff)); + uint32_t x126; + uint32_t x127; + fiat_p256_mulx_u32(&x126, &x127, x102, UINT32_C(0xffffffff)); + uint32_t x128; + fiat_p256_uint1 x129; + fiat_p256_addcarryx_u32(&x128, &x129, 0x0, x124, x127); + uint32_t x130; + fiat_p256_uint1 x131; + fiat_p256_addcarryx_u32(&x130, &x131, x129, x122, x125); + uint32_t x132; + fiat_p256_uint1 x133; + fiat_p256_addcarryx_u32(&x132, &x133, 0x0, x126, x102); + uint32_t x134; + fiat_p256_uint1 x135; + fiat_p256_addcarryx_u32(&x134, &x135, x133, x128, x104); + uint32_t x136; + fiat_p256_uint1 x137; + fiat_p256_addcarryx_u32(&x136, &x137, x135, x130, x106); + uint32_t x138; + fiat_p256_uint1 x139; + fiat_p256_addcarryx_u32(&x138, &x139, x131, 0x0, x123); + uint32_t x140; + fiat_p256_uint1 x141; + fiat_p256_addcarryx_u32(&x140, &x141, x137, x138, x108); + uint32_t x142; + fiat_p256_uint1 x143; + fiat_p256_addcarryx_u32(&x142, &x143, x141, 0x0, x110); + uint32_t x144; + fiat_p256_uint1 x145; + fiat_p256_addcarryx_u32(&x144, &x145, x143, 0x0, x112); + uint32_t x146; + fiat_p256_uint1 x147; + fiat_p256_addcarryx_u32(&x146, &x147, x145, x102, x114); + uint32_t x148; + fiat_p256_uint1 x149; + fiat_p256_addcarryx_u32(&x148, &x149, x147, x120, x118); + uint32_t x150; + fiat_p256_uint1 x151; + fiat_p256_addcarryx_u32(&x150, &x151, x119, 0x0, 0x0); + uint32_t x152; + fiat_p256_uint1 x153; + fiat_p256_addcarryx_u32(&x152, &x153, x149, x121, (fiat_p256_uint1)x150); + uint32_t x154; + fiat_p256_uint1 x155; + fiat_p256_addcarryx_u32(&x154, &x155, 0x0, (arg1[4]), x134); + uint32_t x156; + fiat_p256_uint1 x157; + fiat_p256_addcarryx_u32(&x156, &x157, x155, 0x0, x136); + uint32_t x158; + fiat_p256_uint1 x159; + fiat_p256_addcarryx_u32(&x158, &x159, x157, 0x0, x140); + uint32_t x160; + fiat_p256_uint1 x161; + fiat_p256_addcarryx_u32(&x160, &x161, x159, 0x0, x142); + uint32_t x162; + fiat_p256_uint1 x163; + fiat_p256_addcarryx_u32(&x162, &x163, x161, 0x0, x144); + uint32_t x164; + fiat_p256_uint1 x165; + fiat_p256_addcarryx_u32(&x164, &x165, x163, 0x0, x146); + uint32_t x166; + fiat_p256_uint1 x167; + fiat_p256_addcarryx_u32(&x166, &x167, x165, 0x0, x148); + uint32_t x168; + fiat_p256_uint1 x169; + fiat_p256_addcarryx_u32(&x168, &x169, x167, 0x0, x152); + uint32_t x170; + uint32_t x171; + fiat_p256_mulx_u32(&x170, &x171, x154, UINT32_C(0xffffffff)); + uint32_t x172; + uint32_t x173; + fiat_p256_mulx_u32(&x172, &x173, x154, UINT32_C(0xffffffff)); + uint32_t x174; + uint32_t x175; + fiat_p256_mulx_u32(&x174, &x175, x154, UINT32_C(0xffffffff)); + uint32_t x176; + uint32_t x177; + fiat_p256_mulx_u32(&x176, &x177, x154, UINT32_C(0xffffffff)); + uint32_t x178; + fiat_p256_uint1 x179; + fiat_p256_addcarryx_u32(&x178, &x179, 0x0, x174, x177); + uint32_t x180; + fiat_p256_uint1 x181; + fiat_p256_addcarryx_u32(&x180, &x181, x179, x172, x175); + uint32_t x182; + fiat_p256_uint1 x183; + fiat_p256_addcarryx_u32(&x182, &x183, 0x0, x176, x154); + uint32_t x184; + fiat_p256_uint1 x185; + fiat_p256_addcarryx_u32(&x184, &x185, x183, x178, x156); + uint32_t x186; + fiat_p256_uint1 x187; + fiat_p256_addcarryx_u32(&x186, &x187, x185, x180, x158); + uint32_t x188; + fiat_p256_uint1 x189; + fiat_p256_addcarryx_u32(&x188, &x189, x181, 0x0, x173); + uint32_t x190; + fiat_p256_uint1 x191; + fiat_p256_addcarryx_u32(&x190, &x191, x187, x188, x160); + uint32_t x192; + fiat_p256_uint1 x193; + fiat_p256_addcarryx_u32(&x192, &x193, x191, 0x0, x162); + uint32_t x194; + fiat_p256_uint1 x195; + fiat_p256_addcarryx_u32(&x194, &x195, x193, 0x0, x164); + uint32_t x196; + fiat_p256_uint1 x197; + fiat_p256_addcarryx_u32(&x196, &x197, x195, x154, x166); + uint32_t x198; + fiat_p256_uint1 x199; + fiat_p256_addcarryx_u32(&x198, &x199, x197, x170, x168); + uint32_t x200; + fiat_p256_uint1 x201; + fiat_p256_addcarryx_u32(&x200, &x201, x153, 0x0, 0x0); + uint32_t x202; + fiat_p256_uint1 x203; + fiat_p256_addcarryx_u32(&x202, &x203, x169, 0x0, (fiat_p256_uint1)x200); + uint32_t x204; + fiat_p256_uint1 x205; + fiat_p256_addcarryx_u32(&x204, &x205, x199, x171, x202); + uint32_t x206; + fiat_p256_uint1 x207; + fiat_p256_addcarryx_u32(&x206, &x207, 0x0, (arg1[5]), x184); + uint32_t x208; + fiat_p256_uint1 x209; + fiat_p256_addcarryx_u32(&x208, &x209, x207, 0x0, x186); + uint32_t x210; + fiat_p256_uint1 x211; + fiat_p256_addcarryx_u32(&x210, &x211, x209, 0x0, x190); + uint32_t x212; + fiat_p256_uint1 x213; + fiat_p256_addcarryx_u32(&x212, &x213, x211, 0x0, x192); + uint32_t x214; + fiat_p256_uint1 x215; + fiat_p256_addcarryx_u32(&x214, &x215, x213, 0x0, x194); + uint32_t x216; + fiat_p256_uint1 x217; + fiat_p256_addcarryx_u32(&x216, &x217, x215, 0x0, x196); + uint32_t x218; + fiat_p256_uint1 x219; + fiat_p256_addcarryx_u32(&x218, &x219, x217, 0x0, x198); + uint32_t x220; + fiat_p256_uint1 x221; + fiat_p256_addcarryx_u32(&x220, &x221, x219, 0x0, x204); + uint32_t x222; + uint32_t x223; + fiat_p256_mulx_u32(&x222, &x223, x206, UINT32_C(0xffffffff)); + uint32_t x224; + uint32_t x225; + fiat_p256_mulx_u32(&x224, &x225, x206, UINT32_C(0xffffffff)); + uint32_t x226; + uint32_t x227; + fiat_p256_mulx_u32(&x226, &x227, x206, UINT32_C(0xffffffff)); + uint32_t x228; + uint32_t x229; + fiat_p256_mulx_u32(&x228, &x229, x206, UINT32_C(0xffffffff)); + uint32_t x230; + fiat_p256_uint1 x231; + fiat_p256_addcarryx_u32(&x230, &x231, 0x0, x226, x229); + uint32_t x232; + fiat_p256_uint1 x233; + fiat_p256_addcarryx_u32(&x232, &x233, x231, x224, x227); + uint32_t x234; + fiat_p256_uint1 x235; + fiat_p256_addcarryx_u32(&x234, &x235, 0x0, x228, x206); + uint32_t x236; + fiat_p256_uint1 x237; + fiat_p256_addcarryx_u32(&x236, &x237, x235, x230, x208); + uint32_t x238; + fiat_p256_uint1 x239; + fiat_p256_addcarryx_u32(&x238, &x239, x237, x232, x210); + uint32_t x240; + fiat_p256_uint1 x241; + fiat_p256_addcarryx_u32(&x240, &x241, x233, 0x0, x225); + uint32_t x242; + fiat_p256_uint1 x243; + fiat_p256_addcarryx_u32(&x242, &x243, x239, x240, x212); + uint32_t x244; + fiat_p256_uint1 x245; + fiat_p256_addcarryx_u32(&x244, &x245, x243, 0x0, x214); + uint32_t x246; + fiat_p256_uint1 x247; + fiat_p256_addcarryx_u32(&x246, &x247, x245, 0x0, x216); + uint32_t x248; + fiat_p256_uint1 x249; + fiat_p256_addcarryx_u32(&x248, &x249, x247, x206, x218); + uint32_t x250; + fiat_p256_uint1 x251; + fiat_p256_addcarryx_u32(&x250, &x251, x249, x222, x220); + uint32_t x252; + fiat_p256_uint1 x253; + fiat_p256_addcarryx_u32(&x252, &x253, x205, 0x0, 0x0); + uint32_t x254; + fiat_p256_uint1 x255; + fiat_p256_addcarryx_u32(&x254, &x255, x221, 0x0, (fiat_p256_uint1)x252); + uint32_t x256; + fiat_p256_uint1 x257; + fiat_p256_addcarryx_u32(&x256, &x257, x251, x223, x254); + uint32_t x258; + fiat_p256_uint1 x259; + fiat_p256_addcarryx_u32(&x258, &x259, 0x0, (arg1[6]), x236); + uint32_t x260; + fiat_p256_uint1 x261; + fiat_p256_addcarryx_u32(&x260, &x261, x259, 0x0, x238); + uint32_t x262; + fiat_p256_uint1 x263; + fiat_p256_addcarryx_u32(&x262, &x263, x261, 0x0, x242); + uint32_t x264; + fiat_p256_uint1 x265; + fiat_p256_addcarryx_u32(&x264, &x265, x263, 0x0, x244); + uint32_t x266; + fiat_p256_uint1 x267; + fiat_p256_addcarryx_u32(&x266, &x267, x265, 0x0, x246); + uint32_t x268; + fiat_p256_uint1 x269; + fiat_p256_addcarryx_u32(&x268, &x269, x267, 0x0, x248); + uint32_t x270; + fiat_p256_uint1 x271; + fiat_p256_addcarryx_u32(&x270, &x271, x269, 0x0, x250); + uint32_t x272; + fiat_p256_uint1 x273; + fiat_p256_addcarryx_u32(&x272, &x273, x271, 0x0, x256); + uint32_t x274; + uint32_t x275; + fiat_p256_mulx_u32(&x274, &x275, x258, UINT32_C(0xffffffff)); + uint32_t x276; + uint32_t x277; + fiat_p256_mulx_u32(&x276, &x277, x258, UINT32_C(0xffffffff)); + uint32_t x278; + uint32_t x279; + fiat_p256_mulx_u32(&x278, &x279, x258, UINT32_C(0xffffffff)); + uint32_t x280; + uint32_t x281; + fiat_p256_mulx_u32(&x280, &x281, x258, UINT32_C(0xffffffff)); + uint32_t x282; + fiat_p256_uint1 x283; + fiat_p256_addcarryx_u32(&x282, &x283, 0x0, x278, x281); + uint32_t x284; + fiat_p256_uint1 x285; + fiat_p256_addcarryx_u32(&x284, &x285, x283, x276, x279); + uint32_t x286; + fiat_p256_uint1 x287; + fiat_p256_addcarryx_u32(&x286, &x287, 0x0, x280, x258); + uint32_t x288; + fiat_p256_uint1 x289; + fiat_p256_addcarryx_u32(&x288, &x289, x287, x282, x260); + uint32_t x290; + fiat_p256_uint1 x291; + fiat_p256_addcarryx_u32(&x290, &x291, x289, x284, x262); + uint32_t x292; + fiat_p256_uint1 x293; + fiat_p256_addcarryx_u32(&x292, &x293, x285, 0x0, x277); + uint32_t x294; + fiat_p256_uint1 x295; + fiat_p256_addcarryx_u32(&x294, &x295, x291, x292, x264); + uint32_t x296; + fiat_p256_uint1 x297; + fiat_p256_addcarryx_u32(&x296, &x297, x295, 0x0, x266); + uint32_t x298; + fiat_p256_uint1 x299; + fiat_p256_addcarryx_u32(&x298, &x299, x297, 0x0, x268); + uint32_t x300; + fiat_p256_uint1 x301; + fiat_p256_addcarryx_u32(&x300, &x301, x299, x258, x270); + uint32_t x302; + fiat_p256_uint1 x303; + fiat_p256_addcarryx_u32(&x302, &x303, x301, x274, x272); + uint32_t x304; + fiat_p256_uint1 x305; + fiat_p256_addcarryx_u32(&x304, &x305, x257, 0x0, 0x0); + uint32_t x306; + fiat_p256_uint1 x307; + fiat_p256_addcarryx_u32(&x306, &x307, x273, 0x0, (fiat_p256_uint1)x304); + uint32_t x308; + fiat_p256_uint1 x309; + fiat_p256_addcarryx_u32(&x308, &x309, x303, x275, x306); + uint32_t x310; + fiat_p256_uint1 x311; + fiat_p256_addcarryx_u32(&x310, &x311, 0x0, (arg1[7]), x288); + uint32_t x312; + fiat_p256_uint1 x313; + fiat_p256_addcarryx_u32(&x312, &x313, x311, 0x0, x290); + uint32_t x314; + fiat_p256_uint1 x315; + fiat_p256_addcarryx_u32(&x314, &x315, x313, 0x0, x294); + uint32_t x316; + fiat_p256_uint1 x317; + fiat_p256_addcarryx_u32(&x316, &x317, x315, 0x0, x296); + uint32_t x318; + fiat_p256_uint1 x319; + fiat_p256_addcarryx_u32(&x318, &x319, x317, 0x0, x298); + uint32_t x320; + fiat_p256_uint1 x321; + fiat_p256_addcarryx_u32(&x320, &x321, x319, 0x0, x300); + uint32_t x322; + fiat_p256_uint1 x323; + fiat_p256_addcarryx_u32(&x322, &x323, x321, 0x0, x302); + uint32_t x324; + fiat_p256_uint1 x325; + fiat_p256_addcarryx_u32(&x324, &x325, x323, 0x0, x308); + uint32_t x326; + uint32_t x327; + fiat_p256_mulx_u32(&x326, &x327, x310, UINT32_C(0xffffffff)); + uint32_t x328; + uint32_t x329; + fiat_p256_mulx_u32(&x328, &x329, x310, UINT32_C(0xffffffff)); + uint32_t x330; + uint32_t x331; + fiat_p256_mulx_u32(&x330, &x331, x310, UINT32_C(0xffffffff)); + uint32_t x332; + uint32_t x333; + fiat_p256_mulx_u32(&x332, &x333, x310, UINT32_C(0xffffffff)); + uint32_t x334; + fiat_p256_uint1 x335; + fiat_p256_addcarryx_u32(&x334, &x335, 0x0, x330, x333); + uint32_t x336; + fiat_p256_uint1 x337; + fiat_p256_addcarryx_u32(&x336, &x337, x335, x328, x331); + uint32_t x338; + fiat_p256_uint1 x339; + fiat_p256_addcarryx_u32(&x338, &x339, 0x0, x332, x310); + uint32_t x340; + fiat_p256_uint1 x341; + fiat_p256_addcarryx_u32(&x340, &x341, x339, x334, x312); + uint32_t x342; + fiat_p256_uint1 x343; + fiat_p256_addcarryx_u32(&x342, &x343, x341, x336, x314); + uint32_t x344; + fiat_p256_uint1 x345; + fiat_p256_addcarryx_u32(&x344, &x345, x337, 0x0, x329); + uint32_t x346; + fiat_p256_uint1 x347; + fiat_p256_addcarryx_u32(&x346, &x347, x343, x344, x316); + uint32_t x348; + fiat_p256_uint1 x349; + fiat_p256_addcarryx_u32(&x348, &x349, x347, 0x0, x318); + uint32_t x350; + fiat_p256_uint1 x351; + fiat_p256_addcarryx_u32(&x350, &x351, x349, 0x0, x320); + uint32_t x352; + fiat_p256_uint1 x353; + fiat_p256_addcarryx_u32(&x352, &x353, x351, x310, x322); + uint32_t x354; + fiat_p256_uint1 x355; + fiat_p256_addcarryx_u32(&x354, &x355, x353, x326, x324); + uint32_t x356; + fiat_p256_uint1 x357; + fiat_p256_addcarryx_u32(&x356, &x357, x309, 0x0, 0x0); + uint32_t x358; + fiat_p256_uint1 x359; + fiat_p256_addcarryx_u32(&x358, &x359, x325, 0x0, (fiat_p256_uint1)x356); + uint32_t x360; + fiat_p256_uint1 x361; + fiat_p256_addcarryx_u32(&x360, &x361, x355, x327, x358); + uint32_t x362; + fiat_p256_uint1 x363; + fiat_p256_subborrowx_u32(&x362, &x363, 0x0, x340, UINT32_C(0xffffffff)); + uint32_t x364; + fiat_p256_uint1 x365; + fiat_p256_subborrowx_u32(&x364, &x365, x363, x342, UINT32_C(0xffffffff)); + uint32_t x366; + fiat_p256_uint1 x367; + fiat_p256_subborrowx_u32(&x366, &x367, x365, x346, UINT32_C(0xffffffff)); + uint32_t x368; + fiat_p256_uint1 x369; + fiat_p256_subborrowx_u32(&x368, &x369, x367, x348, 0x0); + uint32_t x370; + fiat_p256_uint1 x371; + fiat_p256_subborrowx_u32(&x370, &x371, x369, x350, 0x0); + uint32_t x372; + fiat_p256_uint1 x373; + fiat_p256_subborrowx_u32(&x372, &x373, x371, x352, 0x0); + uint32_t x374; + fiat_p256_uint1 x375; + fiat_p256_subborrowx_u32(&x374, &x375, x373, x354, 0x1); + uint32_t x376; + fiat_p256_uint1 x377; + fiat_p256_subborrowx_u32(&x376, &x377, x375, x360, UINT32_C(0xffffffff)); + uint32_t x378; + fiat_p256_uint1 x379; + fiat_p256_addcarryx_u32(&x378, &x379, x361, 0x0, 0x0); + uint32_t x380; + fiat_p256_uint1 x381; + fiat_p256_subborrowx_u32(&x380, &x381, x377, (fiat_p256_uint1)x378, 0x0); + uint32_t x382; + fiat_p256_cmovznz_u32(&x382, x381, x362, x340); + uint32_t x383; + fiat_p256_cmovznz_u32(&x383, x381, x364, x342); + uint32_t x384; + fiat_p256_cmovznz_u32(&x384, x381, x366, x346); + uint32_t x385; + fiat_p256_cmovznz_u32(&x385, x381, x368, x348); + uint32_t x386; + fiat_p256_cmovznz_u32(&x386, x381, x370, x350); + uint32_t x387; + fiat_p256_cmovznz_u32(&x387, x381, x372, x352); + uint32_t x388; + fiat_p256_cmovznz_u32(&x388, x381, x374, x354); + uint32_t x389; + fiat_p256_cmovznz_u32(&x389, x381, x376, x360); + out1[0] = x382; + out1[1] = x383; + out1[2] = x384; + out1[3] = x385; + out1[4] = x386; + out1[5] = x387; + out1[6] = x388; + out1[7] = x389; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + * Output Bounds: + * out1: [0x0 ~> 0xffffffff] + */ +static void fiat_p256_nonzero(uint32_t* out1, const uint32_t arg1[8]) { + uint32_t x1 = ((arg1[0]) | ((arg1[1]) | ((arg1[2]) | ((arg1[3]) | ((arg1[4]) | ((arg1[5]) | ((arg1[6]) | ((arg1[7]) | (uint32_t)0x0)))))))); + *out1 = x1; +} + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + * arg3: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + */ +static void fiat_p256_selectznz(uint32_t out1[8], fiat_p256_uint1 arg1, const uint32_t arg2[8], const uint32_t arg3[8]) { + uint32_t x1; + fiat_p256_cmovznz_u32(&x1, arg1, (arg2[0]), (arg3[0])); + uint32_t x2; + fiat_p256_cmovznz_u32(&x2, arg1, (arg2[1]), (arg3[1])); + uint32_t x3; + fiat_p256_cmovznz_u32(&x3, arg1, (arg2[2]), (arg3[2])); + uint32_t x4; + fiat_p256_cmovznz_u32(&x4, arg1, (arg2[3]), (arg3[3])); + uint32_t x5; + fiat_p256_cmovznz_u32(&x5, arg1, (arg2[4]), (arg3[4])); + uint32_t x6; + fiat_p256_cmovznz_u32(&x6, arg1, (arg2[5]), (arg3[5])); + uint32_t x7; + fiat_p256_cmovznz_u32(&x7, arg1, (arg2[6]), (arg3[6])); + uint32_t x8; + fiat_p256_cmovznz_u32(&x8, arg1, (arg2[7]), (arg3[7])); + out1[0] = x1; + out1[1] = x2; + out1[2] = x3; + out1[3] = x4; + out1[4] = x5; + out1[5] = x6; + out1[6] = x7; + out1[7] = x8; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff]] + */ +static void fiat_p256_to_bytes(uint8_t out1[32], const uint32_t arg1[8]) { + uint32_t x1 = (arg1[7]); + uint32_t x2 = (arg1[6]); + uint32_t x3 = (arg1[5]); + uint32_t x4 = (arg1[4]); + uint32_t x5 = (arg1[3]); + uint32_t x6 = (arg1[2]); + uint32_t x7 = (arg1[1]); + uint32_t x8 = (arg1[0]); + uint32_t x9 = (x8 >> 8); + uint8_t x10 = (uint8_t)(x8 & UINT8_C(0xff)); + uint32_t x11 = (x9 >> 8); + uint8_t x12 = (uint8_t)(x9 & UINT8_C(0xff)); + uint8_t x13 = (uint8_t)(x11 >> 8); + uint8_t x14 = (uint8_t)(x11 & UINT8_C(0xff)); + uint8_t x15 = (uint8_t)(x13 & UINT8_C(0xff)); + uint32_t x16 = (x7 >> 8); + uint8_t x17 = (uint8_t)(x7 & UINT8_C(0xff)); + uint32_t x18 = (x16 >> 8); + uint8_t x19 = (uint8_t)(x16 & UINT8_C(0xff)); + uint8_t x20 = (uint8_t)(x18 >> 8); + uint8_t x21 = (uint8_t)(x18 & UINT8_C(0xff)); + uint8_t x22 = (uint8_t)(x20 & UINT8_C(0xff)); + uint32_t x23 = (x6 >> 8); + uint8_t x24 = (uint8_t)(x6 & UINT8_C(0xff)); + uint32_t x25 = (x23 >> 8); + uint8_t x26 = (uint8_t)(x23 & UINT8_C(0xff)); + uint8_t x27 = (uint8_t)(x25 >> 8); + uint8_t x28 = (uint8_t)(x25 & UINT8_C(0xff)); + uint8_t x29 = (uint8_t)(x27 & UINT8_C(0xff)); + uint32_t x30 = (x5 >> 8); + uint8_t x31 = (uint8_t)(x5 & UINT8_C(0xff)); + uint32_t x32 = (x30 >> 8); + uint8_t x33 = (uint8_t)(x30 & UINT8_C(0xff)); + uint8_t x34 = (uint8_t)(x32 >> 8); + uint8_t x35 = (uint8_t)(x32 & UINT8_C(0xff)); + uint8_t x36 = (uint8_t)(x34 & UINT8_C(0xff)); + uint32_t x37 = (x4 >> 8); + uint8_t x38 = (uint8_t)(x4 & UINT8_C(0xff)); + uint32_t x39 = (x37 >> 8); + uint8_t x40 = (uint8_t)(x37 & UINT8_C(0xff)); + uint8_t x41 = (uint8_t)(x39 >> 8); + uint8_t x42 = (uint8_t)(x39 & UINT8_C(0xff)); + uint8_t x43 = (uint8_t)(x41 & UINT8_C(0xff)); + uint32_t x44 = (x3 >> 8); + uint8_t x45 = (uint8_t)(x3 & UINT8_C(0xff)); + uint32_t x46 = (x44 >> 8); + uint8_t x47 = (uint8_t)(x44 & UINT8_C(0xff)); + uint8_t x48 = (uint8_t)(x46 >> 8); + uint8_t x49 = (uint8_t)(x46 & UINT8_C(0xff)); + uint8_t x50 = (uint8_t)(x48 & UINT8_C(0xff)); + uint32_t x51 = (x2 >> 8); + uint8_t x52 = (uint8_t)(x2 & UINT8_C(0xff)); + uint32_t x53 = (x51 >> 8); + uint8_t x54 = (uint8_t)(x51 & UINT8_C(0xff)); + uint8_t x55 = (uint8_t)(x53 >> 8); + uint8_t x56 = (uint8_t)(x53 & UINT8_C(0xff)); + uint8_t x57 = (uint8_t)(x55 & UINT8_C(0xff)); + uint32_t x58 = (x1 >> 8); + uint8_t x59 = (uint8_t)(x1 & UINT8_C(0xff)); + uint32_t x60 = (x58 >> 8); + uint8_t x61 = (uint8_t)(x58 & UINT8_C(0xff)); + uint8_t x62 = (uint8_t)(x60 >> 8); + uint8_t x63 = (uint8_t)(x60 & UINT8_C(0xff)); + out1[0] = x10; + out1[1] = x12; + out1[2] = x14; + out1[3] = x15; + out1[4] = x17; + out1[5] = x19; + out1[6] = x21; + out1[7] = x22; + out1[8] = x24; + out1[9] = x26; + out1[10] = x28; + out1[11] = x29; + out1[12] = x31; + out1[13] = x33; + out1[14] = x35; + out1[15] = x36; + out1[16] = x38; + out1[17] = x40; + out1[18] = x42; + out1[19] = x43; + out1[20] = x45; + out1[21] = x47; + out1[22] = x49; + out1[23] = x50; + out1[24] = x52; + out1[25] = x54; + out1[26] = x56; + out1[27] = x57; + out1[28] = x59; + out1[29] = x61; + out1[30] = x63; + out1[31] = x62; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff], [0x0 ~> 0xffffffff]] + */ +static void fiat_p256_from_bytes(uint32_t out1[8], const uint8_t arg1[32]) { + uint32_t x1 = ((uint32_t)(arg1[31]) << 24); + uint32_t x2 = ((uint32_t)(arg1[30]) << 16); + uint32_t x3 = ((uint32_t)(arg1[29]) << 8); + uint8_t x4 = (arg1[28]); + uint32_t x5 = ((uint32_t)(arg1[27]) << 24); + uint32_t x6 = ((uint32_t)(arg1[26]) << 16); + uint32_t x7 = ((uint32_t)(arg1[25]) << 8); + uint8_t x8 = (arg1[24]); + uint32_t x9 = ((uint32_t)(arg1[23]) << 24); + uint32_t x10 = ((uint32_t)(arg1[22]) << 16); + uint32_t x11 = ((uint32_t)(arg1[21]) << 8); + uint8_t x12 = (arg1[20]); + uint32_t x13 = ((uint32_t)(arg1[19]) << 24); + uint32_t x14 = ((uint32_t)(arg1[18]) << 16); + uint32_t x15 = ((uint32_t)(arg1[17]) << 8); + uint8_t x16 = (arg1[16]); + uint32_t x17 = ((uint32_t)(arg1[15]) << 24); + uint32_t x18 = ((uint32_t)(arg1[14]) << 16); + uint32_t x19 = ((uint32_t)(arg1[13]) << 8); + uint8_t x20 = (arg1[12]); + uint32_t x21 = ((uint32_t)(arg1[11]) << 24); + uint32_t x22 = ((uint32_t)(arg1[10]) << 16); + uint32_t x23 = ((uint32_t)(arg1[9]) << 8); + uint8_t x24 = (arg1[8]); + uint32_t x25 = ((uint32_t)(arg1[7]) << 24); + uint32_t x26 = ((uint32_t)(arg1[6]) << 16); + uint32_t x27 = ((uint32_t)(arg1[5]) << 8); + uint8_t x28 = (arg1[4]); + uint32_t x29 = ((uint32_t)(arg1[3]) << 24); + uint32_t x30 = ((uint32_t)(arg1[2]) << 16); + uint32_t x31 = ((uint32_t)(arg1[1]) << 8); + uint8_t x32 = (arg1[0]); + uint32_t x33 = (x32 + (x31 + (x30 + x29))); + uint32_t x34 = (x33 & UINT32_C(0xffffffff)); + uint32_t x35 = (x4 + (x3 + (x2 + x1))); + uint32_t x36 = (x8 + (x7 + (x6 + x5))); + uint32_t x37 = (x12 + (x11 + (x10 + x9))); + uint32_t x38 = (x16 + (x15 + (x14 + x13))); + uint32_t x39 = (x20 + (x19 + (x18 + x17))); + uint32_t x40 = (x24 + (x23 + (x22 + x21))); + uint32_t x41 = (x28 + (x27 + (x26 + x25))); + uint32_t x42 = (x41 & UINT32_C(0xffffffff)); + uint32_t x43 = (x40 & UINT32_C(0xffffffff)); + uint32_t x44 = (x39 & UINT32_C(0xffffffff)); + uint32_t x45 = (x38 & UINT32_C(0xffffffff)); + uint32_t x46 = (x37 & UINT32_C(0xffffffff)); + uint32_t x47 = (x36 & UINT32_C(0xffffffff)); + out1[0] = x34; + out1[1] = x42; + out1[2] = x43; + out1[3] = x44; + out1[4] = x45; + out1[5] = x46; + out1[6] = x47; + out1[7] = x35; +} +
diff --git a/third_party/fiat/p256_64.c b/third_party/fiat/p256_64.c new file mode 100644 index 0000000..8e449c6 --- /dev/null +++ b/third_party/fiat/p256_64.c
@@ -0,0 +1,1211 @@ +/* Autogenerated */ +/* curve description: p256 */ +/* requested operations: (all) */ +/* m = 0xffffffff00000001000000000000000000000000ffffffffffffffffffffffff (from "2^256 - 2^224 + 2^192 + 2^96 - 1") */ +/* machine_wordsize = 64 (from "64") */ +/* */ +/* NOTE: In addition to the bounds specified above each function, all */ +/* functions synthesized for this Montgomery arithmetic require the */ +/* input to be strictly less than the prime modulus (m), and also */ +/* require the input to be in the unique saturated representation. */ +/* All functions also ensure that these two properties are true of */ +/* return values. */ + +#include <stdint.h> +typedef unsigned char fiat_p256_uint1; +typedef signed char fiat_p256_int1; +typedef signed __int128 fiat_p256_int128; +typedef unsigned __int128 fiat_p256_uint128; + + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [0x0 ~> 0xffffffffffffffff] + * arg3: [0x0 ~> 0xffffffffffffffff] + * Output Bounds: + * out1: [0x0 ~> 0xffffffffffffffff] + * out2: [0x0 ~> 0x1] + */ +static void fiat_p256_addcarryx_u64(uint64_t* out1, fiat_p256_uint1* out2, fiat_p256_uint1 arg1, uint64_t arg2, uint64_t arg3) { + fiat_p256_uint128 x1 = ((arg1 + (fiat_p256_uint128)arg2) + arg3); + uint64_t x2 = (uint64_t)(x1 & UINT64_C(0xffffffffffffffff)); + fiat_p256_uint1 x3 = (fiat_p256_uint1)(x1 >> 64); + *out1 = x2; + *out2 = x3; +} + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [0x0 ~> 0xffffffffffffffff] + * arg3: [0x0 ~> 0xffffffffffffffff] + * Output Bounds: + * out1: [0x0 ~> 0xffffffffffffffff] + * out2: [0x0 ~> 0x1] + */ +static void fiat_p256_subborrowx_u64(uint64_t* out1, fiat_p256_uint1* out2, fiat_p256_uint1 arg1, uint64_t arg2, uint64_t arg3) { + fiat_p256_int128 x1 = ((arg2 - (fiat_p256_int128)arg1) - arg3); + fiat_p256_int1 x2 = (fiat_p256_int1)(x1 >> 64); + uint64_t x3 = (uint64_t)(x1 & UINT64_C(0xffffffffffffffff)); + *out1 = x3; + *out2 = (fiat_p256_uint1)(0x0 - x2); +} + +/* + * Input Bounds: + * arg1: [0x0 ~> 0xffffffffffffffff] + * arg2: [0x0 ~> 0xffffffffffffffff] + * Output Bounds: + * out1: [0x0 ~> 0xffffffffffffffff] + * out2: [0x0 ~> 0xffffffffffffffff] + */ +static void fiat_p256_mulx_u64(uint64_t* out1, uint64_t* out2, uint64_t arg1, uint64_t arg2) { + fiat_p256_uint128 x1 = ((fiat_p256_uint128)arg1 * arg2); + uint64_t x2 = (uint64_t)(x1 & UINT64_C(0xffffffffffffffff)); + uint64_t x3 = (uint64_t)(x1 >> 64); + *out1 = x2; + *out2 = x3; +} + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [0x0 ~> 0xffffffffffffffff] + * arg3: [0x0 ~> 0xffffffffffffffff] + * Output Bounds: + * out1: [0x0 ~> 0xffffffffffffffff] + */ +static void fiat_p256_cmovznz_u64(uint64_t* out1, fiat_p256_uint1 arg1, uint64_t arg2, uint64_t arg3) { + fiat_p256_uint1 x1 = (!(!arg1)); + uint64_t x2 = ((fiat_p256_int1)(0x0 - x1) & UINT64_C(0xffffffffffffffff)); + uint64_t x3 = ((x2 & arg3) | ((~x2) & arg2)); + *out1 = x3; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + * arg2: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + */ +static void fiat_p256_mul(uint64_t out1[4], const uint64_t arg1[4], const uint64_t arg2[4]) { + uint64_t x1 = (arg1[1]); + uint64_t x2 = (arg1[2]); + uint64_t x3 = (arg1[3]); + uint64_t x4 = (arg1[0]); + uint64_t x5; + uint64_t x6; + fiat_p256_mulx_u64(&x5, &x6, x4, (arg2[3])); + uint64_t x7; + uint64_t x8; + fiat_p256_mulx_u64(&x7, &x8, x4, (arg2[2])); + uint64_t x9; + uint64_t x10; + fiat_p256_mulx_u64(&x9, &x10, x4, (arg2[1])); + uint64_t x11; + uint64_t x12; + fiat_p256_mulx_u64(&x11, &x12, x4, (arg2[0])); + uint64_t x13; + fiat_p256_uint1 x14; + fiat_p256_addcarryx_u64(&x13, &x14, 0x0, x9, x12); + uint64_t x15; + fiat_p256_uint1 x16; + fiat_p256_addcarryx_u64(&x15, &x16, x14, x7, x10); + uint64_t x17; + fiat_p256_uint1 x18; + fiat_p256_addcarryx_u64(&x17, &x18, x16, x5, x8); + uint64_t x19; + fiat_p256_uint1 x20; + fiat_p256_addcarryx_u64(&x19, &x20, x18, 0x0, x6); + uint64_t x21; + uint64_t x22; + fiat_p256_mulx_u64(&x21, &x22, x11, UINT64_C(0xffffffff00000001)); + uint64_t x23; + uint64_t x24; + fiat_p256_mulx_u64(&x23, &x24, x11, UINT32_C(0xffffffff)); + uint64_t x25; + uint64_t x26; + fiat_p256_mulx_u64(&x25, &x26, x11, UINT64_C(0xffffffffffffffff)); + uint64_t x27; + fiat_p256_uint1 x28; + fiat_p256_addcarryx_u64(&x27, &x28, 0x0, x23, x26); + uint64_t x29; + fiat_p256_uint1 x30; + fiat_p256_addcarryx_u64(&x29, &x30, x28, 0x0, x24); + uint64_t x31; + fiat_p256_uint1 x32; + fiat_p256_addcarryx_u64(&x31, &x32, 0x0, x25, x11); + uint64_t x33; + fiat_p256_uint1 x34; + fiat_p256_addcarryx_u64(&x33, &x34, x32, x27, x13); + uint64_t x35; + fiat_p256_uint1 x36; + fiat_p256_addcarryx_u64(&x35, &x36, x34, x29, x15); + uint64_t x37; + fiat_p256_uint1 x38; + fiat_p256_addcarryx_u64(&x37, &x38, x36, x21, x17); + uint64_t x39; + fiat_p256_uint1 x40; + fiat_p256_addcarryx_u64(&x39, &x40, x38, x22, x19); + uint64_t x41; + fiat_p256_uint1 x42; + fiat_p256_addcarryx_u64(&x41, &x42, x40, 0x0, 0x0); + uint64_t x43; + uint64_t x44; + fiat_p256_mulx_u64(&x43, &x44, x1, (arg2[3])); + uint64_t x45; + uint64_t x46; + fiat_p256_mulx_u64(&x45, &x46, x1, (arg2[2])); + uint64_t x47; + uint64_t x48; + fiat_p256_mulx_u64(&x47, &x48, x1, (arg2[1])); + uint64_t x49; + uint64_t x50; + fiat_p256_mulx_u64(&x49, &x50, x1, (arg2[0])); + uint64_t x51; + fiat_p256_uint1 x52; + fiat_p256_addcarryx_u64(&x51, &x52, 0x0, x47, x50); + uint64_t x53; + fiat_p256_uint1 x54; + fiat_p256_addcarryx_u64(&x53, &x54, x52, x45, x48); + uint64_t x55; + fiat_p256_uint1 x56; + fiat_p256_addcarryx_u64(&x55, &x56, x54, x43, x46); + uint64_t x57; + fiat_p256_uint1 x58; + fiat_p256_addcarryx_u64(&x57, &x58, x56, 0x0, x44); + uint64_t x59; + fiat_p256_uint1 x60; + fiat_p256_addcarryx_u64(&x59, &x60, 0x0, x49, x33); + uint64_t x61; + fiat_p256_uint1 x62; + fiat_p256_addcarryx_u64(&x61, &x62, x60, x51, x35); + uint64_t x63; + fiat_p256_uint1 x64; + fiat_p256_addcarryx_u64(&x63, &x64, x62, x53, x37); + uint64_t x65; + fiat_p256_uint1 x66; + fiat_p256_addcarryx_u64(&x65, &x66, x64, x55, x39); + uint64_t x67; + fiat_p256_uint1 x68; + fiat_p256_addcarryx_u64(&x67, &x68, x66, x57, (fiat_p256_uint1)x41); + uint64_t x69; + uint64_t x70; + fiat_p256_mulx_u64(&x69, &x70, x59, UINT64_C(0xffffffff00000001)); + uint64_t x71; + uint64_t x72; + fiat_p256_mulx_u64(&x71, &x72, x59, UINT32_C(0xffffffff)); + uint64_t x73; + uint64_t x74; + fiat_p256_mulx_u64(&x73, &x74, x59, UINT64_C(0xffffffffffffffff)); + uint64_t x75; + fiat_p256_uint1 x76; + fiat_p256_addcarryx_u64(&x75, &x76, 0x0, x71, x74); + uint64_t x77; + fiat_p256_uint1 x78; + fiat_p256_addcarryx_u64(&x77, &x78, x76, 0x0, x72); + uint64_t x79; + fiat_p256_uint1 x80; + fiat_p256_addcarryx_u64(&x79, &x80, 0x0, x73, x59); + uint64_t x81; + fiat_p256_uint1 x82; + fiat_p256_addcarryx_u64(&x81, &x82, x80, x75, x61); + uint64_t x83; + fiat_p256_uint1 x84; + fiat_p256_addcarryx_u64(&x83, &x84, x82, x77, x63); + uint64_t x85; + fiat_p256_uint1 x86; + fiat_p256_addcarryx_u64(&x85, &x86, x84, x69, x65); + uint64_t x87; + fiat_p256_uint1 x88; + fiat_p256_addcarryx_u64(&x87, &x88, x86, x70, x67); + uint64_t x89; + fiat_p256_uint1 x90; + fiat_p256_addcarryx_u64(&x89, &x90, x88, 0x0, x68); + uint64_t x91; + uint64_t x92; + fiat_p256_mulx_u64(&x91, &x92, x2, (arg2[3])); + uint64_t x93; + uint64_t x94; + fiat_p256_mulx_u64(&x93, &x94, x2, (arg2[2])); + uint64_t x95; + uint64_t x96; + fiat_p256_mulx_u64(&x95, &x96, x2, (arg2[1])); + uint64_t x97; + uint64_t x98; + fiat_p256_mulx_u64(&x97, &x98, x2, (arg2[0])); + uint64_t x99; + fiat_p256_uint1 x100; + fiat_p256_addcarryx_u64(&x99, &x100, 0x0, x95, x98); + uint64_t x101; + fiat_p256_uint1 x102; + fiat_p256_addcarryx_u64(&x101, &x102, x100, x93, x96); + uint64_t x103; + fiat_p256_uint1 x104; + fiat_p256_addcarryx_u64(&x103, &x104, x102, x91, x94); + uint64_t x105; + fiat_p256_uint1 x106; + fiat_p256_addcarryx_u64(&x105, &x106, x104, 0x0, x92); + uint64_t x107; + fiat_p256_uint1 x108; + fiat_p256_addcarryx_u64(&x107, &x108, 0x0, x97, x81); + uint64_t x109; + fiat_p256_uint1 x110; + fiat_p256_addcarryx_u64(&x109, &x110, x108, x99, x83); + uint64_t x111; + fiat_p256_uint1 x112; + fiat_p256_addcarryx_u64(&x111, &x112, x110, x101, x85); + uint64_t x113; + fiat_p256_uint1 x114; + fiat_p256_addcarryx_u64(&x113, &x114, x112, x103, x87); + uint64_t x115; + fiat_p256_uint1 x116; + fiat_p256_addcarryx_u64(&x115, &x116, x114, x105, x89); + uint64_t x117; + uint64_t x118; + fiat_p256_mulx_u64(&x117, &x118, x107, UINT64_C(0xffffffff00000001)); + uint64_t x119; + uint64_t x120; + fiat_p256_mulx_u64(&x119, &x120, x107, UINT32_C(0xffffffff)); + uint64_t x121; + uint64_t x122; + fiat_p256_mulx_u64(&x121, &x122, x107, UINT64_C(0xffffffffffffffff)); + uint64_t x123; + fiat_p256_uint1 x124; + fiat_p256_addcarryx_u64(&x123, &x124, 0x0, x119, x122); + uint64_t x125; + fiat_p256_uint1 x126; + fiat_p256_addcarryx_u64(&x125, &x126, x124, 0x0, x120); + uint64_t x127; + fiat_p256_uint1 x128; + fiat_p256_addcarryx_u64(&x127, &x128, 0x0, x121, x107); + uint64_t x129; + fiat_p256_uint1 x130; + fiat_p256_addcarryx_u64(&x129, &x130, x128, x123, x109); + uint64_t x131; + fiat_p256_uint1 x132; + fiat_p256_addcarryx_u64(&x131, &x132, x130, x125, x111); + uint64_t x133; + fiat_p256_uint1 x134; + fiat_p256_addcarryx_u64(&x133, &x134, x132, x117, x113); + uint64_t x135; + fiat_p256_uint1 x136; + fiat_p256_addcarryx_u64(&x135, &x136, x134, x118, x115); + uint64_t x137; + fiat_p256_uint1 x138; + fiat_p256_addcarryx_u64(&x137, &x138, x136, 0x0, x116); + uint64_t x139; + uint64_t x140; + fiat_p256_mulx_u64(&x139, &x140, x3, (arg2[3])); + uint64_t x141; + uint64_t x142; + fiat_p256_mulx_u64(&x141, &x142, x3, (arg2[2])); + uint64_t x143; + uint64_t x144; + fiat_p256_mulx_u64(&x143, &x144, x3, (arg2[1])); + uint64_t x145; + uint64_t x146; + fiat_p256_mulx_u64(&x145, &x146, x3, (arg2[0])); + uint64_t x147; + fiat_p256_uint1 x148; + fiat_p256_addcarryx_u64(&x147, &x148, 0x0, x143, x146); + uint64_t x149; + fiat_p256_uint1 x150; + fiat_p256_addcarryx_u64(&x149, &x150, x148, x141, x144); + uint64_t x151; + fiat_p256_uint1 x152; + fiat_p256_addcarryx_u64(&x151, &x152, x150, x139, x142); + uint64_t x153; + fiat_p256_uint1 x154; + fiat_p256_addcarryx_u64(&x153, &x154, x152, 0x0, x140); + uint64_t x155; + fiat_p256_uint1 x156; + fiat_p256_addcarryx_u64(&x155, &x156, 0x0, x145, x129); + uint64_t x157; + fiat_p256_uint1 x158; + fiat_p256_addcarryx_u64(&x157, &x158, x156, x147, x131); + uint64_t x159; + fiat_p256_uint1 x160; + fiat_p256_addcarryx_u64(&x159, &x160, x158, x149, x133); + uint64_t x161; + fiat_p256_uint1 x162; + fiat_p256_addcarryx_u64(&x161, &x162, x160, x151, x135); + uint64_t x163; + fiat_p256_uint1 x164; + fiat_p256_addcarryx_u64(&x163, &x164, x162, x153, x137); + uint64_t x165; + uint64_t x166; + fiat_p256_mulx_u64(&x165, &x166, x155, UINT64_C(0xffffffff00000001)); + uint64_t x167; + uint64_t x168; + fiat_p256_mulx_u64(&x167, &x168, x155, UINT32_C(0xffffffff)); + uint64_t x169; + uint64_t x170; + fiat_p256_mulx_u64(&x169, &x170, x155, UINT64_C(0xffffffffffffffff)); + uint64_t x171; + fiat_p256_uint1 x172; + fiat_p256_addcarryx_u64(&x171, &x172, 0x0, x167, x170); + uint64_t x173; + fiat_p256_uint1 x174; + fiat_p256_addcarryx_u64(&x173, &x174, x172, 0x0, x168); + uint64_t x175; + fiat_p256_uint1 x176; + fiat_p256_addcarryx_u64(&x175, &x176, 0x0, x169, x155); + uint64_t x177; + fiat_p256_uint1 x178; + fiat_p256_addcarryx_u64(&x177, &x178, x176, x171, x157); + uint64_t x179; + fiat_p256_uint1 x180; + fiat_p256_addcarryx_u64(&x179, &x180, x178, x173, x159); + uint64_t x181; + fiat_p256_uint1 x182; + fiat_p256_addcarryx_u64(&x181, &x182, x180, x165, x161); + uint64_t x183; + fiat_p256_uint1 x184; + fiat_p256_addcarryx_u64(&x183, &x184, x182, x166, x163); + uint64_t x185; + fiat_p256_uint1 x186; + fiat_p256_addcarryx_u64(&x185, &x186, x184, 0x0, x164); + uint64_t x187; + fiat_p256_uint1 x188; + fiat_p256_subborrowx_u64(&x187, &x188, 0x0, x177, UINT64_C(0xffffffffffffffff)); + uint64_t x189; + fiat_p256_uint1 x190; + fiat_p256_subborrowx_u64(&x189, &x190, x188, x179, UINT32_C(0xffffffff)); + uint64_t x191; + fiat_p256_uint1 x192; + fiat_p256_subborrowx_u64(&x191, &x192, x190, x181, 0x0); + uint64_t x193; + fiat_p256_uint1 x194; + fiat_p256_subborrowx_u64(&x193, &x194, x192, x183, UINT64_C(0xffffffff00000001)); + uint64_t x195; + fiat_p256_uint1 x196; + fiat_p256_subborrowx_u64(&x195, &x196, x194, x185, 0x0); + uint64_t x197; + fiat_p256_cmovznz_u64(&x197, x196, x187, x177); + uint64_t x198; + fiat_p256_cmovznz_u64(&x198, x196, x189, x179); + uint64_t x199; + fiat_p256_cmovznz_u64(&x199, x196, x191, x181); + uint64_t x200; + fiat_p256_cmovznz_u64(&x200, x196, x193, x183); + out1[0] = x197; + out1[1] = x198; + out1[2] = x199; + out1[3] = x200; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + */ +static void fiat_p256_square(uint64_t out1[4], const uint64_t arg1[4]) { + uint64_t x1 = (arg1[1]); + uint64_t x2 = (arg1[2]); + uint64_t x3 = (arg1[3]); + uint64_t x4 = (arg1[0]); + uint64_t x5; + uint64_t x6; + fiat_p256_mulx_u64(&x5, &x6, x4, (arg1[3])); + uint64_t x7; + uint64_t x8; + fiat_p256_mulx_u64(&x7, &x8, x4, (arg1[2])); + uint64_t x9; + uint64_t x10; + fiat_p256_mulx_u64(&x9, &x10, x4, (arg1[1])); + uint64_t x11; + uint64_t x12; + fiat_p256_mulx_u64(&x11, &x12, x4, (arg1[0])); + uint64_t x13; + fiat_p256_uint1 x14; + fiat_p256_addcarryx_u64(&x13, &x14, 0x0, x9, x12); + uint64_t x15; + fiat_p256_uint1 x16; + fiat_p256_addcarryx_u64(&x15, &x16, x14, x7, x10); + uint64_t x17; + fiat_p256_uint1 x18; + fiat_p256_addcarryx_u64(&x17, &x18, x16, x5, x8); + uint64_t x19; + fiat_p256_uint1 x20; + fiat_p256_addcarryx_u64(&x19, &x20, x18, 0x0, x6); + uint64_t x21; + uint64_t x22; + fiat_p256_mulx_u64(&x21, &x22, x11, UINT64_C(0xffffffff00000001)); + uint64_t x23; + uint64_t x24; + fiat_p256_mulx_u64(&x23, &x24, x11, UINT32_C(0xffffffff)); + uint64_t x25; + uint64_t x26; + fiat_p256_mulx_u64(&x25, &x26, x11, UINT64_C(0xffffffffffffffff)); + uint64_t x27; + fiat_p256_uint1 x28; + fiat_p256_addcarryx_u64(&x27, &x28, 0x0, x23, x26); + uint64_t x29; + fiat_p256_uint1 x30; + fiat_p256_addcarryx_u64(&x29, &x30, x28, 0x0, x24); + uint64_t x31; + fiat_p256_uint1 x32; + fiat_p256_addcarryx_u64(&x31, &x32, 0x0, x25, x11); + uint64_t x33; + fiat_p256_uint1 x34; + fiat_p256_addcarryx_u64(&x33, &x34, x32, x27, x13); + uint64_t x35; + fiat_p256_uint1 x36; + fiat_p256_addcarryx_u64(&x35, &x36, x34, x29, x15); + uint64_t x37; + fiat_p256_uint1 x38; + fiat_p256_addcarryx_u64(&x37, &x38, x36, x21, x17); + uint64_t x39; + fiat_p256_uint1 x40; + fiat_p256_addcarryx_u64(&x39, &x40, x38, x22, x19); + uint64_t x41; + fiat_p256_uint1 x42; + fiat_p256_addcarryx_u64(&x41, &x42, x40, 0x0, 0x0); + uint64_t x43; + uint64_t x44; + fiat_p256_mulx_u64(&x43, &x44, x1, (arg1[3])); + uint64_t x45; + uint64_t x46; + fiat_p256_mulx_u64(&x45, &x46, x1, (arg1[2])); + uint64_t x47; + uint64_t x48; + fiat_p256_mulx_u64(&x47, &x48, x1, (arg1[1])); + uint64_t x49; + uint64_t x50; + fiat_p256_mulx_u64(&x49, &x50, x1, (arg1[0])); + uint64_t x51; + fiat_p256_uint1 x52; + fiat_p256_addcarryx_u64(&x51, &x52, 0x0, x47, x50); + uint64_t x53; + fiat_p256_uint1 x54; + fiat_p256_addcarryx_u64(&x53, &x54, x52, x45, x48); + uint64_t x55; + fiat_p256_uint1 x56; + fiat_p256_addcarryx_u64(&x55, &x56, x54, x43, x46); + uint64_t x57; + fiat_p256_uint1 x58; + fiat_p256_addcarryx_u64(&x57, &x58, x56, 0x0, x44); + uint64_t x59; + fiat_p256_uint1 x60; + fiat_p256_addcarryx_u64(&x59, &x60, 0x0, x49, x33); + uint64_t x61; + fiat_p256_uint1 x62; + fiat_p256_addcarryx_u64(&x61, &x62, x60, x51, x35); + uint64_t x63; + fiat_p256_uint1 x64; + fiat_p256_addcarryx_u64(&x63, &x64, x62, x53, x37); + uint64_t x65; + fiat_p256_uint1 x66; + fiat_p256_addcarryx_u64(&x65, &x66, x64, x55, x39); + uint64_t x67; + fiat_p256_uint1 x68; + fiat_p256_addcarryx_u64(&x67, &x68, x66, x57, (fiat_p256_uint1)x41); + uint64_t x69; + uint64_t x70; + fiat_p256_mulx_u64(&x69, &x70, x59, UINT64_C(0xffffffff00000001)); + uint64_t x71; + uint64_t x72; + fiat_p256_mulx_u64(&x71, &x72, x59, UINT32_C(0xffffffff)); + uint64_t x73; + uint64_t x74; + fiat_p256_mulx_u64(&x73, &x74, x59, UINT64_C(0xffffffffffffffff)); + uint64_t x75; + fiat_p256_uint1 x76; + fiat_p256_addcarryx_u64(&x75, &x76, 0x0, x71, x74); + uint64_t x77; + fiat_p256_uint1 x78; + fiat_p256_addcarryx_u64(&x77, &x78, x76, 0x0, x72); + uint64_t x79; + fiat_p256_uint1 x80; + fiat_p256_addcarryx_u64(&x79, &x80, 0x0, x73, x59); + uint64_t x81; + fiat_p256_uint1 x82; + fiat_p256_addcarryx_u64(&x81, &x82, x80, x75, x61); + uint64_t x83; + fiat_p256_uint1 x84; + fiat_p256_addcarryx_u64(&x83, &x84, x82, x77, x63); + uint64_t x85; + fiat_p256_uint1 x86; + fiat_p256_addcarryx_u64(&x85, &x86, x84, x69, x65); + uint64_t x87; + fiat_p256_uint1 x88; + fiat_p256_addcarryx_u64(&x87, &x88, x86, x70, x67); + uint64_t x89; + fiat_p256_uint1 x90; + fiat_p256_addcarryx_u64(&x89, &x90, x88, 0x0, x68); + uint64_t x91; + uint64_t x92; + fiat_p256_mulx_u64(&x91, &x92, x2, (arg1[3])); + uint64_t x93; + uint64_t x94; + fiat_p256_mulx_u64(&x93, &x94, x2, (arg1[2])); + uint64_t x95; + uint64_t x96; + fiat_p256_mulx_u64(&x95, &x96, x2, (arg1[1])); + uint64_t x97; + uint64_t x98; + fiat_p256_mulx_u64(&x97, &x98, x2, (arg1[0])); + uint64_t x99; + fiat_p256_uint1 x100; + fiat_p256_addcarryx_u64(&x99, &x100, 0x0, x95, x98); + uint64_t x101; + fiat_p256_uint1 x102; + fiat_p256_addcarryx_u64(&x101, &x102, x100, x93, x96); + uint64_t x103; + fiat_p256_uint1 x104; + fiat_p256_addcarryx_u64(&x103, &x104, x102, x91, x94); + uint64_t x105; + fiat_p256_uint1 x106; + fiat_p256_addcarryx_u64(&x105, &x106, x104, 0x0, x92); + uint64_t x107; + fiat_p256_uint1 x108; + fiat_p256_addcarryx_u64(&x107, &x108, 0x0, x97, x81); + uint64_t x109; + fiat_p256_uint1 x110; + fiat_p256_addcarryx_u64(&x109, &x110, x108, x99, x83); + uint64_t x111; + fiat_p256_uint1 x112; + fiat_p256_addcarryx_u64(&x111, &x112, x110, x101, x85); + uint64_t x113; + fiat_p256_uint1 x114; + fiat_p256_addcarryx_u64(&x113, &x114, x112, x103, x87); + uint64_t x115; + fiat_p256_uint1 x116; + fiat_p256_addcarryx_u64(&x115, &x116, x114, x105, x89); + uint64_t x117; + uint64_t x118; + fiat_p256_mulx_u64(&x117, &x118, x107, UINT64_C(0xffffffff00000001)); + uint64_t x119; + uint64_t x120; + fiat_p256_mulx_u64(&x119, &x120, x107, UINT32_C(0xffffffff)); + uint64_t x121; + uint64_t x122; + fiat_p256_mulx_u64(&x121, &x122, x107, UINT64_C(0xffffffffffffffff)); + uint64_t x123; + fiat_p256_uint1 x124; + fiat_p256_addcarryx_u64(&x123, &x124, 0x0, x119, x122); + uint64_t x125; + fiat_p256_uint1 x126; + fiat_p256_addcarryx_u64(&x125, &x126, x124, 0x0, x120); + uint64_t x127; + fiat_p256_uint1 x128; + fiat_p256_addcarryx_u64(&x127, &x128, 0x0, x121, x107); + uint64_t x129; + fiat_p256_uint1 x130; + fiat_p256_addcarryx_u64(&x129, &x130, x128, x123, x109); + uint64_t x131; + fiat_p256_uint1 x132; + fiat_p256_addcarryx_u64(&x131, &x132, x130, x125, x111); + uint64_t x133; + fiat_p256_uint1 x134; + fiat_p256_addcarryx_u64(&x133, &x134, x132, x117, x113); + uint64_t x135; + fiat_p256_uint1 x136; + fiat_p256_addcarryx_u64(&x135, &x136, x134, x118, x115); + uint64_t x137; + fiat_p256_uint1 x138; + fiat_p256_addcarryx_u64(&x137, &x138, x136, 0x0, x116); + uint64_t x139; + uint64_t x140; + fiat_p256_mulx_u64(&x139, &x140, x3, (arg1[3])); + uint64_t x141; + uint64_t x142; + fiat_p256_mulx_u64(&x141, &x142, x3, (arg1[2])); + uint64_t x143; + uint64_t x144; + fiat_p256_mulx_u64(&x143, &x144, x3, (arg1[1])); + uint64_t x145; + uint64_t x146; + fiat_p256_mulx_u64(&x145, &x146, x3, (arg1[0])); + uint64_t x147; + fiat_p256_uint1 x148; + fiat_p256_addcarryx_u64(&x147, &x148, 0x0, x143, x146); + uint64_t x149; + fiat_p256_uint1 x150; + fiat_p256_addcarryx_u64(&x149, &x150, x148, x141, x144); + uint64_t x151; + fiat_p256_uint1 x152; + fiat_p256_addcarryx_u64(&x151, &x152, x150, x139, x142); + uint64_t x153; + fiat_p256_uint1 x154; + fiat_p256_addcarryx_u64(&x153, &x154, x152, 0x0, x140); + uint64_t x155; + fiat_p256_uint1 x156; + fiat_p256_addcarryx_u64(&x155, &x156, 0x0, x145, x129); + uint64_t x157; + fiat_p256_uint1 x158; + fiat_p256_addcarryx_u64(&x157, &x158, x156, x147, x131); + uint64_t x159; + fiat_p256_uint1 x160; + fiat_p256_addcarryx_u64(&x159, &x160, x158, x149, x133); + uint64_t x161; + fiat_p256_uint1 x162; + fiat_p256_addcarryx_u64(&x161, &x162, x160, x151, x135); + uint64_t x163; + fiat_p256_uint1 x164; + fiat_p256_addcarryx_u64(&x163, &x164, x162, x153, x137); + uint64_t x165; + uint64_t x166; + fiat_p256_mulx_u64(&x165, &x166, x155, UINT64_C(0xffffffff00000001)); + uint64_t x167; + uint64_t x168; + fiat_p256_mulx_u64(&x167, &x168, x155, UINT32_C(0xffffffff)); + uint64_t x169; + uint64_t x170; + fiat_p256_mulx_u64(&x169, &x170, x155, UINT64_C(0xffffffffffffffff)); + uint64_t x171; + fiat_p256_uint1 x172; + fiat_p256_addcarryx_u64(&x171, &x172, 0x0, x167, x170); + uint64_t x173; + fiat_p256_uint1 x174; + fiat_p256_addcarryx_u64(&x173, &x174, x172, 0x0, x168); + uint64_t x175; + fiat_p256_uint1 x176; + fiat_p256_addcarryx_u64(&x175, &x176, 0x0, x169, x155); + uint64_t x177; + fiat_p256_uint1 x178; + fiat_p256_addcarryx_u64(&x177, &x178, x176, x171, x157); + uint64_t x179; + fiat_p256_uint1 x180; + fiat_p256_addcarryx_u64(&x179, &x180, x178, x173, x159); + uint64_t x181; + fiat_p256_uint1 x182; + fiat_p256_addcarryx_u64(&x181, &x182, x180, x165, x161); + uint64_t x183; + fiat_p256_uint1 x184; + fiat_p256_addcarryx_u64(&x183, &x184, x182, x166, x163); + uint64_t x185; + fiat_p256_uint1 x186; + fiat_p256_addcarryx_u64(&x185, &x186, x184, 0x0, x164); + uint64_t x187; + fiat_p256_uint1 x188; + fiat_p256_subborrowx_u64(&x187, &x188, 0x0, x177, UINT64_C(0xffffffffffffffff)); + uint64_t x189; + fiat_p256_uint1 x190; + fiat_p256_subborrowx_u64(&x189, &x190, x188, x179, UINT32_C(0xffffffff)); + uint64_t x191; + fiat_p256_uint1 x192; + fiat_p256_subborrowx_u64(&x191, &x192, x190, x181, 0x0); + uint64_t x193; + fiat_p256_uint1 x194; + fiat_p256_subborrowx_u64(&x193, &x194, x192, x183, UINT64_C(0xffffffff00000001)); + uint64_t x195; + fiat_p256_uint1 x196; + fiat_p256_subborrowx_u64(&x195, &x196, x194, x185, 0x0); + uint64_t x197; + fiat_p256_cmovznz_u64(&x197, x196, x187, x177); + uint64_t x198; + fiat_p256_cmovznz_u64(&x198, x196, x189, x179); + uint64_t x199; + fiat_p256_cmovznz_u64(&x199, x196, x191, x181); + uint64_t x200; + fiat_p256_cmovznz_u64(&x200, x196, x193, x183); + out1[0] = x197; + out1[1] = x198; + out1[2] = x199; + out1[3] = x200; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + * arg2: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + */ +static void fiat_p256_add(uint64_t out1[4], const uint64_t arg1[4], const uint64_t arg2[4]) { + uint64_t x1; + fiat_p256_uint1 x2; + fiat_p256_addcarryx_u64(&x1, &x2, 0x0, (arg2[0]), (arg1[0])); + uint64_t x3; + fiat_p256_uint1 x4; + fiat_p256_addcarryx_u64(&x3, &x4, x2, (arg2[1]), (arg1[1])); + uint64_t x5; + fiat_p256_uint1 x6; + fiat_p256_addcarryx_u64(&x5, &x6, x4, (arg2[2]), (arg1[2])); + uint64_t x7; + fiat_p256_uint1 x8; + fiat_p256_addcarryx_u64(&x7, &x8, x6, (arg2[3]), (arg1[3])); + uint64_t x9; + fiat_p256_uint1 x10; + fiat_p256_subborrowx_u64(&x9, &x10, 0x0, x1, UINT64_C(0xffffffffffffffff)); + uint64_t x11; + fiat_p256_uint1 x12; + fiat_p256_subborrowx_u64(&x11, &x12, x10, x3, UINT32_C(0xffffffff)); + uint64_t x13; + fiat_p256_uint1 x14; + fiat_p256_subborrowx_u64(&x13, &x14, x12, x5, 0x0); + uint64_t x15; + fiat_p256_uint1 x16; + fiat_p256_subborrowx_u64(&x15, &x16, x14, x7, UINT64_C(0xffffffff00000001)); + uint64_t x17; + fiat_p256_uint1 x18; + fiat_p256_subborrowx_u64(&x17, &x18, x16, x8, 0x0); + uint64_t x19; + fiat_p256_cmovznz_u64(&x19, x18, x9, x1); + uint64_t x20; + fiat_p256_cmovznz_u64(&x20, x18, x11, x3); + uint64_t x21; + fiat_p256_cmovznz_u64(&x21, x18, x13, x5); + uint64_t x22; + fiat_p256_cmovznz_u64(&x22, x18, x15, x7); + out1[0] = x19; + out1[1] = x20; + out1[2] = x21; + out1[3] = x22; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + * arg2: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + */ +static void fiat_p256_sub(uint64_t out1[4], const uint64_t arg1[4], const uint64_t arg2[4]) { + uint64_t x1; + fiat_p256_uint1 x2; + fiat_p256_subborrowx_u64(&x1, &x2, 0x0, (arg1[0]), (arg2[0])); + uint64_t x3; + fiat_p256_uint1 x4; + fiat_p256_subborrowx_u64(&x3, &x4, x2, (arg1[1]), (arg2[1])); + uint64_t x5; + fiat_p256_uint1 x6; + fiat_p256_subborrowx_u64(&x5, &x6, x4, (arg1[2]), (arg2[2])); + uint64_t x7; + fiat_p256_uint1 x8; + fiat_p256_subborrowx_u64(&x7, &x8, x6, (arg1[3]), (arg2[3])); + uint64_t x9; + fiat_p256_cmovznz_u64(&x9, x8, 0x0, UINT64_C(0xffffffffffffffff)); + uint64_t x10; + fiat_p256_uint1 x11; + fiat_p256_addcarryx_u64(&x10, &x11, 0x0, (x9 & UINT64_C(0xffffffffffffffff)), x1); + uint64_t x12; + fiat_p256_uint1 x13; + fiat_p256_addcarryx_u64(&x12, &x13, x11, (x9 & UINT32_C(0xffffffff)), x3); + uint64_t x14; + fiat_p256_uint1 x15; + fiat_p256_addcarryx_u64(&x14, &x15, x13, 0x0, x5); + uint64_t x16; + fiat_p256_uint1 x17; + fiat_p256_addcarryx_u64(&x16, &x17, x15, (x9 & UINT64_C(0xffffffff00000001)), x7); + out1[0] = x10; + out1[1] = x12; + out1[2] = x14; + out1[3] = x16; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + */ +static void fiat_p256_opp(uint64_t out1[4], const uint64_t arg1[4]) { + uint64_t x1; + fiat_p256_uint1 x2; + fiat_p256_subborrowx_u64(&x1, &x2, 0x0, 0x0, (arg1[0])); + uint64_t x3; + fiat_p256_uint1 x4; + fiat_p256_subborrowx_u64(&x3, &x4, x2, 0x0, (arg1[1])); + uint64_t x5; + fiat_p256_uint1 x6; + fiat_p256_subborrowx_u64(&x5, &x6, x4, 0x0, (arg1[2])); + uint64_t x7; + fiat_p256_uint1 x8; + fiat_p256_subborrowx_u64(&x7, &x8, x6, 0x0, (arg1[3])); + uint64_t x9; + fiat_p256_cmovznz_u64(&x9, x8, 0x0, UINT64_C(0xffffffffffffffff)); + uint64_t x10; + fiat_p256_uint1 x11; + fiat_p256_addcarryx_u64(&x10, &x11, 0x0, (x9 & UINT64_C(0xffffffffffffffff)), x1); + uint64_t x12; + fiat_p256_uint1 x13; + fiat_p256_addcarryx_u64(&x12, &x13, x11, (x9 & UINT32_C(0xffffffff)), x3); + uint64_t x14; + fiat_p256_uint1 x15; + fiat_p256_addcarryx_u64(&x14, &x15, x13, 0x0, x5); + uint64_t x16; + fiat_p256_uint1 x17; + fiat_p256_addcarryx_u64(&x16, &x17, x15, (x9 & UINT64_C(0xffffffff00000001)), x7); + out1[0] = x10; + out1[1] = x12; + out1[2] = x14; + out1[3] = x16; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + */ +static void fiat_p256_from_montgomery(uint64_t out1[4], const uint64_t arg1[4]) { + uint64_t x1 = (arg1[0]); + uint64_t x2; + uint64_t x3; + fiat_p256_mulx_u64(&x2, &x3, x1, UINT64_C(0xffffffff00000001)); + uint64_t x4; + uint64_t x5; + fiat_p256_mulx_u64(&x4, &x5, x1, UINT32_C(0xffffffff)); + uint64_t x6; + uint64_t x7; + fiat_p256_mulx_u64(&x6, &x7, x1, UINT64_C(0xffffffffffffffff)); + uint64_t x8; + fiat_p256_uint1 x9; + fiat_p256_addcarryx_u64(&x8, &x9, 0x0, x4, x7); + uint64_t x10; + fiat_p256_uint1 x11; + fiat_p256_addcarryx_u64(&x10, &x11, 0x0, x6, x1); + uint64_t x12; + fiat_p256_uint1 x13; + fiat_p256_addcarryx_u64(&x12, &x13, x11, x8, 0x0); + uint64_t x14; + fiat_p256_uint1 x15; + fiat_p256_addcarryx_u64(&x14, &x15, 0x0, (arg1[1]), x12); + uint64_t x16; + uint64_t x17; + fiat_p256_mulx_u64(&x16, &x17, x14, UINT64_C(0xffffffff00000001)); + uint64_t x18; + uint64_t x19; + fiat_p256_mulx_u64(&x18, &x19, x14, UINT32_C(0xffffffff)); + uint64_t x20; + uint64_t x21; + fiat_p256_mulx_u64(&x20, &x21, x14, UINT64_C(0xffffffffffffffff)); + uint64_t x22; + fiat_p256_uint1 x23; + fiat_p256_addcarryx_u64(&x22, &x23, 0x0, x18, x21); + uint64_t x24; + fiat_p256_uint1 x25; + fiat_p256_addcarryx_u64(&x24, &x25, x9, 0x0, x5); + uint64_t x26; + fiat_p256_uint1 x27; + fiat_p256_addcarryx_u64(&x26, &x27, x13, x24, 0x0); + uint64_t x28; + fiat_p256_uint1 x29; + fiat_p256_addcarryx_u64(&x28, &x29, x15, 0x0, x26); + uint64_t x30; + fiat_p256_uint1 x31; + fiat_p256_addcarryx_u64(&x30, &x31, 0x0, x20, x14); + uint64_t x32; + fiat_p256_uint1 x33; + fiat_p256_addcarryx_u64(&x32, &x33, x31, x22, x28); + uint64_t x34; + fiat_p256_uint1 x35; + fiat_p256_addcarryx_u64(&x34, &x35, x23, 0x0, x19); + uint64_t x36; + fiat_p256_uint1 x37; + fiat_p256_addcarryx_u64(&x36, &x37, x33, x34, x2); + uint64_t x38; + fiat_p256_uint1 x39; + fiat_p256_addcarryx_u64(&x38, &x39, x37, x16, x3); + uint64_t x40; + fiat_p256_uint1 x41; + fiat_p256_addcarryx_u64(&x40, &x41, 0x0, (arg1[2]), x32); + uint64_t x42; + fiat_p256_uint1 x43; + fiat_p256_addcarryx_u64(&x42, &x43, x41, 0x0, x36); + uint64_t x44; + fiat_p256_uint1 x45; + fiat_p256_addcarryx_u64(&x44, &x45, x43, 0x0, x38); + uint64_t x46; + uint64_t x47; + fiat_p256_mulx_u64(&x46, &x47, x40, UINT64_C(0xffffffff00000001)); + uint64_t x48; + uint64_t x49; + fiat_p256_mulx_u64(&x48, &x49, x40, UINT32_C(0xffffffff)); + uint64_t x50; + uint64_t x51; + fiat_p256_mulx_u64(&x50, &x51, x40, UINT64_C(0xffffffffffffffff)); + uint64_t x52; + fiat_p256_uint1 x53; + fiat_p256_addcarryx_u64(&x52, &x53, 0x0, x48, x51); + uint64_t x54; + fiat_p256_uint1 x55; + fiat_p256_addcarryx_u64(&x54, &x55, 0x0, x50, x40); + uint64_t x56; + fiat_p256_uint1 x57; + fiat_p256_addcarryx_u64(&x56, &x57, x55, x52, x42); + uint64_t x58; + fiat_p256_uint1 x59; + fiat_p256_addcarryx_u64(&x58, &x59, x53, 0x0, x49); + uint64_t x60; + fiat_p256_uint1 x61; + fiat_p256_addcarryx_u64(&x60, &x61, x57, x58, x44); + uint64_t x62; + fiat_p256_uint1 x63; + fiat_p256_addcarryx_u64(&x62, &x63, x39, x17, 0x0); + uint64_t x64; + fiat_p256_uint1 x65; + fiat_p256_addcarryx_u64(&x64, &x65, x45, 0x0, x62); + uint64_t x66; + fiat_p256_uint1 x67; + fiat_p256_addcarryx_u64(&x66, &x67, x61, x46, x64); + uint64_t x68; + fiat_p256_uint1 x69; + fiat_p256_addcarryx_u64(&x68, &x69, 0x0, (arg1[3]), x56); + uint64_t x70; + fiat_p256_uint1 x71; + fiat_p256_addcarryx_u64(&x70, &x71, x69, 0x0, x60); + uint64_t x72; + fiat_p256_uint1 x73; + fiat_p256_addcarryx_u64(&x72, &x73, x71, 0x0, x66); + uint64_t x74; + uint64_t x75; + fiat_p256_mulx_u64(&x74, &x75, x68, UINT64_C(0xffffffff00000001)); + uint64_t x76; + uint64_t x77; + fiat_p256_mulx_u64(&x76, &x77, x68, UINT32_C(0xffffffff)); + uint64_t x78; + uint64_t x79; + fiat_p256_mulx_u64(&x78, &x79, x68, UINT64_C(0xffffffffffffffff)); + uint64_t x80; + fiat_p256_uint1 x81; + fiat_p256_addcarryx_u64(&x80, &x81, 0x0, x76, x79); + uint64_t x82; + fiat_p256_uint1 x83; + fiat_p256_addcarryx_u64(&x82, &x83, 0x0, x78, x68); + uint64_t x84; + fiat_p256_uint1 x85; + fiat_p256_addcarryx_u64(&x84, &x85, x83, x80, x70); + uint64_t x86; + fiat_p256_uint1 x87; + fiat_p256_addcarryx_u64(&x86, &x87, x81, 0x0, x77); + uint64_t x88; + fiat_p256_uint1 x89; + fiat_p256_addcarryx_u64(&x88, &x89, x85, x86, x72); + uint64_t x90; + fiat_p256_uint1 x91; + fiat_p256_addcarryx_u64(&x90, &x91, x67, x47, 0x0); + uint64_t x92; + fiat_p256_uint1 x93; + fiat_p256_addcarryx_u64(&x92, &x93, x73, 0x0, x90); + uint64_t x94; + fiat_p256_uint1 x95; + fiat_p256_addcarryx_u64(&x94, &x95, x89, x74, x92); + uint64_t x96; + fiat_p256_uint1 x97; + fiat_p256_addcarryx_u64(&x96, &x97, x95, x75, 0x0); + uint64_t x98; + fiat_p256_uint1 x99; + fiat_p256_subborrowx_u64(&x98, &x99, 0x0, x84, UINT64_C(0xffffffffffffffff)); + uint64_t x100; + fiat_p256_uint1 x101; + fiat_p256_subborrowx_u64(&x100, &x101, x99, x88, UINT32_C(0xffffffff)); + uint64_t x102; + fiat_p256_uint1 x103; + fiat_p256_subborrowx_u64(&x102, &x103, x101, x94, 0x0); + uint64_t x104; + fiat_p256_uint1 x105; + fiat_p256_subborrowx_u64(&x104, &x105, x103, x96, UINT64_C(0xffffffff00000001)); + uint64_t x106; + fiat_p256_uint1 x107; + fiat_p256_subborrowx_u64(&x106, &x107, x105, 0x0, 0x0); + uint64_t x108; + fiat_p256_cmovznz_u64(&x108, x107, x98, x84); + uint64_t x109; + fiat_p256_cmovznz_u64(&x109, x107, x100, x88); + uint64_t x110; + fiat_p256_cmovznz_u64(&x110, x107, x102, x94); + uint64_t x111; + fiat_p256_cmovznz_u64(&x111, x107, x104, x96); + out1[0] = x108; + out1[1] = x109; + out1[2] = x110; + out1[3] = x111; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + * Output Bounds: + * out1: [0x0 ~> 0xffffffffffffffff] + */ +static void fiat_p256_nonzero(uint64_t* out1, const uint64_t arg1[4]) { + uint64_t x1 = ((arg1[0]) | ((arg1[1]) | ((arg1[2]) | ((arg1[3]) | (uint64_t)0x0)))); + *out1 = x1; +} + +/* + * Input Bounds: + * arg1: [0x0 ~> 0x1] + * arg2: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + * arg3: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + */ +static void fiat_p256_selectznz(uint64_t out1[4], fiat_p256_uint1 arg1, const uint64_t arg2[4], const uint64_t arg3[4]) { + uint64_t x1; + fiat_p256_cmovznz_u64(&x1, arg1, (arg2[0]), (arg3[0])); + uint64_t x2; + fiat_p256_cmovznz_u64(&x2, arg1, (arg2[1]), (arg3[1])); + uint64_t x3; + fiat_p256_cmovznz_u64(&x3, arg1, (arg2[2]), (arg3[2])); + uint64_t x4; + fiat_p256_cmovznz_u64(&x4, arg1, (arg2[3]), (arg3[3])); + out1[0] = x1; + out1[1] = x2; + out1[2] = x3; + out1[3] = x4; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + * Output Bounds: + * out1: [[0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff]] + */ +static void fiat_p256_to_bytes(uint8_t out1[32], const uint64_t arg1[4]) { + uint64_t x1 = (arg1[3]); + uint64_t x2 = (arg1[2]); + uint64_t x3 = (arg1[1]); + uint64_t x4 = (arg1[0]); + uint64_t x5 = (x4 >> 8); + uint8_t x6 = (uint8_t)(x4 & UINT8_C(0xff)); + uint64_t x7 = (x5 >> 8); + uint8_t x8 = (uint8_t)(x5 & UINT8_C(0xff)); + uint64_t x9 = (x7 >> 8); + uint8_t x10 = (uint8_t)(x7 & UINT8_C(0xff)); + uint64_t x11 = (x9 >> 8); + uint8_t x12 = (uint8_t)(x9 & UINT8_C(0xff)); + uint64_t x13 = (x11 >> 8); + uint8_t x14 = (uint8_t)(x11 & UINT8_C(0xff)); + uint64_t x15 = (x13 >> 8); + uint8_t x16 = (uint8_t)(x13 & UINT8_C(0xff)); + uint8_t x17 = (uint8_t)(x15 >> 8); + uint8_t x18 = (uint8_t)(x15 & UINT8_C(0xff)); + uint8_t x19 = (uint8_t)(x17 & UINT8_C(0xff)); + uint64_t x20 = (x3 >> 8); + uint8_t x21 = (uint8_t)(x3 & UINT8_C(0xff)); + uint64_t x22 = (x20 >> 8); + uint8_t x23 = (uint8_t)(x20 & UINT8_C(0xff)); + uint64_t x24 = (x22 >> 8); + uint8_t x25 = (uint8_t)(x22 & UINT8_C(0xff)); + uint64_t x26 = (x24 >> 8); + uint8_t x27 = (uint8_t)(x24 & UINT8_C(0xff)); + uint64_t x28 = (x26 >> 8); + uint8_t x29 = (uint8_t)(x26 & UINT8_C(0xff)); + uint64_t x30 = (x28 >> 8); + uint8_t x31 = (uint8_t)(x28 & UINT8_C(0xff)); + uint8_t x32 = (uint8_t)(x30 >> 8); + uint8_t x33 = (uint8_t)(x30 & UINT8_C(0xff)); + uint8_t x34 = (uint8_t)(x32 & UINT8_C(0xff)); + uint64_t x35 = (x2 >> 8); + uint8_t x36 = (uint8_t)(x2 & UINT8_C(0xff)); + uint64_t x37 = (x35 >> 8); + uint8_t x38 = (uint8_t)(x35 & UINT8_C(0xff)); + uint64_t x39 = (x37 >> 8); + uint8_t x40 = (uint8_t)(x37 & UINT8_C(0xff)); + uint64_t x41 = (x39 >> 8); + uint8_t x42 = (uint8_t)(x39 & UINT8_C(0xff)); + uint64_t x43 = (x41 >> 8); + uint8_t x44 = (uint8_t)(x41 & UINT8_C(0xff)); + uint64_t x45 = (x43 >> 8); + uint8_t x46 = (uint8_t)(x43 & UINT8_C(0xff)); + uint8_t x47 = (uint8_t)(x45 >> 8); + uint8_t x48 = (uint8_t)(x45 & UINT8_C(0xff)); + uint8_t x49 = (uint8_t)(x47 & UINT8_C(0xff)); + uint64_t x50 = (x1 >> 8); + uint8_t x51 = (uint8_t)(x1 & UINT8_C(0xff)); + uint64_t x52 = (x50 >> 8); + uint8_t x53 = (uint8_t)(x50 & UINT8_C(0xff)); + uint64_t x54 = (x52 >> 8); + uint8_t x55 = (uint8_t)(x52 & UINT8_C(0xff)); + uint64_t x56 = (x54 >> 8); + uint8_t x57 = (uint8_t)(x54 & UINT8_C(0xff)); + uint64_t x58 = (x56 >> 8); + uint8_t x59 = (uint8_t)(x56 & UINT8_C(0xff)); + uint64_t x60 = (x58 >> 8); + uint8_t x61 = (uint8_t)(x58 & UINT8_C(0xff)); + uint8_t x62 = (uint8_t)(x60 >> 8); + uint8_t x63 = (uint8_t)(x60 & UINT8_C(0xff)); + out1[0] = x6; + out1[1] = x8; + out1[2] = x10; + out1[3] = x12; + out1[4] = x14; + out1[5] = x16; + out1[6] = x18; + out1[7] = x19; + out1[8] = x21; + out1[9] = x23; + out1[10] = x25; + out1[11] = x27; + out1[12] = x29; + out1[13] = x31; + out1[14] = x33; + out1[15] = x34; + out1[16] = x36; + out1[17] = x38; + out1[18] = x40; + out1[19] = x42; + out1[20] = x44; + out1[21] = x46; + out1[22] = x48; + out1[23] = x49; + out1[24] = x51; + out1[25] = x53; + out1[26] = x55; + out1[27] = x57; + out1[28] = x59; + out1[29] = x61; + out1[30] = x63; + out1[31] = x62; +} + +/* + * Input Bounds: + * arg1: [[0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff], [0x0 ~> 0xff]] + * Output Bounds: + * out1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] + */ +static void fiat_p256_from_bytes(uint64_t out1[4], const uint8_t arg1[32]) { + uint64_t x1 = ((uint64_t)(arg1[31]) << 56); + uint64_t x2 = ((uint64_t)(arg1[30]) << 48); + uint64_t x3 = ((uint64_t)(arg1[29]) << 40); + uint64_t x4 = ((uint64_t)(arg1[28]) << 32); + uint64_t x5 = ((uint64_t)(arg1[27]) << 24); + uint64_t x6 = ((uint64_t)(arg1[26]) << 16); + uint64_t x7 = ((uint64_t)(arg1[25]) << 8); + uint8_t x8 = (arg1[24]); + uint64_t x9 = ((uint64_t)(arg1[23]) << 56); + uint64_t x10 = ((uint64_t)(arg1[22]) << 48); + uint64_t x11 = ((uint64_t)(arg1[21]) << 40); + uint64_t x12 = ((uint64_t)(arg1[20]) << 32); + uint64_t x13 = ((uint64_t)(arg1[19]) << 24); + uint64_t x14 = ((uint64_t)(arg1[18]) << 16); + uint64_t x15 = ((uint64_t)(arg1[17]) << 8); + uint8_t x16 = (arg1[16]); + uint64_t x17 = ((uint64_t)(arg1[15]) << 56); + uint64_t x18 = ((uint64_t)(arg1[14]) << 48); + uint64_t x19 = ((uint64_t)(arg1[13]) << 40); + uint64_t x20 = ((uint64_t)(arg1[12]) << 32); + uint64_t x21 = ((uint64_t)(arg1[11]) << 24); + uint64_t x22 = ((uint64_t)(arg1[10]) << 16); + uint64_t x23 = ((uint64_t)(arg1[9]) << 8); + uint8_t x24 = (arg1[8]); + uint64_t x25 = ((uint64_t)(arg1[7]) << 56); + uint64_t x26 = ((uint64_t)(arg1[6]) << 48); + uint64_t x27 = ((uint64_t)(arg1[5]) << 40); + uint64_t x28 = ((uint64_t)(arg1[4]) << 32); + uint64_t x29 = ((uint64_t)(arg1[3]) << 24); + uint64_t x30 = ((uint64_t)(arg1[2]) << 16); + uint64_t x31 = ((uint64_t)(arg1[1]) << 8); + uint8_t x32 = (arg1[0]); + uint64_t x33 = (x32 + (x31 + (x30 + (x29 + (x28 + (x27 + (x26 + x25))))))); + uint64_t x34 = (x33 & UINT64_C(0xffffffffffffffff)); + uint64_t x35 = (x8 + (x7 + (x6 + (x5 + (x4 + (x3 + (x2 + x1))))))); + uint64_t x36 = (x16 + (x15 + (x14 + (x13 + (x12 + (x11 + (x10 + x9))))))); + uint64_t x37 = (x24 + (x23 + (x22 + (x21 + (x20 + (x19 + (x18 + x17))))))); + uint64_t x38 = (x37 & UINT64_C(0xffffffffffffffff)); + uint64_t x39 = (x36 & UINT64_C(0xffffffffffffffff)); + out1[0] = x34; + out1[1] = x38; + out1[2] = x39; + out1[3] = x35; +} +