* [PATCH v5 0/4] The SipHash Patchset @ 2016-12-15 20:29 Jason A. Donenfeld 2016-12-15 20:30 ` [PATCH v5 1/4] siphash: add cryptographically secure PRF Jason A. Donenfeld ` (4 more replies) 0 siblings, 5 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-15 20:29 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto Cc: Jason A. Donenfeld Hey folks, I think we're approaching the end of the review for this patchset and we're getting somewhat close to being ready for it being queued up. At this point, I've incorporated all of the extremely helpful and instructive suggestions from the list. For this v5, we now accept u64[2] as the key, so that alignment is taken care of naturally. For other alignment issues, we have both the fast aligned version and the unaligned version, depending on what's necessary. We've worked out the issues for struct padding. The functions now take a void pointer to avoid ugly casting, which also helps us shed the inline helper functions which were not very pretty. The replacements of MD5 have been benchmarked and show a big increase in speed. We've even come up with a better naming scheme for dword/qword. All and all it's shaping up nicely. So, if this series looks good to you, please send along your Reviewed-by, so we can begin to get this completed. If there are still lingering issues, let me know and I'll incorporated them into a v6 if necessary. Thanks, Jason Jason A. Donenfeld (4): siphash: add cryptographically secure PRF siphash: add Nu{32,64} helpers secure_seq: use SipHash in place of MD5 random: use SipHash in place of MD5 drivers/char/random.c | 32 +++---- include/linux/siphash.h | 65 ++++++++++++++ lib/Kconfig.debug | 6 +- lib/Makefile | 5 +- lib/siphash.c | 223 ++++++++++++++++++++++++++++++++++++++++++++++++ lib/test_siphash.c | 101 ++++++++++++++++++++++ net/core/secure_seq.c | 133 +++++++++++------------------ 7 files changed, 460 insertions(+), 105 deletions(-) create mode 100644 include/linux/siphash.h create mode 100644 lib/siphash.c create mode 100644 lib/test_siphash.c -- 2.11.0 ^ permalink raw reply [flat|nested] 82+ messages in thread
* [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-15 20:29 [PATCH v5 0/4] The SipHash Patchset Jason A. Donenfeld @ 2016-12-15 20:30 ` Jason A. Donenfeld 2016-12-15 22:42 ` George Spelvin ` (2 more replies) 2016-12-15 20:30 ` [PATCH v5 2/4] siphash: add Nu{32,64} helpers Jason A. Donenfeld ` (3 subsequent siblings) 4 siblings, 3 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-15 20:30 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto Cc: Jason A. Donenfeld, Jean-Philippe Aumasson, Daniel J . Bernstein SipHash is a 64-bit keyed hash function that is actually a cryptographically secure PRF, like HMAC. Except SipHash is super fast, and is meant to be used as a hashtable keyed lookup function, or as a general PRF for short input use cases, such as sequence numbers or RNG chaining. For the first usage: There are a variety of attacks known as "hashtable poisoning" in which an attacker forms some data such that the hash of that data will be the same, and then preceeds to fill up all entries of a hashbucket. This is a realistic and well-known denial-of-service vector. Currently hashtables use jhash, which is fast but not secure, and some kind of rotating key scheme (or none at all, which isn't good). SipHash is meant as a replacement for jhash in these cases. There are a modicum of places in the kernel that are vulnerable to hashtable poisoning attacks, either via userspace vectors or network vectors, and there's not a reliable mechanism inside the kernel at the moment to fix it. The first step toward fixing these issues is actually getting a secure primitive into the kernel for developers to use. Then we can, bit by bit, port things over to it as deemed appropriate. While SipHash is extremely fast for a cryptographically secure function, it is likely a tiny bit slower than the insecure jhash, and so replacements will be evaluated on a case-by-case basis based on whether or not the difference in speed is negligible and whether or not the current jhash usage poses a real security risk. For the second usage: A few places in the kernel are using MD5 for creating secure sequence numbers, port numbers, or fast random numbers. SipHash is a faster, more fitting, and more secure replacement for MD5 in those situations. Replacing MD5 with SipHash for these uses is obvious and straight- forward, and so is submitted along with this patch series. There shouldn't be much of a debate over its efficacy. Dozens of languages are already using this internally for their hash tables and PRFs. Some of the BSDs already use this in their kernels. SipHash is a widely known high-speed solution to a widely known set of problems, and it's time we catch-up. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> Cc: Daniel J. Bernstein <djb@cr.yp.to> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: David Laight <David.Laight@aculab.com> --- include/linux/siphash.h | 32 +++++++++++ lib/Kconfig.debug | 6 +-- lib/Makefile | 5 +- lib/siphash.c | 138 ++++++++++++++++++++++++++++++++++++++++++++++++ lib/test_siphash.c | 83 +++++++++++++++++++++++++++++ 5 files changed, 259 insertions(+), 5 deletions(-) create mode 100644 include/linux/siphash.h create mode 100644 lib/siphash.c create mode 100644 lib/test_siphash.c diff --git a/include/linux/siphash.h b/include/linux/siphash.h new file mode 100644 index 000000000000..145cf5667078 --- /dev/null +++ b/include/linux/siphash.h @@ -0,0 +1,32 @@ +/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. + * + * This file is provided under a dual BSD/GPLv2 license. + * + * SipHash: a fast short-input PRF + * https://131002.net/siphash/ + * + * This implementation is specifically for SipHash2-4. + */ + +#ifndef _LINUX_SIPHASH_H +#define _LINUX_SIPHASH_H + +#include <linux/types.h> + +#define SIPHASH_ALIGNMENT 8 + +typedef u64 siphash_key_t[2]; + +u64 siphash(const void *data, size_t len, const siphash_key_t key); + +#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS +static inline u64 siphash_unaligned(const void *data, size_t len, + const siphash_key_t key) +{ + return siphash(data, len, key); +} +#else +u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key); +#endif + +#endif /* _LINUX_SIPHASH_H */ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 7446097f72bd..86254ea99b45 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1843,9 +1843,9 @@ config TEST_HASH tristate "Perform selftest on hash functions" default n help - Enable this option to test the kernel's integer (<linux/hash,h>) - and string (<linux/stringhash.h>) hash functions on boot - (or module load). + Enable this option to test the kernel's integer (<linux/hash.h>), + string (<linux/stringhash.h>), and siphash (<linux/siphash.h>) + hash functions on boot (or module load). This is intended to help people writing architecture-specific optimized versions. If unsure, say N. diff --git a/lib/Makefile b/lib/Makefile index 50144a3aeebd..71d398b04a74 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -22,7 +22,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \ sha1.o chacha20.o md5.o irq_regs.o argv_split.o \ flex_proportions.o ratelimit.o show_mem.o \ is_single_threaded.o plist.o decompress.o kobject_uevent.o \ - earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o win_minmax.o + earlycpio.o seq_buf.o siphash.o \ + nmi_backtrace.o nodemask.o win_minmax.o lib-$(CONFIG_MMU) += ioremap.o lib-$(CONFIG_SMP) += cpumask.o @@ -44,7 +45,7 @@ obj-$(CONFIG_TEST_HEXDUMP) += test_hexdump.o obj-y += kstrtox.o obj-$(CONFIG_TEST_BPF) += test_bpf.o obj-$(CONFIG_TEST_FIRMWARE) += test_firmware.o -obj-$(CONFIG_TEST_HASH) += test_hash.o +obj-$(CONFIG_TEST_HASH) += test_hash.o test_siphash.o obj-$(CONFIG_TEST_KASAN) += test_kasan.o obj-$(CONFIG_TEST_KSTRTOX) += test-kstrtox.o obj-$(CONFIG_TEST_LKM) += test_module.o diff --git a/lib/siphash.c b/lib/siphash.c new file mode 100644 index 000000000000..afc13cbb1b78 --- /dev/null +++ b/lib/siphash.c @@ -0,0 +1,138 @@ +/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. + * + * This file is provided under a dual BSD/GPLv2 license. + * + * SipHash: a fast short-input PRF + * https://131002.net/siphash/ + * + * This implementation is specifically for SipHash2-4. + */ + +#include <linux/siphash.h> +#include <linux/kernel.h> +#include <asm/unaligned.h> + +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 +#include <linux/dcache.h> +#include <asm/word-at-a-time.h> +#endif + +#define SIPROUND \ + do { \ + v0 += v1; v1 = rol64(v1, 13); v1 ^= v0; v0 = rol64(v0, 32); \ + v2 += v3; v3 = rol64(v3, 16); v3 ^= v2; \ + v0 += v3; v3 = rol64(v3, 21); v3 ^= v0; \ + v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); \ + } while(0) + +/** + * siphash - compute 64-bit siphash PRF value + * @data: buffer to hash, must be aligned to SIPHASH_ALIGNMENT + * @size: size of @data + * @key: the siphash key + */ +u64 siphash(const void *data, size_t len, const siphash_key_t key) +{ + u64 v0 = 0x736f6d6570736575ULL; + u64 v1 = 0x646f72616e646f6dULL; + u64 v2 = 0x6c7967656e657261ULL; + u64 v3 = 0x7465646279746573ULL; + u64 b = ((u64)len) << 56; + u64 m; + const u8 *end = data + len - (len % sizeof(u64)); + const u8 left = len & (sizeof(u64) - 1); + v3 ^= key[1]; + v2 ^= key[0]; + v1 ^= key[1]; + v0 ^= key[0]; + for (; data != end; data += sizeof(u64)) { + m = le64_to_cpup(data); + v3 ^= m; + SIPROUND; + SIPROUND; + v0 ^= m; + } +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 + if (left) + b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) & + bytemask_from_count(left))); +#else + switch (left) { + case 7: b |= ((u64)end[6]) << 48; + case 6: b |= ((u64)end[5]) << 40; + case 5: b |= ((u64)end[4]) << 32; + case 4: b |= le32_to_cpup(data); break; + case 3: b |= ((u64)end[2]) << 16; + case 2: b |= le16_to_cpup(data); break; + case 1: b |= end[0]; + } +#endif + v3 ^= b; + SIPROUND; + SIPROUND; + v0 ^= b; + v2 ^= 0xff; + SIPROUND; + SIPROUND; + SIPROUND; + SIPROUND; + return (v0 ^ v1) ^ (v2 ^ v3); +} +EXPORT_SYMBOL(siphash); + +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS +/** + * siphash - compute 64-bit siphash PRF value, without alignment requirements + * @data: buffer to hash + * @size: size of @data + * @key: the siphash key + */ +u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key) +{ + u64 v0 = 0x736f6d6570736575ULL; + u64 v1 = 0x646f72616e646f6dULL; + u64 v2 = 0x6c7967656e657261ULL; + u64 v3 = 0x7465646279746573ULL; + u64 b = ((u64)len) << 56; + u64 m; + const u8 *end = data + len - (len % sizeof(u64)); + const u8 left = len & (sizeof(u64) - 1); + v3 ^= key[1]; + v2 ^= key[0]; + v1 ^= key[1]; + v0 ^= key[0]; + for (; data != end; data += sizeof(u64)) { + m = get_unaligned_le64(data); + v3 ^= m; + SIPROUND; + SIPROUND; + v0 ^= m; + } +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 + if (left) + b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) & + bytemask_from_count(left))); +#else + switch (left) { + case 7: b |= ((u64)end[6]) << 48; + case 6: b |= ((u64)end[5]) << 40; + case 5: b |= ((u64)end[4]) << 32; + case 4: b |= get_unaligned_le32(end); break; + case 3: b |= ((u64)end[2]) << 16; + case 2: b |= get_unaligned_le16(end); break; + case 1: b |= bytes[0]; + } +#endif + v3 ^= b; + SIPROUND; + SIPROUND; + v0 ^= b; + v2 ^= 0xff; + SIPROUND; + SIPROUND; + SIPROUND; + SIPROUND; + return (v0 ^ v1) ^ (v2 ^ v3); +} +EXPORT_SYMBOL(siphash_unaligned); +#endif diff --git a/lib/test_siphash.c b/lib/test_siphash.c new file mode 100644 index 000000000000..93549e4e22c5 --- /dev/null +++ b/lib/test_siphash.c @@ -0,0 +1,83 @@ +/* Test cases for siphash.c + * + * Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. + * + * This file is provided under a dual BSD/GPLv2 license. + * + * SipHash: a fast short-input PRF + * https://131002.net/siphash/ + * + * This implementation is specifically for SipHash2-4. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include <linux/siphash.h> +#include <linux/kernel.h> +#include <linux/string.h> +#include <linux/errno.h> +#include <linux/module.h> + +/* Test vectors taken from official reference source available at: + * https://131002.net/siphash/siphash24.c + */ +static const u64 test_vectors[64] = { + 0x726fdb47dd0e0e31ULL, 0x74f839c593dc67fdULL, 0x0d6c8009d9a94f5aULL, + 0x85676696d7fb7e2dULL, 0xcf2794e0277187b7ULL, 0x18765564cd99a68dULL, + 0xcbc9466e58fee3ceULL, 0xab0200f58b01d137ULL, 0x93f5f5799a932462ULL, + 0x9e0082df0ba9e4b0ULL, 0x7a5dbbc594ddb9f3ULL, 0xf4b32f46226bada7ULL, + 0x751e8fbc860ee5fbULL, 0x14ea5627c0843d90ULL, 0xf723ca908e7af2eeULL, + 0xa129ca6149be45e5ULL, 0x3f2acc7f57c29bdbULL, 0x699ae9f52cbe4794ULL, + 0x4bc1b3f0968dd39cULL, 0xbb6dc91da77961bdULL, 0xbed65cf21aa2ee98ULL, + 0xd0f2cbb02e3b67c7ULL, 0x93536795e3a33e88ULL, 0xa80c038ccd5ccec8ULL, + 0xb8ad50c6f649af94ULL, 0xbce192de8a85b8eaULL, 0x17d835b85bbb15f3ULL, + 0x2f2e6163076bcfadULL, 0xde4daaaca71dc9a5ULL, 0xa6a2506687956571ULL, + 0xad87a3535c49ef28ULL, 0x32d892fad841c342ULL, 0x7127512f72f27cceULL, + 0xa7f32346f95978e3ULL, 0x12e0b01abb051238ULL, 0x15e034d40fa197aeULL, + 0x314dffbe0815a3b4ULL, 0x027990f029623981ULL, 0xcadcd4e59ef40c4dULL, + 0x9abfd8766a33735cULL, 0x0e3ea96b5304a7d0ULL, 0xad0c42d6fc585992ULL, + 0x187306c89bc215a9ULL, 0xd4a60abcf3792b95ULL, 0xf935451de4f21df2ULL, + 0xa9538f0419755787ULL, 0xdb9acddff56ca510ULL, 0xd06c98cd5c0975ebULL, + 0xe612a3cb9ecba951ULL, 0xc766e62cfcadaf96ULL, 0xee64435a9752fe72ULL, + 0xa192d576b245165aULL, 0x0a8787bf8ecb74b2ULL, 0x81b3e73d20b49b6fULL, + 0x7fa8220ba3b2eceaULL, 0x245731c13ca42499ULL, 0xb78dbfaf3a8d83bdULL, + 0xea1ad565322a1a0bULL, 0x60e61c23a3795013ULL, 0x6606d7e446282b93ULL, + 0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 0xe51b38608ef25f57ULL, + 0x958a324ceb064572ULL +}; +static const siphash_key_t test_key = + { 0x0706050403020100ULL , 0x0f0e0d0c0b0a0908ULL }; + +static int __init siphash_test_init(void) +{ + u8 in[64] __aligned(SIPHASH_ALIGNMENT); + u8 in_unaligned[65]; + u8 i; + int ret = 0; + + for (i = 0; i < 64; ++i) { + in[i] = i; + in_unaligned[i + 1] = i; + if (siphash(in, i, test_key) != test_vectors[i]) { + pr_info("self-test aligned %u: FAIL\n", i + 1); + ret = -EINVAL; + } + if (siphash_unaligned(in_unaligned + 1, i, test_key) != test_vectors[i]) { + pr_info("self-test unaligned %u: FAIL\n", i + 1); + ret = -EINVAL; + } + } + if (!ret) + pr_info("self-tests: pass\n"); + return ret; +} + +static void __exit siphash_test_exit(void) +{ +} + +module_init(siphash_test_init); +module_exit(siphash_test_exit); + +MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>"); +MODULE_LICENSE("Dual BSD/GPL"); -- 2.11.0 ^ permalink raw reply related [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-15 20:30 ` [PATCH v5 1/4] siphash: add cryptographically secure PRF Jason A. Donenfeld @ 2016-12-15 22:42 ` George Spelvin 2016-12-16 2:14 ` kbuild test robot 2016-12-17 14:55 ` Jeffrey Walton 2 siblings, 0 replies; 82+ messages in thread From: George Spelvin @ 2016-12-15 22:42 UTC (permalink / raw) To: ak, davem, David.Laight, ebiggers3, hannes, Jason, kernel-hardening, linux-crypto, linux-kernel, linux, luto, netdev, tom, torvalds, tytso, vegard.nossum Cc: djb, jeanphilippe.aumasson > While SipHash is extremely fast for a cryptographically secure function, > it is likely a tiny bit slower than the insecure jhash, and so replacements > will be evaluated on a case-by-case basis based on whether or not the > difference in speed is negligible and whether or not the current jhash usage > poses a real security risk. To quantify that, jhash is 27 instructions per 12 bytes of input, with a dependency path length of 13 instructions. (24/12 in __jash_mix, plus 3/1 for adding the input to the state.) The final add + __jhash_final is 24 instructions with a path length of 15, which is close enough for this handwaving. Call it 18n instructions and 8n cycles for 8n bytes. SipHash (on a 64-bit machine) is 14 instructions with a dependency path length of 4 *per round*. Two rounds per 8 bytes, plus plus two adds and one cycle per input word, plus four rounds to finish makes 30n+46 instructions and 9n+16 cycles for 8n bytes. So *if* you have a 64-bit 4-way superscalar machine, it's not that much slower once it gets going, but the four-round finalization is quite noticeable for short inputs. For typical kernel input lengths "within a factor of 2" is probably more accurate than "a tiny bit". You lose a factor of 2 if you machine is 2-way or non-superscalar, and a second factor of 2 if it's a 32-bit machine. I mention this because there are a lot of home routers and other netwoek appliances running Linux on 32-bit ARM and MIPS processors. For those, it's a factor of *eight*, which is a lot more than "a tiny bit". The real killer is if you don't have enough registers; SipHash performs horribly on i386 because it uses more state than i386 has registers. (If i386 performance is desired, you might ask Jean-Philippe for some rotate constants for a 32-bit variant with 64 bits of key. Note that SipHash's security proof requires that key length + input length is strictly less than the state size, so for a 4x32-bit variant, while you could stretch the key length a little, you'd have a hard limit at 95 bits.) A second point, the final XOR in SipHash is either a (very minor) design mistake, or an opportunity for optimization, depending on how you look at it. Look at the end of the function: >+ SIPROUND; >+ SIPROUND; >+ return (v0 ^ v1) ^ (v2 ^ v3); Expanding that out, you get: + v0 += v1; v1 = rol64(v1, 13); v1 ^= v0; v0 = rol64(v0, 32); + v2 += v3; v3 = rol64(v3, 16); v3 ^= v2; + v0 += v3; v3 = rol64(v3, 21); v3 ^= v0; + v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); + return v0 ^ v1 ^ v2 ^ v3; Since the final XOR includes both v0 and v3, it's undoing the "v3 ^= v0" two lines earlier, so the value of v0 doesn't matter after its XOR into v1 on line one. The final SIPROUND and return can then be optimized to + v0 += v1; v1 = rol64(v1, 13); v1 ^= v0; + v2 += v3; v3 = rol64(v3, 16); v3 ^= v2; + v3 = rol64(v3, 21); + v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); + return v1 ^ v2 ^ v3; A 32-bit implementation could further tweak the 4 instructions of v1 ^= v2; v2 = rol64(v2, 32); v1 ^= v2; gcc 6.2.1 -O3 compiles it to basically: v1.low ^= v2.low; v1.high ^= v2.high; v1.low ^= v2.high; v1.high ^= v2.low; but it could be written as: v2.low ^= v2.high; v1.low ^= v2.low; v1.high ^= v2.low; Alternatively, if it's for private use only (key not shared with other systems), a slightly stronger variant would "return v1 ^ v3;". (The final swap of v2 is dead code, but a compiler can spot that easily.) ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-15 20:30 ` [PATCH v5 1/4] siphash: add cryptographically secure PRF Jason A. Donenfeld 2016-12-15 22:42 ` George Spelvin @ 2016-12-16 2:14 ` kbuild test robot 2016-12-17 14:55 ` Jeffrey Walton 2 siblings, 0 replies; 82+ messages in thread From: kbuild test robot @ 2016-12-16 2:14 UTC (permalink / raw) To: Jason A. Donenfeld Cc: kbuild-all, Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto, Jason A. Donenfeld, Jean-Philippe Aumasson, Daniel J . Bernstein [-- Attachment #1: Type: text/plain, Size: 1530 bytes --] Hi Jason, [auto build test ERROR on linus/master] [also build test ERROR on v4.9 next-20161215] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Jason-A-Donenfeld/siphash-add-cryptographically-secure-PRF/20161216-092837 config: ia64-allmodconfig (attached as .config) compiler: ia64-linux-gcc (GCC) 6.2.0 reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=ia64 All errors (new ones prefixed by >>): lib/siphash.c: In function 'siphash_unaligned': >> lib/siphash.c:123:15: error: 'bytes' undeclared (first use in this function) case 1: b |= bytes[0]; ^~~~~ lib/siphash.c:123:15: note: each undeclared identifier is reported only once for each function it appears in vim +/bytes +123 lib/siphash.c 117 case 7: b |= ((u64)end[6]) << 48; 118 case 6: b |= ((u64)end[5]) << 40; 119 case 5: b |= ((u64)end[4]) << 32; 120 case 4: b |= get_unaligned_le32(end); break; 121 case 3: b |= ((u64)end[2]) << 16; 122 case 2: b |= get_unaligned_le16(end); break; > 123 case 1: b |= bytes[0]; 124 } 125 #endif 126 v3 ^= b; --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 45664 bytes --] ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-15 20:30 ` [PATCH v5 1/4] siphash: add cryptographically secure PRF Jason A. Donenfeld 2016-12-15 22:42 ` George Spelvin 2016-12-16 2:14 ` kbuild test robot @ 2016-12-17 14:55 ` Jeffrey Walton 2016-12-19 17:08 ` Jason A. Donenfeld 2 siblings, 1 reply; 82+ messages in thread From: Jeffrey Walton @ 2016-12-17 14:55 UTC (permalink / raw) To: Jason A. Donenfeld Cc: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto, Jean-Philippe Aumasson, Daniel J . Bernstein > diff --git a/lib/test_siphash.c b/lib/test_siphash.c > new file mode 100644 > index 000000000000..93549e4e22c5 > --- /dev/null > +++ b/lib/test_siphash.c > @@ -0,0 +1,83 @@ > +/* Test cases for siphash.c > + * > + * Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. > + * > + * This file is provided under a dual BSD/GPLv2 license. > + * > + * SipHash: a fast short-input PRF > + * https://131002.net/siphash/ > + * > + * This implementation is specifically for SipHash2-4. > + */ > + > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > + > +#include <linux/siphash.h> > +#include <linux/kernel.h> > +#include <linux/string.h> > +#include <linux/errno.h> > +#include <linux/module.h> > + > +/* Test vectors taken from official reference source available at: > + * https://131002.net/siphash/siphash24.c > + */ > +static const u64 test_vectors[64] = { > + 0x726fdb47dd0e0e31ULL, 0x74f839c593dc67fdULL, 0x0d6c8009d9a94f5aULL, > + 0x85676696d7fb7e2dULL, 0xcf2794e0277187b7ULL, 0x18765564cd99a68dULL, > + 0xcbc9466e58fee3ceULL, 0xab0200f58b01d137ULL, 0x93f5f5799a932462ULL, > + 0x9e0082df0ba9e4b0ULL, 0x7a5dbbc594ddb9f3ULL, 0xf4b32f46226bada7ULL, > + 0x751e8fbc860ee5fbULL, 0x14ea5627c0843d90ULL, 0xf723ca908e7af2eeULL, > + 0xa129ca6149be45e5ULL, 0x3f2acc7f57c29bdbULL, 0x699ae9f52cbe4794ULL, > + 0x4bc1b3f0968dd39cULL, 0xbb6dc91da77961bdULL, 0xbed65cf21aa2ee98ULL, > + 0xd0f2cbb02e3b67c7ULL, 0x93536795e3a33e88ULL, 0xa80c038ccd5ccec8ULL, > + 0xb8ad50c6f649af94ULL, 0xbce192de8a85b8eaULL, 0x17d835b85bbb15f3ULL, > + 0x2f2e6163076bcfadULL, 0xde4daaaca71dc9a5ULL, 0xa6a2506687956571ULL, > + 0xad87a3535c49ef28ULL, 0x32d892fad841c342ULL, 0x7127512f72f27cceULL, > + 0xa7f32346f95978e3ULL, 0x12e0b01abb051238ULL, 0x15e034d40fa197aeULL, > + 0x314dffbe0815a3b4ULL, 0x027990f029623981ULL, 0xcadcd4e59ef40c4dULL, > + 0x9abfd8766a33735cULL, 0x0e3ea96b5304a7d0ULL, 0xad0c42d6fc585992ULL, > + 0x187306c89bc215a9ULL, 0xd4a60abcf3792b95ULL, 0xf935451de4f21df2ULL, > + 0xa9538f0419755787ULL, 0xdb9acddff56ca510ULL, 0xd06c98cd5c0975ebULL, > + 0xe612a3cb9ecba951ULL, 0xc766e62cfcadaf96ULL, 0xee64435a9752fe72ULL, > + 0xa192d576b245165aULL, 0x0a8787bf8ecb74b2ULL, 0x81b3e73d20b49b6fULL, > + 0x7fa8220ba3b2eceaULL, 0x245731c13ca42499ULL, 0xb78dbfaf3a8d83bdULL, > + 0xea1ad565322a1a0bULL, 0x60e61c23a3795013ULL, 0x6606d7e446282b93ULL, > + 0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 0xe51b38608ef25f57ULL, > + 0x958a324ceb064572ULL > +}; > +static const siphash_key_t test_key = > + { 0x0706050403020100ULL , 0x0f0e0d0c0b0a0908ULL }; > + > +static int __init siphash_test_init(void) > +{ > + u8 in[64] __aligned(SIPHASH_ALIGNMENT); > + u8 in_unaligned[65]; > + u8 i; > + int ret = 0; > + > + for (i = 0; i < 64; ++i) { > + in[i] = i; > + in_unaligned[i + 1] = i; > + if (siphash(in, i, test_key) != test_vectors[i]) { > + pr_info("self-test aligned %u: FAIL\n", i + 1); > + ret = -EINVAL; > + } > + if (siphash_unaligned(in_unaligned + 1, i, test_key) != test_vectors[i]) { > + pr_info("self-test unaligned %u: FAIL\n", i + 1); > + ret = -EINVAL; > + } > + } > + if (!ret) > + pr_info("self-tests: pass\n"); > + return ret; > +} > + > +static void __exit siphash_test_exit(void) > +{ > +} > + > +module_init(siphash_test_init); > +module_exit(siphash_test_exit); > + > +MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>"); > +MODULE_LICENSE("Dual BSD/GPL"); > -- > 2.11.0 > I believe the output of SipHash depends upon endianness. Folks who request a digest through the af_alg interface will likely expect a byte array. I think that means on little endian machines, values like element 0 must be reversed byte reversed: 0x726fdb47dd0e0e31ULL => 31,0e,0e,dd,47,db,6f,72 If I am not mistaken, that value (and other tv's) are returned here: return (v0 ^ v1) ^ (v2 ^ v3); It may be prudent to include the endian reversal in the test to ensure big endian machines produce expected results. Some closely related testing on an old Apple PowerMac G5 revealed that result needed to be reversed before returning it to a caller. Jeff ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-17 14:55 ` Jeffrey Walton @ 2016-12-19 17:08 ` Jason A. Donenfeld 0 siblings, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-19 17:08 UTC (permalink / raw) To: noloader Cc: Netdev, kernel-hardening, LKML, Linux Crypto Mailing List, David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, Andi Kleen, David Miller, Andy Lutomirski, Jean-Philippe Aumasson, Daniel J . Bernstein On Sat, Dec 17, 2016 at 3:55 PM, Jeffrey Walton <noloader@gmail.com> wrote: > It may be prudent to include the endian reversal in the test to ensure > big endian machines produce expected results. Some closely related > testing on an old Apple PowerMac G5 revealed that result needed to be > reversed before returning it to a caller. The function [1] returns a u64. Originally I had it returning a __le64, but that was considered unnecessary by many prior reviewers on the list. It returns an integer. If you want uniform bytes out of it, then use the endian conversion function, the same as you would do with any other type of integer. Additionally, this function is *not* meant for af_alg or any of the crypto/* code. It's very unlikely to find a use there. > Forgive my ignorance... I did not find reading on using the primitive > in a PRNG. Does anyone know what Aumasson or Bernstein have to say? > Aumasson's site does not seem to discuss the use case: He's on this thread so I suppose he can speak up for himself. But in my conversations with him, the primary take-away was, "seems okay to me!". But please -- JP - correct me if I've misinterpreted. ^ permalink raw reply [flat|nested] 82+ messages in thread
* [PATCH v5 2/4] siphash: add Nu{32,64} helpers 2016-12-15 20:29 [PATCH v5 0/4] The SipHash Patchset Jason A. Donenfeld 2016-12-15 20:30 ` [PATCH v5 1/4] siphash: add cryptographically secure PRF Jason A. Donenfeld @ 2016-12-15 20:30 ` Jason A. Donenfeld 2016-12-16 10:39 ` David Laight 2016-12-15 20:30 ` [PATCH v5 3/4] secure_seq: use SipHash in place of MD5 Jason A. Donenfeld ` (2 subsequent siblings) 4 siblings, 1 reply; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-15 20:30 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto Cc: Jason A. Donenfeld These restore parity with the jhash interface by providing high performance helpers for common input sizes. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Tom Herbert <tom@herbertland.com> --- include/linux/siphash.h | 33 ++++++++++ lib/siphash.c | 157 +++++++++++++++++++++++++++++++++++++----------- lib/test_siphash.c | 18 ++++++ 3 files changed, 172 insertions(+), 36 deletions(-) diff --git a/include/linux/siphash.h b/include/linux/siphash.h index 145cf5667078..6f5a08a0fc7e 100644 --- a/include/linux/siphash.h +++ b/include/linux/siphash.h @@ -29,4 +29,37 @@ static inline u64 siphash_unaligned(const void *data, size_t len, u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key); #endif +u64 siphash_1u64(const u64 a, const siphash_key_t key); +u64 siphash_2u64(const u64 a, const u64 b, const siphash_key_t key); +u64 siphash_3u64(const u64 a, const u64 b, const u64 c, + const siphash_key_t key); +u64 siphash_4u64(const u64 a, const u64 b, const u64 c, const u64 d, + const siphash_key_t key); + +static inline u64 siphash_2u32(const u32 a, const u32 b, const siphash_key_t key) +{ + return siphash_1u64((u64)b << 32 | a, key); +} + +static inline u64 siphash_4u32(const u32 a, const u32 b, const u32 c, const u32 d, + const siphash_key_t key) +{ + return siphash_2u64((u64)b << 32 | a, (u64)d << 32 | c, key); +} + +static inline u64 siphash_6u32(const u32 a, const u32 b, const u32 c, const u32 d, + const u32 e, const u32 f, const siphash_key_t key) +{ + return siphash_3u64((u64)b << 32 | a, (u64)d << 32 | c, (u64)f << 32 | e, + key); +} + +static inline u64 siphash_8u32(const u32 a, const u32 b, const u32 c, const u32 d, + const u32 e, const u32 f, const u32 g, const u32 h, + const siphash_key_t key) +{ + return siphash_4u64((u64)b << 32 | a, (u64)d << 32 | c, (u64)f << 32 | e, + (u64)h << 32 | g, key); +} + #endif /* _LINUX_SIPHASH_H */ diff --git a/lib/siphash.c b/lib/siphash.c index afc13cbb1b78..970c083ab06a 100644 --- a/lib/siphash.c +++ b/lib/siphash.c @@ -25,6 +25,29 @@ v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); \ } while(0) +#define PREAMBLE(len) \ + u64 v0 = 0x736f6d6570736575ULL; \ + u64 v1 = 0x646f72616e646f6dULL; \ + u64 v2 = 0x6c7967656e657261ULL; \ + u64 v3 = 0x7465646279746573ULL; \ + u64 b = ((u64)len) << 56; \ + v3 ^= key[1]; \ + v2 ^= key[0]; \ + v1 ^= key[1]; \ + v0 ^= key[0]; + +#define POSTAMBLE \ + v3 ^= b; \ + SIPROUND; \ + SIPROUND; \ + v0 ^= b; \ + v2 ^= 0xff; \ + SIPROUND; \ + SIPROUND; \ + SIPROUND; \ + SIPROUND; \ + return (v0 ^ v1) ^ (v2 ^ v3); + /** * siphash - compute 64-bit siphash PRF value * @data: buffer to hash, must be aligned to SIPHASH_ALIGNMENT @@ -33,18 +56,10 @@ */ u64 siphash(const void *data, size_t len, const siphash_key_t key) { - u64 v0 = 0x736f6d6570736575ULL; - u64 v1 = 0x646f72616e646f6dULL; - u64 v2 = 0x6c7967656e657261ULL; - u64 v3 = 0x7465646279746573ULL; - u64 b = ((u64)len) << 56; - u64 m; const u8 *end = data + len - (len % sizeof(u64)); const u8 left = len & (sizeof(u64) - 1); - v3 ^= key[1]; - v2 ^= key[0]; - v1 ^= key[1]; - v0 ^= key[0]; + u64 m; + PREAMBLE(len) for (; data != end; data += sizeof(u64)) { m = le64_to_cpup(data); v3 ^= m; @@ -67,16 +82,7 @@ u64 siphash(const void *data, size_t len, const siphash_key_t key) case 1: b |= end[0]; } #endif - v3 ^= b; - SIPROUND; - SIPROUND; - v0 ^= b; - v2 ^= 0xff; - SIPROUND; - SIPROUND; - SIPROUND; - SIPROUND; - return (v0 ^ v1) ^ (v2 ^ v3); + POSTAMBLE } EXPORT_SYMBOL(siphash); @@ -89,18 +95,10 @@ EXPORT_SYMBOL(siphash); */ u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key) { - u64 v0 = 0x736f6d6570736575ULL; - u64 v1 = 0x646f72616e646f6dULL; - u64 v2 = 0x6c7967656e657261ULL; - u64 v3 = 0x7465646279746573ULL; - u64 b = ((u64)len) << 56; - u64 m; const u8 *end = data + len - (len % sizeof(u64)); const u8 left = len & (sizeof(u64) - 1); - v3 ^= key[1]; - v2 ^= key[0]; - v1 ^= key[1]; - v0 ^= key[0]; + u64 m; + PREAMBLE(len) for (; data != end; data += sizeof(u64)) { m = get_unaligned_le64(data); v3 ^= m; @@ -123,16 +121,103 @@ u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key) case 1: b |= bytes[0]; } #endif - v3 ^= b; + POSTAMBLE +} +EXPORT_SYMBOL(siphash_unaligned); +#endif + +/** + * siphash_1u64 - compute 64-bit siphash PRF value of a u64 + * @first: first u64 + * @key: the siphash key + */ +u64 siphash_1u64(const u64 first, const siphash_key_t key) +{ + PREAMBLE(8) + v3 ^= first; + SIPROUND; + SIPROUND; + v0 ^= first; + POSTAMBLE +} +EXPORT_SYMBOL(siphash_1u64); + +/** + * siphash_2u64 - compute 64-bit siphash PRF value of 2 u64 + * @first: first u64 + * @second: second u64 + * @key: the siphash key + */ +u64 siphash_2u64(const u64 first, const u64 second, const siphash_key_t key) +{ + PREAMBLE(16) + v3 ^= first; SIPROUND; SIPROUND; - v0 ^= b; - v2 ^= 0xff; + v0 ^= first; + v3 ^= second; SIPROUND; SIPROUND; + v0 ^= second; + POSTAMBLE +} +EXPORT_SYMBOL(siphash_2u64); + +/** + * siphash_3u64 - compute 64-bit siphash PRF value of 3 u64 + * @first: first u64 + * @second: second u64 + * @third: third u64 + * @key: the siphash key + */ +u64 siphash_3u64(const u64 first, const u64 second, const u64 third, + const siphash_key_t key) +{ + PREAMBLE(24) + v3 ^= first; SIPROUND; SIPROUND; - return (v0 ^ v1) ^ (v2 ^ v3); + v0 ^= first; + v3 ^= second; + SIPROUND; + SIPROUND; + v0 ^= second; + v3 ^= third; + SIPROUND; + SIPROUND; + v0 ^= third; + POSTAMBLE } -EXPORT_SYMBOL(siphash_unaligned); -#endif +EXPORT_SYMBOL(siphash_3u64); + +/** + * siphash_4u64 - compute 64-bit siphash PRF value of 4 u64 + * @first: first u64 + * @second: second u64 + * @third: third u64 + * @forth: forth u64 + * @key: the siphash key + */ +u64 siphash_4u64(const u64 first, const u64 second, const u64 third, + const u64 forth, const siphash_key_t key) +{ + PREAMBLE(32) + v3 ^= first; + SIPROUND; + SIPROUND; + v0 ^= first; + v3 ^= second; + SIPROUND; + SIPROUND; + v0 ^= second; + v3 ^= third; + SIPROUND; + SIPROUND; + v0 ^= third; + v3 ^= forth; + SIPROUND; + SIPROUND; + v0 ^= forth; + POSTAMBLE +} +EXPORT_SYMBOL(siphash_4u64); diff --git a/lib/test_siphash.c b/lib/test_siphash.c index 93549e4e22c5..1635189c171f 100644 --- a/lib/test_siphash.c +++ b/lib/test_siphash.c @@ -67,6 +67,24 @@ static int __init siphash_test_init(void) ret = -EINVAL; } } + if (siphash_1u64(0x0706050403020100ULL, test_key) != test_vectors[8]) { + pr_info("self-test 1u64: FAIL\n"); + ret = -EINVAL; + } + if (siphash_2u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL, test_key) != test_vectors[16]) { + pr_info("self-test 2u64: FAIL\n"); + ret = -EINVAL; + } + if (siphash_3u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL, + 0x1716151413121110ULL, test_key) != test_vectors[24]) { + pr_info("self-test 3u64: FAIL\n"); + ret = -EINVAL; + } + if (siphash_4u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL, + 0x1716151413121110ULL, 0x1f1e1d1c1b1a1918ULL, test_key) != test_vectors[32]) { + pr_info("self-test 4u64: FAIL\n"); + ret = -EINVAL; + } if (!ret) pr_info("self-tests: pass\n"); return ret; -- 2.11.0 ^ permalink raw reply related [flat|nested] 82+ messages in thread
* RE: [PATCH v5 2/4] siphash: add Nu{32,64} helpers 2016-12-15 20:30 ` [PATCH v5 2/4] siphash: add Nu{32,64} helpers Jason A. Donenfeld @ 2016-12-16 10:39 ` David Laight 2016-12-16 15:44 ` George Spelvin 0 siblings, 1 reply; 82+ messages in thread From: David Laight @ 2016-12-16 10:39 UTC (permalink / raw) To: 'Jason A. Donenfeld', Netdev, kernel-hardening, LKML, linux-crypto, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto From: Jason A. Donenfeld > Sent: 15 December 2016 20:30 > These restore parity with the jhash interface by providing high > performance helpers for common input sizes. ... > +#define PREAMBLE(len) \ > + u64 v0 = 0x736f6d6570736575ULL; \ > + u64 v1 = 0x646f72616e646f6dULL; \ > + u64 v2 = 0x6c7967656e657261ULL; \ > + u64 v3 = 0x7465646279746573ULL; \ > + u64 b = ((u64)len) << 56; \ > + v3 ^= key[1]; \ > + v2 ^= key[0]; \ > + v1 ^= key[1]; \ > + v0 ^= key[0]; Isn't that equivalent to: v0 = key[0]; v1 = key[1]; v2 = key[0] ^ (0x736f6d6570736575ULL ^ 0x646f72616e646f6dULL); v3 = key[1] ^ (0x646f72616e646f6dULL ^ 0x7465646279746573ULL); Those constants also look like ASCII strings. What cryptographic analysis has been done on the values? David ^ permalink raw reply [flat|nested] 82+ messages in thread
* RE: [PATCH v5 2/4] siphash: add Nu{32,64} helpers 2016-12-16 10:39 ` David Laight @ 2016-12-16 15:44 ` George Spelvin 0 siblings, 0 replies; 82+ messages in thread From: George Spelvin @ 2016-12-16 15:44 UTC (permalink / raw) To: ak, davem, David.Laight, ebiggers3, hannes, Jason, kernel-hardening, linux-crypto, linux-kernel, linux, luto, netdev, tom, torvalds, tytso, vegard.nossum Jason A. Donenfeld wrote: > Isn't that equivalent to: > v0 = key[0]; > v1 = key[1]; > v2 = key[0] ^ (0x736f6d6570736575ULL ^ 0x646f72616e646f6dULL); > v3 = key[1] ^ (0x646f72616e646f6dULL ^ 0x7465646279746573ULL); (Pre-XORing key[] with the first two constants which, if the constants are random in the first place, can be a no-op.) Other than the typo in the v2 line, yes. If they key is non-public, then you can xor an arbitrary constant in to both halves to slightly speed up the startup. (Nits: There's a typo in the v2 line, you don't need to parenthesize associative operators like xor, and the "ull" suffix is redundant here.) > Those constants also look like ASCII strings. They are. The ASCII is "somepseudorandomlygeneratedbytes". > What cryptographic analysis has been done on the values? They're "nothing up my sleeve numbers". They're arbitrary numbers, and almost any other values would do exactly as well. The main properties are: 1) They're different (particulatly v0 != v2 and v1 != v3), and 2) Neither they, nor their xor, is rotationally symmetric like 0x55555555. (Because SipHash is mostly rotationally symmetric, broken only by the interruption of the carry chain at the msbit, it helps slightly to break this up at the beginning.) Those exact values only matter for portability. If you don't need anyone else to be able to compute matching outputs, then you could use any other convenient constants (like the MD5 round constants). ^ permalink raw reply [flat|nested] 82+ messages in thread
* [PATCH v5 3/4] secure_seq: use SipHash in place of MD5 2016-12-15 20:29 [PATCH v5 0/4] The SipHash Patchset Jason A. Donenfeld 2016-12-15 20:30 ` [PATCH v5 1/4] siphash: add cryptographically secure PRF Jason A. Donenfeld 2016-12-15 20:30 ` [PATCH v5 2/4] siphash: add Nu{32,64} helpers Jason A. Donenfeld @ 2016-12-15 20:30 ` Jason A. Donenfeld 2016-12-16 9:59 ` David Laight 2016-12-15 20:30 ` [PATCH v5 4/4] random: " Jason A. Donenfeld 2016-12-16 3:03 ` [PATCH v6 0/5] The SipHash Patchset Jason A. Donenfeld 4 siblings, 1 reply; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-15 20:30 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto Cc: Jason A. Donenfeld This gives a clear speed and security improvement. Siphash is both faster and is more solid crypto than the aging MD5. Rather than manually filling MD5 buffers, for IPv6, we simply create a layout by a simple anonymous struct, for which gcc generates rather efficient code. For IPv4, we pass the values directly to the short input convenience functions. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Miller <davem@davemloft.net> Cc: David Laight <David.Laight@aculab.com> Cc: Tom Herbert <tom@herbertland.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> --- net/core/secure_seq.c | 133 ++++++++++++++++++++------------------------------ 1 file changed, 52 insertions(+), 81 deletions(-) diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c index 88a8e429fc3e..c80583bf3213 100644 --- a/net/core/secure_seq.c +++ b/net/core/secure_seq.c @@ -1,3 +1,5 @@ +/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. */ + #include <linux/kernel.h> #include <linux/init.h> #include <linux/cryptohash.h> @@ -8,14 +10,14 @@ #include <linux/ktime.h> #include <linux/string.h> #include <linux/net.h> - +#include <linux/siphash.h> #include <net/secure_seq.h> #if IS_ENABLED(CONFIG_IPV6) || IS_ENABLED(CONFIG_INET) +#include <linux/in6.h> #include <net/tcp.h> -#define NET_SECRET_SIZE (MD5_MESSAGE_BYTES / 4) -static u32 net_secret[NET_SECRET_SIZE] ____cacheline_aligned; +static siphash_key_t net_secret; static __always_inline void net_secret_init(void) { @@ -44,44 +46,42 @@ static u32 seq_scale(u32 seq) u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr, __be16 sport, __be16 dport, u32 *tsoff) { - u32 secret[MD5_MESSAGE_BYTES / 4]; - u32 hash[MD5_DIGEST_WORDS]; - u32 i; - + const struct { + struct in6_addr saddr; + struct in6_addr daddr; + __be16 sport; + __be16 dport; + u32 padding; + } __aligned(SIPHASH_ALIGNMENT) combined = { + .saddr = *(struct in6_addr *)saddr, + .daddr = *(struct in6_addr *)daddr, + .sport = sport, + .dport = dport + }; + u64 hash; net_secret_init(); - memcpy(hash, saddr, 16); - for (i = 0; i < 4; i++) - secret[i] = net_secret[i] + (__force u32)daddr[i]; - secret[4] = net_secret[4] + - (((__force u16)sport << 16) + (__force u16)dport); - for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++) - secret[i] = net_secret[i]; - - md5_transform(hash, secret); - - *tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0; - return seq_scale(hash[0]); + hash = siphash(&combined, sizeof(combined), net_secret); + *tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0; + return seq_scale(hash); } EXPORT_SYMBOL(secure_tcpv6_sequence_number); u32 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr, __be16 dport) { - u32 secret[MD5_MESSAGE_BYTES / 4]; - u32 hash[MD5_DIGEST_WORDS]; - u32 i; - + const struct { + struct in6_addr saddr; + struct in6_addr daddr; + __be16 dport; + u16 padding1; + u32 padding2; + } __aligned(SIPHASH_ALIGNMENT) combined = { + .saddr = *(struct in6_addr *)saddr, + .daddr = *(struct in6_addr *)daddr, + .dport = dport + }; net_secret_init(); - memcpy(hash, saddr, 16); - for (i = 0; i < 4; i++) - secret[i] = net_secret[i] + (__force u32) daddr[i]; - secret[4] = net_secret[4] + (__force u32)dport; - for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++) - secret[i] = net_secret[i]; - - md5_transform(hash, secret); - - return hash[0]; + return siphash(&combined, sizeof(combined), net_secret); } EXPORT_SYMBOL(secure_ipv6_port_ephemeral); #endif @@ -91,33 +91,17 @@ EXPORT_SYMBOL(secure_ipv6_port_ephemeral); u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr, __be16 sport, __be16 dport, u32 *tsoff) { - u32 hash[MD5_DIGEST_WORDS]; - + u64 hash; net_secret_init(); - hash[0] = (__force u32)saddr; - hash[1] = (__force u32)daddr; - hash[2] = ((__force u16)sport << 16) + (__force u16)dport; - hash[3] = net_secret[15]; - - md5_transform(hash, net_secret); - - *tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0; - return seq_scale(hash[0]); + hash = siphash_4u32(saddr, daddr, sport, dport, net_secret); + *tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0; + return seq_scale(hash); } u32 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport) { - u32 hash[MD5_DIGEST_WORDS]; - net_secret_init(); - hash[0] = (__force u32)saddr; - hash[1] = (__force u32)daddr; - hash[2] = (__force u32)dport ^ net_secret[14]; - hash[3] = net_secret[15]; - - md5_transform(hash, net_secret); - - return hash[0]; + return siphash_4u32(saddr, daddr, dport, 0, net_secret); } EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral); #endif @@ -126,21 +110,11 @@ EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral); u64 secure_dccp_sequence_number(__be32 saddr, __be32 daddr, __be16 sport, __be16 dport) { - u32 hash[MD5_DIGEST_WORDS]; u64 seq; - net_secret_init(); - hash[0] = (__force u32)saddr; - hash[1] = (__force u32)daddr; - hash[2] = ((__force u16)sport << 16) + (__force u16)dport; - hash[3] = net_secret[15]; - - md5_transform(hash, net_secret); - - seq = hash[0] | (((u64)hash[1]) << 32); + seq = siphash_4u32(saddr, daddr, sport, dport, net_secret); seq += ktime_get_real_ns(); seq &= (1ull << 48) - 1; - return seq; } EXPORT_SYMBOL(secure_dccp_sequence_number); @@ -149,26 +123,23 @@ EXPORT_SYMBOL(secure_dccp_sequence_number); u64 secure_dccpv6_sequence_number(__be32 *saddr, __be32 *daddr, __be16 sport, __be16 dport) { - u32 secret[MD5_MESSAGE_BYTES / 4]; - u32 hash[MD5_DIGEST_WORDS]; + const struct { + struct in6_addr saddr; + struct in6_addr daddr; + __be16 sport; + __be16 dport; + u32 padding; + } __aligned(SIPHASH_ALIGNMENT) combined = { + .saddr = *(struct in6_addr *)saddr, + .daddr = *(struct in6_addr *)daddr, + .sport = sport, + .dport = dport + }; u64 seq; - u32 i; - net_secret_init(); - memcpy(hash, saddr, 16); - for (i = 0; i < 4; i++) - secret[i] = net_secret[i] + (__force u32)daddr[i]; - secret[4] = net_secret[4] + - (((__force u16)sport << 16) + (__force u16)dport); - for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++) - secret[i] = net_secret[i]; - - md5_transform(hash, secret); - - seq = hash[0] | (((u64)hash[1]) << 32); + seq = siphash(&combined, sizeof(combined), net_secret); seq += ktime_get_real_ns(); seq &= (1ull << 48) - 1; - return seq; } EXPORT_SYMBOL(secure_dccpv6_sequence_number); -- 2.11.0 ^ permalink raw reply related [flat|nested] 82+ messages in thread
* RE: [PATCH v5 3/4] secure_seq: use SipHash in place of MD5 2016-12-15 20:30 ` [PATCH v5 3/4] secure_seq: use SipHash in place of MD5 Jason A. Donenfeld @ 2016-12-16 9:59 ` David Laight 2016-12-16 15:57 ` Jason A. Donenfeld 0 siblings, 1 reply; 82+ messages in thread From: David Laight @ 2016-12-16 9:59 UTC (permalink / raw) To: 'Jason A. Donenfeld', Netdev, kernel-hardening, LKML, linux-crypto, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto From: Jason A. Donenfeld > Sent: 15 December 2016 20:30 > This gives a clear speed and security improvement. Siphash is both > faster and is more solid crypto than the aging MD5. > > Rather than manually filling MD5 buffers, for IPv6, we simply create > a layout by a simple anonymous struct, for which gcc generates > rather efficient code. For IPv4, we pass the values directly to the > short input convenience functions. ... > diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c > index 88a8e429fc3e..c80583bf3213 100644 ... > + const struct { > + struct in6_addr saddr; > + struct in6_addr daddr; > + __be16 sport; > + __be16 dport; > + u32 padding; > + } __aligned(SIPHASH_ALIGNMENT) combined = { > + .saddr = *(struct in6_addr *)saddr, > + .daddr = *(struct in6_addr *)daddr, > + .sport = sport, > + .dport = dport > + }; I think you should explicitly initialise the 'padding'. It can do no harm and makes it obvious that it is necessary. You are still putting over-aligned data on stack. You only need to align it to the alignment of u64 (not the size of u64). If an on-stack item has a stronger alignment requirement than the stack the gcc has to generate two stack frames for the function. If you assign to each field (instead of using initialisers) then you can get the alignment by making the first member an anonymous union of in6_addr and u64. Oh - and wait a bit longer between revisions. David ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 3/4] secure_seq: use SipHash in place of MD5 2016-12-16 9:59 ` David Laight @ 2016-12-16 15:57 ` Jason A. Donenfeld 0 siblings, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-16 15:57 UTC (permalink / raw) To: David Laight Cc: Netdev, kernel-hardening, LKML, linux-crypto, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto Hi David, On Fri, Dec 16, 2016 at 10:59 AM, David Laight <David.Laight@aculab.com> wrote: > You are still putting over-aligned data on stack. > You only need to align it to the alignment of u64 (not the size of u64). > If an on-stack item has a stronger alignment requirement than the stack > the gcc has to generate two stack frames for the function. Yesterday, folks were saying that sometimes 32-bit platforms need 8-byte alignment for certain 64-bit operations, so I shouldn't fall back to 4-byte alignment there. But actually, looking at this more closely, I can just make SIPHASH_ALIGNMENT == __alignof__(u64), which will take care of all possible concerns, since gcc knows best which platforms need what alignment. Thanks for making this clear to me with "the alignment of u64 (not the size of u64)". > Oh - and wait a bit longer between revisions. Okay. We can be turtles. Jason ^ permalink raw reply [flat|nested] 82+ messages in thread
* [PATCH v5 4/4] random: use SipHash in place of MD5 2016-12-15 20:29 [PATCH v5 0/4] The SipHash Patchset Jason A. Donenfeld ` (2 preceding siblings ...) 2016-12-15 20:30 ` [PATCH v5 3/4] secure_seq: use SipHash in place of MD5 Jason A. Donenfeld @ 2016-12-15 20:30 ` Jason A. Donenfeld 2016-12-16 3:03 ` [PATCH v6 0/5] The SipHash Patchset Jason A. Donenfeld 4 siblings, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-15 20:30 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto Cc: Jason A. Donenfeld, Jean-Philippe Aumasson This duplicates the current algorithm for get_random_int/long, but uses siphash instead. This comes with several benefits. It's certainly faster and more cryptographically secure than MD5. This patch also separates hashed fields into three values instead of one, in order to increase diffusion. The previous MD5 algorithm used a per-cpu MD5 state, which caused successive calls to the function to chain upon each other. While it's not entirely clear that this kind of chaining is absolutely necessary when using a secure PRF like siphash, it can't hurt, and the timing of the call chain does add a degree of natural entropy. So, in keeping with this design, instead of the massive per-cpu 64-byte MD5 state, there is instead a per-cpu previously returned value for chaining. The speed benefits are substantial: | siphash | md5 | speedup | ------------------------------ get_random_long | 137130 | 415983 | 3.03x | get_random_int | 86384 | 343323 | 3.97x | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> Cc: Ted Tso <tytso@mit.edu> --- drivers/char/random.c | 32 +++++++++++++------------------- 1 file changed, 13 insertions(+), 19 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index d6876d506220..a51f0ff43f00 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -262,6 +262,7 @@ #include <linux/syscalls.h> #include <linux/completion.h> #include <linux/uuid.h> +#include <linux/siphash.h> #include <crypto/chacha20.h> #include <asm/processor.h> @@ -2042,7 +2043,7 @@ struct ctl_table random_table[] = { }; #endif /* CONFIG_SYSCTL */ -static u32 random_int_secret[MD5_MESSAGE_BYTES / 4] ____cacheline_aligned; +static siphash_key_t random_int_secret; int random_int_secret_init(void) { @@ -2050,8 +2051,7 @@ int random_int_secret_init(void) return 0; } -static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash) - __aligned(sizeof(unsigned long)); +static DEFINE_PER_CPU(u64, get_random_int_chaining); /* * Get a random word for internal kernel use only. Similar to urandom but @@ -2061,19 +2061,16 @@ static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash) */ unsigned int get_random_int(void) { - __u32 *hash; unsigned int ret; + u64 *chaining; if (arch_get_random_int(&ret)) return ret; - hash = get_cpu_var(get_random_int_hash); - - hash[0] += current->pid + jiffies + random_get_entropy(); - md5_transform(hash, random_int_secret); - ret = hash[0]; - put_cpu_var(get_random_int_hash); - + chaining = &get_cpu_var(get_random_int_chaining); + ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() + + current->pid, random_int_secret); + put_cpu_var(get_random_int_chaining); return ret; } EXPORT_SYMBOL(get_random_int); @@ -2083,19 +2080,16 @@ EXPORT_SYMBOL(get_random_int); */ unsigned long get_random_long(void) { - __u32 *hash; unsigned long ret; + u64 *chaining; if (arch_get_random_long(&ret)) return ret; - hash = get_cpu_var(get_random_int_hash); - - hash[0] += current->pid + jiffies + random_get_entropy(); - md5_transform(hash, random_int_secret); - ret = *(unsigned long *)hash; - put_cpu_var(get_random_int_hash); - + chaining = &get_cpu_var(get_random_int_chaining); + ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() + + current->pid, random_int_secret); + put_cpu_var(get_random_int_chaining); return ret; } EXPORT_SYMBOL(get_random_long); -- 2.11.0 ^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH v6 0/5] The SipHash Patchset 2016-12-15 20:29 [PATCH v5 0/4] The SipHash Patchset Jason A. Donenfeld ` (3 preceding siblings ...) 2016-12-15 20:30 ` [PATCH v5 4/4] random: " Jason A. Donenfeld @ 2016-12-16 3:03 ` Jason A. Donenfeld 2016-12-16 3:03 ` [PATCH v6 1/5] siphash: add cryptographically secure PRF Jason A. Donenfeld ` (5 more replies) 4 siblings, 6 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-16 3:03 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto Cc: Jason A. Donenfeld Hey again, This keeps getting more ambitious, which is good I suppose. If the frequency of new versioned patchsets is too high for LKML and not customary, please let me know. Otherwise, read on to see what's new this time... With Hannes' suggestion, there is now only one siphash() function, which will use the faster aligned version by compile-time constant folding. Additionally, I now use constant folding to optionally switch to the helper siphash_Nu64 functions that are a bit faster for data of length 8, 16, 24, and 32. So, the result is that you use siphash(data, len, key) if you have a buffer of sorts, and then everything is taken care of for you. Or, if you have a series of integers, you can opt to use siphash_Nu{32,64} functions instead. The basic API is now complete. After replacing MD5 in secure sequence number generation and the RNG, it turned out that md5_transform wasn't used any place else in the tree, so finally -- this is something to rejoice over -- lib/md5.c has been deleted and now that function lives as a static function in crypto/md5.c where it belongs. Meanwhile, it seems that sha_transform is used in places where SipHash would be more fitting, so the IPv4 and IPv6 syncookies implementation now uses SipHash, which should speed up TCP performance. Some BSDs already do this. I'd like to replace sha_transform in addrconf, but that code is a bit gnarley, so I don't want to be too meddlesome. I'm not entirely convinced either that SipHash is a good choice for it. But I'm open to discussion here, so if you have an opinion, please speak up. If you've been following the evolution of this patchset, and think that certain patches in it are fine, please do lend me your Reviewed-by to carry into any subsequent versions, so that in case you disappear your useful reviews will still keep the ball moving. Thanks for all the great feedback thus far. Jason Jason A. Donenfeld (5): siphash: add cryptographically secure PRF secure_seq: use SipHash in place of MD5 random: use SipHash in place of MD5 md5: remove from lib and only live in crypto syncookies: use SipHash in place of SHA1 MAINTAINERS | 7 ++ crypto/md5.c | 95 +++++++++++++++++++++- drivers/char/random.c | 32 +++----- include/linux/siphash.h | 86 ++++++++++++++++++++ lib/Kconfig.debug | 6 +- lib/Makefile | 7 +- lib/md5.c | 95 ---------------------- lib/siphash.c | 210 ++++++++++++++++++++++++++++++++++++++++++++++++ lib/test_siphash.c | 101 +++++++++++++++++++++++ net/core/secure_seq.c | 133 ++++++++++++------------------ net/ipv4/syncookies.c | 20 +---- net/ipv6/syncookies.c | 37 ++++----- 12 files changed, 590 insertions(+), 239 deletions(-) create mode 100644 include/linux/siphash.h delete mode 100644 lib/md5.c create mode 100644 lib/siphash.c create mode 100644 lib/test_siphash.c -- 2.11.0 ^ permalink raw reply [flat|nested] 82+ messages in thread
* [PATCH v6 1/5] siphash: add cryptographically secure PRF 2016-12-16 3:03 ` [PATCH v6 0/5] The SipHash Patchset Jason A. Donenfeld @ 2016-12-16 3:03 ` Jason A. Donenfeld 2016-12-16 3:03 ` [PATCH v6 2/5] secure_seq: use SipHash in place of MD5 Jason A. Donenfeld ` (4 subsequent siblings) 5 siblings, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-16 3:03 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto Cc: Jason A. Donenfeld, Jean-Philippe Aumasson SipHash is a 64-bit keyed hash function that is actually a cryptographically secure PRF, like HMAC. Except SipHash is super fast, and is meant to be used as a hashtable keyed lookup function, or as a general PRF for short input use cases, such as sequence numbers or RNG chaining. For the first usage: There are a variety of attacks known as "hashtable poisoning" in which an attacker forms some data such that the hash of that data will be the same, and then preceeds to fill up all entries of a hashbucket. This is a realistic and well-known denial-of-service vector. Currently hashtables use jhash, which is fast but not secure, and some kind of rotating key scheme (or none at all, which isn't good). SipHash is meant as a replacement for jhash in these cases. There are a modicum of places in the kernel that are vulnerable to hashtable poisoning attacks, either via userspace vectors or network vectors, and there's not a reliable mechanism inside the kernel at the moment to fix it. The first step toward fixing these issues is actually getting a secure primitive into the kernel for developers to use. Then we can, bit by bit, port things over to it as deemed appropriate. While SipHash is extremely fast for a cryptographically secure function, it is likely a bit slower than the insecure jhash, and so replacements will be evaluated on a case-by-case basis based on whether or not the difference in speed is negligible and whether or not the current jhash usage poses a real security risk. For the second usage: A few places in the kernel are using MD5 or SHA1 for creating secure sequence numbers, syn cookies, port numbers, or fast random numbers. SipHash is a faster and more fitting, and more secure replacement for MD5 in those situations. Replacing MD5 and SHA1 with SipHash for these uses is obvious and straight-forward, and so is submitted along with this patch series. There shouldn't be much of a debate over its efficacy. Dozens of languages are already using this internally for their hash tables and PRFs. Some of the BSDs already use this in their kernels. SipHash is a widely known high-speed solution to a widely known set of problems, and it's time we catch-up. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: David Laight <David.Laight@aculab.com> --- MAINTAINERS | 7 ++ include/linux/siphash.h | 86 ++++++++++++++++++++ lib/Kconfig.debug | 6 +- lib/Makefile | 5 +- lib/siphash.c | 210 ++++++++++++++++++++++++++++++++++++++++++++++++ lib/test_siphash.c | 101 +++++++++++++++++++++++ 6 files changed, 410 insertions(+), 5 deletions(-) create mode 100644 include/linux/siphash.h create mode 100644 lib/siphash.c create mode 100644 lib/test_siphash.c diff --git a/MAINTAINERS b/MAINTAINERS index 59c9895d73d5..5d87a8c1056a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11231,6 +11231,13 @@ F: arch/arm/mach-s3c24xx/mach-bast.c F: arch/arm/mach-s3c24xx/bast-ide.c F: arch/arm/mach-s3c24xx/bast-irq.c +SIPHASH PRF ROUTINES +M: Jason A. Donenfeld <Jason@zx2c4.com> +S: Maintained +F: lib/siphash.c +F: lib/test_siphash.c +F: include/linux/siphash.h + TI DAVINCI MACHINE SUPPORT M: Sekhar Nori <nsekhar@ti.com> M: Kevin Hilman <khilman@kernel.org> diff --git a/include/linux/siphash.h b/include/linux/siphash.h new file mode 100644 index 000000000000..e82fce48a0f6 --- /dev/null +++ b/include/linux/siphash.h @@ -0,0 +1,86 @@ +/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. + * + * This file is provided under a dual BSD/GPLv2 license. + * + * SipHash: a fast short-input PRF + * https://131002.net/siphash/ + * + * This implementation is specifically for SipHash2-4. + */ + +#ifndef _LINUX_SIPHASH_H +#define _LINUX_SIPHASH_H + +#include <linux/types.h> +#include <linux/kernel.h> + +#define SIPHASH_ALIGNMENT 8 +typedef u64 siphash_key_t[2]; + +u64 __siphash_aligned(const void *data, size_t len, const siphash_key_t key); +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS +u64 __siphash_unaligned(const void *data, size_t len, const siphash_key_t key); +#endif + +u64 siphash_1u64(const u64 a, const siphash_key_t key); +u64 siphash_2u64(const u64 a, const u64 b, const siphash_key_t key); +u64 siphash_3u64(const u64 a, const u64 b, const u64 c, + const siphash_key_t key); +u64 siphash_4u64(const u64 a, const u64 b, const u64 c, const u64 d, + const siphash_key_t key); + +static inline u64 ___siphash_aligned(const u64 *data, size_t len, const siphash_key_t key) +{ + if (__builtin_constant_p(len) && len == 8) + return siphash_1u64(data[0], key); + if (__builtin_constant_p(len) && len == 16) + return siphash_2u64(data[0], data[1], key); + if (__builtin_constant_p(len) && len == 24) + return siphash_3u64(data[0], data[1], data[2], key); + if (__builtin_constant_p(len) && len == 32) + return siphash_4u64(data[0], data[1], data[2], data[3], key); + return __siphash_aligned(data, len, key); +} + +/** + * siphash - compute 64-bit siphash PRF value + * @data: buffer to hash + * @size: size of @data + * @key: the siphash key + */ +static inline u64 siphash(const void *data, size_t len, const siphash_key_t key) +{ +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS + if (!IS_ALIGNED((unsigned long)data, SIPHASH_ALIGNMENT)) + return __siphash_unaligned(data, len, key); +#endif + return ___siphash_aligned(data, len, key); +} + +static inline u64 siphash_2u32(const u32 a, const u32 b, const siphash_key_t key) +{ + return siphash_1u64((u64)b << 32 | a, key); +} + +static inline u64 siphash_4u32(const u32 a, const u32 b, const u32 c, const u32 d, + const siphash_key_t key) +{ + return siphash_2u64((u64)b << 32 | a, (u64)d << 32 | c, key); +} + +static inline u64 siphash_6u32(const u32 a, const u32 b, const u32 c, const u32 d, + const u32 e, const u32 f, const siphash_key_t key) +{ + return siphash_3u64((u64)b << 32 | a, (u64)d << 32 | c, (u64)f << 32 | e, + key); +} + +static inline u64 siphash_8u32(const u32 a, const u32 b, const u32 c, const u32 d, + const u32 e, const u32 f, const u32 g, const u32 h, + const siphash_key_t key) +{ + return siphash_4u64((u64)b << 32 | a, (u64)d << 32 | c, (u64)f << 32 | e, + (u64)h << 32 | g, key); +} + +#endif /* _LINUX_SIPHASH_H */ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 7446097f72bd..86254ea99b45 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1843,9 +1843,9 @@ config TEST_HASH tristate "Perform selftest on hash functions" default n help - Enable this option to test the kernel's integer (<linux/hash,h>) - and string (<linux/stringhash.h>) hash functions on boot - (or module load). + Enable this option to test the kernel's integer (<linux/hash.h>), + string (<linux/stringhash.h>), and siphash (<linux/siphash.h>) + hash functions on boot (or module load). This is intended to help people writing architecture-specific optimized versions. If unsure, say N. diff --git a/lib/Makefile b/lib/Makefile index 50144a3aeebd..71d398b04a74 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -22,7 +22,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \ sha1.o chacha20.o md5.o irq_regs.o argv_split.o \ flex_proportions.o ratelimit.o show_mem.o \ is_single_threaded.o plist.o decompress.o kobject_uevent.o \ - earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o win_minmax.o + earlycpio.o seq_buf.o siphash.o \ + nmi_backtrace.o nodemask.o win_minmax.o lib-$(CONFIG_MMU) += ioremap.o lib-$(CONFIG_SMP) += cpumask.o @@ -44,7 +45,7 @@ obj-$(CONFIG_TEST_HEXDUMP) += test_hexdump.o obj-y += kstrtox.o obj-$(CONFIG_TEST_BPF) += test_bpf.o obj-$(CONFIG_TEST_FIRMWARE) += test_firmware.o -obj-$(CONFIG_TEST_HASH) += test_hash.o +obj-$(CONFIG_TEST_HASH) += test_hash.o test_siphash.o obj-$(CONFIG_TEST_KASAN) += test_kasan.o obj-$(CONFIG_TEST_KSTRTOX) += test-kstrtox.o obj-$(CONFIG_TEST_LKM) += test_module.o diff --git a/lib/siphash.c b/lib/siphash.c new file mode 100644 index 000000000000..7efc273de5d0 --- /dev/null +++ b/lib/siphash.c @@ -0,0 +1,210 @@ +/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. + * + * This file is provided under a dual BSD/GPLv2 license. + * + * SipHash: a fast short-input PRF + * https://131002.net/siphash/ + * + * This implementation is specifically for SipHash2-4. + */ + +#include <linux/siphash.h> +#include <asm/unaligned.h> + +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 +#include <linux/dcache.h> +#include <asm/word-at-a-time.h> +#endif + +#define SIPROUND \ + do { \ + v0 += v1; v1 = rol64(v1, 13); v1 ^= v0; v0 = rol64(v0, 32); \ + v2 += v3; v3 = rol64(v3, 16); v3 ^= v2; \ + v0 += v3; v3 = rol64(v3, 21); v3 ^= v0; \ + v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); \ + } while(0) + +#define PREAMBLE(len) \ + u64 v0 = 0x736f6d6570736575ULL; \ + u64 v1 = 0x646f72616e646f6dULL; \ + u64 v2 = 0x6c7967656e657261ULL; \ + u64 v3 = 0x7465646279746573ULL; \ + u64 b = ((u64)len) << 56; \ + v3 ^= key[1]; \ + v2 ^= key[0]; \ + v1 ^= key[1]; \ + v0 ^= key[0]; + +#define POSTAMBLE \ + v3 ^= b; \ + SIPROUND; \ + SIPROUND; \ + v0 ^= b; \ + v2 ^= 0xff; \ + SIPROUND; \ + SIPROUND; \ + SIPROUND; \ + SIPROUND; \ + return (v0 ^ v1) ^ (v2 ^ v3); + +u64 __siphash_aligned(const void *data, size_t len, const siphash_key_t key) +{ + const u8 *end = data + len - (len % sizeof(u64)); + const u8 left = len & (sizeof(u64) - 1); + u64 m; + PREAMBLE(len) + for (; data != end; data += sizeof(u64)) { + m = le64_to_cpup(data); + v3 ^= m; + SIPROUND; + SIPROUND; + v0 ^= m; + } +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 + if (left) + b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) & + bytemask_from_count(left))); +#else + switch (left) { + case 7: b |= ((u64)end[6]) << 48; + case 6: b |= ((u64)end[5]) << 40; + case 5: b |= ((u64)end[4]) << 32; + case 4: b |= le32_to_cpup(data); break; + case 3: b |= ((u64)end[2]) << 16; + case 2: b |= le16_to_cpup(data); break; + case 1: b |= end[0]; + } +#endif + POSTAMBLE +} +EXPORT_SYMBOL(__siphash_aligned); + +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS +u64 __siphash_unaligned(const void *data, size_t len, const siphash_key_t key) +{ + const u8 *end = data + len - (len % sizeof(u64)); + const u8 left = len & (sizeof(u64) - 1); + u64 m; + PREAMBLE(len) + for (; data != end; data += sizeof(u64)) { + m = get_unaligned_le64(data); + v3 ^= m; + SIPROUND; + SIPROUND; + v0 ^= m; + } +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 + if (left) + b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) & + bytemask_from_count(left))); +#else + switch (left) { + case 7: b |= ((u64)end[6]) << 48; + case 6: b |= ((u64)end[5]) << 40; + case 5: b |= ((u64)end[4]) << 32; + case 4: b |= get_unaligned_le32(end); break; + case 3: b |= ((u64)end[2]) << 16; + case 2: b |= get_unaligned_le16(end); break; + case 1: b |= end[0]; + } +#endif + POSTAMBLE +} +EXPORT_SYMBOL(__siphash_unaligned); +#endif + +/** + * siphash_1u64 - compute 64-bit siphash PRF value of a u64 + * @first: first u64 + * @key: the siphash key + */ +u64 siphash_1u64(const u64 first, const siphash_key_t key) +{ + PREAMBLE(8) + v3 ^= first; + SIPROUND; + SIPROUND; + v0 ^= first; + POSTAMBLE +} +EXPORT_SYMBOL(siphash_1u64); + +/** + * siphash_2u64 - compute 64-bit siphash PRF value of 2 u64 + * @first: first u64 + * @second: second u64 + * @key: the siphash key + */ +u64 siphash_2u64(const u64 first, const u64 second, const siphash_key_t key) +{ + PREAMBLE(16) + v3 ^= first; + SIPROUND; + SIPROUND; + v0 ^= first; + v3 ^= second; + SIPROUND; + SIPROUND; + v0 ^= second; + POSTAMBLE +} +EXPORT_SYMBOL(siphash_2u64); + +/** + * siphash_3u64 - compute 64-bit siphash PRF value of 3 u64 + * @first: first u64 + * @second: second u64 + * @third: third u64 + * @key: the siphash key + */ +u64 siphash_3u64(const u64 first, const u64 second, const u64 third, + const siphash_key_t key) +{ + PREAMBLE(24) + v3 ^= first; + SIPROUND; + SIPROUND; + v0 ^= first; + v3 ^= second; + SIPROUND; + SIPROUND; + v0 ^= second; + v3 ^= third; + SIPROUND; + SIPROUND; + v0 ^= third; + POSTAMBLE +} +EXPORT_SYMBOL(siphash_3u64); + +/** + * siphash_4u64 - compute 64-bit siphash PRF value of 4 u64 + * @first: first u64 + * @second: second u64 + * @third: third u64 + * @forth: forth u64 + * @key: the siphash key + */ +u64 siphash_4u64(const u64 first, const u64 second, const u64 third, + const u64 forth, const siphash_key_t key) +{ + PREAMBLE(32) + v3 ^= first; + SIPROUND; + SIPROUND; + v0 ^= first; + v3 ^= second; + SIPROUND; + SIPROUND; + v0 ^= second; + v3 ^= third; + SIPROUND; + SIPROUND; + v0 ^= third; + v3 ^= forth; + SIPROUND; + SIPROUND; + v0 ^= forth; + POSTAMBLE +} +EXPORT_SYMBOL(siphash_4u64); diff --git a/lib/test_siphash.c b/lib/test_siphash.c new file mode 100644 index 000000000000..906e58a2c946 --- /dev/null +++ b/lib/test_siphash.c @@ -0,0 +1,101 @@ +/* Test cases for siphash.c + * + * Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. + * + * This file is provided under a dual BSD/GPLv2 license. + * + * SipHash: a fast short-input PRF + * https://131002.net/siphash/ + * + * This implementation is specifically for SipHash2-4. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include <linux/siphash.h> +#include <linux/kernel.h> +#include <linux/string.h> +#include <linux/errno.h> +#include <linux/module.h> + +/* Test vectors taken from official reference source available at: + * https://131002.net/siphash/siphash24.c + */ +static const u64 test_vectors[64] = { + 0x726fdb47dd0e0e31ULL, 0x74f839c593dc67fdULL, 0x0d6c8009d9a94f5aULL, + 0x85676696d7fb7e2dULL, 0xcf2794e0277187b7ULL, 0x18765564cd99a68dULL, + 0xcbc9466e58fee3ceULL, 0xab0200f58b01d137ULL, 0x93f5f5799a932462ULL, + 0x9e0082df0ba9e4b0ULL, 0x7a5dbbc594ddb9f3ULL, 0xf4b32f46226bada7ULL, + 0x751e8fbc860ee5fbULL, 0x14ea5627c0843d90ULL, 0xf723ca908e7af2eeULL, + 0xa129ca6149be45e5ULL, 0x3f2acc7f57c29bdbULL, 0x699ae9f52cbe4794ULL, + 0x4bc1b3f0968dd39cULL, 0xbb6dc91da77961bdULL, 0xbed65cf21aa2ee98ULL, + 0xd0f2cbb02e3b67c7ULL, 0x93536795e3a33e88ULL, 0xa80c038ccd5ccec8ULL, + 0xb8ad50c6f649af94ULL, 0xbce192de8a85b8eaULL, 0x17d835b85bbb15f3ULL, + 0x2f2e6163076bcfadULL, 0xde4daaaca71dc9a5ULL, 0xa6a2506687956571ULL, + 0xad87a3535c49ef28ULL, 0x32d892fad841c342ULL, 0x7127512f72f27cceULL, + 0xa7f32346f95978e3ULL, 0x12e0b01abb051238ULL, 0x15e034d40fa197aeULL, + 0x314dffbe0815a3b4ULL, 0x027990f029623981ULL, 0xcadcd4e59ef40c4dULL, + 0x9abfd8766a33735cULL, 0x0e3ea96b5304a7d0ULL, 0xad0c42d6fc585992ULL, + 0x187306c89bc215a9ULL, 0xd4a60abcf3792b95ULL, 0xf935451de4f21df2ULL, + 0xa9538f0419755787ULL, 0xdb9acddff56ca510ULL, 0xd06c98cd5c0975ebULL, + 0xe612a3cb9ecba951ULL, 0xc766e62cfcadaf96ULL, 0xee64435a9752fe72ULL, + 0xa192d576b245165aULL, 0x0a8787bf8ecb74b2ULL, 0x81b3e73d20b49b6fULL, + 0x7fa8220ba3b2eceaULL, 0x245731c13ca42499ULL, 0xb78dbfaf3a8d83bdULL, + 0xea1ad565322a1a0bULL, 0x60e61c23a3795013ULL, 0x6606d7e446282b93ULL, + 0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 0xe51b38608ef25f57ULL, + 0x958a324ceb064572ULL +}; +static const siphash_key_t test_key = + { 0x0706050403020100ULL , 0x0f0e0d0c0b0a0908ULL }; + +static int __init siphash_test_init(void) +{ + u8 in[64] __aligned(SIPHASH_ALIGNMENT); + u8 in_unaligned[65]; + u8 i; + int ret = 0; + + for (i = 0; i < 64; ++i) { + in[i] = i; + in_unaligned[i + 1] = i; + if (siphash(in, i, test_key) != test_vectors[i]) { + pr_info("self-test aligned %u: FAIL\n", i + 1); + ret = -EINVAL; + } + if (siphash(in_unaligned + 1, i, test_key) != test_vectors[i]) { + pr_info("self-test unaligned %u: FAIL\n", i + 1); + ret = -EINVAL; + } + } + if (siphash_1u64(0x0706050403020100ULL, test_key) != test_vectors[8]) { + pr_info("self-test 1u64: FAIL\n"); + ret = -EINVAL; + } + if (siphash_2u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL, test_key) != test_vectors[16]) { + pr_info("self-test 2u64: FAIL\n"); + ret = -EINVAL; + } + if (siphash_3u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL, + 0x1716151413121110ULL, test_key) != test_vectors[24]) { + pr_info("self-test 3u64: FAIL\n"); + ret = -EINVAL; + } + if (siphash_4u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL, + 0x1716151413121110ULL, 0x1f1e1d1c1b1a1918ULL, test_key) != test_vectors[32]) { + pr_info("self-test 4u64: FAIL\n"); + ret = -EINVAL; + } + if (!ret) + pr_info("self-tests: pass\n"); + return ret; +} + +static void __exit siphash_test_exit(void) +{ +} + +module_init(siphash_test_init); +module_exit(siphash_test_exit); + +MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>"); +MODULE_LICENSE("Dual BSD/GPL"); -- 2.11.0 ^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH v6 2/5] secure_seq: use SipHash in place of MD5 2016-12-16 3:03 ` [PATCH v6 0/5] The SipHash Patchset Jason A. Donenfeld 2016-12-16 3:03 ` [PATCH v6 1/5] siphash: add cryptographically secure PRF Jason A. Donenfeld @ 2016-12-16 3:03 ` Jason A. Donenfeld 2016-12-16 3:03 ` [PATCH v6 3/5] random: " Jason A. Donenfeld ` (3 subsequent siblings) 5 siblings, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-16 3:03 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto Cc: Jason A. Donenfeld This gives a clear speed and security improvement. Siphash is both faster and is more solid crypto than the aging MD5. Rather than manually filling MD5 buffers, for IPv6, we simply create a layout by a simple anonymous struct, for which gcc generates rather efficient code. For IPv4, we pass the values directly to the short input convenience functions. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Miller <davem@davemloft.net> Cc: David Laight <David.Laight@aculab.com> Cc: Tom Herbert <tom@herbertland.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> --- net/core/secure_seq.c | 133 ++++++++++++++++++++------------------------------ 1 file changed, 52 insertions(+), 81 deletions(-) diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c index 88a8e429fc3e..c80583bf3213 100644 --- a/net/core/secure_seq.c +++ b/net/core/secure_seq.c @@ -1,3 +1,5 @@ +/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. */ + #include <linux/kernel.h> #include <linux/init.h> #include <linux/cryptohash.h> @@ -8,14 +10,14 @@ #include <linux/ktime.h> #include <linux/string.h> #include <linux/net.h> - +#include <linux/siphash.h> #include <net/secure_seq.h> #if IS_ENABLED(CONFIG_IPV6) || IS_ENABLED(CONFIG_INET) +#include <linux/in6.h> #include <net/tcp.h> -#define NET_SECRET_SIZE (MD5_MESSAGE_BYTES / 4) -static u32 net_secret[NET_SECRET_SIZE] ____cacheline_aligned; +static siphash_key_t net_secret; static __always_inline void net_secret_init(void) { @@ -44,44 +46,42 @@ static u32 seq_scale(u32 seq) u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr, __be16 sport, __be16 dport, u32 *tsoff) { - u32 secret[MD5_MESSAGE_BYTES / 4]; - u32 hash[MD5_DIGEST_WORDS]; - u32 i; - + const struct { + struct in6_addr saddr; + struct in6_addr daddr; + __be16 sport; + __be16 dport; + u32 padding; + } __aligned(SIPHASH_ALIGNMENT) combined = { + .saddr = *(struct in6_addr *)saddr, + .daddr = *(struct in6_addr *)daddr, + .sport = sport, + .dport = dport + }; + u64 hash; net_secret_init(); - memcpy(hash, saddr, 16); - for (i = 0; i < 4; i++) - secret[i] = net_secret[i] + (__force u32)daddr[i]; - secret[4] = net_secret[4] + - (((__force u16)sport << 16) + (__force u16)dport); - for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++) - secret[i] = net_secret[i]; - - md5_transform(hash, secret); - - *tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0; - return seq_scale(hash[0]); + hash = siphash(&combined, sizeof(combined), net_secret); + *tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0; + return seq_scale(hash); } EXPORT_SYMBOL(secure_tcpv6_sequence_number); u32 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr, __be16 dport) { - u32 secret[MD5_MESSAGE_BYTES / 4]; - u32 hash[MD5_DIGEST_WORDS]; - u32 i; - + const struct { + struct in6_addr saddr; + struct in6_addr daddr; + __be16 dport; + u16 padding1; + u32 padding2; + } __aligned(SIPHASH_ALIGNMENT) combined = { + .saddr = *(struct in6_addr *)saddr, + .daddr = *(struct in6_addr *)daddr, + .dport = dport + }; net_secret_init(); - memcpy(hash, saddr, 16); - for (i = 0; i < 4; i++) - secret[i] = net_secret[i] + (__force u32) daddr[i]; - secret[4] = net_secret[4] + (__force u32)dport; - for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++) - secret[i] = net_secret[i]; - - md5_transform(hash, secret); - - return hash[0]; + return siphash(&combined, sizeof(combined), net_secret); } EXPORT_SYMBOL(secure_ipv6_port_ephemeral); #endif @@ -91,33 +91,17 @@ EXPORT_SYMBOL(secure_ipv6_port_ephemeral); u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr, __be16 sport, __be16 dport, u32 *tsoff) { - u32 hash[MD5_DIGEST_WORDS]; - + u64 hash; net_secret_init(); - hash[0] = (__force u32)saddr; - hash[1] = (__force u32)daddr; - hash[2] = ((__force u16)sport << 16) + (__force u16)dport; - hash[3] = net_secret[15]; - - md5_transform(hash, net_secret); - - *tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0; - return seq_scale(hash[0]); + hash = siphash_4u32(saddr, daddr, sport, dport, net_secret); + *tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0; + return seq_scale(hash); } u32 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport) { - u32 hash[MD5_DIGEST_WORDS]; - net_secret_init(); - hash[0] = (__force u32)saddr; - hash[1] = (__force u32)daddr; - hash[2] = (__force u32)dport ^ net_secret[14]; - hash[3] = net_secret[15]; - - md5_transform(hash, net_secret); - - return hash[0]; + return siphash_4u32(saddr, daddr, dport, 0, net_secret); } EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral); #endif @@ -126,21 +110,11 @@ EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral); u64 secure_dccp_sequence_number(__be32 saddr, __be32 daddr, __be16 sport, __be16 dport) { - u32 hash[MD5_DIGEST_WORDS]; u64 seq; - net_secret_init(); - hash[0] = (__force u32)saddr; - hash[1] = (__force u32)daddr; - hash[2] = ((__force u16)sport << 16) + (__force u16)dport; - hash[3] = net_secret[15]; - - md5_transform(hash, net_secret); - - seq = hash[0] | (((u64)hash[1]) << 32); + seq = siphash_4u32(saddr, daddr, sport, dport, net_secret); seq += ktime_get_real_ns(); seq &= (1ull << 48) - 1; - return seq; } EXPORT_SYMBOL(secure_dccp_sequence_number); @@ -149,26 +123,23 @@ EXPORT_SYMBOL(secure_dccp_sequence_number); u64 secure_dccpv6_sequence_number(__be32 *saddr, __be32 *daddr, __be16 sport, __be16 dport) { - u32 secret[MD5_MESSAGE_BYTES / 4]; - u32 hash[MD5_DIGEST_WORDS]; + const struct { + struct in6_addr saddr; + struct in6_addr daddr; + __be16 sport; + __be16 dport; + u32 padding; + } __aligned(SIPHASH_ALIGNMENT) combined = { + .saddr = *(struct in6_addr *)saddr, + .daddr = *(struct in6_addr *)daddr, + .sport = sport, + .dport = dport + }; u64 seq; - u32 i; - net_secret_init(); - memcpy(hash, saddr, 16); - for (i = 0; i < 4; i++) - secret[i] = net_secret[i] + (__force u32)daddr[i]; - secret[4] = net_secret[4] + - (((__force u16)sport << 16) + (__force u16)dport); - for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++) - secret[i] = net_secret[i]; - - md5_transform(hash, secret); - - seq = hash[0] | (((u64)hash[1]) << 32); + seq = siphash(&combined, sizeof(combined), net_secret); seq += ktime_get_real_ns(); seq &= (1ull << 48) - 1; - return seq; } EXPORT_SYMBOL(secure_dccpv6_sequence_number); -- 2.11.0 ^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH v6 3/5] random: use SipHash in place of MD5 2016-12-16 3:03 ` [PATCH v6 0/5] The SipHash Patchset Jason A. Donenfeld 2016-12-16 3:03 ` [PATCH v6 1/5] siphash: add cryptographically secure PRF Jason A. Donenfeld 2016-12-16 3:03 ` [PATCH v6 2/5] secure_seq: use SipHash in place of MD5 Jason A. Donenfeld @ 2016-12-16 3:03 ` Jason A. Donenfeld 2016-12-16 21:31 ` Andy Lutomirski 2016-12-16 3:03 ` [PATCH v6 4/5] md5: remove from lib and only live in crypto Jason A. Donenfeld ` (2 subsequent siblings) 5 siblings, 1 reply; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-16 3:03 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto Cc: Jason A. Donenfeld, Jean-Philippe Aumasson This duplicates the current algorithm for get_random_int/long, but uses siphash instead. This comes with several benefits. It's certainly faster and more cryptographically secure than MD5. This patch also separates hashed fields into three values instead of one, in order to increase diffusion. The previous MD5 algorithm used a per-cpu MD5 state, which caused successive calls to the function to chain upon each other. While it's not entirely clear that this kind of chaining is absolutely necessary when using a secure PRF like siphash, it can't hurt, and the timing of the call chain does add a degree of natural entropy. So, in keeping with this design, instead of the massive per-cpu 64-byte MD5 state, there is instead a per-cpu previously returned value for chaining. The speed benefits are substantial: | siphash | md5 | speedup | ------------------------------ get_random_long | 137130 | 415983 | 3.03x | get_random_int | 86384 | 343323 | 3.97x | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> Cc: Ted Tso <tytso@mit.edu> --- drivers/char/random.c | 32 +++++++++++++------------------- 1 file changed, 13 insertions(+), 19 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index d6876d506220..a51f0ff43f00 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -262,6 +262,7 @@ #include <linux/syscalls.h> #include <linux/completion.h> #include <linux/uuid.h> +#include <linux/siphash.h> #include <crypto/chacha20.h> #include <asm/processor.h> @@ -2042,7 +2043,7 @@ struct ctl_table random_table[] = { }; #endif /* CONFIG_SYSCTL */ -static u32 random_int_secret[MD5_MESSAGE_BYTES / 4] ____cacheline_aligned; +static siphash_key_t random_int_secret; int random_int_secret_init(void) { @@ -2050,8 +2051,7 @@ int random_int_secret_init(void) return 0; } -static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash) - __aligned(sizeof(unsigned long)); +static DEFINE_PER_CPU(u64, get_random_int_chaining); /* * Get a random word for internal kernel use only. Similar to urandom but @@ -2061,19 +2061,16 @@ static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash) */ unsigned int get_random_int(void) { - __u32 *hash; unsigned int ret; + u64 *chaining; if (arch_get_random_int(&ret)) return ret; - hash = get_cpu_var(get_random_int_hash); - - hash[0] += current->pid + jiffies + random_get_entropy(); - md5_transform(hash, random_int_secret); - ret = hash[0]; - put_cpu_var(get_random_int_hash); - + chaining = &get_cpu_var(get_random_int_chaining); + ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() + + current->pid, random_int_secret); + put_cpu_var(get_random_int_chaining); return ret; } EXPORT_SYMBOL(get_random_int); @@ -2083,19 +2080,16 @@ EXPORT_SYMBOL(get_random_int); */ unsigned long get_random_long(void) { - __u32 *hash; unsigned long ret; + u64 *chaining; if (arch_get_random_long(&ret)) return ret; - hash = get_cpu_var(get_random_int_hash); - - hash[0] += current->pid + jiffies + random_get_entropy(); - md5_transform(hash, random_int_secret); - ret = *(unsigned long *)hash; - put_cpu_var(get_random_int_hash); - + chaining = &get_cpu_var(get_random_int_chaining); + ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() + + current->pid, random_int_secret); + put_cpu_var(get_random_int_chaining); return ret; } EXPORT_SYMBOL(get_random_long); -- 2.11.0 ^ permalink raw reply related [flat|nested] 82+ messages in thread
* Re: [PATCH v6 3/5] random: use SipHash in place of MD5 2016-12-16 3:03 ` [PATCH v6 3/5] random: " Jason A. Donenfeld @ 2016-12-16 21:31 ` Andy Lutomirski 0 siblings, 0 replies; 82+ messages in thread From: Andy Lutomirski @ 2016-12-16 21:31 UTC (permalink / raw) To: Jason A. Donenfeld Cc: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Thu, Dec 15, 2016 at 7:03 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: > -static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash) > - __aligned(sizeof(unsigned long)); > +static DEFINE_PER_CPU(u64, get_random_int_chaining); > [...] > unsigned long get_random_long(void) > { > - __u32 *hash; > unsigned long ret; > + u64 *chaining; > > if (arch_get_random_long(&ret)) > return ret; > > - hash = get_cpu_var(get_random_int_hash); > - > - hash[0] += current->pid + jiffies + random_get_entropy(); > - md5_transform(hash, random_int_secret); > - ret = *(unsigned long *)hash; > - put_cpu_var(get_random_int_hash); > - > + chaining = &get_cpu_var(get_random_int_chaining); > + ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() + > + current->pid, random_int_secret); > + put_cpu_var(get_random_int_chaining); > return ret; > } I think it would be nice to try to strenghen the PRNG construction. FWIW, I'm not an expert in PRNGs, and there's fairly extensive literature, but I can at least try. Here are some properties I'd like: 1. A one-time leak of memory contents doesn't ruin security until reboot. This is especially value across suspend and/or hibernation. 2. An attack with a low work factor (2^64?) shouldn't break the scheme until reboot. This is effectively doing: output = H(prev_output, weak "entropy", per-boot secret); One unfortunately downside is that, if used in a context where an attacker can see a single output, the attacker learns the chaining value. If the attacker can guess the entropy, then, with 2^64 work, they learn the secret, and they can predict future outputs. I would advocate adding two types of improvements. First, re-seed it every now and then (every 128 calls?) by just replacing both the chaining value and the percpu secret with fresh CSPRNG output. Second, change the mode so that an attacker doesn't learn so much internal state. For example: output = H(old_chain, entropy, secret); new_chain = old_chain + entropy + output; This increases the effort needed to brute-force the internal state from 2^64 to 2^128 (barring any weaknesses in the scheme). Also, can we not call this get_random_int()? get_random_int() sounds too much like get_random_bytes(), and the latter is intended to be a real CSPRNG. Can we call it get_weak_random_int() or similar? --Andy ^ permalink raw reply [flat|nested] 82+ messages in thread
* [PATCH v6 4/5] md5: remove from lib and only live in crypto 2016-12-16 3:03 ` [PATCH v6 0/5] The SipHash Patchset Jason A. Donenfeld ` (2 preceding siblings ...) 2016-12-16 3:03 ` [PATCH v6 3/5] random: " Jason A. Donenfeld @ 2016-12-16 3:03 ` Jason A. Donenfeld 2016-12-16 3:03 ` [PATCH v6 5/5] syncookies: use SipHash in place of SHA1 Jason A. Donenfeld 2016-12-21 23:02 ` [PATCH v7 0/6] The SipHash Patchset Jason A. Donenfeld 5 siblings, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-16 3:03 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto Cc: Jason A. Donenfeld The md5_transform function is no longer used any where in the tree, except for the crypto api's actual implementation of md5, so we can drop the function from lib and put it as a static function of the crypto file, where it belongs. There should be no new users of md5_transform, anyway, since there are more modern ways of doing what it once achieved. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> --- crypto/md5.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- lib/Makefile | 2 +- lib/md5.c | 95 ------------------------------------------------------------ 3 files changed, 95 insertions(+), 97 deletions(-) delete mode 100644 lib/md5.c diff --git a/crypto/md5.c b/crypto/md5.c index 2355a7c25c45..f7ae1a48225b 100644 --- a/crypto/md5.c +++ b/crypto/md5.c @@ -21,9 +21,11 @@ #include <linux/module.h> #include <linux/string.h> #include <linux/types.h> -#include <linux/cryptohash.h> #include <asm/byteorder.h> +#define MD5_DIGEST_WORDS 4 +#define MD5_MESSAGE_BYTES 64 + const u8 md5_zero_message_hash[MD5_DIGEST_SIZE] = { 0xd4, 0x1d, 0x8c, 0xd9, 0x8f, 0x00, 0xb2, 0x04, 0xe9, 0x80, 0x09, 0x98, 0xec, 0xf8, 0x42, 0x7e, @@ -47,6 +49,97 @@ static inline void cpu_to_le32_array(u32 *buf, unsigned int words) } } +#define F1(x, y, z) (z ^ (x & (y ^ z))) +#define F2(x, y, z) F1(z, x, y) +#define F3(x, y, z) (x ^ y ^ z) +#define F4(x, y, z) (y ^ (x | ~z)) + +#define MD5STEP(f, w, x, y, z, in, s) \ + (w += f(x, y, z) + in, w = (w<<s | w>>(32-s)) + x) + +static void md5_transform(__u32 *hash, __u32 const *in) +{ + u32 a, b, c, d; + + a = hash[0]; + b = hash[1]; + c = hash[2]; + d = hash[3]; + + MD5STEP(F1, a, b, c, d, in[0] + 0xd76aa478, 7); + MD5STEP(F1, d, a, b, c, in[1] + 0xe8c7b756, 12); + MD5STEP(F1, c, d, a, b, in[2] + 0x242070db, 17); + MD5STEP(F1, b, c, d, a, in[3] + 0xc1bdceee, 22); + MD5STEP(F1, a, b, c, d, in[4] + 0xf57c0faf, 7); + MD5STEP(F1, d, a, b, c, in[5] + 0x4787c62a, 12); + MD5STEP(F1, c, d, a, b, in[6] + 0xa8304613, 17); + MD5STEP(F1, b, c, d, a, in[7] + 0xfd469501, 22); + MD5STEP(F1, a, b, c, d, in[8] + 0x698098d8, 7); + MD5STEP(F1, d, a, b, c, in[9] + 0x8b44f7af, 12); + MD5STEP(F1, c, d, a, b, in[10] + 0xffff5bb1, 17); + MD5STEP(F1, b, c, d, a, in[11] + 0x895cd7be, 22); + MD5STEP(F1, a, b, c, d, in[12] + 0x6b901122, 7); + MD5STEP(F1, d, a, b, c, in[13] + 0xfd987193, 12); + MD5STEP(F1, c, d, a, b, in[14] + 0xa679438e, 17); + MD5STEP(F1, b, c, d, a, in[15] + 0x49b40821, 22); + + MD5STEP(F2, a, b, c, d, in[1] + 0xf61e2562, 5); + MD5STEP(F2, d, a, b, c, in[6] + 0xc040b340, 9); + MD5STEP(F2, c, d, a, b, in[11] + 0x265e5a51, 14); + MD5STEP(F2, b, c, d, a, in[0] + 0xe9b6c7aa, 20); + MD5STEP(F2, a, b, c, d, in[5] + 0xd62f105d, 5); + MD5STEP(F2, d, a, b, c, in[10] + 0x02441453, 9); + MD5STEP(F2, c, d, a, b, in[15] + 0xd8a1e681, 14); + MD5STEP(F2, b, c, d, a, in[4] + 0xe7d3fbc8, 20); + MD5STEP(F2, a, b, c, d, in[9] + 0x21e1cde6, 5); + MD5STEP(F2, d, a, b, c, in[14] + 0xc33707d6, 9); + MD5STEP(F2, c, d, a, b, in[3] + 0xf4d50d87, 14); + MD5STEP(F2, b, c, d, a, in[8] + 0x455a14ed, 20); + MD5STEP(F2, a, b, c, d, in[13] + 0xa9e3e905, 5); + MD5STEP(F2, d, a, b, c, in[2] + 0xfcefa3f8, 9); + MD5STEP(F2, c, d, a, b, in[7] + 0x676f02d9, 14); + MD5STEP(F2, b, c, d, a, in[12] + 0x8d2a4c8a, 20); + + MD5STEP(F3, a, b, c, d, in[5] + 0xfffa3942, 4); + MD5STEP(F3, d, a, b, c, in[8] + 0x8771f681, 11); + MD5STEP(F3, c, d, a, b, in[11] + 0x6d9d6122, 16); + MD5STEP(F3, b, c, d, a, in[14] + 0xfde5380c, 23); + MD5STEP(F3, a, b, c, d, in[1] + 0xa4beea44, 4); + MD5STEP(F3, d, a, b, c, in[4] + 0x4bdecfa9, 11); + MD5STEP(F3, c, d, a, b, in[7] + 0xf6bb4b60, 16); + MD5STEP(F3, b, c, d, a, in[10] + 0xbebfbc70, 23); + MD5STEP(F3, a, b, c, d, in[13] + 0x289b7ec6, 4); + MD5STEP(F3, d, a, b, c, in[0] + 0xeaa127fa, 11); + MD5STEP(F3, c, d, a, b, in[3] + 0xd4ef3085, 16); + MD5STEP(F3, b, c, d, a, in[6] + 0x04881d05, 23); + MD5STEP(F3, a, b, c, d, in[9] + 0xd9d4d039, 4); + MD5STEP(F3, d, a, b, c, in[12] + 0xe6db99e5, 11); + MD5STEP(F3, c, d, a, b, in[15] + 0x1fa27cf8, 16); + MD5STEP(F3, b, c, d, a, in[2] + 0xc4ac5665, 23); + + MD5STEP(F4, a, b, c, d, in[0] + 0xf4292244, 6); + MD5STEP(F4, d, a, b, c, in[7] + 0x432aff97, 10); + MD5STEP(F4, c, d, a, b, in[14] + 0xab9423a7, 15); + MD5STEP(F4, b, c, d, a, in[5] + 0xfc93a039, 21); + MD5STEP(F4, a, b, c, d, in[12] + 0x655b59c3, 6); + MD5STEP(F4, d, a, b, c, in[3] + 0x8f0ccc92, 10); + MD5STEP(F4, c, d, a, b, in[10] + 0xffeff47d, 15); + MD5STEP(F4, b, c, d, a, in[1] + 0x85845dd1, 21); + MD5STEP(F4, a, b, c, d, in[8] + 0x6fa87e4f, 6); + MD5STEP(F4, d, a, b, c, in[15] + 0xfe2ce6e0, 10); + MD5STEP(F4, c, d, a, b, in[6] + 0xa3014314, 15); + MD5STEP(F4, b, c, d, a, in[13] + 0x4e0811a1, 21); + MD5STEP(F4, a, b, c, d, in[4] + 0xf7537e82, 6); + MD5STEP(F4, d, a, b, c, in[11] + 0xbd3af235, 10); + MD5STEP(F4, c, d, a, b, in[2] + 0x2ad7d2bb, 15); + MD5STEP(F4, b, c, d, a, in[9] + 0xeb86d391, 21); + + hash[0] += a; + hash[1] += b; + hash[2] += c; + hash[3] += d; +} + static inline void md5_transform_helper(struct md5_state *ctx) { le32_to_cpu_array(ctx->block, sizeof(ctx->block) / sizeof(u32)); diff --git a/lib/Makefile b/lib/Makefile index 71d398b04a74..1079152607e0 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -19,7 +19,7 @@ KCOV_INSTRUMENT_dynamic_debug.o := n lib-y := ctype.o string.o vsprintf.o cmdline.o \ rbtree.o radix-tree.o dump_stack.o timerqueue.o\ idr.o int_sqrt.o extable.o \ - sha1.o chacha20.o md5.o irq_regs.o argv_split.o \ + sha1.o chacha20.o irq_regs.o argv_split.o \ flex_proportions.o ratelimit.o show_mem.o \ is_single_threaded.o plist.o decompress.o kobject_uevent.o \ earlycpio.o seq_buf.o siphash.o \ diff --git a/lib/md5.c b/lib/md5.c deleted file mode 100644 index bb0cd01d356d..000000000000 --- a/lib/md5.c +++ /dev/null @@ -1,95 +0,0 @@ -#include <linux/compiler.h> -#include <linux/export.h> -#include <linux/cryptohash.h> - -#define F1(x, y, z) (z ^ (x & (y ^ z))) -#define F2(x, y, z) F1(z, x, y) -#define F3(x, y, z) (x ^ y ^ z) -#define F4(x, y, z) (y ^ (x | ~z)) - -#define MD5STEP(f, w, x, y, z, in, s) \ - (w += f(x, y, z) + in, w = (w<<s | w>>(32-s)) + x) - -void md5_transform(__u32 *hash, __u32 const *in) -{ - u32 a, b, c, d; - - a = hash[0]; - b = hash[1]; - c = hash[2]; - d = hash[3]; - - MD5STEP(F1, a, b, c, d, in[0] + 0xd76aa478, 7); - MD5STEP(F1, d, a, b, c, in[1] + 0xe8c7b756, 12); - MD5STEP(F1, c, d, a, b, in[2] + 0x242070db, 17); - MD5STEP(F1, b, c, d, a, in[3] + 0xc1bdceee, 22); - MD5STEP(F1, a, b, c, d, in[4] + 0xf57c0faf, 7); - MD5STEP(F1, d, a, b, c, in[5] + 0x4787c62a, 12); - MD5STEP(F1, c, d, a, b, in[6] + 0xa8304613, 17); - MD5STEP(F1, b, c, d, a, in[7] + 0xfd469501, 22); - MD5STEP(F1, a, b, c, d, in[8] + 0x698098d8, 7); - MD5STEP(F1, d, a, b, c, in[9] + 0x8b44f7af, 12); - MD5STEP(F1, c, d, a, b, in[10] + 0xffff5bb1, 17); - MD5STEP(F1, b, c, d, a, in[11] + 0x895cd7be, 22); - MD5STEP(F1, a, b, c, d, in[12] + 0x6b901122, 7); - MD5STEP(F1, d, a, b, c, in[13] + 0xfd987193, 12); - MD5STEP(F1, c, d, a, b, in[14] + 0xa679438e, 17); - MD5STEP(F1, b, c, d, a, in[15] + 0x49b40821, 22); - - MD5STEP(F2, a, b, c, d, in[1] + 0xf61e2562, 5); - MD5STEP(F2, d, a, b, c, in[6] + 0xc040b340, 9); - MD5STEP(F2, c, d, a, b, in[11] + 0x265e5a51, 14); - MD5STEP(F2, b, c, d, a, in[0] + 0xe9b6c7aa, 20); - MD5STEP(F2, a, b, c, d, in[5] + 0xd62f105d, 5); - MD5STEP(F2, d, a, b, c, in[10] + 0x02441453, 9); - MD5STEP(F2, c, d, a, b, in[15] + 0xd8a1e681, 14); - MD5STEP(F2, b, c, d, a, in[4] + 0xe7d3fbc8, 20); - MD5STEP(F2, a, b, c, d, in[9] + 0x21e1cde6, 5); - MD5STEP(F2, d, a, b, c, in[14] + 0xc33707d6, 9); - MD5STEP(F2, c, d, a, b, in[3] + 0xf4d50d87, 14); - MD5STEP(F2, b, c, d, a, in[8] + 0x455a14ed, 20); - MD5STEP(F2, a, b, c, d, in[13] + 0xa9e3e905, 5); - MD5STEP(F2, d, a, b, c, in[2] + 0xfcefa3f8, 9); - MD5STEP(F2, c, d, a, b, in[7] + 0x676f02d9, 14); - MD5STEP(F2, b, c, d, a, in[12] + 0x8d2a4c8a, 20); - - MD5STEP(F3, a, b, c, d, in[5] + 0xfffa3942, 4); - MD5STEP(F3, d, a, b, c, in[8] + 0x8771f681, 11); - MD5STEP(F3, c, d, a, b, in[11] + 0x6d9d6122, 16); - MD5STEP(F3, b, c, d, a, in[14] + 0xfde5380c, 23); - MD5STEP(F3, a, b, c, d, in[1] + 0xa4beea44, 4); - MD5STEP(F3, d, a, b, c, in[4] + 0x4bdecfa9, 11); - MD5STEP(F3, c, d, a, b, in[7] + 0xf6bb4b60, 16); - MD5STEP(F3, b, c, d, a, in[10] + 0xbebfbc70, 23); - MD5STEP(F3, a, b, c, d, in[13] + 0x289b7ec6, 4); - MD5STEP(F3, d, a, b, c, in[0] + 0xeaa127fa, 11); - MD5STEP(F3, c, d, a, b, in[3] + 0xd4ef3085, 16); - MD5STEP(F3, b, c, d, a, in[6] + 0x04881d05, 23); - MD5STEP(F3, a, b, c, d, in[9] + 0xd9d4d039, 4); - MD5STEP(F3, d, a, b, c, in[12] + 0xe6db99e5, 11); - MD5STEP(F3, c, d, a, b, in[15] + 0x1fa27cf8, 16); - MD5STEP(F3, b, c, d, a, in[2] + 0xc4ac5665, 23); - - MD5STEP(F4, a, b, c, d, in[0] + 0xf4292244, 6); - MD5STEP(F4, d, a, b, c, in[7] + 0x432aff97, 10); - MD5STEP(F4, c, d, a, b, in[14] + 0xab9423a7, 15); - MD5STEP(F4, b, c, d, a, in[5] + 0xfc93a039, 21); - MD5STEP(F4, a, b, c, d, in[12] + 0x655b59c3, 6); - MD5STEP(F4, d, a, b, c, in[3] + 0x8f0ccc92, 10); - MD5STEP(F4, c, d, a, b, in[10] + 0xffeff47d, 15); - MD5STEP(F4, b, c, d, a, in[1] + 0x85845dd1, 21); - MD5STEP(F4, a, b, c, d, in[8] + 0x6fa87e4f, 6); - MD5STEP(F4, d, a, b, c, in[15] + 0xfe2ce6e0, 10); - MD5STEP(F4, c, d, a, b, in[6] + 0xa3014314, 15); - MD5STEP(F4, b, c, d, a, in[13] + 0x4e0811a1, 21); - MD5STEP(F4, a, b, c, d, in[4] + 0xf7537e82, 6); - MD5STEP(F4, d, a, b, c, in[11] + 0xbd3af235, 10); - MD5STEP(F4, c, d, a, b, in[2] + 0x2ad7d2bb, 15); - MD5STEP(F4, b, c, d, a, in[9] + 0xeb86d391, 21); - - hash[0] += a; - hash[1] += b; - hash[2] += c; - hash[3] += d; -} -EXPORT_SYMBOL(md5_transform); -- 2.11.0 ^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH v6 5/5] syncookies: use SipHash in place of SHA1 2016-12-16 3:03 ` [PATCH v6 0/5] The SipHash Patchset Jason A. Donenfeld ` (3 preceding siblings ...) 2016-12-16 3:03 ` [PATCH v6 4/5] md5: remove from lib and only live in crypto Jason A. Donenfeld @ 2016-12-16 3:03 ` Jason A. Donenfeld 2016-12-21 23:02 ` [PATCH v7 0/6] The SipHash Patchset Jason A. Donenfeld 5 siblings, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-16 3:03 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto Cc: Jason A. Donenfeld SHA1 is slower and less secure than SipHash, and so replacing syncookie generation with SipHash makes natural sense. Some BSDs have been doing this for several years in fact. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> --- net/ipv4/syncookies.c | 20 ++++---------------- net/ipv6/syncookies.c | 37 ++++++++++++++++--------------------- 2 files changed, 20 insertions(+), 37 deletions(-) diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index 3e88467d70ee..03bb068f8888 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -13,13 +13,13 @@ #include <linux/tcp.h> #include <linux/slab.h> #include <linux/random.h> -#include <linux/cryptohash.h> +#include <linux/siphash.h> #include <linux/kernel.h> #include <linux/export.h> #include <net/tcp.h> #include <net/route.h> -static u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS] __read_mostly; +static siphash_key_t syncookie_secret[2] __read_mostly; #define COOKIEBITS 24 /* Upper bits store count */ #define COOKIEMASK (((__u32)1 << COOKIEBITS) - 1) @@ -48,24 +48,12 @@ static u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS] __read_mostly; #define TSBITS 6 #define TSMASK (((__u32)1 << TSBITS) - 1) -static DEFINE_PER_CPU(__u32 [16 + 5 + SHA_WORKSPACE_WORDS], ipv4_cookie_scratch); - static u32 cookie_hash(__be32 saddr, __be32 daddr, __be16 sport, __be16 dport, u32 count, int c) { - __u32 *tmp; - net_get_random_once(syncookie_secret, sizeof(syncookie_secret)); - - tmp = this_cpu_ptr(ipv4_cookie_scratch); - memcpy(tmp + 4, syncookie_secret[c], sizeof(syncookie_secret[c])); - tmp[0] = (__force u32)saddr; - tmp[1] = (__force u32)daddr; - tmp[2] = ((__force u32)sport << 16) + (__force u32)dport; - tmp[3] = count; - sha_transform(tmp + 16, (__u8 *)tmp, tmp + 16 + 5); - - return tmp[17]; + return siphash_4u32(saddr, daddr, (u32)sport << 16 | dport, count, + syncookie_secret[c]); } diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c index a4d49760bf43..04d19e89a3e0 100644 --- a/net/ipv6/syncookies.c +++ b/net/ipv6/syncookies.c @@ -16,7 +16,7 @@ #include <linux/tcp.h> #include <linux/random.h> -#include <linux/cryptohash.h> +#include <linux/siphash.h> #include <linux/kernel.h> #include <net/ipv6.h> #include <net/tcp.h> @@ -24,7 +24,7 @@ #define COOKIEBITS 24 /* Upper bits store count */ #define COOKIEMASK (((__u32)1 << COOKIEBITS) - 1) -static u32 syncookie6_secret[2][16-4+SHA_DIGEST_WORDS] __read_mostly; +static siphash_key_t syncookie6_secret[2] __read_mostly; /* RFC 2460, Section 8.3: * [ipv6 tcp] MSS must be computed as the maximum packet size minus 60 [..] @@ -41,30 +41,25 @@ static __u16 const msstab[] = { 9000 - 60, }; -static DEFINE_PER_CPU(__u32 [16 + 5 + SHA_WORKSPACE_WORDS], ipv6_cookie_scratch); - static u32 cookie_hash(const struct in6_addr *saddr, const struct in6_addr *daddr, __be16 sport, __be16 dport, u32 count, int c) { - __u32 *tmp; + const struct { + struct in6_addr saddr; + struct in6_addr daddr; + u32 count; + u16 sport; + u16 dport; + } __aligned(SIPHASH_ALIGNMENT) combined = { + .saddr = *saddr, + .daddr = *daddr, + .count = count, + .sport = sport, + .dport = dport + }; net_get_random_once(syncookie6_secret, sizeof(syncookie6_secret)); - - tmp = this_cpu_ptr(ipv6_cookie_scratch); - - /* - * we have 320 bits of information to hash, copy in the remaining - * 192 bits required for sha_transform, from the syncookie6_secret - * and overwrite the digest with the secret - */ - memcpy(tmp + 10, syncookie6_secret[c], 44); - memcpy(tmp, saddr, 16); - memcpy(tmp + 4, daddr, 16); - tmp[8] = ((__force u32)sport << 16) + (__force u32)dport; - tmp[9] = count; - sha_transform(tmp + 16, (__u8 *)tmp, tmp + 16 + 5); - - return tmp[17]; + return siphash(&combined, sizeof(combined), syncookie6_secret[c]); } static __u32 secure_tcp_syn_cookie(const struct in6_addr *saddr, -- 2.11.0 ^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH v7 0/6] The SipHash Patchset 2016-12-16 3:03 ` [PATCH v6 0/5] The SipHash Patchset Jason A. Donenfeld ` (4 preceding siblings ...) 2016-12-16 3:03 ` [PATCH v6 5/5] syncookies: use SipHash in place of SHA1 Jason A. Donenfeld @ 2016-12-21 23:02 ` Jason A. Donenfeld 2016-12-21 23:02 ` [PATCH v7 1/6] siphash: add cryptographically secure PRF Jason A. Donenfeld ` (5 more replies) 5 siblings, 6 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-21 23:02 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, edumazet, Linus Torvalds, Eric Biggers, Tom Herbert, ak, davem, luto, Jean-Philippe Aumasson Cc: Jason A. Donenfeld Hey folks, Again we've made huge progress, with this latest version now shipping Jean-Phillipe Aumasson's HalfSipHash, which should be much more competitive with jhash (in addition to being more secure, of course). There are dozens of little cleanups and improvements right and left throughout this series, so I urge you to take a look at the whole thing. I've tried to take into consideration lots of concerns and suggestions from many of you over the last week. There is also now documentation! And the test suite now has full coverage of all functions. Finally, there's been some significant benchmarking, and now a few commit messages have some performance numbers. Please send your Reviewed-by lines as you see fit. Regards, Jason Jason A. Donenfeld (6): siphash: add cryptographically secure PRF secure_seq: use SipHash in place of MD5 random: use SipHash in place of MD5 md5: remove from lib and only live in crypto syncookies: use SipHash in place of SHA1 siphash: implement HalfSipHash1-3 for hash tables Documentation/siphash.txt | 154 +++++++++++++ MAINTAINERS | 7 + crypto/md5.c | 95 +++++++- drivers/char/random.c | 84 ++++--- include/linux/random.h | 1 - include/linux/siphash.h | 133 +++++++++++ init/main.c | 1 - lib/Kconfig.debug | 6 +- lib/Makefile | 7 +- lib/md5.c | 95 -------- lib/siphash.c | 548 ++++++++++++++++++++++++++++++++++++++++++++++ lib/test_siphash.c | 208 ++++++++++++++++++ net/core/secure_seq.c | 135 +++++------- net/ipv4/syncookies.c | 20 +- net/ipv6/syncookies.c | 37 ++-- 15 files changed, 1274 insertions(+), 257 deletions(-) create mode 100644 Documentation/siphash.txt create mode 100644 include/linux/siphash.h delete mode 100644 lib/md5.c create mode 100644 lib/siphash.c create mode 100644 lib/test_siphash.c -- 2.11.0 ^ permalink raw reply [flat|nested] 82+ messages in thread
* [PATCH v7 1/6] siphash: add cryptographically secure PRF 2016-12-21 23:02 ` [PATCH v7 0/6] The SipHash Patchset Jason A. Donenfeld @ 2016-12-21 23:02 ` Jason A. Donenfeld 2016-12-22 1:40 ` Stephen Hemminger 2016-12-21 23:02 ` [PATCH v7 2/6] secure_seq: use SipHash in place of MD5 Jason A. Donenfeld ` (4 subsequent siblings) 5 siblings, 1 reply; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-21 23:02 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, edumazet, Linus Torvalds, Eric Biggers, Tom Herbert, ak, davem, luto, Jean-Philippe Aumasson Cc: Jason A. Donenfeld, Eric Dumazet SipHash is a 64-bit keyed hash function that is actually a cryptographically secure PRF, like HMAC. Except SipHash is super fast, and is meant to be used as a hashtable keyed lookup function, or as a general PRF for short input use cases, such as sequence numbers or RNG chaining. For the first usage: There are a variety of attacks known as "hashtable poisoning" in which an attacker forms some data such that the hash of that data will be the same, and then preceeds to fill up all entries of a hashbucket. This is a realistic and well-known denial-of-service vector. Currently hashtables use jhash, which is fast but not secure, and some kind of rotating key scheme (or none at all, which isn't good). SipHash is meant as a replacement for jhash in these cases. There are a modicum of places in the kernel that are vulnerable to hashtable poisoning attacks, either via userspace vectors or network vectors, and there's not a reliable mechanism inside the kernel at the moment to fix it. The first step toward fixing these issues is actually getting a secure primitive into the kernel for developers to use. Then we can, bit by bit, port things over to it as deemed appropriate. While SipHash is extremely fast for a cryptographically secure function, it is likely a bit slower than the insecure jhash, and so replacements will be evaluated on a case-by-case basis based on whether or not the difference in speed is negligible and whether or not the current jhash usage poses a real security risk. For the second usage: A few places in the kernel are using MD5 or SHA1 for creating secure sequence numbers, syn cookies, port numbers, or fast random numbers. SipHash is a faster and more fitting, and more secure replacement for MD5 in those situations. Replacing MD5 and SHA1 with SipHash for these uses is obvious and straight-forward, and so is submitted along with this patch series. There shouldn't be much of a debate over its efficacy. Dozens of languages are already using this internally for their hash tables and PRFs. Some of the BSDs already use this in their kernels. SipHash is a widely known high-speed solution to a widely known set of problems, and it's time we catch-up. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> --- Documentation/siphash.txt | 79 ++++++++++++++++ MAINTAINERS | 7 ++ include/linux/siphash.h | 79 ++++++++++++++++ lib/Kconfig.debug | 6 +- lib/Makefile | 5 +- lib/siphash.c | 232 ++++++++++++++++++++++++++++++++++++++++++++++ lib/test_siphash.c | 119 ++++++++++++++++++++++++ 7 files changed, 522 insertions(+), 5 deletions(-) create mode 100644 Documentation/siphash.txt create mode 100644 include/linux/siphash.h create mode 100644 lib/siphash.c create mode 100644 lib/test_siphash.c diff --git a/Documentation/siphash.txt b/Documentation/siphash.txt new file mode 100644 index 000000000000..39ff7f0438e7 --- /dev/null +++ b/Documentation/siphash.txt @@ -0,0 +1,79 @@ + SipHash - a short input PRF +----------------------------------------------- +Written by Jason A. Donenfeld <jason@zx2c4.com> + +SipHash is a cryptographically secure PRF -- a keyed hash function -- that +performs very well for short inputs, hence the name. It was designed by +cryptographers Daniel J. Bernstein and Jean-Philippe Aumasson. It is intended +as a replacement for some uses of: `jhash`, `md5_transform`, `sha_transform`, +and so forth. + +SipHash takes a secret key filled with randomly generated numbers and either +an input buffer or several input integers. It spits out an integer that is +indistinguishable from random. You may then use that integer as part of secure +sequence numbers, secure cookies, or mask it off for use in a hash table. + +1. Generating a key + +Keys should always be generated from a cryptographically secure source of +random numbers, either using get_random_bytes or get_random_once: + +siphash_key_t key; +get_random_bytes(key, sizeof(key)); + +If you're not deriving your key from here, you're doing it wrong. + +2. Using the functions + +There are two variants of the function, one that takes a list of integers, and +one that takes a buffer: + +u64 siphash(const void *data, size_t len, siphash_key_t key); + +And: + +u64 siphash_1u64(u64, siphash_key_t key); +u64 siphash_2u64(u64, u64, siphash_key_t key); +u64 siphash_3u64(u64, u64, u64, siphash_key_t key); +u64 siphash_4u64(u64, u64, u64, u64, siphash_key_t key); +u64 siphash_1u32(u32, siphash_key_t key); +u64 siphash_2u32(u32, u32, siphash_key_t key); +u64 siphash_3u32(u32, u32, u32, siphash_key_t key); +u64 siphash_4u32(u32, u32, u32, u32, siphash_key_t key); + +If you pass the generic siphash function something of a constant length, it +will constant fold at compile-time and automatically choose one of the +optimized functions. + +3. Hashtable key function usage: + +struct some_hashtable { + DECLARE_HASHTABLE(hashtable, 8); + siphash_key_t key; +}; + +void init_hashtable(struct some_hashtable *table) +{ + get_random_bytes(table->key, sizeof(table->key)); +} + +static inline hlist_head *some_hashtable_bucket(struct some_hashtable *table, struct interesting_input *input) +{ + return &table->hashtable[siphash(input, sizeof(*input), table->key) & (HASH_SIZE(table->hashtable) - 1)]; +} + +You may then iterate like usual over the returned hash bucket. + +4. Security + +SipHash has a very high security margin, with its 128-bit key. So long as the +key is kept secret, it is impossible for an attacker to guess the outputs of +the function, even if being able to observe many outputs, since 2^128 outputs +is significant. + +Linux implements the "2-4" variant of SipHash. + +5. Resources + +Read the SipHash paper if you're interested in learning more: +https://131002.net/siphash/siphash.pdf diff --git a/MAINTAINERS b/MAINTAINERS index 59c9895d73d5..5d87a8c1056a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11231,6 +11231,13 @@ F: arch/arm/mach-s3c24xx/mach-bast.c F: arch/arm/mach-s3c24xx/bast-ide.c F: arch/arm/mach-s3c24xx/bast-irq.c +SIPHASH PRF ROUTINES +M: Jason A. Donenfeld <Jason@zx2c4.com> +S: Maintained +F: lib/siphash.c +F: lib/test_siphash.c +F: include/linux/siphash.h + TI DAVINCI MACHINE SUPPORT M: Sekhar Nori <nsekhar@ti.com> M: Kevin Hilman <khilman@kernel.org> diff --git a/include/linux/siphash.h b/include/linux/siphash.h new file mode 100644 index 000000000000..7aa666eb00d9 --- /dev/null +++ b/include/linux/siphash.h @@ -0,0 +1,79 @@ +/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. + * + * This file is provided under a dual BSD/GPLv2 license. + * + * SipHash: a fast short-input PRF + * https://131002.net/siphash/ + * + * This implementation is specifically for SipHash2-4. + */ + +#ifndef _LINUX_SIPHASH_H +#define _LINUX_SIPHASH_H + +#include <linux/types.h> +#include <linux/kernel.h> + +#define SIPHASH_ALIGNMENT __alignof__(u64) +typedef u64 siphash_key_t[2]; + +u64 __siphash_aligned(const void *data, size_t len, const siphash_key_t key); +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS +u64 __siphash_unaligned(const void *data, size_t len, const siphash_key_t key); +#endif + +u64 siphash_1u64(const u64 a, const siphash_key_t key); +u64 siphash_2u64(const u64 a, const u64 b, const siphash_key_t key); +u64 siphash_3u64(const u64 a, const u64 b, const u64 c, + const siphash_key_t key); +u64 siphash_4u64(const u64 a, const u64 b, const u64 c, const u64 d, + const siphash_key_t key); +u64 siphash_1u32(const u32 a, const siphash_key_t key); +u64 siphash_3u32(const u32 a, const u32 b, const u32 c, const siphash_key_t key); + +static inline u64 siphash_2u32(const u32 a, const u32 b, const siphash_key_t key) +{ + return siphash_1u64((u64)b << 32 | a, key); +} +static inline u64 siphash_4u32(const u32 a, const u32 b, const u32 c, const u32 d, + const siphash_key_t key) +{ + return siphash_2u64((u64)b << 32 | a, (u64)d << 32 | c, key); +} + + +static inline u64 ___siphash_aligned(const __le64 *data, size_t len, const siphash_key_t key) +{ + if (__builtin_constant_p(len) && len == 4) + return siphash_1u32(le32_to_cpu(data[0]), key); + if (__builtin_constant_p(len) && len == 8) + return siphash_1u64(le64_to_cpu(data[0]), key); + if (__builtin_constant_p(len) && len == 16) + return siphash_2u64(le64_to_cpu(data[0]), le64_to_cpu(data[1]), + key); + if (__builtin_constant_p(len) && len == 24) + return siphash_3u64(le64_to_cpu(data[0]), le64_to_cpu(data[1]), + le64_to_cpu(data[2]), key); + if (__builtin_constant_p(len) && len == 32) + return siphash_4u64(le64_to_cpu(data[0]), le64_to_cpu(data[1]), + le64_to_cpu(data[2]), le64_to_cpu(data[3]), + key); + return __siphash_aligned(data, len, key); +} + +/** + * siphash - compute 64-bit siphash PRF value + * @data: buffer to hash + * @size: size of @data + * @key: the siphash key + */ +static inline u64 siphash(const void *data, size_t len, const siphash_key_t key) +{ +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS + if (!IS_ALIGNED((unsigned long)data, SIPHASH_ALIGNMENT)) + return __siphash_unaligned(data, len, key); +#endif + return ___siphash_aligned(data, len, key); +} + +#endif /* _LINUX_SIPHASH_H */ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 7446097f72bd..86254ea99b45 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1843,9 +1843,9 @@ config TEST_HASH tristate "Perform selftest on hash functions" default n help - Enable this option to test the kernel's integer (<linux/hash,h>) - and string (<linux/stringhash.h>) hash functions on boot - (or module load). + Enable this option to test the kernel's integer (<linux/hash.h>), + string (<linux/stringhash.h>), and siphash (<linux/siphash.h>) + hash functions on boot (or module load). This is intended to help people writing architecture-specific optimized versions. If unsure, say N. diff --git a/lib/Makefile b/lib/Makefile index 50144a3aeebd..71d398b04a74 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -22,7 +22,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \ sha1.o chacha20.o md5.o irq_regs.o argv_split.o \ flex_proportions.o ratelimit.o show_mem.o \ is_single_threaded.o plist.o decompress.o kobject_uevent.o \ - earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o win_minmax.o + earlycpio.o seq_buf.o siphash.o \ + nmi_backtrace.o nodemask.o win_minmax.o lib-$(CONFIG_MMU) += ioremap.o lib-$(CONFIG_SMP) += cpumask.o @@ -44,7 +45,7 @@ obj-$(CONFIG_TEST_HEXDUMP) += test_hexdump.o obj-y += kstrtox.o obj-$(CONFIG_TEST_BPF) += test_bpf.o obj-$(CONFIG_TEST_FIRMWARE) += test_firmware.o -obj-$(CONFIG_TEST_HASH) += test_hash.o +obj-$(CONFIG_TEST_HASH) += test_hash.o test_siphash.o obj-$(CONFIG_TEST_KASAN) += test_kasan.o obj-$(CONFIG_TEST_KSTRTOX) += test-kstrtox.o obj-$(CONFIG_TEST_LKM) += test_module.o diff --git a/lib/siphash.c b/lib/siphash.c new file mode 100644 index 000000000000..ff2151313667 --- /dev/null +++ b/lib/siphash.c @@ -0,0 +1,232 @@ +/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. + * + * This file is provided under a dual BSD/GPLv2 license. + * + * SipHash: a fast short-input PRF + * https://131002.net/siphash/ + * + * This implementation is specifically for SipHash2-4. + */ + +#include <linux/siphash.h> +#include <asm/unaligned.h> + +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 +#include <linux/dcache.h> +#include <asm/word-at-a-time.h> +#endif + +#define SIPROUND \ + do { \ + v0 += v1; v1 = rol64(v1, 13); v1 ^= v0; v0 = rol64(v0, 32); \ + v2 += v3; v3 = rol64(v3, 16); v3 ^= v2; \ + v0 += v3; v3 = rol64(v3, 21); v3 ^= v0; \ + v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); \ + } while(0) + +#define PREAMBLE(len) \ + u64 v0 = 0x736f6d6570736575ULL; \ + u64 v1 = 0x646f72616e646f6dULL; \ + u64 v2 = 0x6c7967656e657261ULL; \ + u64 v3 = 0x7465646279746573ULL; \ + u64 b = ((u64)len) << 56; \ + v3 ^= key[1]; \ + v2 ^= key[0]; \ + v1 ^= key[1]; \ + v0 ^= key[0]; + +#define POSTAMBLE \ + v3 ^= b; \ + SIPROUND; \ + SIPROUND; \ + v0 ^= b; \ + v2 ^= 0xff; \ + SIPROUND; \ + SIPROUND; \ + SIPROUND; \ + SIPROUND; \ + return (v0 ^ v1) ^ (v2 ^ v3); + +u64 __siphash_aligned(const void *data, size_t len, const siphash_key_t key) +{ + const u8 *end = data + len - (len % sizeof(u64)); + const u8 left = len & (sizeof(u64) - 1); + u64 m; + PREAMBLE(len) + for (; data != end; data += sizeof(u64)) { + m = le64_to_cpup(data); + v3 ^= m; + SIPROUND; + SIPROUND; + v0 ^= m; + } +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 + if (left) + b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) & + bytemask_from_count(left))); +#else + switch (left) { + case 7: b |= ((u64)end[6]) << 48; + case 6: b |= ((u64)end[5]) << 40; + case 5: b |= ((u64)end[4]) << 32; + case 4: b |= le32_to_cpup(data); break; + case 3: b |= ((u64)end[2]) << 16; + case 2: b |= le16_to_cpup(data); break; + case 1: b |= end[0]; + } +#endif + POSTAMBLE +} +EXPORT_SYMBOL(__siphash_aligned); + +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS +u64 __siphash_unaligned(const void *data, size_t len, const siphash_key_t key) +{ + const u8 *end = data + len - (len % sizeof(u64)); + const u8 left = len & (sizeof(u64) - 1); + u64 m; + PREAMBLE(len) + for (; data != end; data += sizeof(u64)) { + m = get_unaligned_le64(data); + v3 ^= m; + SIPROUND; + SIPROUND; + v0 ^= m; + } +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 + if (left) + b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) & + bytemask_from_count(left))); +#else + switch (left) { + case 7: b |= ((u64)end[6]) << 48; + case 6: b |= ((u64)end[5]) << 40; + case 5: b |= ((u64)end[4]) << 32; + case 4: b |= get_unaligned_le32(end); break; + case 3: b |= ((u64)end[2]) << 16; + case 2: b |= get_unaligned_le16(end); break; + case 1: b |= end[0]; + } +#endif + POSTAMBLE +} +EXPORT_SYMBOL(__siphash_unaligned); +#endif + +/** + * siphash_1u64 - compute 64-bit siphash PRF value of a u64 + * @first: first u64 + * @key: the siphash key + */ +u64 siphash_1u64(const u64 first, const siphash_key_t key) +{ + PREAMBLE(8) + v3 ^= first; + SIPROUND; + SIPROUND; + v0 ^= first; + POSTAMBLE +} +EXPORT_SYMBOL(siphash_1u64); + +/** + * siphash_2u64 - compute 64-bit siphash PRF value of 2 u64 + * @first: first u64 + * @second: second u64 + * @key: the siphash key + */ +u64 siphash_2u64(const u64 first, const u64 second, const siphash_key_t key) +{ + PREAMBLE(16) + v3 ^= first; + SIPROUND; + SIPROUND; + v0 ^= first; + v3 ^= second; + SIPROUND; + SIPROUND; + v0 ^= second; + POSTAMBLE +} +EXPORT_SYMBOL(siphash_2u64); + +/** + * siphash_3u64 - compute 64-bit siphash PRF value of 3 u64 + * @first: first u64 + * @second: second u64 + * @third: third u64 + * @key: the siphash key + */ +u64 siphash_3u64(const u64 first, const u64 second, const u64 third, + const siphash_key_t key) +{ + PREAMBLE(24) + v3 ^= first; + SIPROUND; + SIPROUND; + v0 ^= first; + v3 ^= second; + SIPROUND; + SIPROUND; + v0 ^= second; + v3 ^= third; + SIPROUND; + SIPROUND; + v0 ^= third; + POSTAMBLE +} +EXPORT_SYMBOL(siphash_3u64); + +/** + * siphash_4u64 - compute 64-bit siphash PRF value of 4 u64 + * @first: first u64 + * @second: second u64 + * @third: third u64 + * @forth: forth u64 + * @key: the siphash key + */ +u64 siphash_4u64(const u64 first, const u64 second, const u64 third, + const u64 forth, const siphash_key_t key) +{ + PREAMBLE(32) + v3 ^= first; + SIPROUND; + SIPROUND; + v0 ^= first; + v3 ^= second; + SIPROUND; + SIPROUND; + v0 ^= second; + v3 ^= third; + SIPROUND; + SIPROUND; + v0 ^= third; + v3 ^= forth; + SIPROUND; + SIPROUND; + v0 ^= forth; + POSTAMBLE +} +EXPORT_SYMBOL(siphash_4u64); + +u64 siphash_1u32(const u32 first, const siphash_key_t key) +{ + PREAMBLE(4) + b |= first; + POSTAMBLE +} +EXPORT_SYMBOL(siphash_1u32); + +u64 siphash_3u32(const u32 first, const u32 second, const u32 third, + const siphash_key_t key) +{ + u64 combined = (u64)second << 32 | first; + PREAMBLE(12) + v3 ^= combined; + SIPROUND; + SIPROUND; + v0 ^= combined; + b |= third; + POSTAMBLE +} +EXPORT_SYMBOL(siphash_3u32); diff --git a/lib/test_siphash.c b/lib/test_siphash.c new file mode 100644 index 000000000000..e0ba2cf8dc67 --- /dev/null +++ b/lib/test_siphash.c @@ -0,0 +1,119 @@ +/* Test cases for siphash.c + * + * Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. + * + * This file is provided under a dual BSD/GPLv2 license. + * + * SipHash: a fast short-input PRF + * https://131002.net/siphash/ + * + * This implementation is specifically for SipHash2-4. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include <linux/siphash.h> +#include <linux/kernel.h> +#include <linux/string.h> +#include <linux/errno.h> +#include <linux/module.h> + +/* Test vectors taken from official reference source available at: + * https://131002.net/siphash/siphash24.c + */ +static const u64 test_vectors[64] = { + 0x726fdb47dd0e0e31ULL, 0x74f839c593dc67fdULL, 0x0d6c8009d9a94f5aULL, + 0x85676696d7fb7e2dULL, 0xcf2794e0277187b7ULL, 0x18765564cd99a68dULL, + 0xcbc9466e58fee3ceULL, 0xab0200f58b01d137ULL, 0x93f5f5799a932462ULL, + 0x9e0082df0ba9e4b0ULL, 0x7a5dbbc594ddb9f3ULL, 0xf4b32f46226bada7ULL, + 0x751e8fbc860ee5fbULL, 0x14ea5627c0843d90ULL, 0xf723ca908e7af2eeULL, + 0xa129ca6149be45e5ULL, 0x3f2acc7f57c29bdbULL, 0x699ae9f52cbe4794ULL, + 0x4bc1b3f0968dd39cULL, 0xbb6dc91da77961bdULL, 0xbed65cf21aa2ee98ULL, + 0xd0f2cbb02e3b67c7ULL, 0x93536795e3a33e88ULL, 0xa80c038ccd5ccec8ULL, + 0xb8ad50c6f649af94ULL, 0xbce192de8a85b8eaULL, 0x17d835b85bbb15f3ULL, + 0x2f2e6163076bcfadULL, 0xde4daaaca71dc9a5ULL, 0xa6a2506687956571ULL, + 0xad87a3535c49ef28ULL, 0x32d892fad841c342ULL, 0x7127512f72f27cceULL, + 0xa7f32346f95978e3ULL, 0x12e0b01abb051238ULL, 0x15e034d40fa197aeULL, + 0x314dffbe0815a3b4ULL, 0x027990f029623981ULL, 0xcadcd4e59ef40c4dULL, + 0x9abfd8766a33735cULL, 0x0e3ea96b5304a7d0ULL, 0xad0c42d6fc585992ULL, + 0x187306c89bc215a9ULL, 0xd4a60abcf3792b95ULL, 0xf935451de4f21df2ULL, + 0xa9538f0419755787ULL, 0xdb9acddff56ca510ULL, 0xd06c98cd5c0975ebULL, + 0xe612a3cb9ecba951ULL, 0xc766e62cfcadaf96ULL, 0xee64435a9752fe72ULL, + 0xa192d576b245165aULL, 0x0a8787bf8ecb74b2ULL, 0x81b3e73d20b49b6fULL, + 0x7fa8220ba3b2eceaULL, 0x245731c13ca42499ULL, 0xb78dbfaf3a8d83bdULL, + 0xea1ad565322a1a0bULL, 0x60e61c23a3795013ULL, 0x6606d7e446282b93ULL, + 0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 0xe51b38608ef25f57ULL, + 0x958a324ceb064572ULL +}; +static const siphash_key_t test_key = + { 0x0706050403020100ULL , 0x0f0e0d0c0b0a0908ULL }; + +static int __init siphash_test_init(void) +{ + u8 in[64] __aligned(SIPHASH_ALIGNMENT); + u8 in_unaligned[65]; + u8 i; + int ret = 0; + + for (i = 0; i < 64; ++i) { + in[i] = i; + in_unaligned[i + 1] = i; + if (siphash(in, i, test_key) != test_vectors[i]) { + pr_info("self-test aligned %u: FAIL\n", i + 1); + ret = -EINVAL; + } + if (siphash(in_unaligned + 1, i, test_key) != test_vectors[i]) { + pr_info("self-test unaligned %u: FAIL\n", i + 1); + ret = -EINVAL; + } + } + if (siphash_1u64(0x0706050403020100ULL, test_key) != test_vectors[8]) { + pr_info("self-test 1u64: FAIL\n"); + ret = -EINVAL; + } + if (siphash_2u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL, test_key) != test_vectors[16]) { + pr_info("self-test 2u64: FAIL\n"); + ret = -EINVAL; + } + if (siphash_3u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL, + 0x1716151413121110ULL, test_key) != test_vectors[24]) { + pr_info("self-test 3u64: FAIL\n"); + ret = -EINVAL; + } + if (siphash_4u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL, + 0x1716151413121110ULL, 0x1f1e1d1c1b1a1918ULL, test_key) != test_vectors[32]) { + pr_info("self-test 4u64: FAIL\n"); + ret = -EINVAL; + } + if (siphash_1u32(0x03020100U, test_key) != test_vectors[4]) { + pr_info("self-test 1u32: FAIL\n"); + ret = -EINVAL; + } + if (siphash_2u32(0x03020100U, 0x07060504U, test_key) != test_vectors[8]) { + pr_info("self-test 2u32: FAIL\n"); + ret = -EINVAL; + } + if (siphash_3u32(0x03020100U, 0x07060504U, + 0x0b0a0908U, test_key) != test_vectors[12]) { + pr_info("self-test 3u32: FAIL\n"); + ret = -EINVAL; + } + if (siphash_4u32(0x03020100U, 0x07060504U, + 0x0b0a0908U, 0x0f0e0d0cU, test_key) != test_vectors[16]) { + pr_info("self-test 4u32: FAIL\n"); + ret = -EINVAL; + } + if (!ret) + pr_info("self-tests: pass\n"); + return ret; +} + +static void __exit siphash_test_exit(void) +{ +} + +module_init(siphash_test_init); +module_exit(siphash_test_exit); + +MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>"); +MODULE_LICENSE("Dual BSD/GPL"); -- 2.11.0 ^ permalink raw reply related [flat|nested] 82+ messages in thread
* Re: [PATCH v7 1/6] siphash: add cryptographically secure PRF 2016-12-21 23:02 ` [PATCH v7 1/6] siphash: add cryptographically secure PRF Jason A. Donenfeld @ 2016-12-22 1:40 ` Stephen Hemminger 0 siblings, 0 replies; 82+ messages in thread From: Stephen Hemminger @ 2016-12-22 1:40 UTC (permalink / raw) To: Jason A. Donenfeld Cc: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, edumazet, Linus Torvalds, Eric Biggers, Tom Herbert, ak, davem, luto, Jean-Philippe Aumasson, Eric Dumazet On Thu, 22 Dec 2016 00:02:11 +0100 "Jason A. Donenfeld" <Jason@zx2c4.com> wrote: > SipHash is a 64-bit keyed hash function that is actually a > cryptographically secure PRF, like HMAC. Except SipHash is super fast, > and is meant to be used as a hashtable keyed lookup function, or as a > general PRF for short input use cases, such as sequence numbers or RNG > chaining. > > For the first usage: > > There are a variety of attacks known as "hashtable poisoning" in which an > attacker forms some data such that the hash of that data will be the > same, and then preceeds to fill up all entries of a hashbucket. This is > a realistic and well-known denial-of-service vector. Currently > hashtables use jhash, which is fast but not secure, and some kind of > rotating key scheme (or none at all, which isn't good). SipHash is meant > as a replacement for jhash in these cases. > > There are a modicum of places in the kernel that are vulnerable to > hashtable poisoning attacks, either via userspace vectors or network > vectors, and there's not a reliable mechanism inside the kernel at the > moment to fix it. The first step toward fixing these issues is actually > getting a secure primitive into the kernel for developers to use. Then > we can, bit by bit, port things over to it as deemed appropriate. > > While SipHash is extremely fast for a cryptographically secure function, > it is likely a bit slower than the insecure jhash, and so replacements > will be evaluated on a case-by-case basis based on whether or not the > difference in speed is negligible and whether or not the current jhash usage > poses a real security risk. > > For the second usage: > > A few places in the kernel are using MD5 or SHA1 for creating secure > sequence numbers, syn cookies, port numbers, or fast random numbers. > SipHash is a faster and more fitting, and more secure replacement for MD5 > in those situations. Replacing MD5 and SHA1 with SipHash for these uses is > obvious and straight-forward, and so is submitted along with this patch > series. There shouldn't be much of a debate over its efficacy. > > Dozens of languages are already using this internally for their hash > tables and PRFs. Some of the BSDs already use this in their kernels. > SipHash is a widely known high-speed solution to a widely known set of > problems, and it's time we catch-up. > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> > Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Eric Biggers <ebiggers3@gmail.com> > Cc: David Laight <David.Laight@aculab.com> > Cc: Eric Dumazet <eric.dumazet@gmail.com> The networking tree (net-next) which is where you are submitting to is technically closed right now. ^ permalink raw reply [flat|nested] 82+ messages in thread
* [PATCH v7 2/6] secure_seq: use SipHash in place of MD5 2016-12-21 23:02 ` [PATCH v7 0/6] The SipHash Patchset Jason A. Donenfeld 2016-12-21 23:02 ` [PATCH v7 1/6] siphash: add cryptographically secure PRF Jason A. Donenfeld @ 2016-12-21 23:02 ` Jason A. Donenfeld 2016-12-21 23:02 ` [PATCH v7 3/6] random: " Jason A. Donenfeld ` (3 subsequent siblings) 5 siblings, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-21 23:02 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, edumazet, Linus Torvalds, Eric Biggers, Tom Herbert, ak, davem, luto, Jean-Philippe Aumasson Cc: Jason A. Donenfeld, Eric Dumazet This gives a clear speed and security improvement. Siphash is both faster and is more solid crypto than the aging MD5. Rather than manually filling MD5 buffers, for IPv6, we simply create a layout by a simple anonymous struct, for which gcc generates rather efficient code. For IPv4, we pass the values directly to the short input convenience functions. 64-bit x86_64: [ 1.683628] secure_tcpv6_sequence_number_md5# cycles: 99563527 [ 1.717350] secure_tcp_sequence_number_md5# cycles: 92890502 [ 1.741968] secure_tcpv6_sequence_number_siphash# cycles: 67825362 [ 1.762048] secure_tcp_sequence_number_siphash# cycles: 67485526 32-bit x86: [ 1.600012] secure_tcpv6_sequence_number_md5# cycles: 103227892 [ 1.634219] secure_tcp_sequence_number_md5# cycles: 94732544 [ 1.669102] secure_tcpv6_sequence_number_siphash# cycles: 96299384 [ 1.700165] secure_tcp_sequence_number_siphash# cycles: 86015473 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Miller <davem@davemloft.net> Cc: David Laight <David.Laight@aculab.com> Cc: Tom Herbert <tom@herbertland.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> --- net/core/secure_seq.c | 135 ++++++++++++++++++++------------------------------ 1 file changed, 54 insertions(+), 81 deletions(-) diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c index 88a8e429fc3e..3dc2689bcc64 100644 --- a/net/core/secure_seq.c +++ b/net/core/secure_seq.c @@ -1,3 +1,5 @@ +/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. */ + #include <linux/kernel.h> #include <linux/init.h> #include <linux/cryptohash.h> @@ -8,14 +10,14 @@ #include <linux/ktime.h> #include <linux/string.h> #include <linux/net.h> - +#include <linux/siphash.h> #include <net/secure_seq.h> #if IS_ENABLED(CONFIG_IPV6) || IS_ENABLED(CONFIG_INET) +#include <linux/in6.h> #include <net/tcp.h> -#define NET_SECRET_SIZE (MD5_MESSAGE_BYTES / 4) -static u32 net_secret[NET_SECRET_SIZE] ____cacheline_aligned; +static siphash_key_t net_secret; static __always_inline void net_secret_init(void) { @@ -44,80 +46,65 @@ static u32 seq_scale(u32 seq) u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr, __be16 sport, __be16 dport, u32 *tsoff) { - u32 secret[MD5_MESSAGE_BYTES / 4]; - u32 hash[MD5_DIGEST_WORDS]; - u32 i; - + const struct { + struct in6_addr saddr; + struct in6_addr daddr; + __be16 sport; + __be16 dport; + } __aligned(SIPHASH_ALIGNMENT) combined = { + .saddr = *(struct in6_addr *)saddr, + .daddr = *(struct in6_addr *)daddr, + .sport = sport, + .dport = dport + }; + u64 hash; net_secret_init(); - memcpy(hash, saddr, 16); - for (i = 0; i < 4; i++) - secret[i] = net_secret[i] + (__force u32)daddr[i]; - secret[4] = net_secret[4] + - (((__force u16)sport << 16) + (__force u16)dport); - for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++) - secret[i] = net_secret[i]; - - md5_transform(hash, secret); - - *tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0; - return seq_scale(hash[0]); + hash = siphash(&combined, offsetofend(typeof(combined), dport), net_secret); + *tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0; + return seq_scale(hash); } EXPORT_SYMBOL(secure_tcpv6_sequence_number); u32 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr, __be16 dport) { - u32 secret[MD5_MESSAGE_BYTES / 4]; - u32 hash[MD5_DIGEST_WORDS]; - u32 i; - + const struct { + struct in6_addr saddr; + struct in6_addr daddr; + __be16 dport; + } __aligned(SIPHASH_ALIGNMENT) combined = { + .saddr = *(struct in6_addr *)saddr, + .daddr = *(struct in6_addr *)daddr, + .dport = dport + }; net_secret_init(); - memcpy(hash, saddr, 16); - for (i = 0; i < 4; i++) - secret[i] = net_secret[i] + (__force u32) daddr[i]; - secret[4] = net_secret[4] + (__force u32)dport; - for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++) - secret[i] = net_secret[i]; - - md5_transform(hash, secret); - - return hash[0]; + return siphash(&combined, offsetofend(typeof(combined), dport), net_secret); } EXPORT_SYMBOL(secure_ipv6_port_ephemeral); #endif #ifdef CONFIG_INET +/* secure_tcp_sequence_number(a, b, 0, d) == secure_ipv4_port_ephemeral(a, b, d), + * but fortunately, `sport' cannot be 0 in any circumstances. If this changes, + * it would be easy enough to have the former function use siphash_4u32, passing + * the arguments as separate u32. + */ + u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr, __be16 sport, __be16 dport, u32 *tsoff) { - u32 hash[MD5_DIGEST_WORDS]; - + u64 hash; net_secret_init(); - hash[0] = (__force u32)saddr; - hash[1] = (__force u32)daddr; - hash[2] = ((__force u16)sport << 16) + (__force u16)dport; - hash[3] = net_secret[15]; - - md5_transform(hash, net_secret); - - *tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0; - return seq_scale(hash[0]); + hash = siphash_3u32(saddr, daddr, (u32)sport << 16 | dport, net_secret); + *tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0; + return seq_scale(hash); } u32 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport) { - u32 hash[MD5_DIGEST_WORDS]; - net_secret_init(); - hash[0] = (__force u32)saddr; - hash[1] = (__force u32)daddr; - hash[2] = (__force u32)dport ^ net_secret[14]; - hash[3] = net_secret[15]; - - md5_transform(hash, net_secret); - - return hash[0]; + return siphash_3u32(saddr, daddr, dport, net_secret); } EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral); #endif @@ -126,21 +113,11 @@ EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral); u64 secure_dccp_sequence_number(__be32 saddr, __be32 daddr, __be16 sport, __be16 dport) { - u32 hash[MD5_DIGEST_WORDS]; u64 seq; - net_secret_init(); - hash[0] = (__force u32)saddr; - hash[1] = (__force u32)daddr; - hash[2] = ((__force u16)sport << 16) + (__force u16)dport; - hash[3] = net_secret[15]; - - md5_transform(hash, net_secret); - - seq = hash[0] | (((u64)hash[1]) << 32); + seq = siphash_3u32(saddr, daddr, (u32)sport << 16 | dport, net_secret); seq += ktime_get_real_ns(); seq &= (1ull << 48) - 1; - return seq; } EXPORT_SYMBOL(secure_dccp_sequence_number); @@ -149,26 +126,22 @@ EXPORT_SYMBOL(secure_dccp_sequence_number); u64 secure_dccpv6_sequence_number(__be32 *saddr, __be32 *daddr, __be16 sport, __be16 dport) { - u32 secret[MD5_MESSAGE_BYTES / 4]; - u32 hash[MD5_DIGEST_WORDS]; + const struct { + struct in6_addr saddr; + struct in6_addr daddr; + __be16 sport; + __be16 dport; + } __aligned(SIPHASH_ALIGNMENT) combined = { + .saddr = *(struct in6_addr *)saddr, + .daddr = *(struct in6_addr *)daddr, + .sport = sport, + .dport = dport + }; u64 seq; - u32 i; - net_secret_init(); - memcpy(hash, saddr, 16); - for (i = 0; i < 4; i++) - secret[i] = net_secret[i] + (__force u32)daddr[i]; - secret[4] = net_secret[4] + - (((__force u16)sport << 16) + (__force u16)dport); - for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++) - secret[i] = net_secret[i]; - - md5_transform(hash, secret); - - seq = hash[0] | (((u64)hash[1]) << 32); + seq = siphash(&combined, offsetofend(typeof(combined), dport), net_secret); seq += ktime_get_real_ns(); seq &= (1ull << 48) - 1; - return seq; } EXPORT_SYMBOL(secure_dccpv6_sequence_number); -- 2.11.0 ^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-21 23:02 ` [PATCH v7 0/6] The SipHash Patchset Jason A. Donenfeld 2016-12-21 23:02 ` [PATCH v7 1/6] siphash: add cryptographically secure PRF Jason A. Donenfeld 2016-12-21 23:02 ` [PATCH v7 2/6] secure_seq: use SipHash in place of MD5 Jason A. Donenfeld @ 2016-12-21 23:02 ` Jason A. Donenfeld 2016-12-21 23:13 ` Jason A. Donenfeld 2016-12-21 23:42 ` Andy Lutomirski 2016-12-21 23:02 ` [PATCH v7 4/6] md5: remove from lib and only live in crypto Jason A. Donenfeld ` (2 subsequent siblings) 5 siblings, 2 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-21 23:02 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, edumazet, Linus Torvalds, Eric Biggers, Tom Herbert, ak, davem, luto, Jean-Philippe Aumasson Cc: Jason A. Donenfeld This duplicates the current algorithm for get_random_int/long, but uses siphash instead. This comes with several benefits. It's certainly faster and more cryptographically secure than MD5. This patch also separates hashed fields into three values instead of one, in order to increase diffusion. The previous MD5 algorithm used a per-cpu MD5 state, which caused successive calls to the function to chain upon each other. While it's not entirely clear that this kind of chaining is absolutely necessary when using a secure PRF like siphash, it can't hurt, and the timing of the call chain does add a degree of natural entropy. So, in keeping with this design, instead of the massive per-cpu 64-byte MD5 state, there is instead a per-cpu previously returned value for chaining. The speed benefits are substantial: | siphash | md5 | speedup | ------------------------------ get_random_long | 137130 | 415983 | 3.03x | get_random_int | 86384 | 343323 | 3.97x | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> Cc: Ted Tso <tytso@mit.edu> --- drivers/char/random.c | 84 +++++++++++++++++++++++++++++--------------------- include/linux/random.h | 1 - init/main.c | 1 - 3 files changed, 49 insertions(+), 37 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index d6876d506220..ea9858d9d8b4 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -262,6 +262,7 @@ #include <linux/syscalls.h> #include <linux/completion.h> #include <linux/uuid.h> +#include <linux/siphash.h> #include <crypto/chacha20.h> #include <asm/processor.h> @@ -2042,17 +2043,31 @@ struct ctl_table random_table[] = { }; #endif /* CONFIG_SYSCTL */ -static u32 random_int_secret[MD5_MESSAGE_BYTES / 4] ____cacheline_aligned; -int random_int_secret_init(void) +struct random_int_secret { + siphash_key_t secret; + u64 chaining; + unsigned long birthdate; + bool initialized; +}; +static DEFINE_PER_CPU(struct random_int_secret, random_int_secret); + +enum { + SECRET_ROTATION_TIME = HZ * 60 * 5 +}; + +static struct random_int_secret *get_random_int_secret(void) { - get_random_bytes(random_int_secret, sizeof(random_int_secret)); - return 0; + struct random_int_secret *secret = &get_cpu_var(random_int_secret); + if (unlikely(!secret->initialized || + !time_is_after_jiffies(secret->birthdate + SECRET_ROTATION_TIME))) { + secret->initialized = true; + secret->birthdate = jiffies; + get_random_bytes(secret->secret, sizeof(secret->secret)); + } + return secret; } -static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash) - __aligned(sizeof(unsigned long)); - /* * Get a random word for internal kernel use only. Similar to urandom but * with the goal of minimal entropy pool depletion. As a result, the random @@ -2061,20 +2076,20 @@ static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash) */ unsigned int get_random_int(void) { - __u32 *hash; - unsigned int ret; - - if (arch_get_random_int(&ret)) - return ret; - - hash = get_cpu_var(get_random_int_hash); - - hash[0] += current->pid + jiffies + random_get_entropy(); - md5_transform(hash, random_int_secret); - ret = hash[0]; - put_cpu_var(get_random_int_hash); - - return ret; + unsigned int arch_result; + u64 result; + struct random_int_secret *secret; + + if (arch_get_random_int(&arch_result)) + return arch_result; + + secret = get_random_int_secret(); + result = siphash_3u64(secret->chaining, jiffies, + (u64)random_get_entropy() + current->pid, + secret->secret); + secret->chaining += result; + put_cpu_var(secret); + return result; } EXPORT_SYMBOL(get_random_int); @@ -2083,20 +2098,19 @@ EXPORT_SYMBOL(get_random_int); */ unsigned long get_random_long(void) { - __u32 *hash; - unsigned long ret; - - if (arch_get_random_long(&ret)) - return ret; - - hash = get_cpu_var(get_random_int_hash); - - hash[0] += current->pid + jiffies + random_get_entropy(); - md5_transform(hash, random_int_secret); - ret = *(unsigned long *)hash; - put_cpu_var(get_random_int_hash); - - return ret; + unsigned long arch_result; + u64 result; + struct random_int_secret *secret; + + if (arch_get_random_long(&arch_result)) + return arch_result; + + secret = get_random_int_secret(); + result = siphash_3u64(secret->chaining, jiffies, random_get_entropy() + + current->pid, secret->secret); + secret->chaining += result; + put_cpu_var(secret); + return result; } EXPORT_SYMBOL(get_random_long); diff --git a/include/linux/random.h b/include/linux/random.h index 7bd2403e4fef..16ab429735a7 100644 --- a/include/linux/random.h +++ b/include/linux/random.h @@ -37,7 +37,6 @@ extern void get_random_bytes(void *buf, int nbytes); extern int add_random_ready_callback(struct random_ready_callback *rdy); extern void del_random_ready_callback(struct random_ready_callback *rdy); extern void get_random_bytes_arch(void *buf, int nbytes); -extern int random_int_secret_init(void); #ifndef MODULE extern const struct file_operations random_fops, urandom_fops; diff --git a/init/main.c b/init/main.c index 23c275cca73a..a3af9296cafd 100644 --- a/init/main.c +++ b/init/main.c @@ -879,7 +879,6 @@ static void __init do_basic_setup(void) do_ctors(); usermodehelper_enable(); do_initcalls(); - random_int_secret_init(); } static void __init do_pre_smp_initcalls(void) -- 2.11.0 ^ permalink raw reply related [flat|nested] 82+ messages in thread
* Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-21 23:02 ` [PATCH v7 3/6] random: " Jason A. Donenfeld @ 2016-12-21 23:13 ` Jason A. Donenfeld 2016-12-21 23:42 ` Andy Lutomirski 1 sibling, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-21 23:13 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, Linux Crypto Mailing List, David Laight, Ted Tso, Hannes Frederic Sowa, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David Miller, Andy Lutomirski, Jean-Philippe Aumasson Cc: Jason A. Donenfeld Hi Ted, On Thu, Dec 22, 2016 at 12:02 AM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: > This duplicates the current algorithm for get_random_int/long I should have mentioned this directly in the commit message, which I forgot to update: this v7 adds the time-based key rotation, which, while not strictly necessary for ensuring the security of the RNG, might help alleviate some concerns, as we talked about. Performance is quite good on both 32-bit and 64-bit -- better than MD5 in both cases. If you like this, terrific. If not, I'm happy to take this in whatever direction you prefer, and implement whatever construction you think best. There's been a lot of noise on this list about it; we can continue to discuss more, or you can just tell me whatever you want to do, and I'll implement it and that'll be the end of it. As you said, we can always get something decent now and improve it later. Alternatively, if you've decided in the end you prefer your batched entropy approach using chacha, I'm happy to implement a polished version of that here in this patch series (so that we can keep the `rm lib/md5.c` commit.) Just let me know how you'd like to proceed. Thanks, Jason ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-21 23:02 ` [PATCH v7 3/6] random: " Jason A. Donenfeld 2016-12-21 23:13 ` Jason A. Donenfeld @ 2016-12-21 23:42 ` Andy Lutomirski 2016-12-22 2:07 ` Hannes Frederic Sowa 2016-12-22 2:31 ` Jason A. Donenfeld 1 sibling, 2 replies; 82+ messages in thread From: Andy Lutomirski @ 2016-12-21 23:42 UTC (permalink / raw) To: Jason A. Donenfeld Cc: Netdev, kernel-hardening, LKML, Linux Crypto Mailing List, David Laight, Ted Tso, Hannes Frederic Sowa, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Wed, Dec 21, 2016 at 3:02 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: > unsigned int get_random_int(void) > { > - __u32 *hash; > - unsigned int ret; > - > - if (arch_get_random_int(&ret)) > - return ret; > - > - hash = get_cpu_var(get_random_int_hash); > - > - hash[0] += current->pid + jiffies + random_get_entropy(); > - md5_transform(hash, random_int_secret); > - ret = hash[0]; > - put_cpu_var(get_random_int_hash); > - > - return ret; > + unsigned int arch_result; > + u64 result; > + struct random_int_secret *secret; > + > + if (arch_get_random_int(&arch_result)) > + return arch_result; > + > + secret = get_random_int_secret(); > + result = siphash_3u64(secret->chaining, jiffies, > + (u64)random_get_entropy() + current->pid, > + secret->secret); > + secret->chaining += result; > + put_cpu_var(secret); > + return result; > } > EXPORT_SYMBOL(get_random_int); Hmm. I haven't tried to prove anything for real. But here goes (in the random oracle model): Suppose I'm an attacker and I don't know the secret or the chaining value. Then, regardless of what the entropy is, I can't predict the numbers. Now suppose I do know the secret and the chaining value due to some leak. If I want to deduce prior outputs, I think I'm stuck: I'd need to find a value "result" such that prev_chaining + result = chaining and result = H(prev_chaining, ..., secret);. I don't think this can be done efficiently in the random oracle model regardless of what the "..." is. But, if I know the secret and chaining value, I can predict the next output assuming I can guess the entropy. What's worse is that, even if I can't guess the entropy, if I *observe* the next output then I can calculate the next chaining value. So this is probably good enough, and making it better is hard. Changing it to: u64 entropy = (u64)random_get_entropy() + current->pid; result = siphash(..., entropy, ...); secret->chaining += result + entropy; would reduce this problem by forcing an attacker to brute-force the entropy on each iteration, which is probably an improvement. To fully fix it, something like "catastrophic reseeding" would be needed, but that's hard to get right. (An aside: on x86 at least, using two percpu variables is faster because directly percpu access is essentially free, whereas getting the address of a percpu variable is not free.) ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-21 23:42 ` Andy Lutomirski @ 2016-12-22 2:07 ` Hannes Frederic Sowa 2016-12-22 2:09 ` Andy Lutomirski 2016-12-22 2:49 ` Jason A. Donenfeld 2016-12-22 2:31 ` Jason A. Donenfeld 1 sibling, 2 replies; 82+ messages in thread From: Hannes Frederic Sowa @ 2016-12-22 2:07 UTC (permalink / raw) To: Andy Lutomirski, Jason A. Donenfeld Cc: Netdev, kernel-hardening, LKML, Linux Crypto Mailing List, David Laight, Ted Tso, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On 22.12.2016 00:42, Andy Lutomirski wrote: > On Wed, Dec 21, 2016 at 3:02 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: >> unsigned int get_random_int(void) >> { >> - __u32 *hash; >> - unsigned int ret; >> - >> - if (arch_get_random_int(&ret)) >> - return ret; >> - >> - hash = get_cpu_var(get_random_int_hash); >> - >> - hash[0] += current->pid + jiffies + random_get_entropy(); >> - md5_transform(hash, random_int_secret); >> - ret = hash[0]; >> - put_cpu_var(get_random_int_hash); >> - >> - return ret; >> + unsigned int arch_result; >> + u64 result; >> + struct random_int_secret *secret; >> + >> + if (arch_get_random_int(&arch_result)) >> + return arch_result; >> + >> + secret = get_random_int_secret(); >> + result = siphash_3u64(secret->chaining, jiffies, >> + (u64)random_get_entropy() + current->pid, >> + secret->secret); >> + secret->chaining += result; >> + put_cpu_var(secret); >> + return result; >> } >> EXPORT_SYMBOL(get_random_int); > > Hmm. I haven't tried to prove anything for real. But here goes (in > the random oracle model): > > Suppose I'm an attacker and I don't know the secret or the chaining > value. Then, regardless of what the entropy is, I can't predict the > numbers. > > Now suppose I do know the secret and the chaining value due to some > leak. If I want to deduce prior outputs, I think I'm stuck: I'd need > to find a value "result" such that prev_chaining + result = chaining > and result = H(prev_chaining, ..., secret);. I don't think this can > be done efficiently in the random oracle model regardless of what the > "..." is. > > But, if I know the secret and chaining value, I can predict the next > output assuming I can guess the entropy. What's worse is that, even > if I can't guess the entropy, if I *observe* the next output then I > can calculate the next chaining value. > > So this is probably good enough, and making it better is hard. Changing it to: > > u64 entropy = (u64)random_get_entropy() + current->pid; > result = siphash(..., entropy, ...); > secret->chaining += result + entropy; > > would reduce this problem by forcing an attacker to brute-force the > entropy on each iteration, which is probably an improvement. > > To fully fix it, something like "catastrophic reseeding" would be > needed, but that's hard to get right. I wonder if Ted's proposal was analyzed further in terms of performance if get_random_int should provide cprng alike properties? For reference: https://lkml.org/lkml/2016/12/14/351 The proposal made sense to me and would completely solve the above mentioned problem on the cost of repeatedly reseeding from the crng. Bye, Hannes ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 2:07 ` Hannes Frederic Sowa @ 2016-12-22 2:09 ` Andy Lutomirski 2016-12-22 2:49 ` Jason A. Donenfeld 1 sibling, 0 replies; 82+ messages in thread From: Andy Lutomirski @ 2016-12-22 2:09 UTC (permalink / raw) To: Hannes Frederic Sowa Cc: Jason A. Donenfeld, Netdev, kernel-hardening, LKML, Linux Crypto Mailing List, David Laight, Ted Tso, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Wed, Dec 21, 2016 at 6:07 PM, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > On 22.12.2016 00:42, Andy Lutomirski wrote: >> On Wed, Dec 21, 2016 at 3:02 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: >>> unsigned int get_random_int(void) >>> { >>> - __u32 *hash; >>> - unsigned int ret; >>> - >>> - if (arch_get_random_int(&ret)) >>> - return ret; >>> - >>> - hash = get_cpu_var(get_random_int_hash); >>> - >>> - hash[0] += current->pid + jiffies + random_get_entropy(); >>> - md5_transform(hash, random_int_secret); >>> - ret = hash[0]; >>> - put_cpu_var(get_random_int_hash); >>> - >>> - return ret; >>> + unsigned int arch_result; >>> + u64 result; >>> + struct random_int_secret *secret; >>> + >>> + if (arch_get_random_int(&arch_result)) >>> + return arch_result; >>> + >>> + secret = get_random_int_secret(); >>> + result = siphash_3u64(secret->chaining, jiffies, >>> + (u64)random_get_entropy() + current->pid, >>> + secret->secret); >>> + secret->chaining += result; >>> + put_cpu_var(secret); >>> + return result; >>> } >>> EXPORT_SYMBOL(get_random_int); >> >> Hmm. I haven't tried to prove anything for real. But here goes (in >> the random oracle model): >> >> Suppose I'm an attacker and I don't know the secret or the chaining >> value. Then, regardless of what the entropy is, I can't predict the >> numbers. >> >> Now suppose I do know the secret and the chaining value due to some >> leak. If I want to deduce prior outputs, I think I'm stuck: I'd need >> to find a value "result" such that prev_chaining + result = chaining >> and result = H(prev_chaining, ..., secret);. I don't think this can >> be done efficiently in the random oracle model regardless of what the >> "..." is. >> >> But, if I know the secret and chaining value, I can predict the next >> output assuming I can guess the entropy. What's worse is that, even >> if I can't guess the entropy, if I *observe* the next output then I >> can calculate the next chaining value. >> >> So this is probably good enough, and making it better is hard. Changing it to: >> >> u64 entropy = (u64)random_get_entropy() + current->pid; >> result = siphash(..., entropy, ...); >> secret->chaining += result + entropy; >> >> would reduce this problem by forcing an attacker to brute-force the >> entropy on each iteration, which is probably an improvement. >> >> To fully fix it, something like "catastrophic reseeding" would be >> needed, but that's hard to get right. > > I wonder if Ted's proposal was analyzed further in terms of performance > if get_random_int should provide cprng alike properties? > > For reference: https://lkml.org/lkml/2016/12/14/351 > > The proposal made sense to me and would completely solve the above > mentioned problem on the cost of repeatedly reseeding from the crng. > Unless I've misunderstood it, Ted's proposal causes get_random_int() to return bytes straight from urandom (effectively), which should make it very strong. And if urandom is competitively fast now, I don't see the problem. ChaCha20 is designed for speed, after all. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 2:07 ` Hannes Frederic Sowa 2016-12-22 2:09 ` Andy Lutomirski @ 2016-12-22 2:49 ` Jason A. Donenfeld 2016-12-22 3:12 ` Jason A. Donenfeld 2016-12-22 5:41 ` [kernel-hardening] " Theodore Ts'o 1 sibling, 2 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-22 2:49 UTC (permalink / raw) To: Hannes Frederic Sowa Cc: Andy Lutomirski, Netdev, kernel-hardening, LKML, Linux Crypto Mailing List, David Laight, Ted Tso, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson Hi Andy & Hannes, On Thu, Dec 22, 2016 at 3:07 AM, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > I wonder if Ted's proposal was analyzed further in terms of performance > if get_random_int should provide cprng alike properties? > > For reference: https://lkml.org/lkml/2016/12/14/351 > > The proposal made sense to me and would completely solve the above > mentioned problem on the cost of repeatedly reseeding from the crng. On Thu, Dec 22, 2016 at 3:09 AM, Andy Lutomirski <luto@amacapital.net> wrote: > Unless I've misunderstood it, Ted's proposal causes get_random_int() > to return bytes straight from urandom (effectively), which should make > it very strong. And if urandom is competitively fast now, I don't see > the problem. ChaCha20 is designed for speed, after all. Funny -- while you guys were sending this back & forth, I was writing my reply to Andy which essentially arrives at the same conclusion. Given that we're all arriving to the same thing, and that Ted shot in this direction long before we all did, I'm leaning toward abandoning SipHash for the de-MD5-ification of get_random_int/long, and working on polishing Ted's idea into something shiny for this patchset. I did have two objections to this. The first was that my SipHash construction is faster. But in any case, they're both faster than the current MD5, so it's just extra rice. The second, and the more important one, was that batching entropy up like this means that 32 calls will be really fast, and then the 33rd will be slow, since it has to do a whole ChaCha round, because get_random_bytes must be called to refill the batch. Since get_random_long is called for every process startup, I didn't really like there being inconsistent performance on process startup. And I'm pretty sure that one ChaCha whole block is slower than computing MD5, even though it lasts 32 times as long, though I need to measure this. But maybe that's dumb in the end? Are these concerns that should point us toward the determinism (and speed) of SipHash? Are these concerns that don't matter and so we should roll with the simplicity of reusing ChaCha? Jason ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 2:49 ` Jason A. Donenfeld @ 2016-12-22 3:12 ` Jason A. Donenfeld 2016-12-22 5:41 ` [kernel-hardening] " Theodore Ts'o 1 sibling, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-22 3:12 UTC (permalink / raw) To: Hannes Frederic Sowa Cc: Andy Lutomirski, Netdev, kernel-hardening, LKML, Linux Crypto Mailing List, David Laight, Ted Tso, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Thu, Dec 22, 2016 at 3:49 AM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: > I did have two objections to this. The first was that my SipHash > construction is faster. But in any case, they're both faster than the > current MD5, so it's just extra rice. The second, and the more > important one, was that batching entropy up like this means that 32 > calls will be really fast, and then the 33rd will be slow, since it > has to do a whole ChaCha round, because get_random_bytes must be > called to refill the batch. Since get_random_long is called for every > process startup, I didn't really like there being inconsistent > performance on process startup. And I'm pretty sure that one ChaCha > whole block is slower than computing MD5, even though it lasts 32 > times as long, though I need to measure this. But maybe that's dumb in > the end? Are these concerns that should point us toward the > determinism (and speed) of SipHash? Are these concerns that don't > matter and so we should roll with the simplicity of reusing ChaCha? I ran some measurements in order to quantify what I'm talking about. Repeatedly running md5_transform is about 2.3 times faster than repeatedly running extract_crng. What does this mean? One call to extract_crng gives us 32 times as many longs as one call to md5_transform. This means that spread over 32 process creations, chacha will be 13.9 times faster. However, every 32nd process will take 2.3 times as long to generate its ASLR value as it would with the old md5_transform code. Personally, I don't think that 2.3 is a big deal. And I really like how much this simplifies the analysis. But if it's a big deal to you, then we can continue to discuss my SipHash construction, which gives faster and more consistent performance, at the cost of a more complicated and probably less impressive security analysis. Jason ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 2:49 ` Jason A. Donenfeld 2016-12-22 3:12 ` Jason A. Donenfeld @ 2016-12-22 5:41 ` Theodore Ts'o 2016-12-22 6:03 ` Jason A. Donenfeld 2016-12-22 12:47 ` Hannes Frederic Sowa 1 sibling, 2 replies; 82+ messages in thread From: Theodore Ts'o @ 2016-12-22 5:41 UTC (permalink / raw) To: kernel-hardening Cc: Hannes Frederic Sowa, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Thu, Dec 22, 2016 at 03:49:39AM +0100, Jason A. Donenfeld wrote: > > Funny -- while you guys were sending this back & forth, I was writing > my reply to Andy which essentially arrives at the same conclusion. > Given that we're all arriving to the same thing, and that Ted shot in > this direction long before we all did, I'm leaning toward abandoning > SipHash for the de-MD5-ification of get_random_int/long, and working > on polishing Ted's idea into something shiny for this patchset. here are my numbers comparing siphash (using the first three patches of the v7 siphash patches) with my batched chacha20 implementation. The results are taken by running get_random_* 10000 times, and then dividing the numbers by 10000 to get the average number of cycles for the call. I compiled 32-bit and 64-bit kernels, and ran the results using kvm: siphash batched chacha20 get_random_int get_random_long get_random_int get_random_long 32-bit 270 278 114 146 64-bit 75 75 106 186 > I did have two objections to this. The first was that my SipHash > construction is faster. Well, it's faster on everything except 32-bit x86. :-P > The second, and the more > important one, was that batching entropy up like this means that 32 > calls will be really fast, and then the 33rd will be slow, since it > has to do a whole ChaCha round, because get_random_bytes must be > called to refill the batch. ... and this will take 2121 cycles on 64-bit x86, and 2315 cycles on a 32-bit x86. Which on a 2.3 GHz processor, is just under a microsecond. As far as being inconsistent on process startup, I very much doubt a microsecond is really going to be visible to the user. :-) The bottom line is that I think we're really "pixel peeping" at this point --- which is what obsessed digital photographers will do when debating the quality of a Canon vs Nikon DSLR by blowing up a photo by a thousand times, and then trying to claim that this is visible to the human eye. Or people who obsessing over the frequency response curves of TH-X00 headphones with Mahogony vs Purpleheart wood, when it's likely that in a blind head-to-head comparison, most people wouldn't be able to tell the difference.... I think the main argument for using the batched getrandom approach is that it, I would argue, simpler than introducing siphash into the picture. On 64-bit platforms it is faster and more consistent, so it's basically that versus complexity of having to adding siphash to the things that people have to analyze when considering random number security on Linux. But it's a close call either way, I think. - Ted P.S. My benchmarking code.... diff --git a/drivers/char/random.c b/drivers/char/random.c index a51f0ff43f00..41860864b775 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -1682,6 +1682,55 @@ static int rand_initialize(void) } early_initcall(rand_initialize); +static unsigned int get_random_int_new(void); +static unsigned long get_random_long_new(void); + +#define NUM_CYCLES 10000 +#define AVG(finish, start) ((unsigned int)(finish - start + NUM_CYCLES/2) / NUM_CYCLES) + +static int rand_benchmark(void) +{ + cycles_t start,finish; + int i, out; + + pr_crit("random benchmark!!\n"); + start = get_cycles(); + for (i = 0; i < NUM_CYCLES; i++) { + get_random_int();} + finish = get_cycles(); + pr_err("get_random_int # cycles: %u\n", AVG(finish, start)); + + start = get_cycles(); + for (i = 0; i < NUM_CYCLES; i++) { + get_random_int_new(); + } + finish = get_cycles(); + pr_err("get_random_int_new (batched chacha20) # cycles: %u\n", AVG(finish, start)); + + start = get_cycles(); + for (i = 0; i < NUM_CYCLES; i++) { + get_random_long(); + } + finish = get_cycles(); + pr_err("get_random_long # cycles: %u\n", AVG(finish, start)); + + start = get_cycles(); + for (i = 0; i < NUM_CYCLES; i++) { + get_random_long_new(); + } + finish = get_cycles(); + pr_err("get_random_long_new (batched chacha20) # cycles: %u\n", AVG(finish, start)); + + start = get_cycles(); + for (i = 0; i < NUM_CYCLES; i++) { + get_random_bytes(&out, sizeof(out)); + } + finish = get_cycles(); + pr_err("get_random_bytes # cycles: %u\n", AVG(finish, start)); + return 0; +} +device_initcall(rand_benchmark); + #ifdef CONFIG_BLOCK void rand_initialize_disk(struct gendisk *disk) { @@ -2064,8 +2113,10 @@ unsigned int get_random_int(void) unsigned int ret; u64 *chaining; +#if 0 // force slow path if (arch_get_random_int(&ret)) return ret; +#endif chaining = &get_cpu_var(get_random_int_chaining); ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() + @@ -2083,8 +2134,10 @@ unsigned long get_random_long(void) unsigned long ret; u64 *chaining; +#if 0 // force slow path if (arch_get_random_long(&ret)) return ret; +#endif chaining = &get_cpu_var(get_random_int_chaining); ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() + @@ -2094,6 +2147,47 @@ unsigned long get_random_long(void) } EXPORT_SYMBOL(get_random_long); +struct random_buf { + __u8 buf[CHACHA20_BLOCK_SIZE]; + int ptr; +}; + +static DEFINE_PER_CPU(struct random_buf, batched_entropy); + +static void get_batched_entropy(void *buf, int n) +{ + struct random_buf *p; + + p = &get_cpu_var(batched_entropy); + + if ((p->ptr == 0) || + (p->ptr + n >= CHACHA20_BLOCK_SIZE)) { + extract_crng(p->buf); + p->ptr = 0; + } + BUG_ON(n > CHACHA20_BLOCK_SIZE); + memcpy(buf, p->buf, n); + p->ptr += n; + put_cpu_var(batched_entropy); +} + +static unsigned int get_random_int_new(void) +{ + unsigned int ret; + + get_batched_entropy(&ret, sizeof(ret)); + return ret; +} + +static unsigned long get_random_long_new(void) +{ + unsigned long ret; + + get_batched_entropy(&ret, sizeof(ret)); + return ret; +} + + /** * randomize_page - Generate a random, page aligned address * @start: The smallest acceptable address the caller will take. ^ permalink raw reply related [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 5:41 ` [kernel-hardening] " Theodore Ts'o @ 2016-12-22 6:03 ` Jason A. Donenfeld 2016-12-22 15:58 ` Theodore Ts'o 2016-12-22 12:47 ` Hannes Frederic Sowa 1 sibling, 1 reply; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-22 6:03 UTC (permalink / raw) To: kernel-hardening, Theodore Ts'o, Hannes Frederic Sowa, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson Hi Ted, On Thu, Dec 22, 2016 at 6:41 AM, Theodore Ts'o <tytso@mit.edu> wrote: > The bottom line is that I think we're really "pixel peeping" at this > point --- which is what obsessed digital photographers will do when > debating the quality of a Canon vs Nikon DSLR by blowing up a photo by > a thousand times, and then trying to claim that this is visible to the > human eye. Or people who obsessing over the frequency response curves > of TH-X00 headphones with Mahogony vs Purpleheart wood, when it's > likely that in a blind head-to-head comparison, most people wouldn't > be able to tell the difference.... This is hilarious, thanks for the laugh. I believe you're right about this... > > I think the main argument for using the batched getrandom approach is > that it, I would argue, simpler than introducing siphash into the > picture. On 64-bit platforms it is faster and more consistent, so > it's basically that versus complexity of having to adding siphash to > the things that people have to analyze when considering random number > security on Linux. But it's a close call either way, I think. I find this compelling. We'll have one csprng for both get_random_int/long and for /dev/urandom, and we'll be able to update that silly warning on the comment of get_random_int/long to read "gives output of either rdrand quality or of /dev/urandom quality", which makes it more useful for other things. It introduces less error prone code, and it lets the RNG analysis be spent on just one RNG, not two. So, with your blessing, I'm going to move ahead with implementing a pretty version of this for v8. Regards, Jason ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 6:03 ` Jason A. Donenfeld @ 2016-12-22 15:58 ` Theodore Ts'o 2016-12-22 16:16 ` Jason A. Donenfeld 0 siblings, 1 reply; 82+ messages in thread From: Theodore Ts'o @ 2016-12-22 15:58 UTC (permalink / raw) To: Jason A. Donenfeld Cc: kernel-hardening, Hannes Frederic Sowa, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Thu, Dec 22, 2016 at 07:03:29AM +0100, Jason A. Donenfeld wrote: > I find this compelling. We'll have one csprng for both > get_random_int/long and for /dev/urandom, and we'll be able to update > that silly warning on the comment of get_random_int/long to read > "gives output of either rdrand quality or of /dev/urandom quality", > which makes it more useful for other things. It introduces less error > prone code, and it lets the RNG analysis be spent on just one RNG, not > two. > > So, with your blessing, I'm going to move ahead with implementing a > pretty version of this for v8. Can we do this as a separate series, please? At this point, it's a completely separate change from a logical perspective, and we can take in the change through the random.git tree. Changes that touch files that are normally changed in several different git trees leads to the potential for merge conflicts during the linux-next integration and merge window processes. Which is why it's generally best to try to isolate changes as much as possible. Cheers, - Ted ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 15:58 ` Theodore Ts'o @ 2016-12-22 16:16 ` Jason A. Donenfeld 2016-12-22 16:30 ` Theodore Ts'o 0 siblings, 1 reply; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-22 16:16 UTC (permalink / raw) To: Theodore Ts'o, Jason A. Donenfeld, kernel-hardening, Hannes Frederic Sowa, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson Hi Ted, On Thu, Dec 22, 2016 at 4:58 PM, Theodore Ts'o <tytso@mit.edu> wrote: > Can we do this as a separate series, please? At this point, it's a > completely separate change from a logical perspective, and we can take > in the change through the random.git tree. > > Changes that touch files that are normally changed in several > different git trees leads to the potential for merge conflicts during > the linux-next integration and merge window processes. Which is why > it's generally best to try to isolate changes as much as possible. Sure, I can separate things out. Could you offer a bit of advice on how to manage dependencies between patchsets during merge windows? I'm a bit new to the process. Specifically, we how have 4 parts: 1. add siphash, and use it for some networking code. to: david miller's net-next 2. convert char/random to use siphash. to: ted ts'o's random-next 3. move lib/md5.c to static function in crypto/md5.c, remove entry inside of linux/cryptohash.h. to: ??'s ??-next 4. move lib/halfmd4.c to static function in fs/ext/hash.c, remove entry inside of linux/cryptohash.c. to: td ts'o's ext-next Problem: 2 depends on 1, 3 depends on 1 & 2. But this can be simplified into 3 parts: 1. add siphash, and use it for some networking code. to: david miller's net-next 2. convert char/random to use siphash, move lib/md5.c to static function in crypto/md5.c, remove entry inside of linux/cryptohash.h. to: ted ts'o's random-next 3. move lib/halfmd4.c to static function in fs/ext/hash.c, remove entry inside of linux/cryptohash.c. to: td ts'o's ext-next Problem: 2 depends on 1. Is that okay with you? Also, would you like me to merge (3) and (2) of the second list into one series for you? Jason ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 16:16 ` Jason A. Donenfeld @ 2016-12-22 16:30 ` Theodore Ts'o 2016-12-22 16:36 ` Jason A. Donenfeld 0 siblings, 1 reply; 82+ messages in thread From: Theodore Ts'o @ 2016-12-22 16:30 UTC (permalink / raw) To: kernel-hardening Cc: Jason A. Donenfeld, Hannes Frederic Sowa, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Thu, Dec 22, 2016 at 05:16:47PM +0100, Jason A. Donenfeld wrote: > Could you offer a bit of advice on how to manage dependencies between > patchsets during merge windows? I'm a bit new to the process. > > Specifically, we how have 4 parts: > 1. add siphash, and use it for some networking code. to: david miller's net-next I'd do this first, as one set. Adding a new file to crypto is unlikely to cause merge conflicts. > 2. convert char/random to use siphash. to: ted ts'o's random-next I'm confused, I thought you had agreed to the batched chacha20 approach? > 3. move lib/md5.c to static function in crypto/md5.c, remove entry > inside of linux/cryptohash.h. to: ??'s ??-next This is cleanup, so it doesn't matter that much when it happens. md5 changes to crypto is also unlikely to cause conflicts, so we could do this at the same time as (2), if Herbert (the crypto maintainer) agrees. > 4. move lib/halfmd4.c to static function in fs/ext/hash.c, remove > entry inside of linux/cryptohash.c. to: td ts'o's ext-next This is definitely separate. One more thing. Can you add some test cases to lib/siphash.h? Triggered off of a CONFIG_SIPHASH_REGRESSION_TEST config flag, with some test inputs and known outputs? I'm going to need to add a version of siphash to e2fsprogs, and I want to make sure the userspace version is implementing the same algorithm as the kernel siphash. - Ted ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 16:30 ` Theodore Ts'o @ 2016-12-22 16:36 ` Jason A. Donenfeld 0 siblings, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-22 16:36 UTC (permalink / raw) To: kernel-hardening, Theodore Ts'o, Jason A. Donenfeld, Hannes Frederic Sowa, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Thu, Dec 22, 2016 at 5:30 PM, Theodore Ts'o <tytso@mit.edu> wrote: > I'd do this first, as one set. Adding a new file to crypto is > unlikely to cause merge conflicts. Ack. > >> 2. convert char/random to use siphash. to: ted ts'o's random-next > > I'm confused, I thought you had agreed to the batched chacha20 > approach? Sorry, I meant to write this. Long day, little sleep. Yes, of course. Batched entropy. >> 3. move lib/md5.c to static function in crypto/md5.c, remove entry >> inside of linux/cryptohash.h. to: ??'s ??-next > > This is cleanup, so it doesn't matter that much when it happens. md5 > changes to crypto is also unlikely to cause conflicts, so we could do > this at the same time as (2), if Herbert (the crypto maintainer) agrees. Alright, sure. > >> 4. move lib/halfmd4.c to static function in fs/ext/hash.c, remove >> entry inside of linux/cryptohash.c. to: td ts'o's ext-next > > This is definitely separate. Okay, I'll submit it to you separately. > One more thing. Can you add some test cases to lib/siphash.h? > Triggered off of a CONFIG_SIPHASH_REGRESSION_TEST config flag, with > some test inputs and known outputs? I'm going to need to add a > version of siphash to e2fsprogs, and I want to make sure the userspace > version is implementing the same algorithm as the kernel siphash. I've already written these. They're behind TEST_HASH. They currently test every single line of code of all implementations of siphash. I spent a long time on these. The test vectors themselves were taken from the SipHash creators' reference publication. Check out lib/test_siphash.c in my tree. Jason On Thu, Dec 22, 2016 at 5:30 PM, Theodore Ts'o <tytso@mit.edu> wrote: > On Thu, Dec 22, 2016 at 05:16:47PM +0100, Jason A. Donenfeld wrote: >> Could you offer a bit of advice on how to manage dependencies between >> patchsets during merge windows? I'm a bit new to the process. >> >> Specifically, we how have 4 parts: >> 1. add siphash, and use it for some networking code. to: david miller's net-next > > I'd do this first, as one set. Adding a new file to crypto is > unlikely to cause merge conflicts. > >> 2. convert char/random to use siphash. to: ted ts'o's random-next > > I'm confused, I thought you had agreed to the batched chacha20 > approach? > >> 3. move lib/md5.c to static function in crypto/md5.c, remove entry >> inside of linux/cryptohash.h. to: ??'s ??-next > > This is cleanup, so it doesn't matter that much when it happens. md5 > changes to crypto is also unlikely to cause conflicts, so we could do > this at the same time as (2), if Herbert (the crypto maintainer) agrees. > >> 4. move lib/halfmd4.c to static function in fs/ext/hash.c, remove >> entry inside of linux/cryptohash.c. to: td ts'o's ext-next > > This is definitely separate. > > One more thing. Can you add some test cases to lib/siphash.h? > Triggered off of a CONFIG_SIPHASH_REGRESSION_TEST config flag, with > some test inputs and known outputs? I'm going to need to add a > version of siphash to e2fsprogs, and I want to make sure the userspace > version is implementing the same algorithm as the kernel siphash. > > - Ted -- Jason A. Donenfeld Deep Space Explorer fr: +33 6 51 90 82 66 us: +1 513 476 1200 www.jasondonenfeld.com www.zx2c4.com zx2c4.com/keys/AB9942E6D4A4CFC3412620A749FC7012A5DE03AE.asc ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 5:41 ` [kernel-hardening] " Theodore Ts'o 2016-12-22 6:03 ` Jason A. Donenfeld @ 2016-12-22 12:47 ` Hannes Frederic Sowa 2016-12-22 13:10 ` Jason A. Donenfeld 1 sibling, 1 reply; 82+ messages in thread From: Hannes Frederic Sowa @ 2016-12-22 12:47 UTC (permalink / raw) To: Theodore Ts'o, kernel-hardening Cc: Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson Hi Ted, On Thu, 2016-12-22 at 00:41 -0500, Theodore Ts'o wrote: > On Thu, Dec 22, 2016 at 03:49:39AM +0100, Jason A. Donenfeld wrote: > > > > Funny -- while you guys were sending this back & forth, I was writing > > my reply to Andy which essentially arrives at the same conclusion. > > Given that we're all arriving to the same thing, and that Ted shot in > > this direction long before we all did, I'm leaning toward abandoning > > SipHash for the de-MD5-ification of get_random_int/long, and working > > on polishing Ted's idea into something shiny for this patchset. > > here are my numbers comparing siphash (using the first three patches > of the v7 siphash patches) with my batched chacha20 implementation. > The results are taken by running get_random_* 10000 times, and then > dividing the numbers by 10000 to get the average number of cycles for > the call. I compiled 32-bit and 64-bit kernels, and ran the results > using kvm: > > siphash batched chacha20 > get_random_int get_random_long get_random_int get_random_long > > 32-bit 270 278 114 146 > 64-bit 75 75 106 186 > > > I did have two objections to this. The first was that my SipHash > > construction is faster. > > Well, it's faster on everything except 32-bit x86. :-P > > > The second, and the more > > important one, was that batching entropy up like this means that 32 > > calls will be really fast, and then the 33rd will be slow, since it > > has to do a whole ChaCha round, because get_random_bytes must be > > called to refill the batch. > > ... and this will take 2121 cycles on 64-bit x86, and 2315 cycles on a > 32-bit x86. Which on a 2.3 GHz processor, is just under a > microsecond. As far as being inconsistent on process startup, I very > much doubt a microsecond is really going to be visible to the user. :-) > > The bottom line is that I think we're really "pixel peeping" at this > point --- which is what obsessed digital photographers will do when > debating the quality of a Canon vs Nikon DSLR by blowing up a photo by > a thousand times, and then trying to claim that this is visible to the > human eye. Or people who obsessing over the frequency response curves > of TH-X00 headphones with Mahogony vs Purpleheart wood, when it's > likely that in a blind head-to-head comparison, most people wouldn't > be able to tell the difference.... > > I think the main argument for using the batched getrandom approach is > that it, I would argue, simpler than introducing siphash into the > picture. On 64-bit platforms it is faster and more consistent, so > it's basically that versus complexity of having to adding siphash to > the things that people have to analyze when considering random number > security on Linux. But it's a close call either way, I think. following up on what appears to be a random subject: ;) IIRC, ext4 code by default still uses half_md4 for hashing of filenames in the htree. siphash seems to fit this use case pretty good. xfs could also need an update, as they don't seed the directory hash tables at all (but not sure if they are vulnerable). I should improve [1] a bit. [1] http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/cmds/xfstests.git;a=blo b;f=src/dirhash_collide.c;h=55cec872d5061ac2ca0f56d1f11e6bf349d5bb97;hb =HEAD Bye, Hannes ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 12:47 ` Hannes Frederic Sowa @ 2016-12-22 13:10 ` Jason A. Donenfeld 2016-12-22 15:05 ` Hannes Frederic Sowa 2016-12-22 15:54 ` Theodore Ts'o 0 siblings, 2 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-22 13:10 UTC (permalink / raw) To: kernel-hardening Cc: Theodore Ts'o, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Thu, Dec 22, 2016 at 1:47 PM, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > following up on what appears to be a random subject: ;) > > IIRC, ext4 code by default still uses half_md4 for hashing of filenames > in the htree. siphash seems to fit this use case pretty good. I saw this too. I'll try to address it in v8 of this series. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 13:10 ` Jason A. Donenfeld @ 2016-12-22 15:05 ` Hannes Frederic Sowa 2016-12-22 15:12 ` Jason A. Donenfeld 2016-12-22 15:54 ` Theodore Ts'o 1 sibling, 1 reply; 82+ messages in thread From: Hannes Frederic Sowa @ 2016-12-22 15:05 UTC (permalink / raw) To: Jason A. Donenfeld, kernel-hardening Cc: Theodore Ts'o, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On 22.12.2016 14:10, Jason A. Donenfeld wrote: > On Thu, Dec 22, 2016 at 1:47 PM, Hannes Frederic Sowa > <hannes@stressinduktion.org> wrote: >> following up on what appears to be a random subject: ;) >> >> IIRC, ext4 code by default still uses half_md4 for hashing of filenames >> in the htree. siphash seems to fit this use case pretty good. > > I saw this too. I'll try to address it in v8 of this series. This change would need a new version of the ext4 super block, because you should not change existing file systems. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 15:05 ` Hannes Frederic Sowa @ 2016-12-22 15:12 ` Jason A. Donenfeld 2016-12-22 15:29 ` Jason A. Donenfeld 0 siblings, 1 reply; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-22 15:12 UTC (permalink / raw) To: Hannes Frederic Sowa Cc: kernel-hardening, Theodore Ts'o, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Thu, Dec 22, 2016 at 4:05 PM, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > This change would need a new version of the ext4 super block, because > you should not change existing file systems. Right. As a first step, I'm considering adding a patch to move halfmd4.c inside the ext4 domain, or at the very least, simply remove it from linux/cryptohash.h. That'll then leave the handful of bizarre sha1 usages to consider. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 15:12 ` Jason A. Donenfeld @ 2016-12-22 15:29 ` Jason A. Donenfeld 2016-12-22 15:33 ` Hannes Frederic Sowa 0 siblings, 1 reply; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-22 15:29 UTC (permalink / raw) To: Hannes Frederic Sowa Cc: kernel-hardening, Theodore Ts'o, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Thu, Dec 22, 2016 at 4:12 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: > As a first step, I'm considering adding a patch to move halfmd4.c > inside the ext4 domain, or at the very least, simply remove it from > linux/cryptohash.h. That'll then leave the handful of bizarre sha1 > usages to consider. Specifically something like this: https://git.zx2c4.com/linux-dev/commit/?h=siphash&id=978213351f9633bd1e3d1fdc3f19d28e36eeac90 That only leaves two more uses of "cryptohash" to consider, but they require a bit of help. First, sha_transform in net/ipv6/addrconf.c. That might be a straight-forward conversion to SipHash, but perhaps not; I need to look closely and think about it. The next is sha_transform in kernel/bpf/core.c. I really have no idea what's going on with the eBPF stuff, so that will take a bit longer to study. Maybe sha1 is fine in the end there? I'm not sure yet. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 15:29 ` Jason A. Donenfeld @ 2016-12-22 15:33 ` Hannes Frederic Sowa 2016-12-22 15:41 ` Jason A. Donenfeld 0 siblings, 1 reply; 82+ messages in thread From: Hannes Frederic Sowa @ 2016-12-22 15:33 UTC (permalink / raw) To: Jason A. Donenfeld Cc: kernel-hardening, Theodore Ts'o, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Thu, 2016-12-22 at 16:29 +0100, Jason A. Donenfeld wrote: > On Thu, Dec 22, 2016 at 4:12 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > As a first step, I'm considering adding a patch to move halfmd4.c > > inside the ext4 domain, or at the very least, simply remove it from > > linux/cryptohash.h. That'll then leave the handful of bizarre sha1 > > usages to consider. > > Specifically something like this: > > https://git.zx2c4.com/linux-dev/commit/?h=siphash&id=978213351f9633bd1e3d1fdc3f19d28e36eeac90 > > That only leaves two more uses of "cryptohash" to consider, but they > require a bit of help. First, sha_transform in net/ipv6/addrconf.c. > That might be a straight-forward conversion to SipHash, but perhaps > not; I need to look closely and think about it. The next is > sha_transform in kernel/bpf/core.c. I really have no idea what's going > on with the eBPF stuff, so that will take a bit longer to study. Maybe > sha1 is fine in the end there? I'm not sure yet. IPv6 you cannot touch anymore. The hashing algorithm is part of uAPI. You don't want to give people new IPv6 addresses with the same stable secret (across reboots) after a kernel upgrade. Maybe they lose connectivity then and it is extra work? The bpf hash stuff can be changed during this merge window, as it is not yet in a released kernel. Albeit I would probably have preferred something like sha256 here, which can be easily replicated by user space tools (minus the problem of patching out references to not hashable data, which must be zeroed). Bye, Hannes ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 15:33 ` Hannes Frederic Sowa @ 2016-12-22 15:41 ` Jason A. Donenfeld 2016-12-22 15:51 ` Hannes Frederic Sowa 0 siblings, 1 reply; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-22 15:41 UTC (permalink / raw) To: Hannes Frederic Sowa Cc: kernel-hardening, Theodore Ts'o, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson Hi Hannes, On Thu, Dec 22, 2016 at 4:33 PM, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > IPv6 you cannot touch anymore. The hashing algorithm is part of uAPI. > You don't want to give people new IPv6 addresses with the same stable > secret (across reboots) after a kernel upgrade. Maybe they lose > connectivity then and it is extra work? Ahh, too bad. So it goes. > The bpf hash stuff can be changed during this merge window, as it is > not yet in a released kernel. Albeit I would probably have preferred > something like sha256 here, which can be easily replicated by user > space tools (minus the problem of patching out references to not > hashable data, which must be zeroed). Oh, interesting, so time is of the essence then. Do you want to handle changing the new eBPF code to something not-SHA1 before it's too late, as part of a new patchset that can fast track itself to David? And then I can preserve my large series for the next merge window. Jason ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 15:41 ` Jason A. Donenfeld @ 2016-12-22 15:51 ` Hannes Frederic Sowa 2016-12-22 15:53 ` Jason A. Donenfeld 0 siblings, 1 reply; 82+ messages in thread From: Hannes Frederic Sowa @ 2016-12-22 15:51 UTC (permalink / raw) To: Jason A. Donenfeld Cc: kernel-hardening, Theodore Ts'o, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Thu, 2016-12-22 at 16:41 +0100, Jason A. Donenfeld wrote: > Hi Hannes, > > On Thu, Dec 22, 2016 at 4:33 PM, Hannes Frederic Sowa > <hannes@stressinduktion.org> wrote: > > IPv6 you cannot touch anymore. The hashing algorithm is part of uAPI. > > You don't want to give people new IPv6 addresses with the same stable > > secret (across reboots) after a kernel upgrade. Maybe they lose > > connectivity then and it is extra work? > > Ahh, too bad. So it goes. If no other users survive we can put it into the ipv6 module. > > The bpf hash stuff can be changed during this merge window, as it is > > not yet in a released kernel. Albeit I would probably have preferred > > something like sha256 here, which can be easily replicated by user > > space tools (minus the problem of patching out references to not > > hashable data, which must be zeroed). > > Oh, interesting, so time is of the essence then. Do you want to handle > changing the new eBPF code to something not-SHA1 before it's too late, > as part of a new patchset that can fast track itself to David? And > then I can preserve my large series for the next merge window. This algorithm should be a non-seeded algorithm, because the hashes should be stable and verifiable by user space tooling. Thus this would need a hashing algorithm that is hardened against pre-image attacks/collision resistance, which siphash is not. I would prefer some higher order SHA algorithm for that actually. Bye, Hannes ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 15:51 ` Hannes Frederic Sowa @ 2016-12-22 15:53 ` Jason A. Donenfeld 0 siblings, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-22 15:53 UTC (permalink / raw) To: Hannes Frederic Sowa Cc: kernel-hardening, Theodore Ts'o, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Thu, Dec 22, 2016 at 4:51 PM, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > This algorithm should be a non-seeded algorithm, because the hashes > should be stable and verifiable by user space tooling. Thus this would > need a hashing algorithm that is hardened against pre-image > attacks/collision resistance, which siphash is not. I would prefer some > higher order SHA algorithm for that actually. Right. SHA-256, SHA-512/256, Blake2s, or Blake2b would probably be good candidates for this. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 13:10 ` Jason A. Donenfeld 2016-12-22 15:05 ` Hannes Frederic Sowa @ 2016-12-22 15:54 ` Theodore Ts'o 2016-12-22 18:08 ` Hannes Frederic Sowa 1 sibling, 1 reply; 82+ messages in thread From: Theodore Ts'o @ 2016-12-22 15:54 UTC (permalink / raw) To: Jason A. Donenfeld Cc: kernel-hardening, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Thu, Dec 22, 2016 at 02:10:33PM +0100, Jason A. Donenfeld wrote: > On Thu, Dec 22, 2016 at 1:47 PM, Hannes Frederic Sowa > <hannes@stressinduktion.org> wrote: > > following up on what appears to be a random subject: ;) > > > > IIRC, ext4 code by default still uses half_md4 for hashing of filenames > > in the htree. siphash seems to fit this use case pretty good. > > I saw this too. I'll try to address it in v8 of this series. This is a separate issue, and this series is getting a bit too complex. So I'd suggest pushing this off to a separate change. Changing the htree hash algorithm is an on-disk format change, and so we couldn't roll it out until e2fsprogs gets updated and rolled out pretty broadley. In fact George sent me patches to add siphash as a hash algorithm for htree a while back (for both the kernel and e2fsprogs), but I never got around to testing and applying them, mainly because while it's technically faster, I had other higher priority issues to work on --- and see previous comments regarding pixel peeping. Improving the hash algorithm by tens or even hundreds of nanoseconds isn't really going to matter since we only do a htree lookup on a file creation or cold cache lookup, and the SSD or HDD I/O times will dominate. And from the power perspective, saving microwatts of CPU power isn't going to matter if you're going to be spinning up the storage device.... - Ted ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 15:54 ` Theodore Ts'o @ 2016-12-22 18:08 ` Hannes Frederic Sowa 2016-12-22 18:13 ` Jason A. Donenfeld 2016-12-22 19:50 ` Theodore Ts'o 0 siblings, 2 replies; 82+ messages in thread From: Hannes Frederic Sowa @ 2016-12-22 18:08 UTC (permalink / raw) To: Theodore Ts'o, Jason A. Donenfeld, kernel-hardening, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On 22.12.2016 16:54, Theodore Ts'o wrote: > On Thu, Dec 22, 2016 at 02:10:33PM +0100, Jason A. Donenfeld wrote: >> On Thu, Dec 22, 2016 at 1:47 PM, Hannes Frederic Sowa >> <hannes@stressinduktion.org> wrote: >>> following up on what appears to be a random subject: ;) >>> >>> IIRC, ext4 code by default still uses half_md4 for hashing of filenames >>> in the htree. siphash seems to fit this use case pretty good. >> >> I saw this too. I'll try to address it in v8 of this series. > > This is a separate issue, and this series is getting a bit too > complex. So I'd suggest pushing this off to a separate change. > > Changing the htree hash algorithm is an on-disk format change, and so > we couldn't roll it out until e2fsprogs gets updated and rolled out > pretty broadley. In fact George sent me patches to add siphash as a > hash algorithm for htree a while back (for both the kernel and > e2fsprogs), but I never got around to testing and applying them, > mainly because while it's technically faster, I had other higher > priority issues to work on --- and see previous comments regarding > pixel peeping. Improving the hash algorithm by tens or even hundreds > of nanoseconds isn't really going to matter since we only do a htree > lookup on a file creation or cold cache lookup, and the SSD or HDD I/O > times will dominate. And from the power perspective, saving > microwatts of CPU power isn't going to matter if you're going to be > spinning up the storage device.... I wasn't concerned about performance but more about DoS resilience. I wonder how safe half md4 actually is in terms of allowing users to generate long hash chains in the filesystem (in terms of length extension attacks against half_md4). In ext4, is it actually possible that a "disrupter" learns about the hashing secret in the way how the inodes are returned during getdents? Thanks, Hannes ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 18:08 ` Hannes Frederic Sowa @ 2016-12-22 18:13 ` Jason A. Donenfeld 2016-12-22 19:50 ` Theodore Ts'o 1 sibling, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-22 18:13 UTC (permalink / raw) To: Hannes Frederic Sowa Cc: Theodore Ts'o, kernel-hardening, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Thu, Dec 22, 2016 at 7:08 PM, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > I wasn't concerned about performance but more about DoS resilience. I > wonder how safe half md4 actually is in terms of allowing users to > generate long hash chains in the filesystem (in terms of length > extension attacks against half_md4). AFAIK, this is a real vulnerability that needs to be addressed. Judging by Ted's inquiry about my siphash testing suite, I assume he's probably tinkering around with it as we speak. :) Meanwhile I've separated things into several trees: 1. chacha20 rng, already submitted: https://git.zx2c4.com/linux-dev/log/?h=random-next 2. md5 cleanup, not yet submitted: https://git.zx2c4.com/linux-dev/log/?h=md5-cleanup 3. md4 cleanup, already submitted: https://git.zx2c4.com/linux-dev/log/?h=ext4-next-md4-cleanup 4. siphash and networking, not yet submitted as a x/4 series: https://git.zx2c4.com/linux-dev/log/?h=net-next-siphash I'll submit (4) in a couple of days, waiting for any comments on the existing patch-set. Jason ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-22 18:08 ` Hannes Frederic Sowa 2016-12-22 18:13 ` Jason A. Donenfeld @ 2016-12-22 19:50 ` Theodore Ts'o 1 sibling, 0 replies; 82+ messages in thread From: Theodore Ts'o @ 2016-12-22 19:50 UTC (permalink / raw) To: Hannes Frederic Sowa Cc: Jason A. Donenfeld, kernel-hardening, Andy Lutomirski, Netdev, LKML, Linux Crypto Mailing List, David Laight, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson On Thu, Dec 22, 2016 at 07:08:37PM +0100, Hannes Frederic Sowa wrote: > I wasn't concerned about performance but more about DoS resilience. I > wonder how safe half md4 actually is in terms of allowing users to > generate long hash chains in the filesystem (in terms of length > extension attacks against half_md4). > > In ext4, is it actually possible that a "disrupter" learns about the > hashing secret in the way how the inodes are returned during getdents? They'd have to be a local user, who can execute telldir(3) --- in which case there are plenty of other denial of service attacks one could carry out that would be far more devastating. It might also be an issue if the file system is exposed via NFS, but again, there are so many other ways an attacker could DoS a NFS server that I don't think of it as a much of a concern. Keep in mind that worst someone can do is cause directory inserts to fail with an ENOSPC, and there are plenty of other ways of doing that --- such as consuming all of the blocks and inodes in the file system, for example. So it's a threat, but not a high priority one as far as I'm concerned. And if this was a problem in actual practice, users could switch to the TEA based hash, which should be far harder to attack, and available today. - Ted ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v7 3/6] random: use SipHash in place of MD5 2016-12-21 23:42 ` Andy Lutomirski 2016-12-22 2:07 ` Hannes Frederic Sowa @ 2016-12-22 2:31 ` Jason A. Donenfeld 1 sibling, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-22 2:31 UTC (permalink / raw) To: Andy Lutomirski Cc: Netdev, kernel-hardening, LKML, Linux Crypto Mailing List, David Laight, Ted Tso, Hannes Frederic Sowa, Eric Dumazet, Linus Torvalds, Eric Biggers, Tom Herbert, Andi Kleen, David S. Miller, Jean-Philippe Aumasson Hi Andy, On Thu, Dec 22, 2016 at 12:42 AM, Andy Lutomirski <luto@amacapital.net> wrote: > So this is probably good enough, and making it better is hard. Changing it to: > > u64 entropy = (u64)random_get_entropy() + current->pid; > result = siphash(..., entropy, ...); > secret->chaining += result + entropy; > > would reduce this problem by forcing an attacker to brute-force the > entropy on each iteration, which is probably an improvement. Ahh, so that's the reasoning behind a similar suggestion of yours in a previous email. Makes sense to me. I'll include this in the next merge if we don't come up with a different idea before then. Your reasoning seems good for it. Part of what makes this process a bit goofy is that it's not all together clear what the design goals are. Right now we're going for "not worse than before", which we've nearly achieved. How good of an RNG do we want? I'm willing to examine and analyze the security and performance of all constructions we can come up with. One thing I don't want to do, however, is start tweaking the primitives themselves in ways not endorsed by the designers. So, I believe that precludes things like carrying over SipHash's internal state (like what was done with MD5), because there hasn't been a formal security analysis of this like there has with other uses of SipHash. I also don't want to change any internals of how SipHash actually works. I mention that because of some of the suggestions on other threads, which make me rather uneasy. So with that said, while writing this reply to you, I was simultaneously reading some other crypto code and was reminded that there's a variant of SipHash which outputs an additional 64-bits; it's part of the siphash reference code, which they call the "128-bit mode". It has the benefit that we can return 64-bits to the caller and save 64-bits for the chaining key. That way there's no correlation between the returned secret and the chaining key, which I think would completely alleviate all of your concerns, and simplify the analysis a bit. Here's what it looks like: https://git.zx2c4.com/linux-dev/commit/?h=siphash&id=46fbe5b408e66b2d16b4447860f8083480e1c08d The downside is that it takes 4 extra Sip rounds. This puts the performance still better than MD5, though, and likely still better than the other batched entropy solution. We could optimize this, I suppose, by giving it only two parameters -- chaining, jiffies+entropy+pid -- instead of the current three -- chaining, jiffies, entropy+pid -- which would then shave off 2 Sip rounds. But I liked the idea of having a bit more spread in the entropy input field. Anyway, with this in mind, we now have three possibilities: 1. result = siphash(chaining, entropy, key); chaining += result + entropy 2. result = siphash_extra_output(chaining, entropy, key, &chaining); 3. Ted's batched entropy idea using chacha20 The more I think about this, the more I suspect that we should just use chacha20. It will still be faster than MD5. I don't like the non-determinism of it (some processes will start slower than others, if the batched entropy has run out and ASLR demands more), but I guess I can live with that. But, most importantly, it greatly simplifies both the security analysis and what we can promise to callers about the function. Right now in the comment documentation, we're coy with callers about the security of the RNG. If we moved to a known construction like chacha20/get_random_bytes_batched, then we could just be straight up with a promise that the numbers it returns are high quality. Thoughts on 2 and 3, and on 1 vs 2 vs 3? Jason ^ permalink raw reply [flat|nested] 82+ messages in thread
* [PATCH v7 4/6] md5: remove from lib and only live in crypto 2016-12-21 23:02 ` [PATCH v7 0/6] The SipHash Patchset Jason A. Donenfeld ` (2 preceding siblings ...) 2016-12-21 23:02 ` [PATCH v7 3/6] random: " Jason A. Donenfeld @ 2016-12-21 23:02 ` Jason A. Donenfeld 2016-12-21 23:02 ` [PATCH v7 5/6] syncookies: use SipHash in place of SHA1 Jason A. Donenfeld 2016-12-21 23:02 ` [PATCH v7 6/6] siphash: implement HalfSipHash1-3 for hash tables Jason A. Donenfeld 5 siblings, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-21 23:02 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, edumazet, Linus Torvalds, Eric Biggers, Tom Herbert, ak, davem, luto, Jean-Philippe Aumasson Cc: Jason A. Donenfeld The md5_transform function is no longer used any where in the tree, except for the crypto api's actual implementation of md5, so we can drop the function from lib and put it as a static function of the crypto file, where it belongs. There should be no new users of md5_transform, anyway, since there are more modern ways of doing what it once achieved. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> --- crypto/md5.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- lib/Makefile | 2 +- lib/md5.c | 95 ------------------------------------------------------------ 3 files changed, 95 insertions(+), 97 deletions(-) delete mode 100644 lib/md5.c diff --git a/crypto/md5.c b/crypto/md5.c index 2355a7c25c45..f7ae1a48225b 100644 --- a/crypto/md5.c +++ b/crypto/md5.c @@ -21,9 +21,11 @@ #include <linux/module.h> #include <linux/string.h> #include <linux/types.h> -#include <linux/cryptohash.h> #include <asm/byteorder.h> +#define MD5_DIGEST_WORDS 4 +#define MD5_MESSAGE_BYTES 64 + const u8 md5_zero_message_hash[MD5_DIGEST_SIZE] = { 0xd4, 0x1d, 0x8c, 0xd9, 0x8f, 0x00, 0xb2, 0x04, 0xe9, 0x80, 0x09, 0x98, 0xec, 0xf8, 0x42, 0x7e, @@ -47,6 +49,97 @@ static inline void cpu_to_le32_array(u32 *buf, unsigned int words) } } +#define F1(x, y, z) (z ^ (x & (y ^ z))) +#define F2(x, y, z) F1(z, x, y) +#define F3(x, y, z) (x ^ y ^ z) +#define F4(x, y, z) (y ^ (x | ~z)) + +#define MD5STEP(f, w, x, y, z, in, s) \ + (w += f(x, y, z) + in, w = (w<<s | w>>(32-s)) + x) + +static void md5_transform(__u32 *hash, __u32 const *in) +{ + u32 a, b, c, d; + + a = hash[0]; + b = hash[1]; + c = hash[2]; + d = hash[3]; + + MD5STEP(F1, a, b, c, d, in[0] + 0xd76aa478, 7); + MD5STEP(F1, d, a, b, c, in[1] + 0xe8c7b756, 12); + MD5STEP(F1, c, d, a, b, in[2] + 0x242070db, 17); + MD5STEP(F1, b, c, d, a, in[3] + 0xc1bdceee, 22); + MD5STEP(F1, a, b, c, d, in[4] + 0xf57c0faf, 7); + MD5STEP(F1, d, a, b, c, in[5] + 0x4787c62a, 12); + MD5STEP(F1, c, d, a, b, in[6] + 0xa8304613, 17); + MD5STEP(F1, b, c, d, a, in[7] + 0xfd469501, 22); + MD5STEP(F1, a, b, c, d, in[8] + 0x698098d8, 7); + MD5STEP(F1, d, a, b, c, in[9] + 0x8b44f7af, 12); + MD5STEP(F1, c, d, a, b, in[10] + 0xffff5bb1, 17); + MD5STEP(F1, b, c, d, a, in[11] + 0x895cd7be, 22); + MD5STEP(F1, a, b, c, d, in[12] + 0x6b901122, 7); + MD5STEP(F1, d, a, b, c, in[13] + 0xfd987193, 12); + MD5STEP(F1, c, d, a, b, in[14] + 0xa679438e, 17); + MD5STEP(F1, b, c, d, a, in[15] + 0x49b40821, 22); + + MD5STEP(F2, a, b, c, d, in[1] + 0xf61e2562, 5); + MD5STEP(F2, d, a, b, c, in[6] + 0xc040b340, 9); + MD5STEP(F2, c, d, a, b, in[11] + 0x265e5a51, 14); + MD5STEP(F2, b, c, d, a, in[0] + 0xe9b6c7aa, 20); + MD5STEP(F2, a, b, c, d, in[5] + 0xd62f105d, 5); + MD5STEP(F2, d, a, b, c, in[10] + 0x02441453, 9); + MD5STEP(F2, c, d, a, b, in[15] + 0xd8a1e681, 14); + MD5STEP(F2, b, c, d, a, in[4] + 0xe7d3fbc8, 20); + MD5STEP(F2, a, b, c, d, in[9] + 0x21e1cde6, 5); + MD5STEP(F2, d, a, b, c, in[14] + 0xc33707d6, 9); + MD5STEP(F2, c, d, a, b, in[3] + 0xf4d50d87, 14); + MD5STEP(F2, b, c, d, a, in[8] + 0x455a14ed, 20); + MD5STEP(F2, a, b, c, d, in[13] + 0xa9e3e905, 5); + MD5STEP(F2, d, a, b, c, in[2] + 0xfcefa3f8, 9); + MD5STEP(F2, c, d, a, b, in[7] + 0x676f02d9, 14); + MD5STEP(F2, b, c, d, a, in[12] + 0x8d2a4c8a, 20); + + MD5STEP(F3, a, b, c, d, in[5] + 0xfffa3942, 4); + MD5STEP(F3, d, a, b, c, in[8] + 0x8771f681, 11); + MD5STEP(F3, c, d, a, b, in[11] + 0x6d9d6122, 16); + MD5STEP(F3, b, c, d, a, in[14] + 0xfde5380c, 23); + MD5STEP(F3, a, b, c, d, in[1] + 0xa4beea44, 4); + MD5STEP(F3, d, a, b, c, in[4] + 0x4bdecfa9, 11); + MD5STEP(F3, c, d, a, b, in[7] + 0xf6bb4b60, 16); + MD5STEP(F3, b, c, d, a, in[10] + 0xbebfbc70, 23); + MD5STEP(F3, a, b, c, d, in[13] + 0x289b7ec6, 4); + MD5STEP(F3, d, a, b, c, in[0] + 0xeaa127fa, 11); + MD5STEP(F3, c, d, a, b, in[3] + 0xd4ef3085, 16); + MD5STEP(F3, b, c, d, a, in[6] + 0x04881d05, 23); + MD5STEP(F3, a, b, c, d, in[9] + 0xd9d4d039, 4); + MD5STEP(F3, d, a, b, c, in[12] + 0xe6db99e5, 11); + MD5STEP(F3, c, d, a, b, in[15] + 0x1fa27cf8, 16); + MD5STEP(F3, b, c, d, a, in[2] + 0xc4ac5665, 23); + + MD5STEP(F4, a, b, c, d, in[0] + 0xf4292244, 6); + MD5STEP(F4, d, a, b, c, in[7] + 0x432aff97, 10); + MD5STEP(F4, c, d, a, b, in[14] + 0xab9423a7, 15); + MD5STEP(F4, b, c, d, a, in[5] + 0xfc93a039, 21); + MD5STEP(F4, a, b, c, d, in[12] + 0x655b59c3, 6); + MD5STEP(F4, d, a, b, c, in[3] + 0x8f0ccc92, 10); + MD5STEP(F4, c, d, a, b, in[10] + 0xffeff47d, 15); + MD5STEP(F4, b, c, d, a, in[1] + 0x85845dd1, 21); + MD5STEP(F4, a, b, c, d, in[8] + 0x6fa87e4f, 6); + MD5STEP(F4, d, a, b, c, in[15] + 0xfe2ce6e0, 10); + MD5STEP(F4, c, d, a, b, in[6] + 0xa3014314, 15); + MD5STEP(F4, b, c, d, a, in[13] + 0x4e0811a1, 21); + MD5STEP(F4, a, b, c, d, in[4] + 0xf7537e82, 6); + MD5STEP(F4, d, a, b, c, in[11] + 0xbd3af235, 10); + MD5STEP(F4, c, d, a, b, in[2] + 0x2ad7d2bb, 15); + MD5STEP(F4, b, c, d, a, in[9] + 0xeb86d391, 21); + + hash[0] += a; + hash[1] += b; + hash[2] += c; + hash[3] += d; +} + static inline void md5_transform_helper(struct md5_state *ctx) { le32_to_cpu_array(ctx->block, sizeof(ctx->block) / sizeof(u32)); diff --git a/lib/Makefile b/lib/Makefile index 71d398b04a74..1079152607e0 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -19,7 +19,7 @@ KCOV_INSTRUMENT_dynamic_debug.o := n lib-y := ctype.o string.o vsprintf.o cmdline.o \ rbtree.o radix-tree.o dump_stack.o timerqueue.o\ idr.o int_sqrt.o extable.o \ - sha1.o chacha20.o md5.o irq_regs.o argv_split.o \ + sha1.o chacha20.o irq_regs.o argv_split.o \ flex_proportions.o ratelimit.o show_mem.o \ is_single_threaded.o plist.o decompress.o kobject_uevent.o \ earlycpio.o seq_buf.o siphash.o \ diff --git a/lib/md5.c b/lib/md5.c deleted file mode 100644 index bb0cd01d356d..000000000000 --- a/lib/md5.c +++ /dev/null @@ -1,95 +0,0 @@ -#include <linux/compiler.h> -#include <linux/export.h> -#include <linux/cryptohash.h> - -#define F1(x, y, z) (z ^ (x & (y ^ z))) -#define F2(x, y, z) F1(z, x, y) -#define F3(x, y, z) (x ^ y ^ z) -#define F4(x, y, z) (y ^ (x | ~z)) - -#define MD5STEP(f, w, x, y, z, in, s) \ - (w += f(x, y, z) + in, w = (w<<s | w>>(32-s)) + x) - -void md5_transform(__u32 *hash, __u32 const *in) -{ - u32 a, b, c, d; - - a = hash[0]; - b = hash[1]; - c = hash[2]; - d = hash[3]; - - MD5STEP(F1, a, b, c, d, in[0] + 0xd76aa478, 7); - MD5STEP(F1, d, a, b, c, in[1] + 0xe8c7b756, 12); - MD5STEP(F1, c, d, a, b, in[2] + 0x242070db, 17); - MD5STEP(F1, b, c, d, a, in[3] + 0xc1bdceee, 22); - MD5STEP(F1, a, b, c, d, in[4] + 0xf57c0faf, 7); - MD5STEP(F1, d, a, b, c, in[5] + 0x4787c62a, 12); - MD5STEP(F1, c, d, a, b, in[6] + 0xa8304613, 17); - MD5STEP(F1, b, c, d, a, in[7] + 0xfd469501, 22); - MD5STEP(F1, a, b, c, d, in[8] + 0x698098d8, 7); - MD5STEP(F1, d, a, b, c, in[9] + 0x8b44f7af, 12); - MD5STEP(F1, c, d, a, b, in[10] + 0xffff5bb1, 17); - MD5STEP(F1, b, c, d, a, in[11] + 0x895cd7be, 22); - MD5STEP(F1, a, b, c, d, in[12] + 0x6b901122, 7); - MD5STEP(F1, d, a, b, c, in[13] + 0xfd987193, 12); - MD5STEP(F1, c, d, a, b, in[14] + 0xa679438e, 17); - MD5STEP(F1, b, c, d, a, in[15] + 0x49b40821, 22); - - MD5STEP(F2, a, b, c, d, in[1] + 0xf61e2562, 5); - MD5STEP(F2, d, a, b, c, in[6] + 0xc040b340, 9); - MD5STEP(F2, c, d, a, b, in[11] + 0x265e5a51, 14); - MD5STEP(F2, b, c, d, a, in[0] + 0xe9b6c7aa, 20); - MD5STEP(F2, a, b, c, d, in[5] + 0xd62f105d, 5); - MD5STEP(F2, d, a, b, c, in[10] + 0x02441453, 9); - MD5STEP(F2, c, d, a, b, in[15] + 0xd8a1e681, 14); - MD5STEP(F2, b, c, d, a, in[4] + 0xe7d3fbc8, 20); - MD5STEP(F2, a, b, c, d, in[9] + 0x21e1cde6, 5); - MD5STEP(F2, d, a, b, c, in[14] + 0xc33707d6, 9); - MD5STEP(F2, c, d, a, b, in[3] + 0xf4d50d87, 14); - MD5STEP(F2, b, c, d, a, in[8] + 0x455a14ed, 20); - MD5STEP(F2, a, b, c, d, in[13] + 0xa9e3e905, 5); - MD5STEP(F2, d, a, b, c, in[2] + 0xfcefa3f8, 9); - MD5STEP(F2, c, d, a, b, in[7] + 0x676f02d9, 14); - MD5STEP(F2, b, c, d, a, in[12] + 0x8d2a4c8a, 20); - - MD5STEP(F3, a, b, c, d, in[5] + 0xfffa3942, 4); - MD5STEP(F3, d, a, b, c, in[8] + 0x8771f681, 11); - MD5STEP(F3, c, d, a, b, in[11] + 0x6d9d6122, 16); - MD5STEP(F3, b, c, d, a, in[14] + 0xfde5380c, 23); - MD5STEP(F3, a, b, c, d, in[1] + 0xa4beea44, 4); - MD5STEP(F3, d, a, b, c, in[4] + 0x4bdecfa9, 11); - MD5STEP(F3, c, d, a, b, in[7] + 0xf6bb4b60, 16); - MD5STEP(F3, b, c, d, a, in[10] + 0xbebfbc70, 23); - MD5STEP(F3, a, b, c, d, in[13] + 0x289b7ec6, 4); - MD5STEP(F3, d, a, b, c, in[0] + 0xeaa127fa, 11); - MD5STEP(F3, c, d, a, b, in[3] + 0xd4ef3085, 16); - MD5STEP(F3, b, c, d, a, in[6] + 0x04881d05, 23); - MD5STEP(F3, a, b, c, d, in[9] + 0xd9d4d039, 4); - MD5STEP(F3, d, a, b, c, in[12] + 0xe6db99e5, 11); - MD5STEP(F3, c, d, a, b, in[15] + 0x1fa27cf8, 16); - MD5STEP(F3, b, c, d, a, in[2] + 0xc4ac5665, 23); - - MD5STEP(F4, a, b, c, d, in[0] + 0xf4292244, 6); - MD5STEP(F4, d, a, b, c, in[7] + 0x432aff97, 10); - MD5STEP(F4, c, d, a, b, in[14] + 0xab9423a7, 15); - MD5STEP(F4, b, c, d, a, in[5] + 0xfc93a039, 21); - MD5STEP(F4, a, b, c, d, in[12] + 0x655b59c3, 6); - MD5STEP(F4, d, a, b, c, in[3] + 0x8f0ccc92, 10); - MD5STEP(F4, c, d, a, b, in[10] + 0xffeff47d, 15); - MD5STEP(F4, b, c, d, a, in[1] + 0x85845dd1, 21); - MD5STEP(F4, a, b, c, d, in[8] + 0x6fa87e4f, 6); - MD5STEP(F4, d, a, b, c, in[15] + 0xfe2ce6e0, 10); - MD5STEP(F4, c, d, a, b, in[6] + 0xa3014314, 15); - MD5STEP(F4, b, c, d, a, in[13] + 0x4e0811a1, 21); - MD5STEP(F4, a, b, c, d, in[4] + 0xf7537e82, 6); - MD5STEP(F4, d, a, b, c, in[11] + 0xbd3af235, 10); - MD5STEP(F4, c, d, a, b, in[2] + 0x2ad7d2bb, 15); - MD5STEP(F4, b, c, d, a, in[9] + 0xeb86d391, 21); - - hash[0] += a; - hash[1] += b; - hash[2] += c; - hash[3] += d; -} -EXPORT_SYMBOL(md5_transform); -- 2.11.0 ^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH v7 5/6] syncookies: use SipHash in place of SHA1 2016-12-21 23:02 ` [PATCH v7 0/6] The SipHash Patchset Jason A. Donenfeld ` (3 preceding siblings ...) 2016-12-21 23:02 ` [PATCH v7 4/6] md5: remove from lib and only live in crypto Jason A. Donenfeld @ 2016-12-21 23:02 ` Jason A. Donenfeld 2016-12-21 23:02 ` [PATCH v7 6/6] siphash: implement HalfSipHash1-3 for hash tables Jason A. Donenfeld 5 siblings, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-21 23:02 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, edumazet, Linus Torvalds, Eric Biggers, Tom Herbert, ak, davem, luto, Jean-Philippe Aumasson Cc: Jason A. Donenfeld, Eric Dumazet SHA1 is slower and less secure than SipHash, and so replacing syncookie generation with SipHash makes natural sense. Some BSDs have been doing this for several years in fact. The speedup should be similar -- and even more impressive -- to the speedup from the sequence number fix in this series. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> --- net/ipv4/syncookies.c | 20 ++++---------------- net/ipv6/syncookies.c | 37 ++++++++++++++++--------------------- 2 files changed, 20 insertions(+), 37 deletions(-) diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index 3e88467d70ee..03bb068f8888 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -13,13 +13,13 @@ #include <linux/tcp.h> #include <linux/slab.h> #include <linux/random.h> -#include <linux/cryptohash.h> +#include <linux/siphash.h> #include <linux/kernel.h> #include <linux/export.h> #include <net/tcp.h> #include <net/route.h> -static u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS] __read_mostly; +static siphash_key_t syncookie_secret[2] __read_mostly; #define COOKIEBITS 24 /* Upper bits store count */ #define COOKIEMASK (((__u32)1 << COOKIEBITS) - 1) @@ -48,24 +48,12 @@ static u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS] __read_mostly; #define TSBITS 6 #define TSMASK (((__u32)1 << TSBITS) - 1) -static DEFINE_PER_CPU(__u32 [16 + 5 + SHA_WORKSPACE_WORDS], ipv4_cookie_scratch); - static u32 cookie_hash(__be32 saddr, __be32 daddr, __be16 sport, __be16 dport, u32 count, int c) { - __u32 *tmp; - net_get_random_once(syncookie_secret, sizeof(syncookie_secret)); - - tmp = this_cpu_ptr(ipv4_cookie_scratch); - memcpy(tmp + 4, syncookie_secret[c], sizeof(syncookie_secret[c])); - tmp[0] = (__force u32)saddr; - tmp[1] = (__force u32)daddr; - tmp[2] = ((__force u32)sport << 16) + (__force u32)dport; - tmp[3] = count; - sha_transform(tmp + 16, (__u8 *)tmp, tmp + 16 + 5); - - return tmp[17]; + return siphash_4u32(saddr, daddr, (u32)sport << 16 | dport, count, + syncookie_secret[c]); } diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c index a4d49760bf43..be51fc0d99ad 100644 --- a/net/ipv6/syncookies.c +++ b/net/ipv6/syncookies.c @@ -16,7 +16,7 @@ #include <linux/tcp.h> #include <linux/random.h> -#include <linux/cryptohash.h> +#include <linux/siphash.h> #include <linux/kernel.h> #include <net/ipv6.h> #include <net/tcp.h> @@ -24,7 +24,7 @@ #define COOKIEBITS 24 /* Upper bits store count */ #define COOKIEMASK (((__u32)1 << COOKIEBITS) - 1) -static u32 syncookie6_secret[2][16-4+SHA_DIGEST_WORDS] __read_mostly; +static siphash_key_t syncookie6_secret[2] __read_mostly; /* RFC 2460, Section 8.3: * [ipv6 tcp] MSS must be computed as the maximum packet size minus 60 [..] @@ -41,30 +41,25 @@ static __u16 const msstab[] = { 9000 - 60, }; -static DEFINE_PER_CPU(__u32 [16 + 5 + SHA_WORKSPACE_WORDS], ipv6_cookie_scratch); - static u32 cookie_hash(const struct in6_addr *saddr, const struct in6_addr *daddr, __be16 sport, __be16 dport, u32 count, int c) { - __u32 *tmp; + const struct { + struct in6_addr saddr; + struct in6_addr daddr; + u32 count; + u16 sport; + u16 dport; + } __aligned(SIPHASH_ALIGNMENT) combined = { + .saddr = *saddr, + .daddr = *daddr, + .count = count, + .sport = sport, + .dport = dport + }; net_get_random_once(syncookie6_secret, sizeof(syncookie6_secret)); - - tmp = this_cpu_ptr(ipv6_cookie_scratch); - - /* - * we have 320 bits of information to hash, copy in the remaining - * 192 bits required for sha_transform, from the syncookie6_secret - * and overwrite the digest with the secret - */ - memcpy(tmp + 10, syncookie6_secret[c], 44); - memcpy(tmp, saddr, 16); - memcpy(tmp + 4, daddr, 16); - tmp[8] = ((__force u32)sport << 16) + (__force u32)dport; - tmp[9] = count; - sha_transform(tmp + 16, (__u8 *)tmp, tmp + 16 + 5); - - return tmp[17]; + return siphash(&combined, offsetofend(typeof(combined), dport), syncookie6_secret[c]); } static __u32 secure_tcp_syn_cookie(const struct in6_addr *saddr, -- 2.11.0 ^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH v7 6/6] siphash: implement HalfSipHash1-3 for hash tables 2016-12-21 23:02 ` [PATCH v7 0/6] The SipHash Patchset Jason A. Donenfeld ` (4 preceding siblings ...) 2016-12-21 23:02 ` [PATCH v7 5/6] syncookies: use SipHash in place of SHA1 Jason A. Donenfeld @ 2016-12-21 23:02 ` Jason A. Donenfeld 2016-12-22 0:46 ` Andi Kleen 5 siblings, 1 reply; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-21 23:02 UTC (permalink / raw) To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, edumazet, Linus Torvalds, Eric Biggers, Tom Herbert, ak, davem, luto, Jean-Philippe Aumasson Cc: Jason A. Donenfeld HalfSipHash, or hsiphash, is a shortened version of SipHash, which generates 32-bit outputs using a weaker 64-bit key. It has *much* lower security margins, and shouldn't be used for anything too sensitive, but it could be used as a hashtable key function replacement, if the output is never exposed, and if the security requirement is not too high. The goal is to make this something that performance-critical jhash users would be willing to use. On 64-bit machines, HalfSipHash1-3 is slower than SipHash1-3, so we alias SipHash1-3 to HalfSipHash1-3 on those systems. 64-bit x86_64: [ 0.509409] test_siphash: SipHash2-4 cycles: 4049181 [ 0.510650] test_siphash: SipHash1-3 cycles: 2512884 [ 0.512205] test_siphash: HalfSipHash1-3 cycles: 3429920 [ 0.512904] test_siphash: JenkinsHash cycles: 978267 So, we map hsiphash() -> SipHash1-3 32-bit x86: [ 0.509868] test_siphash: SipHash2-4 cycles: 14812892 [ 0.513601] test_siphash: SipHash1-3 cycles: 9510710 [ 0.515263] test_siphash: HalfSipHash1-3 cycles: 3856157 [ 0.515952] test_siphash: JenkinsHash cycles: 1148567 So, we map hsiphash() -> HalfSipHash1-3 hsiphash() is roughly 3 times slower than jhash(), but comes with a considerable security improvement. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> --- Documentation/siphash.txt | 75 +++++++++++ include/linux/siphash.h | 56 +++++++- lib/siphash.c | 318 +++++++++++++++++++++++++++++++++++++++++++++- lib/test_siphash.c | 139 ++++++++++++++++---- 4 files changed, 561 insertions(+), 27 deletions(-) diff --git a/Documentation/siphash.txt b/Documentation/siphash.txt index 39ff7f0438e7..f93c1d7104c4 100644 --- a/Documentation/siphash.txt +++ b/Documentation/siphash.txt @@ -77,3 +77,78 @@ Linux implements the "2-4" variant of SipHash. Read the SipHash paper if you're interested in learning more: https://131002.net/siphash/siphash.pdf + + +~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + +HalfSipHash - SipHash's insecure younger cousin +----------------------------------------------- +Written by Jason A. Donenfeld <jason@zx2c4.com> + +On the off-chance that SipHash is not fast enough for your needs, you might be +able to justify using HalfSipHash, a terrifying but potentially useful +possibility. HalfSipHash cuts SipHash's rounds down from "2-4" to "1-3" and, +even scarier, uses an easily brute-forcable 64-bit key (with a 32-bit output) +instead of SipHash's 128-bit key. However, this may appeal to some +high-performance `jhash` users. + +Danger! + +Do not ever use HalfSipHash except for as a hashtable key function, and only +then when you can be absolutely certain that the outputs will never be +transmitted out of the kernel. This is only remotely useful over `jhash` as a +means of mitigating hashtable flooding denial of service attacks. + +1. Generating a key + +Keys should always be generated from a cryptographically secure source of +random numbers, either using get_random_bytes or get_random_once: + +hsiphash_key_t key; +get_random_bytes(key, sizeof(key)); + +If you're not deriving your key from here, you're doing it wrong. + +2. Using the functions + +There are two variants of the function, one that takes a list of integers, and +one that takes a buffer: + +u32 hsiphash(const void *data, size_t len, siphash_key_t key); + +And: + +u32 hsiphash_1u32(u32, hsiphash_key_t key); +u32 hsiphash_2u32(u32, u32, hsiphash_key_t key); +u32 hsiphash_3u32(u32, u32, u32, hsiphash_key_t key); +u32 hsiphash_4u32(u32, u32, u32, u32, hsiphash_key_t key); + +If you pass the generic hsiphash function something of a constant length, it +will constant fold at compile-time and automatically choose one of the +optimized functions. + +3. Hashtable key function usage: + +struct some_hashtable { + DECLARE_HASHTABLE(hashtable, 8); + hsiphash_key_t key; +}; + +void init_hashtable(struct some_hashtable *table) +{ + get_random_bytes(table->key, sizeof(table->key)); +} + +static inline hlist_head *some_hashtable_bucket(struct some_hashtable *table, struct interesting_input *input) +{ + return &table->hashtable[hsiphash(input, sizeof(*input), table->key) & (HASH_SIZE(table->hashtable) - 1)]; +} + +You may then iterate like usual over the returned hash bucket. + +4. Performance + +HalfSipHash is roughly 3 times slower than JenkinsHash. For many replacements, +this will not be a problem, as the hashtable lookup isn't the bottleneck. And +in general, this is probably a good sacrifice to make for the security and DoS +resistance of HalfSipHash. diff --git a/include/linux/siphash.h b/include/linux/siphash.h index 7aa666eb00d9..efab44c654f3 100644 --- a/include/linux/siphash.h +++ b/include/linux/siphash.h @@ -5,7 +5,9 @@ * SipHash: a fast short-input PRF * https://131002.net/siphash/ * - * This implementation is specifically for SipHash2-4. + * This implementation is specifically for SipHash2-4 for a secure PRF + * and HalfSipHash1-3/SipHash1-3 for an insecure PRF only suitable for + * hashtables. */ #ifndef _LINUX_SIPHASH_H @@ -76,4 +78,56 @@ static inline u64 siphash(const void *data, size_t len, const siphash_key_t key) return ___siphash_aligned(data, len, key); } +#if BITS_PER_LONG == 64 +typedef siphash_key_t hsiphash_key_t; +#define HSIPHASH_ALIGNMENT SIPHASH_ALIGNMENT +#else +typedef u32 hsiphash_key_t[2]; +#define HSIPHASH_ALIGNMENT __alignof__(u32) +#endif + +u32 __hsiphash_aligned(const void *data, size_t len, const hsiphash_key_t key); +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS +u32 __hsiphash_unaligned(const void *data, size_t len, const hsiphash_key_t key); +#endif + +u32 hsiphash_1u32(const u32 a, const hsiphash_key_t key); +u32 hsiphash_2u32(const u32 a, const u32 b, const hsiphash_key_t key); +u32 hsiphash_3u32(const u32 a, const u32 b, const u32 c, + const hsiphash_key_t key); +u32 hsiphash_4u32(const u32 a, const u32 b, const u32 c, const u32 d, + const hsiphash_key_t key); + +static inline u32 ___hsiphash_aligned(const __le32 *data, size_t len, const hsiphash_key_t key) +{ + if (__builtin_constant_p(len) && len == 4) + return hsiphash_1u32(le32_to_cpu(data[0]), key); + if (__builtin_constant_p(len) && len == 8) + return hsiphash_2u32(le32_to_cpu(data[0]), le32_to_cpu(data[1]), + key); + if (__builtin_constant_p(len) && len == 12) + return hsiphash_3u32(le32_to_cpu(data[0]), le32_to_cpu(data[1]), + le32_to_cpu(data[2]), key); + if (__builtin_constant_p(len) && len == 16) + return hsiphash_4u32(le32_to_cpu(data[0]), le32_to_cpu(data[1]), + le32_to_cpu(data[2]), le32_to_cpu(data[3]), + key); + return __hsiphash_aligned(data, len, key); +} + +/** + * hsiphash - compute 32-bit hsiphash PRF value + * @data: buffer to hash + * @size: size of @data + * @key: the hsiphash key + */ +static inline u32 hsiphash(const void *data, size_t len, const hsiphash_key_t key) +{ +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS + if (!IS_ALIGNED((unsigned long)data, HSIPHASH_ALIGNMENT)) + return __hsiphash_unaligned(data, len, key); +#endif + return ___hsiphash_aligned(data, len, key); +} + #endif /* _LINUX_SIPHASH_H */ diff --git a/lib/siphash.c b/lib/siphash.c index ff2151313667..e2481226d96c 100644 --- a/lib/siphash.c +++ b/lib/siphash.c @@ -5,7 +5,9 @@ * SipHash: a fast short-input PRF * https://131002.net/siphash/ * - * This implementation is specifically for SipHash2-4. + * This implementation is specifically for SipHash2-4 for a secure PRF + * and HalfSipHash1-3/SipHash1-3 for an insecure PRF only suitable for + * hashtables. */ #include <linux/siphash.h> @@ -230,3 +232,317 @@ u64 siphash_3u32(const u32 first, const u32 second, const u32 third, POSTAMBLE } EXPORT_SYMBOL(siphash_3u32); + +#if BITS_PER_LONG == 64 +/* Note that this HalfSipHash1-3 implementation on 64-bit + * isn't actually HalfSipHash1-3 but rather SipHash1-3. */ + +#define HSIPROUND SIPROUND +#define HPREAMBLE(len) PREAMBLE(len) +#define HPOSTAMBLE \ + v3 ^= b; \ + HSIPROUND; \ + v0 ^= b; \ + v2 ^= 0xff; \ + HSIPROUND; \ + HSIPROUND; \ + HSIPROUND; \ + return (v0 ^ v1) ^ (v2 ^ v3); + +u32 __hsiphash_aligned(const void *data, size_t len, const hsiphash_key_t key) +{ + const u8 *end = data + len - (len % sizeof(u64)); + const u8 left = len & (sizeof(u64) - 1); + u64 m; + HPREAMBLE(len) + for (; data != end; data += sizeof(u64)) { + m = le64_to_cpup(data); + v3 ^= m; + HSIPROUND; + v0 ^= m; + } +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 + if (left) + b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) & + bytemask_from_count(left))); +#else + switch (left) { + case 7: b |= ((u64)end[6]) << 48; + case 6: b |= ((u64)end[5]) << 40; + case 5: b |= ((u64)end[4]) << 32; + case 4: b |= le32_to_cpup(data); break; + case 3: b |= ((u64)end[2]) << 16; + case 2: b |= le16_to_cpup(data); break; + case 1: b |= end[0]; + } +#endif + HPOSTAMBLE +} +EXPORT_SYMBOL(__hsiphash_aligned); + +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS +u32 __hsiphash_unaligned(const void *data, size_t len, const hsiphash_key_t key) +{ + const u8 *end = data + len - (len % sizeof(u64)); + const u8 left = len & (sizeof(u64) - 1); + u64 m; + HPREAMBLE(len) + for (; data != end; data += sizeof(u64)) { + m = get_unaligned_le64(data); + v3 ^= m; + HSIPROUND; + v0 ^= m; + } +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 + if (left) + b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) & + bytemask_from_count(left))); +#else + switch (left) { + case 7: b |= ((u64)end[6]) << 48; + case 6: b |= ((u64)end[5]) << 40; + case 5: b |= ((u64)end[4]) << 32; + case 4: b |= get_unaligned_le32(end); break; + case 3: b |= ((u64)end[2]) << 16; + case 2: b |= get_unaligned_le16(end); break; + case 1: b |= end[0]; + } +#endif + HPOSTAMBLE +} +EXPORT_SYMBOL(__hsiphash_unaligned); +#endif + +/** + * hsiphash_1u32 - compute 64-bit hsiphash PRF value of a u32 + * @first: first u32 + * @key: the hsiphash key + */ +u32 hsiphash_1u32(const u32 first, const hsiphash_key_t key) +{ + HPREAMBLE(4) + b |= first; + HPOSTAMBLE +} +EXPORT_SYMBOL(hsiphash_1u32); + +/** + * hsiphash_2u32 - compute 32-bit hsiphash PRF value of 2 u32 + * @first: first u32 + * @second: second u32 + * @key: the hsiphash key + */ +u32 hsiphash_2u32(const u32 first, const u32 second, const hsiphash_key_t key) +{ + u64 combined = (u64)second << 32 | first; + HPREAMBLE(8) + v3 ^= combined; + HSIPROUND; + v0 ^= combined; + HPOSTAMBLE +} +EXPORT_SYMBOL(hsiphash_2u32); + +/** + * hsiphash_3u32 - compute 32-bit hsiphash PRF value of 3 u32 + * @first: first u32 + * @second: second u32 + * @third: third u32 + * @key: the hsiphash key + */ +u32 hsiphash_3u32(const u32 first, const u32 second, const u32 third, + const hsiphash_key_t key) +{ + u64 combined = (u64)second << 32 | first; + HPREAMBLE(12) + v3 ^= combined; + HSIPROUND; + v0 ^= combined; + b |= third; + HPOSTAMBLE +} +EXPORT_SYMBOL(hsiphash_3u32); + +/** + * hsiphash_4u32 - compute 32-bit hsiphash PRF value of 4 u32 + * @first: first u32 + * @second: second u32 + * @third: third u32 + * @forth: forth u32 + * @key: the hsiphash key + */ +u32 hsiphash_4u32(const u32 first, const u32 second, const u32 third, + const u32 forth, const hsiphash_key_t key) +{ + u64 combined = (u64)second << 32 | first; + HPREAMBLE(16) + v3 ^= combined; + HSIPROUND; + v0 ^= combined; + combined = (u64)forth << 32 | third; + v3 ^= combined; + HSIPROUND; + v0 ^= combined; + HPOSTAMBLE +} +EXPORT_SYMBOL(hsiphash_4u32); +#else +#define HSIPROUND \ + do { \ + v0 += v1; v1 = rol32(v1, 5); v1 ^= v0; v0 = rol32(v0, 16); \ + v2 += v3; v3 = rol32(v3, 8); v3 ^= v2; \ + v0 += v3; v3 = rol32(v3, 7); v3 ^= v0; \ + v2 += v1; v1 = rol32(v1, 13); v1 ^= v2; v2 = rol32(v2, 16); \ + } while(0) + +#define HPREAMBLE(len) \ + u32 v0 = 0; \ + u32 v1 = 0; \ + u32 v2 = 0x6c796765U; \ + u32 v3 = 0x74656462U; \ + u32 b = ((u32)len) << 24; \ + v3 ^= key[1]; \ + v2 ^= key[0]; \ + v1 ^= key[1]; \ + v0 ^= key[0]; + +#define HPOSTAMBLE \ + v3 ^= b; \ + HSIPROUND; \ + v0 ^= b; \ + v2 ^= 0xff; \ + HSIPROUND; \ + HSIPROUND; \ + HSIPROUND; \ + return v1 ^ v3; + +u32 __hsiphash_aligned(const void *data, size_t len, const hsiphash_key_t key) +{ + const u8 *end = data + len - (len % sizeof(u32)); + const u8 left = len & (sizeof(u32) - 1); + u32 m; + HPREAMBLE(len) + for (; data != end; data += sizeof(u32)) { + m = le32_to_cpup(data); + v3 ^= m; + HSIPROUND; + v0 ^= m; + } + switch (left) { + case 3: b |= ((u32)end[2]) << 16; + case 2: b |= le16_to_cpup(data); break; + case 1: b |= end[0]; + } + HPOSTAMBLE +} +EXPORT_SYMBOL(__hsiphash_aligned); + +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS +u32 __hsiphash_unaligned(const void *data, size_t len, const hsiphash_key_t key) +{ + const u8 *end = data + len - (len % sizeof(u32)); + const u8 left = len & (sizeof(u32) - 1); + u32 m; + HPREAMBLE(len) + for (; data != end; data += sizeof(u32)) { + m = get_unaligned_le32(data); + v3 ^= m; + HSIPROUND; + v0 ^= m; + } + switch (left) { + case 3: b |= ((u32)end[2]) << 16; + case 2: b |= get_unaligned_le16(end); break; + case 1: b |= end[0]; + } + HPOSTAMBLE +} +EXPORT_SYMBOL(__hsiphash_unaligned); +#endif + +/** + * hsiphash_1u32 - compute 32-bit hsiphash PRF value of a u32 + * @first: first u32 + * @key: the hsiphash key + */ +u32 hsiphash_1u32(const u32 first, const hsiphash_key_t key) +{ + HPREAMBLE(4) + v3 ^= first; + HSIPROUND; + v0 ^= first; + HPOSTAMBLE +} +EXPORT_SYMBOL(hsiphash_1u32); + +/** + * hsiphash_2u32 - compute 32-bit hsiphash PRF value of 2 u32 + * @first: first u32 + * @second: second u32 + * @key: the hsiphash key + */ +u32 hsiphash_2u32(const u32 first, const u32 second, const hsiphash_key_t key) +{ + HPREAMBLE(8) + v3 ^= first; + HSIPROUND; + v0 ^= first; + v3 ^= second; + HSIPROUND; + v0 ^= second; + HPOSTAMBLE +} +EXPORT_SYMBOL(hsiphash_2u32); + +/** + * hsiphash_3u32 - compute 32-bit hsiphash PRF value of 3 u32 + * @first: first u32 + * @second: second u32 + * @third: third u32 + * @key: the hsiphash key + */ +u32 hsiphash_3u32(const u32 first, const u32 second, const u32 third, + const hsiphash_key_t key) +{ + HPREAMBLE(12) + v3 ^= first; + HSIPROUND; + v0 ^= first; + v3 ^= second; + HSIPROUND; + v0 ^= second; + v3 ^= third; + HSIPROUND; + v0 ^= third; + HPOSTAMBLE +} +EXPORT_SYMBOL(hsiphash_3u32); + +/** + * hsiphash_4u32 - compute 32-bit hsiphash PRF value of 4 u32 + * @first: first u32 + * @second: second u32 + * @third: third u32 + * @forth: forth u32 + * @key: the hsiphash key + */ +u32 hsiphash_4u32(const u32 first, const u32 second, const u32 third, + const u32 forth, const hsiphash_key_t key) +{ + HPREAMBLE(16) + v3 ^= first; + HSIPROUND; + v0 ^= first; + v3 ^= second; + HSIPROUND; + v0 ^= second; + v3 ^= third; + HSIPROUND; + v0 ^= third; + v3 ^= forth; + HSIPROUND; + v0 ^= forth; + HPOSTAMBLE +} +EXPORT_SYMBOL(hsiphash_4u32); +#endif diff --git a/lib/test_siphash.c b/lib/test_siphash.c index e0ba2cf8dc67..ac291ec27fb6 100644 --- a/lib/test_siphash.c +++ b/lib/test_siphash.c @@ -7,7 +7,9 @@ * SipHash: a fast short-input PRF * https://131002.net/siphash/ * - * This implementation is specifically for SipHash2-4. + * This implementation is specifically for SipHash2-4 for a secure PRF + * and HalfSipHash1-3/SipHash1-3 for an insecure PRF only suitable for + * hashtables. */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt @@ -18,10 +20,16 @@ #include <linux/errno.h> #include <linux/module.h> -/* Test vectors taken from official reference source available at: - * https://131002.net/siphash/siphash24.c +/* Test vectors taken from reference source available at: + * https://github.com/veorq/SipHash */ -static const u64 test_vectors[64] = { + + + +static const siphash_key_t test_key_siphash = + { 0x0706050403020100ULL , 0x0f0e0d0c0b0a0908ULL }; + +static const u64 test_vectors_siphash[64] = { 0x726fdb47dd0e0e31ULL, 0x74f839c593dc67fdULL, 0x0d6c8009d9a94f5aULL, 0x85676696d7fb7e2dULL, 0xcf2794e0277187b7ULL, 0x18765564cd99a68dULL, 0xcbc9466e58fee3ceULL, 0xab0200f58b01d137ULL, 0x93f5f5799a932462ULL, @@ -45,9 +53,64 @@ static const u64 test_vectors[64] = { 0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 0xe51b38608ef25f57ULL, 0x958a324ceb064572ULL }; -static const siphash_key_t test_key = +#if BITS_PER_LONG == 64 +static const hsiphash_key_t test_key_hsiphash = { 0x0706050403020100ULL , 0x0f0e0d0c0b0a0908ULL }; +static const u32 test_vectors_hsiphash[64] = { + 0x050fc4dcU, 0x7d57ca93U, 0x4dc7d44dU, + 0xe7ddf7fbU, 0x88d38328U, 0x49533b67U, + 0xc59f22a7U, 0x9bb11140U, 0x8d299a8eU, + 0x6c063de4U, 0x92ff097fU, 0xf94dc352U, + 0x57b4d9a2U, 0x1229ffa7U, 0xc0f95d34U, + 0x2a519956U, 0x7d908b66U, 0x63dbd80cU, + 0xb473e63eU, 0x8d297d1cU, 0xa6cce040U, + 0x2b45f844U, 0xa320872eU, 0xdae6c123U, + 0x67349c8cU, 0x705b0979U, 0xca9913a5U, + 0x4ade3b35U, 0xef6cd00dU, 0x4ab1e1f4U, + 0x43c5e663U, 0x8c21d1bcU, 0x16a7b60dU, + 0x7a8ff9bfU, 0x1f2a753eU, 0xbf186b91U, + 0xada26206U, 0xa3c33057U, 0xae3a36a1U, + 0x7b108392U, 0x99e41531U, 0x3f1ad944U, + 0xc8138825U, 0xc28949a6U, 0xfaf8876bU, + 0x9f042196U, 0x68b1d623U, 0x8b5114fdU, + 0xdf074c46U, 0x12cc86b3U, 0x0a52098fU, + 0x9d292f9aU, 0xa2f41f12U, 0x43a71ed0U, + 0x73f0bce6U, 0x70a7e980U, 0x243c6d75U, + 0xfdb71513U, 0xa67d8a08U, 0xb7e8f148U, + 0xf7a644eeU, 0x0f1837f2U, 0x4b6694e0U, + 0xb7bbb3a8U +}; +#else +static const hsiphash_key_t test_key_hsiphash = + { 0x03020100U, 0x07060504U }; + +static const u32 test_vectors_hsiphash[64] = { + 0x5814c896U, 0xe7e864caU, 0xbc4b0e30U, + 0x01539939U, 0x7e059ea6U, 0x88e3d89bU, + 0xa0080b65U, 0x9d38d9d6U, 0x577999b1U, + 0xc839caedU, 0xe4fa32cfU, 0x959246eeU, + 0x6b28096cU, 0x66dd9cd6U, 0x16658a7cU, + 0xd0257b04U, 0x8b31d501U, 0x2b1cd04bU, + 0x06712339U, 0x522aca67U, 0x911bb605U, + 0x90a65f0eU, 0xf826ef7bU, 0x62512debU, + 0x57150ad7U, 0x5d473507U, 0x1ec47442U, + 0xab64afd3U, 0x0a4100d0U, 0x6d2ce652U, + 0x2331b6a3U, 0x08d8791aU, 0xbc6dda8dU, + 0xe0f6c934U, 0xb0652033U, 0x9b9851ccU, + 0x7c46fb7fU, 0x732ba8cbU, 0xf142997aU, + 0xfcc9aa1bU, 0x05327eb2U, 0xe110131cU, + 0xf9e5e7c0U, 0xa7d708a6U, 0x11795ab1U, + 0x65671619U, 0x9f5fff91U, 0xd89c5267U, + 0x007783ebU, 0x95766243U, 0xab639262U, + 0x9c7e1390U, 0xc368dda6U, 0x38ddc455U, + 0xfa13d379U, 0x979ea4e8U, 0x53ecd77eU, + 0x2ee80657U, 0x33dbb66aU, 0xae3f0577U, + 0x88b4c4ccU, 0x3e7f480bU, 0x74c1ebf8U, + 0x87178304U +}; +#endif + static int __init siphash_test_init(void) { u8 in[64] __aligned(SIPHASH_ALIGNMENT); @@ -58,49 +121,75 @@ static int __init siphash_test_init(void) for (i = 0; i < 64; ++i) { in[i] = i; in_unaligned[i + 1] = i; - if (siphash(in, i, test_key) != test_vectors[i]) { - pr_info("self-test aligned %u: FAIL\n", i + 1); + if (siphash(in, i, test_key_siphash) != test_vectors_siphash[i]) { + pr_info("siphash self-test aligned %u: FAIL\n", i + 1); + ret = -EINVAL; + } + if (siphash(in_unaligned + 1, i, test_key_siphash) != test_vectors_siphash[i]) { + pr_info("siphash self-test unaligned %u: FAIL\n", i + 1); ret = -EINVAL; } - if (siphash(in_unaligned + 1, i, test_key) != test_vectors[i]) { - pr_info("self-test unaligned %u: FAIL\n", i + 1); + if (hsiphash(in, i, test_key_hsiphash) != test_vectors_hsiphash[i]) { + pr_info("hsiphash self-test aligned %u: FAIL\n", i + 1); + ret = -EINVAL; + } + if (hsiphash(in_unaligned + 1, i, test_key_hsiphash) != test_vectors_hsiphash[i]) { + pr_info("hsiphash self-test unaligned %u: FAIL\n", i + 1); ret = -EINVAL; } } - if (siphash_1u64(0x0706050403020100ULL, test_key) != test_vectors[8]) { - pr_info("self-test 1u64: FAIL\n"); + if (siphash_1u64(0x0706050403020100ULL, test_key_siphash) != test_vectors_siphash[8]) { + pr_info("siphash self-test 1u64: FAIL\n"); ret = -EINVAL; } - if (siphash_2u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL, test_key) != test_vectors[16]) { - pr_info("self-test 2u64: FAIL\n"); + if (siphash_2u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL, test_key_siphash) != test_vectors_siphash[16]) { + pr_info("siphash self-test 2u64: FAIL\n"); ret = -EINVAL; } if (siphash_3u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL, - 0x1716151413121110ULL, test_key) != test_vectors[24]) { - pr_info("self-test 3u64: FAIL\n"); + 0x1716151413121110ULL, test_key_siphash) != test_vectors_siphash[24]) { + pr_info("siphash self-test 3u64: FAIL\n"); ret = -EINVAL; } if (siphash_4u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL, - 0x1716151413121110ULL, 0x1f1e1d1c1b1a1918ULL, test_key) != test_vectors[32]) { - pr_info("self-test 4u64: FAIL\n"); + 0x1716151413121110ULL, 0x1f1e1d1c1b1a1918ULL, test_key_siphash) != test_vectors_siphash[32]) { + pr_info("siphash self-test 4u64: FAIL\n"); ret = -EINVAL; } - if (siphash_1u32(0x03020100U, test_key) != test_vectors[4]) { - pr_info("self-test 1u32: FAIL\n"); + if (siphash_1u32(0x03020100U, test_key_siphash) != test_vectors_siphash[4]) { + pr_info("siphash self-test 1u32: FAIL\n"); ret = -EINVAL; } - if (siphash_2u32(0x03020100U, 0x07060504U, test_key) != test_vectors[8]) { - pr_info("self-test 2u32: FAIL\n"); + if (siphash_2u32(0x03020100U, 0x07060504U, test_key_siphash) != test_vectors_siphash[8]) { + pr_info("siphash self-test 2u32: FAIL\n"); ret = -EINVAL; } if (siphash_3u32(0x03020100U, 0x07060504U, - 0x0b0a0908U, test_key) != test_vectors[12]) { - pr_info("self-test 3u32: FAIL\n"); + 0x0b0a0908U, test_key_siphash) != test_vectors_siphash[12]) { + pr_info("siphash self-test 3u32: FAIL\n"); ret = -EINVAL; } if (siphash_4u32(0x03020100U, 0x07060504U, - 0x0b0a0908U, 0x0f0e0d0cU, test_key) != test_vectors[16]) { - pr_info("self-test 4u32: FAIL\n"); + 0x0b0a0908U, 0x0f0e0d0cU, test_key_siphash) != test_vectors_siphash[16]) { + pr_info("siphash self-test 4u32: FAIL\n"); + ret = -EINVAL; + } + if (hsiphash_1u32(0x03020100U, test_key_hsiphash) != test_vectors_hsiphash[4]) { + pr_info("hsiphash self-test 1u32: FAIL\n"); + ret = -EINVAL; + } + if (hsiphash_2u32(0x03020100U, 0x07060504U, test_key_hsiphash) != test_vectors_hsiphash[8]) { + pr_info("hsiphash self-test 2u32: FAIL\n"); + ret = -EINVAL; + } + if (hsiphash_3u32(0x03020100U, 0x07060504U, + 0x0b0a0908U, test_key_hsiphash) != test_vectors_hsiphash[12]) { + pr_info("hsiphash self-test 3u32: FAIL\n"); + ret = -EINVAL; + } + if (hsiphash_4u32(0x03020100U, 0x07060504U, + 0x0b0a0908U, 0x0f0e0d0cU, test_key_hsiphash) != test_vectors_hsiphash[16]) { + pr_info("hsiphash self-test 4u32: FAIL\n"); ret = -EINVAL; } if (!ret) -- 2.11.0 ^ permalink raw reply related [flat|nested] 82+ messages in thread
* Re: [PATCH v7 6/6] siphash: implement HalfSipHash1-3 for hash tables 2016-12-21 23:02 ` [PATCH v7 6/6] siphash: implement HalfSipHash1-3 for hash tables Jason A. Donenfeld @ 2016-12-22 0:46 ` Andi Kleen 0 siblings, 0 replies; 82+ messages in thread From: Andi Kleen @ 2016-12-22 0:46 UTC (permalink / raw) To: Jason A. Donenfeld Cc: Netdev, kernel-hardening, LKML, linux-crypto, David Laight, Ted Tso, Hannes Frederic Sowa, edumazet, Linus Torvalds, Eric Biggers, Tom Herbert, davem, luto, Jean-Philippe Aumasson > 64-bit x86_64: > [ 0.509409] test_siphash: SipHash2-4 cycles: 4049181 > [ 0.510650] test_siphash: SipHash1-3 cycles: 2512884 > [ 0.512205] test_siphash: HalfSipHash1-3 cycles: 3429920 > [ 0.512904] test_siphash: JenkinsHash cycles: 978267 I'm not sure what these numbers mean. Surely a single siphash2-4 does not take 4+ million cycles? If you run them in a loop please divide by the iterations. But generally running small code in a loop is often an unrealistic benchmark strategy because it hides cache misses, primes predictors, changes frequencies and changes memory costs, but also can overload pipelines and oversubscribe resources. [see also page 46+ in http://halobates.de/applicative-mental-models.pdf] So the numbers you get there are at least somewhat dubious. It would be good to have at least some test which is not just a tiny micro benchmark to compare before making conclusions. -Andi ^ permalink raw reply [flat|nested] 82+ messages in thread
[parent not found: <CAGiyFdfmiCMyHvAg=5sGh8KjBBrF0Wb4Qf=JLzJqUAx4yFSS3Q@mail.gmail.com>]
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF [not found] <CAGiyFdfmiCMyHvAg=5sGh8KjBBrF0Wb4Qf=JLzJqUAx4yFSS3Q@mail.gmail.com> @ 2016-12-15 23:28 ` George Spelvin 2016-12-16 17:06 ` David Laight 2016-12-16 3:46 ` George Spelvin 1 sibling, 1 reply; 82+ messages in thread From: George Spelvin @ 2016-12-15 23:28 UTC (permalink / raw) To: ak, davem, David.Laight, ebiggers3, hannes, Jason, jeanphilippe.aumasson, kernel-hardening, linux-crypto, linux-kernel, linux, luto, netdev, tom, torvalds, tytso, vegard.nossum Cc: djb > If a halved version of SipHash can bring significant performance boost > (with 32b words instead of 64b words) with an acceptable security level > (64-bit enough?) then we may design such a version. I was thinking if the key could be pushed to 80 bits, that would be nice, but honestly 64 bits is fine. This is DoS protection, and while it's possible to brute-force a 64 bit secret, there are more effective (DDoS) attacks possible for the same cost. (I'd suggest a name of "HalfSipHash" to convey the reduced security effectively.) > Regarding output size, are 64 bits sufficient? As a replacement for jhash, 32 bits are sufficient. It's for indexing an in-memory hash table on a 32-bit machine. (When you're done thinking about this, as a matter of personal interest I'd love a hash expert's opinion on https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=2a18da7a9c7886f1c7307f8d3f23f24318583f03 and https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8387ff2577eb9ed245df9a39947f66976c6bcd02 which is a non-cryptographic hash function of novel design that's inspired by SipHash.) ^ permalink raw reply [flat|nested] 82+ messages in thread
* RE: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-15 23:28 ` [PATCH v5 1/4] siphash: add cryptographically secure PRF George Spelvin @ 2016-12-16 17:06 ` David Laight 2016-12-16 17:09 ` Jason A. Donenfeld 0 siblings, 1 reply; 82+ messages in thread From: David Laight @ 2016-12-16 17:06 UTC (permalink / raw) To: 'George Spelvin', ak, davem, ebiggers3, hannes, Jason, jeanphilippe.aumasson, kernel-hardening, linux-crypto, linux-kernel, luto, netdev, tom, torvalds, tytso, vegard.nossum Cc: djb From: George Spelvin > Sent: 15 December 2016 23:29 > > If a halved version of SipHash can bring significant performance boost > > (with 32b words instead of 64b words) with an acceptable security level > > (64-bit enough?) then we may design such a version. > > I was thinking if the key could be pushed to 80 bits, that would be nice, > but honestly 64 bits is fine. This is DoS protection, and while it's > possible to brute-force a 64 bit secret, there are more effective (DDoS) > attacks possible for the same cost. A 32bit hash would also remove all the issues about the alignment of IP addresses (etc) on 64bit systems. > (I'd suggest a name of "HalfSipHash" to convey the reduced security > effectively.) > > > Regarding output size, are 64 bits sufficient? > > As a replacement for jhash, 32 bits are sufficient. It's for > indexing an in-memory hash table on a 32-bit machine. It is also worth remembering that if the intent is to generate a hash table index (not a unique fingerprint) you will always get collisions on the final value. Randomness could still give overlong hash chains - which might still need rehashing with a different key. David ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-16 17:06 ` David Laight @ 2016-12-16 17:09 ` Jason A. Donenfeld 0 siblings, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-16 17:09 UTC (permalink / raw) To: David Laight Cc: George Spelvin, ak, davem, ebiggers3, hannes, jeanphilippe.aumasson, kernel-hardening, linux-crypto, linux-kernel, luto, netdev, tom, torvalds, tytso, vegard.nossum, djb Hi David, On Fri, Dec 16, 2016 at 6:06 PM, David Laight <David.Laight@aculab.com> wrote: > A 32bit hash would also remove all the issues about the alignment > of IP addresses (etc) on 64bit systems. The current replacements of md5_transform with siphash in the v6 patch series will continue to use the original siphash, since the 128-bit key is rather important for these kinds of secrets. Additionally, 64-bit siphash is already faster than the md5_transform that it replaces. So the alignment concerns (now, non-issues; problems have been solved, I believe) still remain. Jason ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF [not found] <CAGiyFdfmiCMyHvAg=5sGh8KjBBrF0Wb4Qf=JLzJqUAx4yFSS3Q@mail.gmail.com> 2016-12-15 23:28 ` [PATCH v5 1/4] siphash: add cryptographically secure PRF George Spelvin @ 2016-12-16 3:46 ` George Spelvin [not found] ` <CAGiyFdd6_LVzUUfFcaqMyub1c2WPvWUzAQDCH+Aza-_t6mvmXg@mail.gmail.com> 1 sibling, 1 reply; 82+ messages in thread From: George Spelvin @ 2016-12-16 3:46 UTC (permalink / raw) To: ak, davem, David.Laight, ebiggers3, hannes, Jason, jeanphilippe.aumasson, kernel-hardening, linux-crypto, linux-kernel, linux, luto, netdev, tom, torvalds, tytso, vegard.nossum Cc: djb Jean-Philippe Aumasson wrote: > If a halved version of SipHash can bring significant performance boost > (with 32b words instead of 64b words) with an acceptable security level > (64-bit enough?) then we may design such a version. It would be fairly significant, a 2x speed benefit on a lot of 32-bit machines. First is the fact that a 64-bit SipHash round on a generic 32-bit machine requires not twice as many instructions, but more than three. Consider the core SipHash quarter-round operation: a += b; b = rotate_left(b, k) b ^= a The add and xor are equivalent between 32- and 64-bit rounds; twice the instructions do twice the work. (There's a dependency via the carry bit between the two halves of the add, but it ends up not being on the critical path even in a superscalar implementation.) The problem is the rotates. Although some particularly nice code is possible on 32-bit ARM due to its support for shift-and-xor operations, on a generic 32-bit CPU the rotate grows to 6 instructions with a 2-cycle dependency chain (more in practice because barrel shifters are large and even quad-issue CPUs can't do 4 shifts per cycle): temp_lo = b_lo >> (32-k) temp_hi = b_hi >> (32-k) b_lo <<= k b_hi <<= k b_lo ^= temp_hi b_hi ^= temp_lo The resultant instruction counts and (assuming wide issue) latencies are: 64-bit SipHash "Half" SipHash Inst. Latency Inst. Latency 10 3 3 2 Quarter round 40 6 12 4 Full round 80 12 24 8 Two rounds 82 13 26 9 Mix in one word 82 13 52 18 Mix in 64 bits 166 26 61 18 Four round finalization + final XOR 248 39 113 36 Hash 64 bits 330 52 165 54 Hash 128 bits 412 65 217 72 Hash 192 bits While the ideal latencies are actually better for the 64-bit algorithm, that requires an unrealistic 6+-wide superscalar implementation that's more than twice as wide as the 64-bit code requires (which is already optimized for quad-issue). For a 1- or 2-wide processor, the instruction counts dominate, and not only does the 64-bit algorithm take 60% more time to mix in the same number of bytes, but the finalization rounds bring the ratio to 2:1 for small inputs. (And I haven't included the possible savings if the input size is an odd number of 32-bit words, such as networking applications which include the source/dest port numbers.) Notes on particular processors: - x86 can do a 64-bit rotate in 3 instructions and 2 cycles using the SHLD/SHRD instructions instead: movl %b_hi, %temp shldl $k, %b_lo, %b_hi shldl $k, %temp, %b_lo ... but as I mentioned the problem is registers. SipHash needs 8 32-bit words plus at least one temporary, and 32-bit x86 has only 7 available. (And compilers can rarely manage to keep more than 6 of them busy.) - 64-bit SipHash is particularly efficient on 32-bit ARM due to its support for shift-and-op instructions. The 64-bit shift and following xor can be done in 4 instructions. So the only benefit is from the reduced finalization. - Double-width adds cost a little more on CPUs like MIPS and RISC-V without condition codes. - Certain particularly crappy uClinux processors with slow shifts (68000, anyone?) really suffer from extra shifts. One *weakly* requested feature: It might simplify some programming interfaces if we could use the same key for multiple hash tables with a 1-word "tweak" (e.g. pointer to the hash table, so it could be assumed non-zero if that helped) to make distinct functions. That would let us more safely use a global key for multiple small hash tables without the need to add code to generate and store key material for each place that an unkeyed hash is replaced. ^ permalink raw reply [flat|nested] 82+ messages in thread
[parent not found: <CAGiyFdd6_LVzUUfFcaqMyub1c2WPvWUzAQDCH+Aza-_t6mvmXg@mail.gmail.com>]
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF [not found] ` <CAGiyFdd6_LVzUUfFcaqMyub1c2WPvWUzAQDCH+Aza-_t6mvmXg@mail.gmail.com> @ 2016-12-16 12:39 ` Jason A. Donenfeld 2016-12-16 19:47 ` Tom Herbert [not found] ` <CAGiyFddB_HT3H2yhYQ5rprYZ487rJ4iCaH9uPJQD57hiPbn9ng@mail.gmail.com> 0 siblings, 2 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-16 12:39 UTC (permalink / raw) To: Jean-Philippe Aumasson Cc: George Spelvin, Andi Kleen, David Miller, David Laight, Eric Biggers, Hannes Frederic Sowa, kernel-hardening, Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev, Tom Herbert, Linus Torvalds, Theodore Ts'o, vegard.nossum, Daniel J . Bernstein Hey JP, On Fri, Dec 16, 2016 at 9:08 AM, Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> wrote: > Here's a tentative HalfSipHash: > https://github.com/veorq/SipHash/blob/halfsiphash/halfsiphash.c > > Haven't computed the cycle count nor measured its speed. This is incredible. Really. Wow! I'll integrate this into my patchset and will write up some documentation about when one should be used over the other. Thanks again. Quite exciting. Jason ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-16 12:39 ` Jason A. Donenfeld @ 2016-12-16 19:47 ` Tom Herbert 2016-12-16 20:41 ` George Spelvin 2016-12-17 15:21 ` George Spelvin [not found] ` <CAGiyFddB_HT3H2yhYQ5rprYZ487rJ4iCaH9uPJQD57hiPbn9ng@mail.gmail.com> 1 sibling, 2 replies; 82+ messages in thread From: Tom Herbert @ 2016-12-16 19:47 UTC (permalink / raw) To: Jason A. Donenfeld Cc: Jean-Philippe Aumasson, George Spelvin, Andi Kleen, David Miller, David Laight, Eric Biggers, Hannes Frederic Sowa, kernel-hardening, Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev, Linus Torvalds, Theodore Ts'o, vegard.nossum, Daniel J . Bernstein On Fri, Dec 16, 2016 at 4:39 AM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: > Hey JP, > > On Fri, Dec 16, 2016 at 9:08 AM, Jean-Philippe Aumasson > <jeanphilippe.aumasson@gmail.com> wrote: >> Here's a tentative HalfSipHash: >> https://github.com/veorq/SipHash/blob/halfsiphash/halfsiphash.c >> >> Haven't computed the cycle count nor measured its speed. > Tested this. Distribution and avalanche effect are still good. Speed wise I see about a 33% improvement over siphash (20 nsecs/op versus 32 nsecs). That's about 3x of jhash speed (7 nsecs). So that might closer to a more palatable replacement for jhash. Do we lose any security advantages with halfsiphash? Tom > This is incredible. Really. Wow! > > I'll integrate this into my patchset and will write up some > documentation about when one should be used over the other. > > Thanks again. Quite exciting. > > Jason ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-16 19:47 ` Tom Herbert @ 2016-12-16 20:41 ` George Spelvin 2016-12-16 20:57 ` Tom Herbert 2016-12-17 15:21 ` George Spelvin 1 sibling, 1 reply; 82+ messages in thread From: George Spelvin @ 2016-12-16 20:41 UTC (permalink / raw) To: Jason, tom Cc: ak, davem, David.Laight, djb, ebiggers3, hannes, jeanphilippe.aumasson, kernel-hardening, linux-crypto, linux-kernel, linux, luto, netdev, torvalds, tytso, vegard.nossum Tom Herbert wrote: > Tested this. Distribution and avalanche effect are still good. Speed > wise I see about a 33% improvement over siphash (20 nsecs/op versus 32 > nsecs). That's about 3x of jhash speed (7 nsecs). So that might closer > to a more palatable replacement for jhash. Do we lose any security > advantages with halfsiphash? What are you testing on? And what input size? And does "33% improvement" mean 4/3 the rate and 3/4 the time? Or 2/3 the time and 3/2 the rate? These are very odd results. On a 64-bit machine, SipHash should be the same speed per round, and faster because it hashes more data per round. (Unless you're hitting some unexpected cache/decode effect due to REX prefixes.) On a 32-bit machine (other than ARM, where your results might make sense, or maybe if you're hashing large amounts of data), the difference should be larger. And yes, there is a *significant* security loss. SipHash is 128 bits ("don't worry about it"). hsiphash is 64 bits, which is known breakable ("worry about it"), so we have to do a careful analysis of the cost of a successful attack. As mentioned in the e-mails that just flew by, hsiphash is intended *only* for 32-bit machines which bog down on full SipHash. On all 64-bit machines, it will be implemented as an alias for SipHash and the security concerns will Just Go Away. The place where hsiphash is expected to make a big difference is 32-bit x86. If you only see 33% difference with "gcc -m32", I'm going to be very confused. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-16 20:41 ` George Spelvin @ 2016-12-16 20:57 ` Tom Herbert 0 siblings, 0 replies; 82+ messages in thread From: Tom Herbert @ 2016-12-16 20:57 UTC (permalink / raw) To: George Spelvin Cc: Jason A. Donenfeld, Andi Kleen, David S. Miller, David Laight, Daniel J . Bernstein, Eric Biggers, Hannes Frederic Sowa, Jean-Philippe Aumasson, kernel-hardening, Linux Crypto Mailing List, LKML, Andy Lutomirski, Linux Kernel Network Developers, Linus Torvalds, Theodore Ts'o, vegard.nossum On Fri, Dec 16, 2016 at 12:41 PM, George Spelvin <linux@sciencehorizons.net> wrote: > Tom Herbert wrote: >> Tested this. Distribution and avalanche effect are still good. Speed >> wise I see about a 33% improvement over siphash (20 nsecs/op versus 32 >> nsecs). That's about 3x of jhash speed (7 nsecs). So that might closer >> to a more palatable replacement for jhash. Do we lose any security >> advantages with halfsiphash? > > What are you testing on? And what input size? And does "33% improvement" > mean 4/3 the rate and 3/4 the time? Or 2/3 the time and 3/2 the rate? > Sorry, that is over an IPv4 tuple. Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz. Recoded the function I was using to look like more like 64 bit version and yes it is indeed slower. > These are very odd results. On a 64-bit machine, SipHash should be the > same speed per round, and faster because it hashes more data per round. > (Unless you're hitting some unexpected cache/decode effect due to REX > prefixes.) > > On a 32-bit machine (other than ARM, where your results might make sense, > or maybe if you're hashing large amounts of data), the difference should > be larger. > > And yes, there is a *significant* security loss. SipHash is 128 bits > ("don't worry about it"). hsiphash is 64 bits, which is known breakable > ("worry about it"), so we have to do a careful analysis of the cost of > a successful attack. > > As mentioned in the e-mails that just flew by, hsiphash is intended > *only* for 32-bit machines which bog down on full SipHash. On all 64-bit > machines, it will be implemented as an alias for SipHash and the security > concerns will Just Go Away. > > The place where hsiphash is expected to make a big difference is 32-bit > x86. If you only see 33% difference with "gcc -m32", I'm going to be > very confused. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-16 19:47 ` Tom Herbert 2016-12-16 20:41 ` George Spelvin @ 2016-12-17 15:21 ` George Spelvin 2016-12-19 14:14 ` David Laight 1 sibling, 1 reply; 82+ messages in thread From: George Spelvin @ 2016-12-17 15:21 UTC (permalink / raw) To: tom Cc: ak, davem, David.Laight, djb, ebiggers3, hannes, Jason, jeanphilippe.aumasson, kernel-hardening, linux-crypto, linux-kernel, linux, luto, netdev, torvalds, tytso, vegard.nossum To follow up on my comments that your benchmark results were peculiar, here's my benchmark code. It just computes the hash of all n*(n+1)/2 possible non-empty substrings of a buffer of n (called "max" below) bytes. "cpb" is "cycles per byte". (The average length is (n+2)/3, c.f. https://oeis.org/A000292) On x86-32, HSipHash is asymptotically twice the speed of SipHash, rising to 2.5x for short strings: SipHash/HSipHash benchmark, sizeof(long) = 4 SipHash: max= 4 cycles= 10495 cpb=524.7500 (sum=47a4f5554869fa97) HSipHash: max= 4 cycles= 3400 cpb=170.0000 (sum=146a863e) SipHash: max= 8 cycles= 24468 cpb=203.9000 (sum=21c41a86355affcc) HSipHash: max= 8 cycles= 9237 cpb= 76.9750 (sum=d3b5e0cd) SipHash: max= 16 cycles= 94622 cpb=115.9583 (sum=26d816b72721e48f) HSipHash: max= 16 cycles= 34499 cpb= 42.2782 (sum=16bb7475) SipHash: max= 32 cycles= 418767 cpb= 69.9811 (sum=dd5a97694b8a832d) HSipHash: max= 32 cycles= 156695 cpb= 26.1857 (sum=eed00fcb) SipHash: max= 64 cycles= 2119152 cpb= 46.3101 (sum=a2a725aecc09ed00) HSipHash: max= 64 cycles= 1008678 cpb= 22.0428 (sum=99b9f4f) SipHash: max= 128 cycles= 12728659 cpb= 35.5788 (sum=420878cd20272817) HSipHash: max= 128 cycles= 5452931 cpb= 15.2419 (sum=f1f4ad18) SipHash: max= 256 cycles= 38931946 cpb= 13.7615 (sum=e05dfb28b90dfd98) HSipHash: max= 256 cycles= 13807312 cpb= 4.8805 (sum=ceeafcc1) SipHash: max= 512 cycles= 205537380 cpb= 9.1346 (sum=7d129d4de145fbea) HSipHash: max= 512 cycles= 103420960 cpb= 4.5963 (sum=7f15a313) SipHash: max=1024 cycles=1540259472 cpb= 8.5817 (sum=cca7cbdc778ca8af) HSipHash: max=1024 cycles= 796090824 cpb= 4.4355 (sum=d8f3374f) On x86-64, SipHash is consistently faster, asymptotically approaching 2x for long strings: SipHash/HSipHash benchmark, sizeof(long) = 8 SipHash: max= 4 cycles= 2642 cpb=132.1000 (sum=47a4f5554869fa97) HSipHash: max= 4 cycles= 2498 cpb=124.9000 (sum=146a863e) SipHash: max= 8 cycles= 5270 cpb= 43.9167 (sum=21c41a86355affcc) HSipHash: max= 8 cycles= 7140 cpb= 59.5000 (sum=d3b5e0cd) SipHash: max= 16 cycles= 19950 cpb= 24.4485 (sum=26d816b72721e48f) HSipHash: max= 16 cycles= 23546 cpb= 28.8554 (sum=16bb7475) SipHash: max= 32 cycles= 80188 cpb= 13.4004 (sum=dd5a97694b8a832d) HSipHash: max= 32 cycles= 101218 cpb= 16.9148 (sum=eed00fcb) SipHash: max= 64 cycles= 373286 cpb= 8.1575 (sum=a2a725aecc09ed00) HSipHash: max= 64 cycles= 535568 cpb= 11.7038 (sum=99b9f4f) SipHash: max= 128 cycles= 2075224 cpb= 5.8006 (sum=420878cd20272817) HSipHash: max= 128 cycles= 3336820 cpb= 9.3270 (sum=f1f4ad18) SipHash: max= 256 cycles= 14276278 cpb= 5.0463 (sum=e05dfb28b90dfd98) HSipHash: max= 256 cycles= 28847880 cpb= 10.1970 (sum=ceeafcc1) SipHash: max= 512 cycles= 50135180 cpb= 2.2281 (sum=7d129d4de145fbea) HSipHash: max= 512 cycles= 86145916 cpb= 3.8286 (sum=7f15a313) SipHash: max=1024 cycles= 334111900 cpb= 1.8615 (sum=cca7cbdc778ca8af) HSipHash: max=1024 cycles= 640432452 cpb= 3.5682 (sum=d8f3374f) Here's the code; compile with -DSELFTEST. (The main purpose of printing the sum is to prevent dead code elimination.) #if SELFTEST #include <stdint.h> #include <stdlib.h> static inline uint64_t rol64(uint64_t word, unsigned int shift) { return word << shift | word >> (64 - shift); } static inline uint32_t rol32(uint32_t word, unsigned int shift) { return word << shift | word >> (32 - shift); } static inline uint64_t get_unaligned_le64(void const *p) { return *(uint64_t const *)p; } static inline uint32_t get_unaligned_le32(void const *p) { return *(uint32_t const *)p; } static inline uint64_t le64_to_cpup(uint64_t const *p) { return *p; } static inline uint32_t le32_to_cpup(uint32_t const *p) { return *p; } #else #include <linux/bitops.h> /* For rol64 */ #include <linux/cryptohash.h> #include <asm/byteorder.h> #include <asm/unaligned.h> #endif /* The basic ARX mixing function, taken from Skein */ #define SIP_MIX(a, b, s) ((a) += (b), (b) = rol64(b, s), (b) ^= (a)) /* * The complete SipRound. Note that, when unrolled twice like below, * the 32-bit rotates drop out on 32-bit machines. */ #define SIP_ROUND(a, b, c, d) \ (SIP_MIX(a, b, 13), SIP_MIX(c, d, 16), (a) = rol64(a, 32), \ SIP_MIX(c, b, 17), SIP_MIX(a, d, 21), (c) = rol64(c, 32)) /* * This is rolled up more than most implementations, resulting in about * 55% the code size. Speed is a few precent slower. A crude benchmark * (for (i=1; i <= max; i++) for (j = 0; j < 4096-i; j++) hash(buf+j, i);) * produces the following timings (in usec): * * i386 i386 i386 x86_64 x86_64 x86_64 x86_64 * Length small unroll halfmd4 small unroll halfmd4 teahash * 1..4 1069 1029 1608 195 160 399 690 * 1..8 2483 2381 3851 410 360 988 1659 * 1..12 4303 4152 6207 690 618 1642 2690 * 1..16 6122 5931 8668 968 876 2363 3786 * 1..20 8348 8137 11245 1323 1185 3162 5567 * 1..24 10580 10327 13935 1657 1504 4066 7635 * 1..28 13211 12956 16803 2069 1871 5028 9759 * 1..32 15843 15572 19725 2470 2260 6084 11932 * 1..36 18864 18609 24259 2934 2678 7566 14794 * 1..1024 5890194 6130242 10264816 881933 881244 3617392 7589036 * * The performance penalty is quite minor, decreasing for long strings, * and it's significantly faster than half_md4, so I'm going for the * I-cache win. */ uint64_t siphash24(char const *in, size_t len, uint32_t const seed[4]) { uint64_t a = 0x736f6d6570736575; /* somepseu */ uint64_t b = 0x646f72616e646f6d; /* dorandom */ uint64_t c = 0x6c7967656e657261; /* lygenera */ uint64_t d = 0x7465646279746573; /* tedbytes */ uint64_t m = 0; uint8_t padbyte = len; m = seed[2] | (uint64_t)seed[3] << 32; b ^= m; d ^= m; m = seed[0] | (uint64_t)seed[1] << 32; /* a ^= m; is done in loop below */ c ^= m; /* * By using the same SipRound code for all iterations, we * save space, at the expense of some branch prediction. But * branch prediction is hard because of variable length anyway. */ len = len/8 + 3; /* Now number of rounds to perform */ do { a ^= m; switch (--len) { unsigned bytes; default: /* Full words */ d ^= m = get_unaligned_le64(in); in += 8; break; case 2: /* Final partial word */ /* * We'd like to do one 64-bit fetch rather than * mess around with bytes, but reading past the end * might hit a protection boundary. Fortunately, * we know that protection boundaries are aligned, * so we can consider only three cases: * - The remainder occupies zero words * - The remainder fits into one word * - The remainder straddles two words */ bytes = padbyte & 7; if (bytes == 0) { m = 0; } else { unsigned offset = (unsigned)(uintptr_t)in & 7; if (offset + bytes <= 8) { m = le64_to_cpup((uint64_t const *) (in - offset)); m >>= 8*offset; } else { m = get_unaligned_le64(in); } m &= ((uint64_t)1 << 8*bytes) - 1; } /* Could use | or +, but ^ allows associativity */ d ^= m ^= (uint64_t)padbyte << 56; break; case 1: /* Beginning of finalization */ m = 0; c ^= 0xff; /*FALLTHROUGH*/ case 0: /* Second half of finalization */ break; } SIP_ROUND(a, b, c, d); SIP_ROUND(a, b, c, d); } while (len); return a ^ b ^ c ^ d; } #undef SIP_ROUND #undef SIP_MIX #define HSIP_MIX(a, b, s) ((a) += (b), (b) = rol32(b, s), (b) ^= (a)) /* * These are the PRELIMINARY rotate constants suggested by * Jean-Philippe Aumasson. Update to final when available. */ #define HSIP_ROUND(a, b, c, d) \ (HSIP_MIX(a, b, 5), HSIP_MIX(c, d, 8), (a) = rol32(a, 16), \ HSIP_MIX(c, b, 7), HSIP_MIX(a, d, 13), (c) = rol32(c, 16)) uint32_t hsiphash24(char const *in, size_t len, uint32_t const key[2]) { uint32_t c = key[0]; uint32_t d = key[1]; uint32_t a = 0x6c796765 ^ 0x736f6d65; uint32_t b = d ^ 0x74656462 ^ 0x646f7261; uint32_t m = c; uint8_t padbyte = len; /* * By using the same SipRound code for all iterations, we * save space, at the expense of some branch prediction. But * branch prediction is hard because of variable length anyway. */ len = len/sizeof(m) + 3; /* Now number of rounds to perform */ do { a ^= m; switch (--len) { unsigned bytes; default: /* Full words */ d ^= m = get_unaligned_le32(in); in += sizeof(m); break; case 2: /* Final partial word */ /* * We'd like to do one 32-bit fetch rather than * mess around with bytes, but reading past the end * might hit a protection boundary. Fortunately, * we know that protection boundaries are aligned, * so we can consider only three cases: * - The remainder occupies zero words * - The remainder fits into one word * - The remainder straddles two words */ bytes = padbyte & 3; if (bytes == 0) { m = 0; } else { unsigned offset = (unsigned)(uintptr_t)in & 3; if (offset + bytes <= 4) { m = le32_to_cpup((uint32_t const *) (in - offset)); m >>= 8*offset; } else { m = get_unaligned_le32(in); } m &= ((uint32_t)1 << 8*bytes) - 1; } /* Could use | or +, but ^ allows associativity */ d ^= m ^= (uint32_t)padbyte << 24; break; case 1: /* Beginning of finalization */ m = 0; c ^= 0xff; /*FALLTHROUGH*/ case 0: /* Second half of finalization */ break; } HSIP_ROUND(a, b, c, d); HSIP_ROUND(a, b, c, d); } while (len); return a ^ b ^ c ^ d; // return c + d; } #undef HSIP_ROUND #undef HSIP_MIX /* * No objection to EXPORT_SYMBOL, but we should probably figure out * how the seed[] array should work first. Homework for the first * person to want to call it from a module! */ #if SELFTEST #include <stdio.h> static uint64_t rdtsc() { uint32_t eax, edx; asm volatile ("rdtsc" : "=a" (eax), "=d" (edx)); return (uint64_t)edx << 32 | eax; } int main(void) { static char const buf[1024] = { 0 }; unsigned max; static const uint32_t key[4] = { 1, 2, 3, 4 }; printf("SipHash/HSipHash benchmark, sizeof(long) = %u\n", (unsigned)sizeof(long)); for (unsigned max = 4; max <= 1024; max *= 2) { uint64_t sum1 = 0; uint32_t sum2 = 0; uint64_t cycles; uint32_t bytes = 0; /* A less lazy person could figure out the closed form */ for (int i = 1; i <= max; i++) bytes += i * (max + 1 - i); cycles = rdtsc(); for (int i = 1; i <= max; i++) for (int j = 0; j <= max-i; j++) sum1 += siphash24(buf+j, i, key); cycles = rdtsc() - cycles; printf(" SipHash: max=%4u cycles=%10llu cpb=%8.4f (sum=%llx)\n", max, cycles, (double)cycles/bytes, sum1); cycles = rdtsc(); for (int i = 1; i <= max; i++) for (int j = 0; j <= max-i; j++) sum2 += hsiphash24(buf+j, i, key); cycles = rdtsc() - cycles; printf("HSipHash: max=%4u cycles=%10llu cpb=%8.4f (sum=%lx)\n", max, cycles, (double)cycles/bytes, sum2); } return 0; } #endif ^ permalink raw reply [flat|nested] 82+ messages in thread
* RE: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-17 15:21 ` George Spelvin @ 2016-12-19 14:14 ` David Laight 2016-12-19 18:10 ` George Spelvin 0 siblings, 1 reply; 82+ messages in thread From: David Laight @ 2016-12-19 14:14 UTC (permalink / raw) To: 'George Spelvin', tom Cc: ak, davem, djb, ebiggers3, hannes, Jason, jeanphilippe.aumasson, kernel-hardening, linux-crypto, linux-kernel, luto, netdev, torvalds, tytso, vegard.nossum From: George Spelvin > Sent: 17 December 2016 15:21 ... > uint32_t > hsiphash24(char const *in, size_t len, uint32_t const key[2]) > { > uint32_t c = key[0]; > uint32_t d = key[1]; > uint32_t a = 0x6c796765 ^ 0x736f6d65; > uint32_t b = d ^ 0x74656462 ^ 0x646f7261; I've not looked closely, but is that (in some sense) duplicating the key length? So you could set a = key[2] and b = key[3] and still have an working hash - albeit not exactly the one specified. I'll add another comment here... Is it worth using the 32bit hash for IP addresses on 64bit systems that can't do misaligned accessed? David ^ permalink raw reply [flat|nested] 82+ messages in thread
* RE: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-19 14:14 ` David Laight @ 2016-12-19 18:10 ` George Spelvin 0 siblings, 0 replies; 82+ messages in thread From: George Spelvin @ 2016-12-19 18:10 UTC (permalink / raw) To: David.Laight, linux, tom Cc: ak, davem, djb, ebiggers3, hannes, Jason, jeanphilippe.aumasson, kernel-hardening, linux-crypto, linux-kernel, luto, netdev, torvalds, tytso, vegard.nossum David Laight wrote: > From: George Spelvin ... >> uint32_t >> hsiphash24(char const *in, size_t len, uint32_t const key[2]) >> { >> uint32_t c = key[0]; >> uint32_t d = key[1]; >> uint32_t a = 0x6c796765 ^ 0x736f6d65; >> uint32_t b = d ^ 0x74656462 ^ 0x646f7261; > I've not looked closely, but is that (in some sense) duplicating > the key length? > So you could set a = key[2] and b = key[3] and still have an > working hash - albeit not exactly the one specified. That's tempting, but not necessarily effective. (A similar unsuccesful idea can be found in discussions of "DES with independent round keys". Or see the design discussion of Salsa20 and the constants in its input.) You can increase the key size, but that might not increase the *security* any. The big issue is that there are a *lot* of square root attack in cryptanalysis. Because SipHash's state is twice the size of the key, such an attack will have the same complexity as key exhaustion and need not be considered. To make a stronger security claim, you need to start working through them all and show that they don't apply. For SipHash in particular, an important property is asymmetry of the internal state. That's what duplicating the key with XORs guarantees. If the two halves of the state end up identical, the mixing is much weaker. Now the probability of ending up in a "mirror state" is the square root of the state size (1 in 2^64 for HalfSipHash's 128-bit state), which is the same probability as guessing a key, so it's not a problem that has to be considered when making a 64-bit security claim. But if you want a higher security level, you have to think about what can happen. That said, I have been thinking very hard about a = c ^ 0x48536970; /* 'HSip' */ d = key[2]; By guaranteeing that a and c are different, we get the desired asymmetry, and the XOR of b and d is determined by the first word of the message anyway, so this isn't weakening anything. 96 bits is far beyond the reach of any brute-force attack, and if a more sophisticated 64-bit attack exists, it's at least out of the reach of the script kiddies, and will almost certainly have a non-negligible constant factor and more limits in when it can be applied. > Is it worth using the 32bit hash for IP addresses on 64bit systems that > can't do misaligned accessed? Not a good idea. To hash 64 bits of input: * Full SipHash has to do two loads, a shift, an or, and two rounds of mixing. * HalfSipHash has to do a load, two rounds, another load, and two more rounds. In other words, in addition to being less secure, it's half the speed. Also, what systems are you thinking about? x86, ARMv8, PowerPC, and S390 (and ia64, if anyone cares) all handle unaligned loads. MIPS has efficient support. Alpha and HPPA are for retrocomputing fans, not people who care about performance. So you're down to SPARC. Which conveniently has the same maintainer as the networking code, so I figure DaveM can take care of that himself. :-) ^ permalink raw reply [flat|nested] 82+ messages in thread
[parent not found: <CAGiyFddB_HT3H2yhYQ5rprYZ487rJ4iCaH9uPJQD57hiPbn9ng@mail.gmail.com>]
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF [not found] ` <CAGiyFddB_HT3H2yhYQ5rprYZ487rJ4iCaH9uPJQD57hiPbn9ng@mail.gmail.com> @ 2016-12-16 15:51 ` Jason A. Donenfeld 2016-12-16 17:36 ` George Spelvin 2016-12-17 12:42 ` George Spelvin 2016-12-16 20:39 ` Jason A. Donenfeld 1 sibling, 2 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-16 15:51 UTC (permalink / raw) To: Jean-Philippe Aumasson Cc: George Spelvin, Andi Kleen, David Miller, David Laight, Eric Biggers, Hannes Frederic Sowa, kernel-hardening, Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev, Tom Herbert, Linus Torvalds, Theodore Ts'o, vegard.nossum, Daniel J . Bernstein Hi JP & George, My function names: - SipHash -> siphash - HalfSipHash -> hsiphash It appears that hsiphash can produce either 32-bit output or 64-bit output, with the output length parameter as part of the hash algorithm in there. When I code this for my kernel patchset, I very likely will only implement one output length size. Right now I'm leaning toward 32-bit. Questions: - Is this a reasonable choice? - When hsiphash is desired due to its faster speed, are there any circumstances in which producing a 64-bit output would actually be useful? Namely, are there any hashtables that could benefit from a 64-bit functions? - Are there reasons why hsiphash with 64-bit output would be reasonable? Or will we be fine sticking with 32-bit output only? With both hsiphash and siphash, the division of usage will probably become: - Use 64-bit output 128-bit key siphash for keyed RNG-like things, such as syncookies and sequence numbers - Use 64-bit output 128-bit key siphash for hashtables that must absolutely be secure to an extremely high bandwidth attacker, such as userspace directly DoSing a kernel hashtable - Use 32-bit output 64-bit key hsiphash for quick hashtable functions that still must be secure but do not require as large of a security margin Sound good? Jason ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-16 15:51 ` Jason A. Donenfeld @ 2016-12-16 17:36 ` George Spelvin 2016-12-16 18:00 ` Jason A. Donenfeld 2016-12-17 12:42 ` George Spelvin 1 sibling, 1 reply; 82+ messages in thread From: George Spelvin @ 2016-12-16 17:36 UTC (permalink / raw) To: Jason, jeanphilippe.aumasson Cc: ak, davem, David.Laight, djb, ebiggers3, hannes, kernel-hardening, linux-crypto, linux-kernel, linux, luto, netdev, tom, torvalds, tytso, vegard.nossum > It appears that hsiphash can produce either 32-bit output or 64-bit > output, with the output length parameter as part of the hash algorithm > in there. When I code this for my kernel patchset, I very likely will > only implement one output length size. Right now I'm leaning toward > 32-bit. A 128-bit output option was added to SipHash after the initial publication; this is just the equivalent in 32-bit. > - Is this a reasonable choice? Yes. > - Are there reasons why hsiphash with 64-bit output would be > reasonable? Or will we be fine sticking with 32-bit output only? Personally, I'd put in a comment saying that "there's a 64-bit output variant that's not implemented" and punt until someone find a need. > With both hsiphash and siphash, the division of usage will probably become: > - Use 64-bit output 128-bit key siphash for keyed RNG-like things, > such as syncookies and sequence numbers > - Use 64-bit output 128-bit key siphash for hashtables that must > absolutely be secure to an extremely high bandwidth attacker, such as > userspace directly DoSing a kernel hashtable > - Use 32-bit output 64-bit key hsiphash for quick hashtable functions > that still must be secure but do not require as large of a security > margin. On a 64-bit machine, 64-bit SipHash is *always* faster than 32-bit, and should be used always. Don't even compile the 32-bit code, to prevent anyone accidentally using it, and make hsiphash an alias for siphash. On a 32-bit machine, it's a much trickier case. I'd be tempted to use the 32-bit code always, but it needs examination. Fortunately, the cost of brute-forcing hash functions can be fairly exactly quantified, thanks to bitcoin miners. It currently takes 2^70 hashes to create one bitcoin block, worth 25 bitcoins ($19,500). Thus, 2^63 hashes cost $152. Now, there are two factors that must be considered: - That's a very very "wholesale" rate. That's assuming you're doing large numbers of these and can put in the up-front effort designing silicon ASICs to do the attack. - That's for a more difficult hash (double sha-256) than SipHash. That's a constant fator, but a pretty significant one. If the wholesale assumption holds, that might bring the cost down another 6 or 7 bits, to $1-2 per break. If you're not the NSA and limited to general-purpose silicon, let's assume a state of the art GPU (Radeon HD 7970; AMD GPUs seem do to better than nVidia). The bitcoin mining rate for those is about 700M/second, 29.4 bits. So 63 bits is 152502 GPU-days, divided by some factor to account for SipHash's high speed compared to two rounds of SHA-2. Call it 1000 GPU-days. It's very doable, but also very non-trivial. The question is, wouldn't it be cheaper and easier just to do a brute-force flooding DDoS? (This is why I wish the key size could be tweaked up to 80 bits. That would take all these numbers out of the reasonable range.) Let me consider your second example above, "secure against local users". I should dig through your patchset and find the details, but what exactly are the consequences of such an attack? Hasn't a local user already got much better ways to DoS the system? The thing to remember is that we're worried only about the combination of a *new* Linux kernel (new build or under active maintenance) and a 32-bit host. You'd be hard-pressed to find a *single* machine fitting that description which is hosting multiple users or VMs and is not 64-bit. These days, 32-bit CPUs are for embedded applications: network appliances, TVs, etc. That means basically single-user. Even phones are 64 bit. Is this really a threat that needs to be defended against? For your first case, network applications, the additional security is definitely attractive. Syncookies are only a DoS, but sequence numbers are a real security issue; they can let you inject data into a TCP connection. Hash tables are much harder to attack. The information you get back from timing probes is statistical, and thus testing a key is more expensive. With sequence numbers, large amounts (32 bits) the hash output is directly observable. I wish we could get away with 64-bit security, but given that the modern internet involves attacks from NSA/Spetssvyaz/3PLA, I agree it's just not enough. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-16 17:36 ` George Spelvin @ 2016-12-16 18:00 ` Jason A. Donenfeld 2016-12-16 20:17 ` George Spelvin 0 siblings, 1 reply; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-16 18:00 UTC (permalink / raw) To: George Spelvin Cc: Jean-Philippe Aumasson, Andi Kleen, David Miller, David Laight, Daniel J . Bernstein, Eric Biggers, Hannes Frederic Sowa, kernel-hardening, Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev, Tom Herbert, Linus Torvalds, Theodore Ts'o, Vegard Nossum Hi George, On Fri, Dec 16, 2016 at 6:36 PM, George Spelvin <linux@sciencehorizons.net> wrote: > A 128-bit output option was added to SipHash after the initial publication; > this is just the equivalent in 32-bit. > Personally, I'd put in a comment saying that "there's a 64-bit output > variant that's not implemented" and punt until someone find a need. That's a good way to think about it. Okay, I'll do precisely that. > On a 64-bit machine, 64-bit SipHash is *always* faster than 32-bit, and > should be used always. Don't even compile the 32-bit code, to prevent > anyone accidentally using it, and make hsiphash an alias for siphash. Fascinating! Okay. So I'll alias hsiphash to siphash on 64-bit then. I like this arrangement. > Fortunately, the cost of brute-forcing hash functions can be fairly > exactly quantified, thanks to bitcoin miners. It currently takes 2^70 > hashes to create one bitcoin block, worth 25 bitcoins ($19,500). Thus, > 2^63 hashes cost $152. > > Now, there are two factors that must be considered: > - That's a very very "wholesale" rate. That's assuming you're doing > large numbers of these and can put in the up-front effort designing > silicon ASICs to do the attack. > - That's for a more difficult hash (double sha-256) than SipHash. > That's a constant fator, but a pretty significant one. If the wholesale > assumption holds, that might bring the cost down another 6 or 7 bits, > to $1-2 per break. > > If you're not the NSA and limited to general-purpose silicon, let's > assume a state of the art GPU (Radeon HD 7970; AMD GPUs seem do to better > than nVidia). The bitcoin mining rate for those is about 700M/second, > 29.4 bits. So 63 bits is 152502 GPU-days, divided by some factor > to account for SipHash's high speed compared to two rounds of SHA-2. > Call it 1000 GPU-days. > > It's very doable, but also very non-trivial. The question is, wouldn't > it be cheaper and easier just to do a brute-force flooding DDoS? > > (This is why I wish the key size could be tweaked up to 80 bits. > That would take all these numbers out of the reasonable range.) That's a nice analysis. Might one conclude from that that hsiphash is not useful for our purposes? Or does it still remain useful for network facing code? > Let me consider your second example above, "secure against local users". > I should dig through your patchset and find the details, but what exactly > are the consequences of such an attack? Hasn't a local user already > got much better ways to DoS the system? For example, an unpriv'd user putting lots of entries in one hash bucket for a shared resource that's used by root, like filesystems or other lookup tables. If he can cause root to use more of root's cpu schedule budget than otherwise in a directed way, then that's a bad DoS. > The thing to remember is that we're worried only about the combination > of a *new* Linux kernel (new build or under active maintenance) and a > 32-bit host. You'd be hard-pressed to find a *single* machine fitting > that description which is hosting multiple users or VMs and is not 64-bit. > > These days, 32-bit CPUs are for embedded applications: network appliances, > TVs, etc. That means basically single-user. Even phones are 64 bit. > Is this really a threat that needs to be defended against? I interpret this to indicate all the more reason to alias hsiphash to siphash on 64-bit, and then the problem space collapses in a clear way. > For your first case, network applications, the additional security > is definitely attractive. Syncookies are only a DoS, but sequence > numbers are a real security issue; they can let you inject data into a > TCP connection. > With sequence numbers, large amounts (32 bits) the hash output is > directly observable. Right. Hence the need for always using full siphash and not hsiphash for sequence numbers, per my earlier email to David. > > I wish we could get away with 64-bit security, but given that the > modern internet involves attacks from NSA/Spetssvyaz/3PLA, I agree > it's just not enough. I take this comment to be relavent for the sequence number case. For hashtables and hashtable flooding, is it still your opinion that we will benefit from hsiphash? Or is this final conclusion a rejection of hsiphash for that too? We're talking about two different use cases, and your email kind of interleaved both into your analysis, so I'm not certain so to precisely what your conclusion is for each use case. Can you clear up the ambiguity? Jason ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-16 18:00 ` Jason A. Donenfeld @ 2016-12-16 20:17 ` George Spelvin 2016-12-16 20:43 ` Theodore Ts'o 0 siblings, 1 reply; 82+ messages in thread From: George Spelvin @ 2016-12-16 20:17 UTC (permalink / raw) To: Jason, linux Cc: ak, davem, David.Laight, djb, ebiggers3, hannes, jeanphilippe.aumasson, kernel-hardening, linux-crypto, linux-kernel, luto, netdev, tom, torvalds, tytso, vegard.nossum >> On a 64-bit machine, 64-bit SipHash is *always* faster than 32-bit, and >> should be used always. Don't even compile the 32-bit code, to prevent >> anyone accidentally using it, and make hsiphash an alias for siphash. > Fascinating! Okay. So I'll alias hsiphash to siphash on 64-bit then. I > like this arrangement. This is a basic assumption I make in the security analysis below: on most machines, it's 128-bit-key SipHash everywhere and we can consider security solved. Our analysis *only* has to consider 32-bit machines. My big concern is home routers, with IoT appliances coming second. The routers have severe hardware cost constraints (so limited CPU power), but see a lot of traffic and need to process (NAT) it. > That's a nice analysis. Might one conclude from that that hsiphash is > not useful for our purposes? Or does it still remain useful for > network facing code? I think for attacks where the threat is a DoS, it's usable. The point is you only have to raise the cost to equal that of a packet flood. (Just like in electronic warfare, the best you can possibly do is force the enemy to use broadband jamming.) Hash collision attacks just aren't that powerful. The original PoC was against an application that implemented a hard limit on hash chain length as a DoS defense, which the attack then exploited to turn it into a hard DoS. >> Let me consider your second example above, "secure against local users". >> I should dig through your patchset and find the details, but what exactly >> are the consequences of such an attack? Hasn't a local user already >> got much better ways to DoS the system? > For example, an unpriv'd user putting lots of entries in one hash > bucket for a shared resource that's used by root, like filesystems or > other lookup tables. If he can cause root to use more of root's cpu > schedule budget than otherwise in a directed way, then that's a bad > DoS. This issue was recently discussed when we redesigned the dcache hash. Even a successful attack doesn't slow things down all *that* much. Before you overkill every hash table in the kernel, think about whether it's a bigger problem than the dcache. (Hint: it's probably not.) There's no point armor-plating the side door when the front door was just upgraded from screen to wood. >> These days, 32-bit CPUs are for embedded applications: network appliances, >> TVs, etc. That means basically single-user. Even phones are 64 bit. >> Is this really a threat that needs to be defended against? > I interpret this to indicate all the more reason to alias hsiphash to > siphash on 64-bit, and then the problem space collapses in a clear > way. Yes, exactly. > Right. Hence the need for always using full siphash and not hsiphash > for sequence numbers, per my earlier email to David. > >> I wish we could get away with 64-bit security, but given that the >> modern internet involves attacks from NSA/Spetssvyaz/3PLA, I agree >> it's just not enough. > > I take this comment to be relavent for the sequence number case. Yes. > For hashtables and hashtable flooding, is it still your opinion that > we will benefit from hsiphash? Or is this final conclusion a rejection > of hsiphash for that too? We're talking about two different use cases, > and your email kind of interleaved both into your analysis, so I'm not > certain so to precisely what your conclusion is for each use case. Can > you clear up the ambiguity? My (speaking enerally; I should walk through every hash table you've converted) opinion is that: - Hash tables, even network-facing ones, can all use hsiphash as long as an attacker can only see collisions, i.e. ((H(x) ^ H(y)) & bits) == 0, and the consequences of a successful attack is only more collisions (timing). While the attack is only 2x the cost (two hashes rather than one to test a key), the knowledge of the collision is statistical, especially for network attackers, which raises the cost of guessing beyond an even more brute-force attack. - When the hash value directly visible (e.g. included in a network packet), full SipHash should be the default. - Syncookies *could* use hsiphash, especially as there are two keys in there. Not sure if we need the performance. - For TCP ISNs, I'd prefer to use full SipHash. I know this is a very hot path, and if that's a performance bottleneck, we can work harder on it. In particular, TCP ISNs *used* to rotate the key periodically, limiting the time available to an attacker to perform an attack before the secret goes stale and is useless. commit 6e5714eaf77d79ae1c8b47e3e040ff5411b717ec upgraded to md5 and dropped the key rotation. If 2x hsiphash is faster than siphash, we could use a double-hashing system like syncookies. One 32-bit hash with a permanent key, summed with a k-bit counter and a (32-k)-bit hash, where the key is rotated (and the counter incremented) periodically. The requirement is that the increment rate of the counter hash doesn't shorten the sequence number wraparound too much. The old code used an 8-bit counter and 24-bit hash, with the counter bumped every 5 minutes. Current code uses a 64 ns tick for the ISN, so it counts 2^24 per second. (32 bits wraps every 4.6 minutes.) A 4-bit counter and 28-bit hash (or even 3+29) would work as long as the key is regenerated no more than once per minute. (Just using the 4.6-minute ISN wrap time is the obvious simple implementation.) (Of course, I defer to DaveM's judgement on all network-related issues.) ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-16 20:17 ` George Spelvin @ 2016-12-16 20:43 ` Theodore Ts'o 2016-12-16 22:13 ` George Spelvin 0 siblings, 1 reply; 82+ messages in thread From: Theodore Ts'o @ 2016-12-16 20:43 UTC (permalink / raw) To: George Spelvin Cc: Jason, ak, davem, David.Laight, djb, ebiggers3, hannes, jeanphilippe.aumasson, kernel-hardening, linux-crypto, linux-kernel, luto, netdev, tom, torvalds, vegard.nossum On Fri, Dec 16, 2016 at 03:17:39PM -0500, George Spelvin wrote: > > That's a nice analysis. Might one conclude from that that hsiphash is > > not useful for our purposes? Or does it still remain useful for > > network facing code? > > I think for attacks where the threat is a DoS, it's usable. The point > is you only have to raise the cost to equal that of a packet flood. > (Just like in electronic warfare, the best you can possibly do is force > the enemy to use broadband jamming.) > > Hash collision attacks just aren't that powerful. The original PoC > was against an application that implemented a hard limit on hash chain > length as a DoS defense, which the attack then exploited to turn it into > a hard DoS. What should we do with get_random_int() and get_random_long()? In some cases it's being used in performance sensitive areas, and where anti-DoS protection might be enough. In others, maybe not so much. If we rekeyed the secret used by get_random_int() and get_random_long() frequently (say, every minute or every 5 minutes), would that be sufficient for current and future users of these interfaces? - Ted P.S. I'll note that my performance figures when testing changes to get_random_int() were done on a 32-bit x86; Jason, I'm guessing your figures were using a 64-bit x86 system?. I haven't tried 32-bit ARM or smaller CPU's (e.g., mips, et. al.) that might be more likely to be used on IoT devices, but I'm worried about those too, of course. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-16 20:43 ` Theodore Ts'o @ 2016-12-16 22:13 ` George Spelvin 2016-12-16 22:15 ` Andy Lutomirski 2016-12-16 22:18 ` Jason A. Donenfeld 0 siblings, 2 replies; 82+ messages in thread From: George Spelvin @ 2016-12-16 22:13 UTC (permalink / raw) To: linux, tytso Cc: ak, davem, David.Laight, djb, ebiggers3, hannes, Jason, jeanphilippe.aumasson, kernel-hardening, linux-crypto, linux-kernel, luto, netdev, tom, torvalds, vegard.nossum > What should we do with get_random_int() and get_random_long()? In > some cases it's being used in performance sensitive areas, and where > anti-DoS protection might be enough. In others, maybe not so much. This is tricky. The entire get_random_int() structure is an abuse of the hash function and will need to be thoroughly rethought to convert it to SipHash. Remember, SipHash's security goals are very different from MD5, so there's no obvious way to do the conversion. (It's *documented* as "not cryptographically secure", but we know where that goes.) > If we rekeyed the secret used by get_random_int() and > get_random_long() frequently (say, every minute or every 5 minutes), > would that be sufficient for current and future users of these > interfaces? Remembering that on "real" machines it's full SipHash, then I'd say that 64-bit security + rekeying seems reasonable. The question is, the idea has recently been floated to make hsiphash = SipHash-1-3 on 64-bit machines. Is *that* okay? The annoying thing about the currently proposed patch is that the *only* chaining is the returned value. What I'd *like* to do is the same pattern as we do with md5, and remember v[0..3] between invocations. But there's no partial SipHash primitive; we only get one word back. Even *chaining += ret = siphash_3u64(...) would be an improvement. Although we could do something like c0 = chaining[0]; chaining[0] = c1 = chaining[1]; ret = hsiphash(c0, c1, ...) chaining[1] = c0 + ret; ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-16 22:13 ` George Spelvin @ 2016-12-16 22:15 ` Andy Lutomirski 2016-12-16 22:18 ` Jason A. Donenfeld 1 sibling, 0 replies; 82+ messages in thread From: Andy Lutomirski @ 2016-12-16 22:15 UTC (permalink / raw) To: George Spelvin Cc: Ted Ts'o, Andi Kleen, David S. Miller, David Laight, D. J. Bernstein, Eric Biggers, Hannes Frederic Sowa, Jason A. Donenfeld, Jean-Philippe Aumasson, kernel-hardening, Linux Crypto Mailing List, linux-kernel, Network Development, Tom Herbert, Linus Torvalds, Vegard Nossum On Fri, Dec 16, 2016 at 2:13 PM, George Spelvin <linux@sciencehorizons.net> wrote: >> What should we do with get_random_int() and get_random_long()? In >> some cases it's being used in performance sensitive areas, and where >> anti-DoS protection might be enough. In others, maybe not so much. > > This is tricky. The entire get_random_int() structure is an abuse of > the hash function and will need to be thoroughly rethought to convert > it to SipHash. Remember, SipHash's security goals are very different > from MD5, so there's no obvious way to do the conversion. > > (It's *documented* as "not cryptographically secure", but we know > where that goes.) > >> If we rekeyed the secret used by get_random_int() and >> get_random_long() frequently (say, every minute or every 5 minutes), >> would that be sufficient for current and future users of these >> interfaces? > > Remembering that on "real" machines it's full SipHash, then I'd say that > 64-bit security + rekeying seems reasonable. > > The question is, the idea has recently been floated to make hsiphash = > SipHash-1-3 on 64-bit machines. Is *that* okay? > > > The annoying thing about the currently proposed patch is that the *only* > chaining is the returned value. What I'd *like* to do is the same > pattern as we do with md5, and remember v[0..3] between invocations. > But there's no partial SipHash primitive; we only get one word back. > > Even > *chaining += ret = siphash_3u64(...) > > would be an improvement. This is almost exactly what I suggested in my email on the other thread from a few seconds ago :) --Andy ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-16 22:13 ` George Spelvin 2016-12-16 22:15 ` Andy Lutomirski @ 2016-12-16 22:18 ` Jason A. Donenfeld 2016-12-16 23:44 ` George Spelvin 1 sibling, 1 reply; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-16 22:18 UTC (permalink / raw) To: George Spelvin Cc: Theodore Ts'o, Andi Kleen, David Miller, David Laight, Daniel J . Bernstein, Eric Biggers, Hannes Frederic Sowa, Jean-Philippe Aumasson, kernel-hardening, Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev, Tom Herbert, Linus Torvalds, Vegard Nossum On Fri, Dec 16, 2016 at 11:13 PM, George Spelvin <linux@sciencehorizons.net> wrote: > Remembering that on "real" machines it's full SipHash, then I'd say that > 64-bit security + rekeying seems reasonable. 64-bit security for an RNG is not reasonable even with rekeying. No no no. Considering we already have a massive speed-up here with the secure version, there's zero reason to start weakening the security because we're trigger happy with our benchmarks. No no no. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-16 22:18 ` Jason A. Donenfeld @ 2016-12-16 23:44 ` George Spelvin 2016-12-17 1:39 ` Jason A. Donenfeld 0 siblings, 1 reply; 82+ messages in thread From: George Spelvin @ 2016-12-16 23:44 UTC (permalink / raw) To: Jason, linux Cc: ak, davem, David.Laight, djb, ebiggers3, hannes, jeanphilippe.aumasson, kernel-hardening, linux-crypto, linux-kernel, luto, netdev, tom, torvalds, tytso, vegard.nossum > 64-bit security for an RNG is not reasonable even with rekeying. No no > no. Considering we already have a massive speed-up here with the > secure version, there's zero reason to start weakening the security > because we're trigger happy with our benchmarks. No no no. Just to clarify, I was discussing the idea with Ted (who's in charge of the whole thing, not me), not trying to make any sort of final decision on the subject. I need to look at the various users (46 non-trivial ones for get_random_int, 15 for get_random_long) and see what their security requirements actually are. I'm also trying to see if HalfSipHash can be used in a way that gives slightly more than 64 bits of effective security. The problem is that the old MD5-based transform had unclear, but obviously ample, security. There were 64 bytes of global secret and 16 chaining bytes per CPU. Adapting SipHash (even the full version) takes more thinking. An actual HalfSipHash-based equivalent to the existing code would be: #define RANDOM_INT_WORDS (64 / sizeof(long)) /* 16 or 8 */ static u32 random_int_secret[RANDOM_INT_WORDS] ____cacheline_aligned __read_mostly; static DEFINE_PER_CPU(unsigned long[4], get_random_int_hash) __aligned(sizeof(unsigned long)); unsigned long get_random_long(void) { unsigned long *hash = get_cpu_var(get_random_int_hash); unsigned long v0 = hash[0], v1 = hash[1], v2 = hash[2], v3 = hash[3]; int i; /* This could be improved, but it's equivalent */ v0 += current->pid + jiffies + random_get_entropy(); for (i = 0; i < RANDOM_INT_WORDS; i++) { v3 ^= random_int_secret[i]; HSIPROUND; HSIPROUND; v0 ^= random_int_secret[i]; } /* To be equivalent, we *don't* finalize the transform */ hash[0] = v0; hash[1] = v1; hash[2] = v2; hash[3] = v3; put_cpu_var(get_random_int_hash); return v0 ^ v1 ^ v2 ^ v3; } I don't think there's a 2^64 attack on that. But 64 bytes of global secret is ridiculous if the hash function doesn't require that minimum block size. It'll take some thinking. Ths advice I'd give now is: - Implement unsigned long hsiphash(const void *data, size_t len, const unsigned long key[2]) .. as SipHash on 64-bit (maybe SipHash-1-3, still being discussed) and HalfSipHash on 32-bit. - Document when it may or may not be used carefully. - #define get_random_int (unsigned)get_random_long - Ted, Andy Lutorminski and I will try to figure out a construction of get_random_long() that we all like. ('scuse me for a few hours, I have some unrelated things I really *should* be working on...) ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-16 23:44 ` George Spelvin @ 2016-12-17 1:39 ` Jason A. Donenfeld 2016-12-17 2:15 ` George Spelvin 0 siblings, 1 reply; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-17 1:39 UTC (permalink / raw) To: George Spelvin Cc: Andi Kleen, David Miller, David Laight, Daniel J . Bernstein, Eric Biggers, Hannes Frederic Sowa, Jean-Philippe Aumasson, kernel-hardening, Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev, Tom Herbert, Linus Torvalds, Theodore Ts'o, Vegard Nossum On Sat, Dec 17, 2016 at 12:44 AM, George Spelvin <linux@sciencehorizons.net> wrote: > Ths advice I'd give now is: > - Implement > unsigned long hsiphash(const void *data, size_t len, const unsigned long key[2]) > .. as SipHash on 64-bit (maybe SipHash-1-3, still being discussed) and > HalfSipHash on 32-bit. I already did this. Check my branch. > - Document when it may or may not be used carefully. Good idea. I'll write up some extensive documentation about all of this, detailing use cases and our various conclusions. > - #define get_random_int (unsigned)get_random_long That's a good idea, since ultimately the other just casts in the return value. I wonder if this could also lead to a similar aliasing with arch_get_random_int, since I'm pretty sure all rdrand-like instructions return native word size anyway. > - Ted, Andy Lutorminski and I will try to figure out a construction of > get_random_long() that we all like. And me, I hope... No need to make this exclusive. Jason ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-17 1:39 ` Jason A. Donenfeld @ 2016-12-17 2:15 ` George Spelvin 0 siblings, 0 replies; 82+ messages in thread From: George Spelvin @ 2016-12-17 2:15 UTC (permalink / raw) To: Jason, linux Cc: ak, davem, David.Laight, djb, ebiggers3, hannes, jeanphilippe.aumasson, kernel-hardening, linux-crypto, linux-kernel, luto, netdev, tom, torvalds, tytso, vegard.nossum > I already did this. Check my branch. Do you think it should return "u32" (as you currently have it) or "unsigned long"? I thought the latter, since it doesn't cost any more and makes more > I wonder if this could also lead to a similar aliasing > with arch_get_random_int, since I'm pretty sure all rdrand-like > instructions return native word size anyway. Well, Intel's can return 16, 32 or 64 bits, and it makes a small difference with reseed scheduling. >> - Ted, Andy Lutorminski and I will try to figure out a construction of >> get_random_long() that we all like. > And me, I hope... No need to make this exclusive. Gaah, engage brain before fingers. That was so obvious I didn't say it, and the result came out sounding extremely rude. A better (but longer) way to write it would be "I'm sorry that I, Ted, and Andy are all arguing with you and each other about how to do this and we can't finalize this part yet". ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-16 15:51 ` Jason A. Donenfeld 2016-12-16 17:36 ` George Spelvin @ 2016-12-17 12:42 ` George Spelvin 1 sibling, 0 replies; 82+ messages in thread From: George Spelvin @ 2016-12-17 12:42 UTC (permalink / raw) To: Jason Cc: ak, davem, David.Laight, djb, ebiggers3, hannes, jeanphilippe.aumasson, kernel-hardening, linux-crypto, linux-kernel, linux, luto, netdev, tom, torvalds, tytso, vegard.nossum BTW, here's some SipHash code I wrote for Linux a while ago. My target application was ext4 directory hashing, resulting in different implementation choices, although I still think that a rolled-up implementation like this is reasonable. Reducing I-cache impact speeds up the calling code. One thing I'd like to suggest you steal is the way it handles the fetch of the final partial word. It's a lot smaller and faster than an 8-way case statement. #include <linux/bitops.h> /* For rol64 */ #include <linux/cryptohash.h> #include <asm/byteorder.h> #include <asm/unaligned.h> /* The basic ARX mixing function, taken from Skein */ #define SIP_MIX(a, b, s) ((a) += (b), (b) = rol64(b, s), (b) ^= (a)) /* * The complete SipRound. Note that, when unrolled twice like below, * the 32-bit rotates drop out on 32-bit machines. */ #define SIP_ROUND(a, b, c, d) \ (SIP_MIX(a, b, 13), SIP_MIX(c, d, 16), (a) = rol64(a, 32), \ SIP_MIX(c, b, 17), SIP_MIX(a, d, 21), (c) = rol64(c, 32)) /* * This is rolled up more than most implementations, resulting in about * 55% the code size. Speed is a few precent slower. A crude benchmark * (for (i=1; i <= max; i++) for (j = 0; j < 4096-i; j++) hash(buf+j, i);) * produces the following timings (in usec): * * i386 i386 i386 x86_64 x86_64 x86_64 x86_64 * Length small unroll halfmd4 small unroll halfmd4 teahash * 1..4 1069 1029 1608 195 160 399 690 * 1..8 2483 2381 3851 410 360 988 1659 * 1..12 4303 4152 6207 690 618 1642 2690 * 1..16 6122 5931 8668 968 876 2363 3786 * 1..20 8348 8137 11245 1323 1185 3162 5567 * 1..24 10580 10327 13935 1657 1504 4066 7635 * 1..28 13211 12956 16803 2069 1871 5028 9759 * 1..32 15843 15572 19725 2470 2260 6084 11932 * 1..36 18864 18609 24259 2934 2678 7566 14794 * 1..1024 5890194 6130242 10264816 881933 881244 3617392 7589036 * * The performance penalty is quite minor, decreasing for long strings, * and it's significantly faster than half_md4, so I'm going for the * I-cache win. */ uint64_t siphash24(char const *in, size_t len, uint32_t const seed[4]) { uint64_t a = 0x736f6d6570736575; /* somepseu */ uint64_t b = 0x646f72616e646f6d; /* dorandom */ uint64_t c = 0x6c7967656e657261; /* lygenera */ uint64_t d = 0x7465646279746573; /* tedbytes */ uint64_t m = 0; uint8_t padbyte = len; /* * Mix in the 128-bit hash seed. This is in a format convenient * to the ext3/ext4 code. Please feel free to adapt the * */ if (seed) { m = seed[2] | (uint64_t)seed[3] << 32; b ^= m; d ^= m; m = seed[0] | (uint64_t)seed[1] << 32; /* a ^= m; is done in loop below */ c ^= m; } /* * By using the same SipRound code for all iterations, we * save space, at the expense of some branch prediction. But * branch prediction is hard because of variable length anyway. */ len = len/8 + 3; /* Now number of rounds to perform */ do { a ^= m; switch (--len) { unsigned bytes; default: /* Full words */ d ^= m = get_unaligned_le64(in); in += 8; break; case 2: /* Final partial word */ /* * We'd like to do one 64-bit fetch rather than * mess around with bytes, but reading past the end * might hit a protection boundary. Fortunately, * we know that protection boundaries are aligned, * so we can consider only three cases: * - The remainder occupies zero words * - The remainder fits into one word * - The remainder straddles two words */ bytes = padbyte & 7; if (bytes == 0) { m = 0; } else { unsigned offset = (unsigned)(uintptr_t)in & 7; if (offset + bytes <= 8) { m = le64_to_cpup((uint64_t const *) (in - offset)); m >>= 8*offset; } else { m = get_unaligned_le64(in); } m &= ((uint64_t)1 << 8*bytes) - 1; } /* Could use | or +, but ^ allows associativity */ d ^= m ^= (uint64_t)padbyte << 56; break; case 1: /* Beginning of finalization */ m = 0; c ^= 0xff; /*FALLTHROUGH*/ case 0: /* Second half of finalization */ break; } SIP_ROUND(a, b, c, d); SIP_ROUND(a, b, c, d); } while (len); return a ^ b ^ c ^ d; } #undef SIP_ROUND #undef SIP_MIX /* * No objection to EXPORT_SYMBOL, but we should probably figure out * how the seed[] array should work first. Homework for the first * person to want to call it from a module! */ ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF [not found] ` <CAGiyFddB_HT3H2yhYQ5rprYZ487rJ4iCaH9uPJQD57hiPbn9ng@mail.gmail.com> 2016-12-16 15:51 ` Jason A. Donenfeld @ 2016-12-16 20:39 ` Jason A. Donenfeld 1 sibling, 0 replies; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-16 20:39 UTC (permalink / raw) To: Jean-Philippe Aumasson Cc: George Spelvin, Andi Kleen, David Miller, David Laight, Eric Biggers, Hannes Frederic Sowa, kernel-hardening, Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev, Tom Herbert, Linus Torvalds, Theodore Ts'o, Vegard Nossum, Daniel J . Bernstein Hi JP, On Fri, Dec 16, 2016 at 2:22 PM, Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> wrote: > It needs some basic security review, which I'll try do next week (check for > security margin, optimality of rotation counts, etc.). But after a lot of > experience with this kind of construction (BLAKE, SipHash, NORX), I'm > confident it will be safe as it is. I've implemented it in my siphash kernel branch: https://git.zx2c4.com/linux-dev/log/?h=siphash It's the commit that has "HalfSipHash" in the log message. As the structure is nearly identical to SipHash, there wasn't a lot to change, and so the same implementation strategy exists for each. When you've finished your security review and feel good about it, some test vectors using the same formula (key={0x03020100, 07060504}, input={0x0, 0x1, 0x2, 0x3...}, output=test_vectors) would be nice for verification. Jason ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
@ 2016-12-16 20:43 Jason A. Donenfeld
0 siblings, 0 replies; 82+ messages in thread
From: Jason A. Donenfeld @ 2016-12-16 20:43 UTC (permalink / raw)
To: George Spelvin
Cc: Tom Herbert, Andi Kleen, David Miller, David Laight,
Daniel J . Bernstein, Eric Biggers, Hannes Frederic Sowa,
Jean-Philippe Aumasson, kernel-hardening,
Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
Linus Torvalds, Theodore Ts'o, Vegard Nossum
On Fri, Dec 16, 2016 at 9:41 PM, George Spelvin
<linux@sciencehorizons.net> wrote:
> What are you testing on? And what input size? And does "33% improvement"
> mean 4/3 the rate and 3/4 the time? Or 2/3 the time and 3/2 the rate?
How that I've published my hsiphash implementation to my tree, it
should be possible to conduct the tests back to back with nearly
identical implementation strategies, to remove a potential source of
error.
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF @ 2016-12-16 20:49 Jason A. Donenfeld 2016-12-16 21:25 ` George Spelvin 0 siblings, 1 reply; 82+ messages in thread From: Jason A. Donenfeld @ 2016-12-16 20:49 UTC (permalink / raw) To: George Spelvin Cc: Andi Kleen, David Miller, David Laight, Daniel J . Bernstein, Eric Biggers, Hannes Frederic Sowa, Jean-Philippe Aumasson, kernel-hardening, Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev, Tom Herbert, Linus Torvalds, Theodore Ts'o, Vegard Nossum On Fri, Dec 16, 2016 at 9:17 PM, George Spelvin <linux@sciencehorizons.net> wrote: > My (speaking enerally; I should walk through every hash table you've > converted) opinion is that: > > - Hash tables, even network-facing ones, can all use hsiphash as long > as an attacker can only see collisions, i.e. ((H(x) ^ H(y)) & bits) == > 0, and the consequences of a successful attack is only more collisions > (timing). While the attack is only 2x the cost (two hashes rather than > one to test a key), the knowledge of the collision is statistical, > especially for network attackers, which raises the cost of guessing > beyond an even more brute-force attack. > - When the hash value directly visible (e.g. included in a network > packet), full SipHash should be the default. > - Syncookies *could* use hsiphash, especially as there are > two keys in there. Not sure if we need the performance. > - For TCP ISNs, I'd prefer to use full SipHash. I know this is > a very hot path, and if that's a performance bottleneck, > we can work harder on it. > > In particular, TCP ISNs *used* to rotate the key periodically, > limiting the time available to an attacker to perform an > attack before the secret goes stale and is useless. commit > 6e5714eaf77d79ae1c8b47e3e040ff5411b717ec upgraded to md5 and dropped > the key rotation. While I generally agree with this analysis for the most part, I do think we should use SipHash and not HalfSipHash for syncookies. Although the security risk is lower than with sequence numbers, it previously used full MD5 for this, which means performance is not generally a bottleneck and we'll get massive speedups no matter what, whether using SipHash or HalfSipHash. In addition, using SipHash means that the 128-bit key gives a larger margin and can be safe longterm. So, I think we should err on the side of caution and stick with SipHash in all cases in which we're upgrading from MD5. In other words, only current jhash users should be potentially eligible for hsiphash. > Current code uses a 64 ns tick for the ISN, so it counts 2^24 per second. > (32 bits wraps every 4.6 minutes.) A 4-bit counter and 28-bit hash > (or even 3+29) would work as long as the key is regenerated no more > than once per minute. (Just using the 4.6-minute ISN wrap time is the > obvious simple implementation.) > > (Of course, I defer to DaveM's judgement on all network-related issues.) I saw that jiffies addition in there and was wondering what it was all about. It's currently added _after_ the siphash input, not before, to keep with how the old algorithm worked. I'm not sure if this is correct or if there's something wrong with that, as I haven't studied how it works. If that jiffies should be part of the siphash input and not added to the result, please tell me. Otherwise I'll keep things how they are to avoid breaking something that seems to be working. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF 2016-12-16 20:49 Jason A. Donenfeld @ 2016-12-16 21:25 ` George Spelvin 0 siblings, 0 replies; 82+ messages in thread From: George Spelvin @ 2016-12-16 21:25 UTC (permalink / raw) To: Jason, linux Cc: ak, davem, David.Laight, djb, ebiggers3, hannes, jeanphilippe.aumasson, kernel-hardening, linux-crypto, linux-kernel, luto, netdev, tom, torvalds, tytso, vegard.nossum Jason A. Donenfeld wrote: > I saw that jiffies addition in there and was wondering what it was all > about. It's currently added _after_ the siphash input, not before, to > keep with how the old algorithm worked. I'm not sure if this is > correct or if there's something wrong with that, as I haven't studied > how it works. If that jiffies should be part of the siphash input and > not added to the result, please tell me. Otherwise I'll keep things > how they are to avoid breaking something that seems to be working. Oh, geez, I didn't realize you didn't understand this code. Full details at https://en.wikipedia.org/wiki/TCP_sequence_prediction_attack But yes, the sequence number is supposed to be (random base) + (timestamp). In the old days before Canter & Siegel when the internet was a nice place, people just used a counter that started at boot time. But then someone observed that I can start a connection to host X, see the sequence number it gives back to me, and thereby learn the seauence number it's using on its connections to host Y. And I can use that to inject forged data into an X-to-Y connection, without ever seeing a single byte of the traffic! (If I *can* observe the traffic, of course, none of this makes the slightest difference.) So the random base was made a keyed hash of the endpoint identifiers. (Practically only the hosts matter, but generally the ports are thrown in for good measure.) That way, the ISN that host X sends to me tells me nothing about the ISN it's using to talk to host Y. Now the only way to inject forged data into the X-to-Y connection is to send 2^32 bytes, which is a little less practical. ^ permalink raw reply [flat|nested] 82+ messages in thread
end of thread, other threads:[~2016-12-22 19:50 UTC | newest] Thread overview: 82+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-12-15 20:29 [PATCH v5 0/4] The SipHash Patchset Jason A. Donenfeld 2016-12-15 20:30 ` [PATCH v5 1/4] siphash: add cryptographically secure PRF Jason A. Donenfeld 2016-12-15 22:42 ` George Spelvin 2016-12-16 2:14 ` kbuild test robot 2016-12-17 14:55 ` Jeffrey Walton 2016-12-19 17:08 ` Jason A. Donenfeld 2016-12-15 20:30 ` [PATCH v5 2/4] siphash: add Nu{32,64} helpers Jason A. Donenfeld 2016-12-16 10:39 ` David Laight 2016-12-16 15:44 ` George Spelvin 2016-12-15 20:30 ` [PATCH v5 3/4] secure_seq: use SipHash in place of MD5 Jason A. Donenfeld 2016-12-16 9:59 ` David Laight 2016-12-16 15:57 ` Jason A. Donenfeld 2016-12-15 20:30 ` [PATCH v5 4/4] random: " Jason A. Donenfeld 2016-12-16 3:03 ` [PATCH v6 0/5] The SipHash Patchset Jason A. Donenfeld 2016-12-16 3:03 ` [PATCH v6 1/5] siphash: add cryptographically secure PRF Jason A. Donenfeld 2016-12-16 3:03 ` [PATCH v6 2/5] secure_seq: use SipHash in place of MD5 Jason A. Donenfeld 2016-12-16 3:03 ` [PATCH v6 3/5] random: " Jason A. Donenfeld 2016-12-16 21:31 ` Andy Lutomirski 2016-12-16 3:03 ` [PATCH v6 4/5] md5: remove from lib and only live in crypto Jason A. Donenfeld 2016-12-16 3:03 ` [PATCH v6 5/5] syncookies: use SipHash in place of SHA1 Jason A. Donenfeld 2016-12-21 23:02 ` [PATCH v7 0/6] The SipHash Patchset Jason A. Donenfeld 2016-12-21 23:02 ` [PATCH v7 1/6] siphash: add cryptographically secure PRF Jason A. Donenfeld 2016-12-22 1:40 ` Stephen Hemminger 2016-12-21 23:02 ` [PATCH v7 2/6] secure_seq: use SipHash in place of MD5 Jason A. Donenfeld 2016-12-21 23:02 ` [PATCH v7 3/6] random: " Jason A. Donenfeld 2016-12-21 23:13 ` Jason A. Donenfeld 2016-12-21 23:42 ` Andy Lutomirski 2016-12-22 2:07 ` Hannes Frederic Sowa 2016-12-22 2:09 ` Andy Lutomirski 2016-12-22 2:49 ` Jason A. Donenfeld 2016-12-22 3:12 ` Jason A. Donenfeld 2016-12-22 5:41 ` [kernel-hardening] " Theodore Ts'o 2016-12-22 6:03 ` Jason A. Donenfeld 2016-12-22 15:58 ` Theodore Ts'o 2016-12-22 16:16 ` Jason A. Donenfeld 2016-12-22 16:30 ` Theodore Ts'o 2016-12-22 16:36 ` Jason A. Donenfeld 2016-12-22 12:47 ` Hannes Frederic Sowa 2016-12-22 13:10 ` Jason A. Donenfeld 2016-12-22 15:05 ` Hannes Frederic Sowa 2016-12-22 15:12 ` Jason A. Donenfeld 2016-12-22 15:29 ` Jason A. Donenfeld 2016-12-22 15:33 ` Hannes Frederic Sowa 2016-12-22 15:41 ` Jason A. Donenfeld 2016-12-22 15:51 ` Hannes Frederic Sowa 2016-12-22 15:53 ` Jason A. Donenfeld 2016-12-22 15:54 ` Theodore Ts'o 2016-12-22 18:08 ` Hannes Frederic Sowa 2016-12-22 18:13 ` Jason A. Donenfeld 2016-12-22 19:50 ` Theodore Ts'o 2016-12-22 2:31 ` Jason A. Donenfeld 2016-12-21 23:02 ` [PATCH v7 4/6] md5: remove from lib and only live in crypto Jason A. Donenfeld 2016-12-21 23:02 ` [PATCH v7 5/6] syncookies: use SipHash in place of SHA1 Jason A. Donenfeld 2016-12-21 23:02 ` [PATCH v7 6/6] siphash: implement HalfSipHash1-3 for hash tables Jason A. Donenfeld 2016-12-22 0:46 ` Andi Kleen [not found] <CAGiyFdfmiCMyHvAg=5sGh8KjBBrF0Wb4Qf=JLzJqUAx4yFSS3Q@mail.gmail.com> 2016-12-15 23:28 ` [PATCH v5 1/4] siphash: add cryptographically secure PRF George Spelvin 2016-12-16 17:06 ` David Laight 2016-12-16 17:09 ` Jason A. Donenfeld 2016-12-16 3:46 ` George Spelvin [not found] ` <CAGiyFdd6_LVzUUfFcaqMyub1c2WPvWUzAQDCH+Aza-_t6mvmXg@mail.gmail.com> 2016-12-16 12:39 ` Jason A. Donenfeld 2016-12-16 19:47 ` Tom Herbert 2016-12-16 20:41 ` George Spelvin 2016-12-16 20:57 ` Tom Herbert 2016-12-17 15:21 ` George Spelvin 2016-12-19 14:14 ` David Laight 2016-12-19 18:10 ` George Spelvin [not found] ` <CAGiyFddB_HT3H2yhYQ5rprYZ487rJ4iCaH9uPJQD57hiPbn9ng@mail.gmail.com> 2016-12-16 15:51 ` Jason A. Donenfeld 2016-12-16 17:36 ` George Spelvin 2016-12-16 18:00 ` Jason A. Donenfeld 2016-12-16 20:17 ` George Spelvin 2016-12-16 20:43 ` Theodore Ts'o 2016-12-16 22:13 ` George Spelvin 2016-12-16 22:15 ` Andy Lutomirski 2016-12-16 22:18 ` Jason A. Donenfeld 2016-12-16 23:44 ` George Spelvin 2016-12-17 1:39 ` Jason A. Donenfeld 2016-12-17 2:15 ` George Spelvin 2016-12-17 12:42 ` George Spelvin 2016-12-16 20:39 ` Jason A. Donenfeld 2016-12-16 20:43 Jason A. Donenfeld 2016-12-16 20:49 Jason A. Donenfeld 2016-12-16 21:25 ` George Spelvin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).