From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andy Lutomirski Subject: Re: [PATCH v7 3/6] random: use SipHash in place of MD5 Date: Wed, 21 Dec 2016 15:42:38 -0800 Message-ID: References: <20161216030328.11602-1-Jason@zx2c4.com> <20161221230216.25341-1-Jason@zx2c4.com> <20161221230216.25341-4-Jason@zx2c4.com> Reply-To: kernel-hardening@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: Netdev , "kernel-hardening@lists.openwall.com" , LKML , Linux Crypto Mailing List , David Laight , Ted Tso , Hannes Frederic Sowa , Eric Dumazet , Linus Torvalds , Eric Biggers , Tom Herbert , Andi Kleen , "David S. Miller" , Jean-Philippe Aumasson To: "Jason A. Donenfeld" Return-path: List-Post: List-Help: List-Unsubscribe: List-Subscribe: In-Reply-To: <20161221230216.25341-4-Jason@zx2c4.com> List-Id: linux-crypto.vger.kernel.org On Wed, Dec 21, 2016 at 3:02 PM, Jason A. Donenfeld wrote: > unsigned int get_random_int(void) > { > - __u32 *hash; > - unsigned int ret; > - > - if (arch_get_random_int(&ret)) > - return ret; > - > - hash = get_cpu_var(get_random_int_hash); > - > - hash[0] += current->pid + jiffies + random_get_entropy(); > - md5_transform(hash, random_int_secret); > - ret = hash[0]; > - put_cpu_var(get_random_int_hash); > - > - return ret; > + unsigned int arch_result; > + u64 result; > + struct random_int_secret *secret; > + > + if (arch_get_random_int(&arch_result)) > + return arch_result; > + > + secret = get_random_int_secret(); > + result = siphash_3u64(secret->chaining, jiffies, > + (u64)random_get_entropy() + current->pid, > + secret->secret); > + secret->chaining += result; > + put_cpu_var(secret); > + return result; > } > EXPORT_SYMBOL(get_random_int); Hmm. I haven't tried to prove anything for real. But here goes (in the random oracle model): Suppose I'm an attacker and I don't know the secret or the chaining value. Then, regardless of what the entropy is, I can't predict the numbers. Now suppose I do know the secret and the chaining value due to some leak. If I want to deduce prior outputs, I think I'm stuck: I'd need to find a value "result" such that prev_chaining + result = chaining and result = H(prev_chaining, ..., secret);. I don't think this can be done efficiently in the random oracle model regardless of what the "..." is. But, if I know the secret and chaining value, I can predict the next output assuming I can guess the entropy. What's worse is that, even if I can't guess the entropy, if I *observe* the next output then I can calculate the next chaining value. So this is probably good enough, and making it better is hard. Changing it to: u64 entropy = (u64)random_get_entropy() + current->pid; result = siphash(..., entropy, ...); secret->chaining += result + entropy; would reduce this problem by forcing an attacker to brute-force the entropy on each iteration, which is probably an improvement. To fully fix it, something like "catastrophic reseeding" would be needed, but that's hard to get right. (An aside: on x86 at least, using two percpu variables is faster because directly percpu access is essentially free, whereas getting the address of a percpu variable is not free.) From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935874AbcLUXus (ORCPT ); Wed, 21 Dec 2016 18:50:48 -0500 Received: from mail-ua0-f176.google.com ([209.85.217.176]:35203 "EHLO mail-ua0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753755AbcLUXup (ORCPT ); Wed, 21 Dec 2016 18:50:45 -0500 MIME-Version: 1.0 In-Reply-To: <20161221230216.25341-4-Jason@zx2c4.com> References: <20161216030328.11602-1-Jason@zx2c4.com> <20161221230216.25341-1-Jason@zx2c4.com> <20161221230216.25341-4-Jason@zx2c4.com> From: Andy Lutomirski Date: Wed, 21 Dec 2016 15:42:38 -0800 Message-ID: Subject: Re: [PATCH v7 3/6] random: use SipHash in place of MD5 To: "Jason A. Donenfeld" Cc: Netdev , "kernel-hardening@lists.openwall.com" , LKML , Linux Crypto Mailing List , David Laight , Ted Tso , Hannes Frederic Sowa , Eric Dumazet , Linus Torvalds , Eric Biggers , Tom Herbert , Andi Kleen , "David S. Miller" , Jean-Philippe Aumasson Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Dec 21, 2016 at 3:02 PM, Jason A. Donenfeld wrote: > unsigned int get_random_int(void) > { > - __u32 *hash; > - unsigned int ret; > - > - if (arch_get_random_int(&ret)) > - return ret; > - > - hash = get_cpu_var(get_random_int_hash); > - > - hash[0] += current->pid + jiffies + random_get_entropy(); > - md5_transform(hash, random_int_secret); > - ret = hash[0]; > - put_cpu_var(get_random_int_hash); > - > - return ret; > + unsigned int arch_result; > + u64 result; > + struct random_int_secret *secret; > + > + if (arch_get_random_int(&arch_result)) > + return arch_result; > + > + secret = get_random_int_secret(); > + result = siphash_3u64(secret->chaining, jiffies, > + (u64)random_get_entropy() + current->pid, > + secret->secret); > + secret->chaining += result; > + put_cpu_var(secret); > + return result; > } > EXPORT_SYMBOL(get_random_int); Hmm. I haven't tried to prove anything for real. But here goes (in the random oracle model): Suppose I'm an attacker and I don't know the secret or the chaining value. Then, regardless of what the entropy is, I can't predict the numbers. Now suppose I do know the secret and the chaining value due to some leak. If I want to deduce prior outputs, I think I'm stuck: I'd need to find a value "result" such that prev_chaining + result = chaining and result = H(prev_chaining, ..., secret);. I don't think this can be done efficiently in the random oracle model regardless of what the "..." is. But, if I know the secret and chaining value, I can predict the next output assuming I can guess the entropy. What's worse is that, even if I can't guess the entropy, if I *observe* the next output then I can calculate the next chaining value. So this is probably good enough, and making it better is hard. Changing it to: u64 entropy = (u64)random_get_entropy() + current->pid; result = siphash(..., entropy, ...); secret->chaining += result + entropy; would reduce this problem by forcing an attacker to brute-force the entropy on each iteration, which is probably an improvement. To fully fix it, something like "catastrophic reseeding" would be needed, but that's hard to get right. (An aside: on x86 at least, using two percpu variables is faster because directly percpu access is essentially free, whereas getting the address of a percpu variable is not free.) From mboxrd@z Thu Jan 1 00:00:00 1970 Reply-To: kernel-hardening@lists.openwall.com MIME-Version: 1.0 In-Reply-To: <20161221230216.25341-4-Jason@zx2c4.com> References: <20161216030328.11602-1-Jason@zx2c4.com> <20161221230216.25341-1-Jason@zx2c4.com> <20161221230216.25341-4-Jason@zx2c4.com> From: Andy Lutomirski Date: Wed, 21 Dec 2016 15:42:38 -0800 Message-ID: Content-Type: text/plain; charset=UTF-8 Subject: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5 To: "Jason A. Donenfeld" Cc: Netdev , "kernel-hardening@lists.openwall.com" , LKML , Linux Crypto Mailing List , David Laight , Ted Tso , Hannes Frederic Sowa , Eric Dumazet , Linus Torvalds , Eric Biggers , Tom Herbert , Andi Kleen , "David S. Miller" , Jean-Philippe Aumasson List-ID: On Wed, Dec 21, 2016 at 3:02 PM, Jason A. Donenfeld wrote: > unsigned int get_random_int(void) > { > - __u32 *hash; > - unsigned int ret; > - > - if (arch_get_random_int(&ret)) > - return ret; > - > - hash = get_cpu_var(get_random_int_hash); > - > - hash[0] += current->pid + jiffies + random_get_entropy(); > - md5_transform(hash, random_int_secret); > - ret = hash[0]; > - put_cpu_var(get_random_int_hash); > - > - return ret; > + unsigned int arch_result; > + u64 result; > + struct random_int_secret *secret; > + > + if (arch_get_random_int(&arch_result)) > + return arch_result; > + > + secret = get_random_int_secret(); > + result = siphash_3u64(secret->chaining, jiffies, > + (u64)random_get_entropy() + current->pid, > + secret->secret); > + secret->chaining += result; > + put_cpu_var(secret); > + return result; > } > EXPORT_SYMBOL(get_random_int); Hmm. I haven't tried to prove anything for real. But here goes (in the random oracle model): Suppose I'm an attacker and I don't know the secret or the chaining value. Then, regardless of what the entropy is, I can't predict the numbers. Now suppose I do know the secret and the chaining value due to some leak. If I want to deduce prior outputs, I think I'm stuck: I'd need to find a value "result" such that prev_chaining + result = chaining and result = H(prev_chaining, ..., secret);. I don't think this can be done efficiently in the random oracle model regardless of what the "..." is. But, if I know the secret and chaining value, I can predict the next output assuming I can guess the entropy. What's worse is that, even if I can't guess the entropy, if I *observe* the next output then I can calculate the next chaining value. So this is probably good enough, and making it better is hard. Changing it to: u64 entropy = (u64)random_get_entropy() + current->pid; result = siphash(..., entropy, ...); secret->chaining += result + entropy; would reduce this problem by forcing an attacker to brute-force the entropy on each iteration, which is probably an improvement. To fully fix it, something like "catastrophic reseeding" would be needed, but that's hard to get right. (An aside: on x86 at least, using two percpu variables is faster because directly percpu access is essentially free, whereas getting the address of a percpu variable is not free.)