linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v6 3/5] random: use SipHash in place of MD5
@ 2016-12-16 21:45 Jason A. Donenfeld
  2016-12-16 22:12 ` Jason A. Donenfeld
  2016-12-16 22:13 ` Andy Lutomirski
  0 siblings, 2 replies; 6+ messages in thread
From: Jason A. Donenfeld @ 2016-12-16 21:45 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Netdev, kernel-hardening, LKML, Linux Crypto Mailing List,
	David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds,
	Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum,
	Andi Kleen, David S. Miller, Jean-Philippe Aumasson

Hi Andy,

On Fri, Dec 16, 2016 at 10:31 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> I think it would be nice to try to strenghen the PRNG construction.
> FWIW, I'm not an expert in PRNGs, and there's fairly extensive
> literature, but I can at least try.

In an effort to keep this patchset as initially as uncontroversial as
possible, I kept the same same construction as before and just swapped
out slow MD5 for fast Siphash. Additionally, the function
documentation says that it isn't cryptographically secure. But in the
end I certainly agree with you; we should most definitely improve
things, and seeing the eyeballs now on this series, I think we now
might have a mandate to do so.

> 1. A one-time leak of memory contents doesn't ruin security until
> reboot.  This is especially value across suspend and/or hibernation.

Ted and I were discussing this in another thread, and it sounds like
he wants the same thing. I'll add re-generation of the secret every
once in a while. Perhaps time-based makes more sense than
counter-based for rekeying frequency?

> 2. An attack with a low work factor (2^64?) shouldn't break the scheme
> until reboot.

It won't. The key is 128-bits.

> This is effectively doing:
>
> output = H(prev_output, weak "entropy", per-boot secret);
>
> One unfortunately downside is that, if used in a context where an
> attacker can see a single output, the attacker learns the chaining
> value.  If the attacker can guess the entropy, then, with 2^64 work,
> they learn the secret, and they can predict future outputs.

No, the secret is 128-bits, which isn't feasibly guessable. The secret
also isn't part of the hash, but rather is the generator of the hash
function. A keyed hash (a PRF) is a bit of a different construction
than just hashing a secret value into a hash function.

> Second, change the mode so that an attacker doesn't learn so much
> internal state.  For example:
>
> output = H(old_chain, entropy, secret);
> new_chain = old_chain + entropy + output;

Happy to make this change, with making the chaining value additive
rather than a replacement.

However, I'm not sure adding entropy to the new_chain makes a
different. That entropy is basically just the cycle count plus the
jiffies count. If an attacker can already guess them, then adding them
again to the chaining value doesn't really add much.

Jason

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6 3/5] random: use SipHash in place of MD5
  2016-12-16 21:45 [PATCH v6 3/5] random: use SipHash in place of MD5 Jason A. Donenfeld
@ 2016-12-16 22:12 ` Jason A. Donenfeld
  2016-12-16 22:13 ` Andy Lutomirski
  1 sibling, 0 replies; 6+ messages in thread
From: Jason A. Donenfeld @ 2016-12-16 22:12 UTC (permalink / raw)
  To: Andy Lutomirski, Ted Tso
  Cc: Netdev, kernel-hardening, LKML, Linux Crypto Mailing List,
	David Laight, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, Andi Kleen,
	David S. Miller, Jean-Philippe Aumasson

Hi Andy, Ted,

I've made the requested changes. Keys now rotate and are per-CPU
based. The chaining value is now additive instead of replacing.

DavidL suggested I lower the velocity of `git-send-email` triggers, so
if you'd like to take a look at this before I post v7, you can follow
along at my git tree here:

    https://git.zx2c4.com/linux-dev/log/?h=siphash

Choose the commit entitled "random: use SipHash in place of MD5"

Thanks,
Jason

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6 3/5] random: use SipHash in place of MD5
  2016-12-16 21:45 [PATCH v6 3/5] random: use SipHash in place of MD5 Jason A. Donenfeld
  2016-12-16 22:12 ` Jason A. Donenfeld
@ 2016-12-16 22:13 ` Andy Lutomirski
  2016-12-16 22:23   ` Jason A. Donenfeld
  1 sibling, 1 reply; 6+ messages in thread
From: Andy Lutomirski @ 2016-12-16 22:13 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Netdev, kernel-hardening, LKML, Linux Crypto Mailing List,
	David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds,
	Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum,
	Andi Kleen, David S. Miller, Jean-Philippe Aumasson

On Fri, Dec 16, 2016 at 1:45 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> Hi Andy,
>
> On Fri, Dec 16, 2016 at 10:31 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> I think it would be nice to try to strenghen the PRNG construction.
>> FWIW, I'm not an expert in PRNGs, and there's fairly extensive
>> literature, but I can at least try.
>
> In an effort to keep this patchset as initially as uncontroversial as
> possible, I kept the same same construction as before and just swapped
> out slow MD5 for fast Siphash. Additionally, the function
> documentation says that it isn't cryptographically secure. But in the
> end I certainly agree with you; we should most definitely improve
> things, and seeing the eyeballs now on this series, I think we now
> might have a mandate to do so.
>
>> 1. A one-time leak of memory contents doesn't ruin security until
>> reboot.  This is especially value across suspend and/or hibernation.
>
> Ted and I were discussing this in another thread, and it sounds like
> he wants the same thing. I'll add re-generation of the secret every
> once in a while. Perhaps time-based makes more sense than
> counter-based for rekeying frequency?

Counter may be faster -- you don't need to read a timer.  Lots of CPUs
are surprisingly slow at timing.  OTOH jiffies are fast.

>
>> 2. An attack with a low work factor (2^64?) shouldn't break the scheme
>> until reboot.
>
> It won't. The key is 128-bits.

Whoops, I thought the key was 64-bit...

>
>> This is effectively doing:
>>
>> output = H(prev_output, weak "entropy", per-boot secret);
>>
>> One unfortunately downside is that, if used in a context where an
>> attacker can see a single output, the attacker learns the chaining
>> value.  If the attacker can guess the entropy, then, with 2^64 work,
>> they learn the secret, and they can predict future outputs.
>
> No, the secret is 128-bits, which isn't feasibly guessable. The secret
> also isn't part of the hash, but rather is the generator of the hash
> function. A keyed hash (a PRF) is a bit of a different construction
> than just hashing a secret value into a hash function.

I was thinking in the random oracle model, in which case the whole
function is just a PRF that takes a few input parameters, one of which
is a key.

>
>> Second, change the mode so that an attacker doesn't learn so much
>> internal state.  For example:
>>
>> output = H(old_chain, entropy, secret);
>> new_chain = old_chain + entropy + output;
>
> Happy to make this change, with making the chaining value additive
> rather than a replacement.
>
> However, I'm not sure adding entropy to the new_chain makes a
> different. That entropy is basically just the cycle count plus the
> jiffies count. If an attacker can already guess them, then adding them
> again to the chaining value doesn't really add much.

Agreed.  A simpler contruction would be:

chaining++;
output = H(chaining, secret);

And this looks a whole lot like Ted's ChaCha20 construction.

The benefit of my construction is that (in the random oracle model,
assuming my intuition is right), if an attacker misses ~128 samples
and entropy has at least one bit of independent min-entropy per
sample, then the attacker needs ~2^128 work to brute-force the
chaining value even fi the attacker knew both the original chaining
value and the secret.

--Andy

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6 3/5] random: use SipHash in place of MD5
  2016-12-16 22:13 ` Andy Lutomirski
@ 2016-12-16 22:23   ` Jason A. Donenfeld
  0 siblings, 0 replies; 6+ messages in thread
From: Jason A. Donenfeld @ 2016-12-16 22:23 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Netdev, kernel-hardening, LKML, Linux Crypto Mailing List,
	David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds,
	Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum,
	Andi Kleen, David S. Miller, Jean-Philippe Aumasson

Hi Andy,

> Agreed.  A simpler contruction would be:
>
> chaining++;
> output = H(chaining, secret);
>
> And this looks a whole lot like Ted's ChaCha20 construction.

In that simpler construction with counter-based secret rekeying and in
Ted's ChaCha20 construction, the issue is that every X hits, there's a
call to get_random_bytes, which has variable performance and entropy
issues. Doing it my way with it being time based, in the event that
somebody runs ` :(){ :|:& };:`, system performance doesn't suffer
because ASLR is making repeated calls to get_random_bytes every 128 or
so process creations. In the time based way, the system performance
will not suffer.

Jason

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6 3/5] random: use SipHash in place of MD5
  2016-12-16  3:03   ` [PATCH v6 3/5] random: use SipHash in place of MD5 Jason A. Donenfeld
@ 2016-12-16 21:31     ` Andy Lutomirski
  0 siblings, 0 replies; 6+ messages in thread
From: Andy Lutomirski @ 2016-12-16 21:31 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, Andi Kleen,
	David S. Miller, Jean-Philippe Aumasson

On Thu, Dec 15, 2016 at 7:03 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> -static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash)
> -               __aligned(sizeof(unsigned long));
> +static DEFINE_PER_CPU(u64, get_random_int_chaining);
>

[...]

>  unsigned long get_random_long(void)
>  {
> -       __u32 *hash;
>         unsigned long ret;
> +       u64 *chaining;
>
>         if (arch_get_random_long(&ret))
>                 return ret;
>
> -       hash = get_cpu_var(get_random_int_hash);
> -
> -       hash[0] += current->pid + jiffies + random_get_entropy();
> -       md5_transform(hash, random_int_secret);
> -       ret = *(unsigned long *)hash;
> -       put_cpu_var(get_random_int_hash);
> -
> +       chaining = &get_cpu_var(get_random_int_chaining);
> +       ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() +
> +                                      current->pid, random_int_secret);
> +       put_cpu_var(get_random_int_chaining);
>         return ret;
>  }

I think it would be nice to try to strenghen the PRNG construction.
FWIW, I'm not an expert in PRNGs, and there's fairly extensive
literature, but I can at least try.  Here are some properties I'd
like:

1. A one-time leak of memory contents doesn't ruin security until
reboot.  This is especially value across suspend and/or hibernation.

2. An attack with a low work factor (2^64?) shouldn't break the scheme
until reboot.


This is effectively doing:

output = H(prev_output, weak "entropy", per-boot secret);

One unfortunately downside is that, if used in a context where an
attacker can see a single output, the attacker learns the chaining
value.  If the attacker can guess the entropy, then, with 2^64 work,
they learn the secret, and they can predict future outputs.

I would advocate adding two types of improvements.  First, re-seed it
every now and then (every 128 calls?) by just replacing both the
chaining value and the percpu secret with fresh CSPRNG output.
Second, change the mode so that an attacker doesn't learn so much
internal state.  For example:

output = H(old_chain, entropy, secret);
new_chain = old_chain + entropy + output;

This increases the effort needed to brute-force the internal state
from 2^64 to 2^128 (barring any weaknesses in the scheme).

Also, can we not call this get_random_int()?  get_random_int() sounds
too much like get_random_bytes(), and the latter is intended to be a
real CSPRNG.  Can we call it get_weak_random_int() or similar?

--Andy

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v6 3/5] random: use SipHash in place of MD5
  2016-12-16  3:03 ` [PATCH v6 0/5] " Jason A. Donenfeld
@ 2016-12-16  3:03   ` Jason A. Donenfeld
  2016-12-16 21:31     ` Andy Lutomirski
  0 siblings, 1 reply; 6+ messages in thread
From: Jason A. Donenfeld @ 2016-12-16  3:03 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld, Jean-Philippe Aumasson

This duplicates the current algorithm for get_random_int/long, but uses
siphash instead. This comes with several benefits. It's certainly
faster and more cryptographically secure than MD5. This patch also
separates hashed fields into three values instead of one, in order to
increase diffusion.

The previous MD5 algorithm used a per-cpu MD5 state, which caused
successive calls to the function to chain upon each other. While it's
not entirely clear that this kind of chaining is absolutely necessary
when using a secure PRF like siphash, it can't hurt, and the timing of
the call chain does add a degree of natural entropy. So, in keeping with
this design, instead of the massive per-cpu 64-byte MD5 state, there is
instead a per-cpu previously returned value for chaining.

The speed benefits are substantial:

                | siphash | md5    | speedup |
		------------------------------
get_random_long | 137130  | 415983 | 3.03x   |
get_random_int  | 86384   | 343323 | 3.97x   |

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Ted Tso <tytso@mit.edu>
---
 drivers/char/random.c | 32 +++++++++++++-------------------
 1 file changed, 13 insertions(+), 19 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index d6876d506220..a51f0ff43f00 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -262,6 +262,7 @@
 #include <linux/syscalls.h>
 #include <linux/completion.h>
 #include <linux/uuid.h>
+#include <linux/siphash.h>
 #include <crypto/chacha20.h>
 
 #include <asm/processor.h>
@@ -2042,7 +2043,7 @@ struct ctl_table random_table[] = {
 };
 #endif 	/* CONFIG_SYSCTL */
 
-static u32 random_int_secret[MD5_MESSAGE_BYTES / 4] ____cacheline_aligned;
+static siphash_key_t random_int_secret;
 
 int random_int_secret_init(void)
 {
@@ -2050,8 +2051,7 @@ int random_int_secret_init(void)
 	return 0;
 }
 
-static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash)
-		__aligned(sizeof(unsigned long));
+static DEFINE_PER_CPU(u64, get_random_int_chaining);
 
 /*
  * Get a random word for internal kernel use only. Similar to urandom but
@@ -2061,19 +2061,16 @@ static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash)
  */
 unsigned int get_random_int(void)
 {
-	__u32 *hash;
 	unsigned int ret;
+	u64 *chaining;
 
 	if (arch_get_random_int(&ret))
 		return ret;
 
-	hash = get_cpu_var(get_random_int_hash);
-
-	hash[0] += current->pid + jiffies + random_get_entropy();
-	md5_transform(hash, random_int_secret);
-	ret = hash[0];
-	put_cpu_var(get_random_int_hash);
-
+	chaining = &get_cpu_var(get_random_int_chaining);
+	ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() +
+				       current->pid, random_int_secret);
+	put_cpu_var(get_random_int_chaining);
 	return ret;
 }
 EXPORT_SYMBOL(get_random_int);
@@ -2083,19 +2080,16 @@ EXPORT_SYMBOL(get_random_int);
  */
 unsigned long get_random_long(void)
 {
-	__u32 *hash;
 	unsigned long ret;
+	u64 *chaining;
 
 	if (arch_get_random_long(&ret))
 		return ret;
 
-	hash = get_cpu_var(get_random_int_hash);
-
-	hash[0] += current->pid + jiffies + random_get_entropy();
-	md5_transform(hash, random_int_secret);
-	ret = *(unsigned long *)hash;
-	put_cpu_var(get_random_int_hash);
-
+	chaining = &get_cpu_var(get_random_int_chaining);
+	ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() +
+				       current->pid, random_int_secret);
+	put_cpu_var(get_random_int_chaining);
 	return ret;
 }
 EXPORT_SYMBOL(get_random_long);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-12-16 22:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-16 21:45 [PATCH v6 3/5] random: use SipHash in place of MD5 Jason A. Donenfeld
2016-12-16 22:12 ` Jason A. Donenfeld
2016-12-16 22:13 ` Andy Lutomirski
2016-12-16 22:23   ` Jason A. Donenfeld
  -- strict thread matches above, loose matches on Subject: below --
2016-12-15 20:29 [PATCH v5 0/4] The SipHash Patchset Jason A. Donenfeld
2016-12-16  3:03 ` [PATCH v6 0/5] " Jason A. Donenfeld
2016-12-16  3:03   ` [PATCH v6 3/5] random: use SipHash in place of MD5 Jason A. Donenfeld
2016-12-16 21:31     ` Andy Lutomirski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).