linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 2/6] random: use xor for mixing
       [not found] ` <3.753618428@selenic.com>
@ 2007-12-09  0:01   ` Theodore Tso
  2007-12-09  0:40     ` Matt Mackall
  0 siblings, 1 reply; 19+ messages in thread
From: Theodore Tso @ 2007-12-09  0:01 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andrew Morton, linux-kernel

On Sat, Dec 08, 2007 at 05:20:16PM -0600, Matt Mackall wrote:
> random: use xor for mixing
> 
> With direct assignment, we can determine the twist table element used
> for mixing (the high 3 bits of the table are unique) and reverse a
> single step of mixing. Instead, use xor, which should also help
> preserve entropy in a given pool slot.
> 
> Signed-off-by: Matt Mackall <mpm@selenic.com>
> 
> diff -r bc336762cfdb drivers/char/random.c
> --- a/drivers/char/random.c	Wed Dec 05 17:20:02 2007 -0600
> +++ b/drivers/char/random.c	Sat Dec 08 13:27:34 2007 -0600
> @@ -496,7 +496,7 @@ static void __add_entropy_words(struct e
>  		w ^= r->pool[(i + tap4) & wordmask];
>  		w ^= r->pool[(i + tap5) & wordmask];
>  		w ^= r->pool[i];
> -		r->pool[i] = (w >> 3) ^ twist_table[w & 7];
> +		r->pool[i] ^= (w >> 3) ^ twist_table[w & 7];
>  	}

In the original design of add_entropy_words(), in order to provably
not lose any entropy, you want add_entropy_words() to be reversible if
you mix in all zero's.  So the fact that you can determine the twist
table element used from looking at the high bits was deliberate.  The
mixing done in add_entropy_words() is *not* intended to be
cryptographic, but merely to smear the bits around as they are added
and then to permute the pool so that when you use SHA in the output
stage, enough bits are changing that even if there are weaknesses
discovered in the crypto hash algorithm, it won't help the attacker.

The internals of the pool are never exposed, so an attacker should
never gain direct access to the entropy pool; hence worry about
whether someone can "reverse" the mixing isn't particuarly a worry;
indeed, in order to make sure we preserve entropy, the whole *point*
of the mixing algorithm is that it is reversible.

(note: credit for this design should go to Colin Plumb, who worked
with me on this aspect of the design.  Colin was responsible for the
original random number generator in PGP....)

					- Ted

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/6] random: do extraction before mixing
       [not found] ` <4.753618428@selenic.com>
@ 2007-12-09  0:35   ` Theodore Tso
  2007-12-09  7:12     ` Matt Mackall
  0 siblings, 1 reply; 19+ messages in thread
From: Theodore Tso @ 2007-12-09  0:35 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andrew Morton, linux-kernel

On Sat, Dec 08, 2007 at 05:20:17PM -0600, Matt Mackall wrote:
> random: do extraction before mixing
> 
> If an attacker manages to capture the current pool state, she can
> determine the last 10 bytes extracted from the pool. 

That's not true; we aren't just extracting data in the
__add_entropy_words() call.  In fact, above that, the bulk of the
extraction comes form when we hash the entire pool, feeding back a
portion of the hash into the pool here:

	for (i = 0; i < r->poolinfo->poolwords; i += 16) {
		/* hash blocks of 16 words = 512 bits */
		sha_transform(buf, (__u8 *)(r->pool + i), buf + 5);
		/* feed back portion of the resulting hash */
		add_entropy_words(r, &buf[i % 5], 1);
	}

So the buf[0..5] contains a hash of the entire pool, and every 16
words, we're already mixing 32 bits into the pool.  So even if the
attacker captures the current pool state, she's not going to be able
to undo the intermediate SHA values that had been mixed into the pool.

> By mixing after
> the extraction, this is made substantially harder.

Not that much harder; as I mentioned as comments in my last patch, we
are doing a linear polynomial mixing for speed purposes, so relying on
the mixing to obscure the extraction isn't going to help much.

But that's OK, we don't need to depend on that.  Note the amount of
feedback that we do in the above loop.

					- Ted

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/6] random: use xor for mixing
  2007-12-09  0:01   ` [PATCH 2/6] random: use xor for mixing Theodore Tso
@ 2007-12-09  0:40     ` Matt Mackall
  2007-12-09  2:08       ` Theodore Tso
  0 siblings, 1 reply; 19+ messages in thread
From: Matt Mackall @ 2007-12-09  0:40 UTC (permalink / raw)
  To: Theodore Tso, Andrew Morton, linux-kernel

On Sat, Dec 08, 2007 at 07:01:04PM -0500, Theodore Tso wrote:
> On Sat, Dec 08, 2007 at 05:20:16PM -0600, Matt Mackall wrote:
> > random: use xor for mixing
> > 
> > With direct assignment, we can determine the twist table element used
> > for mixing (the high 3 bits of the table are unique) and reverse a
> > single step of mixing. Instead, use xor, which should also help
> > preserve entropy in a given pool slot.
> > 
> > Signed-off-by: Matt Mackall <mpm@selenic.com>
> > 
> > diff -r bc336762cfdb drivers/char/random.c
> > --- a/drivers/char/random.c	Wed Dec 05 17:20:02 2007 -0600
> > +++ b/drivers/char/random.c	Sat Dec 08 13:27:34 2007 -0600
> > @@ -496,7 +496,7 @@ static void __add_entropy_words(struct e
> >  		w ^= r->pool[(i + tap4) & wordmask];
> >  		w ^= r->pool[(i + tap5) & wordmask];
> >  		w ^= r->pool[i];
> > -		r->pool[i] = (w >> 3) ^ twist_table[w & 7];
> > +		r->pool[i] ^= (w >> 3) ^ twist_table[w & 7];
> >  	}
> 
> In the original design of add_entropy_words(), in order to provably
> not lose any entropy, you want add_entropy_words() to be reversible if
> you mix in all zero's.

Ahh, yes, that's true. I'd somehow missed that we effectively had a
tap at 0, so I thought this was actually improving the situation. I'll
replace this with a comment explaining the situation.

> The internals of the pool are never exposed, so an attacker should
> never gain direct access to the entropy pool; hence worry about
> whether someone can "reverse" the mixing isn't particuarly a worry;
> indeed, in order to make sure we preserve entropy, the whole *point*
> of the mixing algorithm is that it is reversible.

I'm working on strengthening forward security for cases where the
internals are exposed to an attacker. There are any number of
situations where this can and does happen, and ensuring we don't
expose ourselves to backtracking is a worthwhile theoretical concern.

---
random: document revisibility of mixing function.

Signed-off-by: Matt Mackall <mpm@selenic.com>

diff -r bc336762cfdb drivers/char/random.c
--- a/drivers/char/random.c	Wed Dec 05 17:20:02 2007 -0600
+++ b/drivers/char/random.c	Sat Dec 08 18:38:15 2007 -0600
@@ -447,6 +447,12 @@ static struct entropy_store nonblocking_
  * degree, and then twisted.  We twist by three bits at a time because
  * it's cheap to do so and helps slightly in the expected case where
  * the entropy is concentrated in the low-order bits.
+ *
+ * This function is intentionally reversible, meaning that if we mix
+ * in known values, we can run the algorithm backwards and recover the
+ * earlier state. Because we can recover the original state, we are
+ * provably not destroying any existing entropy content in the pool
+ * when we mix.
  */
 static void __add_entropy_words(struct entropy_store *r, const __u32 *in,
 				int nwords, __u32 out[16])

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/6] random: make backtracking attacks harder
       [not found] ` <5.753618428@selenic.com>
@ 2007-12-09  1:45   ` Theodore Tso
  2007-12-09  5:43     ` Matt Mackall
  0 siblings, 1 reply; 19+ messages in thread
From: Theodore Tso @ 2007-12-09  1:45 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andrew Morton, linux-kernel

On Sat, Dec 08, 2007 at 05:20:38PM -0600, Matt Mackall wrote:
> random: make backtracking attacks harder
> 
> At each extraction, we change (poolbits / 16) + 32 bits in the pool,
> or 96 bits in the case of the secondary pools. Thus, a brute-force
> backtracking attack on the pool state is less difficult than breaking
> the hash. In certain cases, this difficulty may be is reduced to 2^64
> iterations.

OK, a backtracking attack assumes a fairly catastrophic case where the
attacker has managed to compromise the internal pool state.  In Linux,
if an attacker has this much access, in most scenarios the attacker
probably has the ability to gain write access to kernel memory as
well, at which point you have far worse problems.

The definition of a backtracking attack is when the attacker uses the
current pool state to try to recover the last entropy extraction.  As
you point out, at each extraction, 96 bits are changed in the pool.
But at each extraction, only 80 bits are extracted.

If you are trying to find the value of the 80 bits that were
extracted, and you know the current state of the pool, yes, you can
take the 96 bits that were changed after the last extraction, and try
all possible 2**96 combinations of the bits; but you probably won't
rule anything out, since after you iterate over all of the 2**96
combinations, you'll probably be able to generate all of the 2**80
possible output bits.  So you won't gain anything by trying to do a
backtracking attack.  So I don't think there's anything to worry about
here.

What you describe in your comments:

> +	 * We mix the hash back into the pool to prevent backtracking
> +	 * attacks (where the attacker knows the state of the pool
> +	 * plus the current outputs, and attempts to find previous
> +	 * ouputs), unless the hash function can be inverted. By
> +	 * mixing at least a SHA1 worth of hash data back, we make
> +	 * brute-forcing the feedback as hard as brute-forcing the
> +	 * hash.
>  	 */

.... sounds like not a backtracking attack, but an Iterative guessing
attack, where "knowledge of internal state at some point and the
intervening PRNG outputs, to learn internal state at a later point
when the inputs collected during this span of time are guessable by
the attacker."  (Definition taken from:
http://www.ee.oulu.fi/research/ouspg/frontier/sota/whitepaper-prng/)

But note first of all, all this lets you do is unwind to an earlier
state just before the known PRNG outputs.  In order to use this to
guess PRNG output, you still have to solve the above mentioned
backtracking attack --- where you have 96 bits that changed, and 80
possible extraction bits.  

Secondly, the fact that we use catastrophic reseeding means that even
if the state was compromised, at some point in time, every so often
when we do a "catastrophic reseed", this acts as a firewall to limit
the damage caused by a internal state exposure, both in the past and
the future.

						- Ted

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/6] random: step more rapidly through the pool when adding and extracting
       [not found] ` <6.753618428@selenic.com>
@ 2007-12-09  1:48   ` Theodore Tso
  2007-12-09  4:00     ` Andrew Morton
  0 siblings, 1 reply; 19+ messages in thread
From: Theodore Tso @ 2007-12-09  1:48 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andrew Morton, linux-kernel

On Sat, Dec 08, 2007 at 05:20:39PM -0600, Matt Mackall wrote:
> random: step more rapidly through the pool when adding and extracting
> 
> Second, this patch spreads the mixing across the pool more rapidly by
> using a larger step. For our secondary pools, we'll be sure to touch
> both blocks with each mix, and for our primary pool, we'll change 5 of
> 8 blocks.

Yeah, that's a good idea.

Acked-by: "Theodore Ts'o" <tytso@mit.edu>

							- Ted

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] random: improve variable naming, clear extract buffer
       [not found] ` <7.753618428@selenic.com>
@ 2007-12-09  1:51   ` Theodore Tso
  2007-12-09  5:08     ` Matt Mackall
  0 siblings, 1 reply; 19+ messages in thread
From: Theodore Tso @ 2007-12-09  1:51 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andrew Morton, linux-kernel

On Sat, Dec 08, 2007 at 05:21:00PM -0600, Matt Mackall wrote:
> random: improve variable naming, clear extract buffer
> 
> - split the SHA variables apart into hash and workspace
> - rename data to extract
> - wipe extract and workspace after hashing
> 
> diff -r 924f9a441236 drivers/char/random.c
> --- a/drivers/char/random.c	Sat Dec 08 16:22:29 2007 -0600
> +++ b/drivers/char/random.c	Sat Dec 08 16:32:31 2007 -0600
> @@ -461,7 +461,7 @@ static void __add_entropy_words(struct e
>  	unsigned long flags;
>  	/* rotate the add pointer more rapidly to span more of the
>  	 * pool on a given add */
> -	const int step = 5;
> +	const int step = 13;
>  
>  	/* Taps are constant, so we can load them without holding r->lock.  */
>  	tap1 = r->poolinfo->tap1;

This change has nothing to do with the patch comment.  If you want to
change the step size from 5 to 13, why not just change patch 5/6 to
just use a step size of 13 from the beginning?

Otherwise, yeah, this patch does make sense.

Acked-by: "Theodore Ts'o" <tytso@mit.edu>

								- Ted

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/6] random: use xor for mixing
  2007-12-09  0:40     ` Matt Mackall
@ 2007-12-09  2:08       ` Theodore Tso
  2007-12-09 13:33         ` Alan Cox
  2007-12-19 21:09         ` Bill Davidsen
  0 siblings, 2 replies; 19+ messages in thread
From: Theodore Tso @ 2007-12-09  2:08 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andrew Morton, linux-kernel

On Sat, Dec 08, 2007 at 06:40:17PM -0600, Matt Mackall wrote:
> I'm working on strengthening forward security for cases where the
> internals are exposed to an attacker. There are any number of
> situations where this can and does happen, and ensuring we don't
> expose ourselves to backtracking is a worthwhile theoretical concern.

See my other comments; I don't think we are currently vulnerable to
backtracking.

I tend to view backtracking as largely a theoretical concern, as most
of the time, if the attacker has read access to kernel memory in order
to compromise the internal state of /dev/random, the attacker very
likely has *write* access to kernel memory, at which point you have
much bigger problems to worry about, at least going forward.  

I suppose if you had *just* generated an 80-bit AES session key, right
before the internal state was compromised, this might be interesting,
but if the attacker can compromise arbitrary kernel memory, then
attacker might as well just grab keying information from the userspace
process --- such as perhaps the long-term private key of the user, or
the data to be encrypted.

So my personal take on it is that protecting against backtracking
attacks is mainly useful in silencing academics who like to come up
with, well, largely academic and theoretical scenario.  If it doesn't
take much effort, sure, let's try to protect against it (and I think
we're OK already).

But a much better use of our time would be spent creating userspace
support so we can more efficiently pull entropy from TPM modules, or
the noise from a sound/microphone input, etc., or other hardware
sources of entropy.  That would provide a much more practical
improvement to the /dev/random subsystem than worry about what I feel
are largely academic concerns.

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/6] random: step more rapidly through the pool when adding and extracting
  2007-12-09  1:48   ` [PATCH 5/6] random: step more rapidly through the pool when adding and extracting Theodore Tso
@ 2007-12-09  4:00     ` Andrew Morton
  2007-12-09  4:22       ` Matt Mackall
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2007-12-09  4:00 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Matt Mackall, linux-kernel

On Sat, 8 Dec 2007 20:48:20 -0500 Theodore Tso <tytso@thunk.org> wrote:

> On Sat, Dec 08, 2007 at 05:20:39PM -0600, Matt Mackall wrote:
> > random: step more rapidly through the pool when adding and extracting
> > 
> > Second, this patch spreads the mixing across the pool more rapidly by
> > using a larger step. For our secondary pools, we'll be sure to touch
> > both blocks with each mix, and for our primary pool, we'll change 5 of
> > 8 blocks.
> 
> Yeah, that's a good idea.
> 
> Acked-by: "Theodore Ts'o" <tytso@mit.edu>
> 

I cannot find the emails to which you are replying.  Not in inbox, not on
lkml, not in spam folders.  And lkml.org records your emails and not the
emails to which you are replying.

I think something went wrong with Matt's outgoing.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/6] random: step more rapidly through the pool when adding and extracting
  2007-12-09  4:00     ` Andrew Morton
@ 2007-12-09  4:22       ` Matt Mackall
  2007-12-09 12:55         ` Theodore Tso
  0 siblings, 1 reply; 19+ messages in thread
From: Matt Mackall @ 2007-12-09  4:22 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Theodore Tso, linux-kernel

On Sat, Dec 08, 2007 at 08:00:55PM -0800, Andrew Morton wrote:
> On Sat, 8 Dec 2007 20:48:20 -0500 Theodore Tso <tytso@thunk.org> wrote:
> 
> > On Sat, Dec 08, 2007 at 05:20:39PM -0600, Matt Mackall wrote:
> > > random: step more rapidly through the pool when adding and extracting
> > > 
> > > Second, this patch spreads the mixing across the pool more rapidly by
> > > using a larger step. For our secondary pools, we'll be sure to touch
> > > both blocks with each mix, and for our primary pool, we'll change 5 of
> > > 8 blocks.
> > 
> > Yeah, that's a good idea.
> > 
> > Acked-by: "Theodore Ts'o" <tytso@mit.edu>
> > 
> 
> I cannot find the emails to which you are replying.  Not in inbox, not on
> lkml, not in spam folders.  And lkml.org records your emails and not the
> emails to which you are replying.
> 
> I think something went wrong with Matt's outgoing.

It did. My laptop didn't relay them through my smarthost and my domain
has an SPF record.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] random: improve variable naming, clear extract buffer
  2007-12-09  1:51   ` [PATCH 6/6] random: improve variable naming, clear extract buffer Theodore Tso
@ 2007-12-09  5:08     ` Matt Mackall
  0 siblings, 0 replies; 19+ messages in thread
From: Matt Mackall @ 2007-12-09  5:08 UTC (permalink / raw)
  To: tytso, Andrew Morton, linux-kernel

On Sat, Dec 08, 2007 at 08:51:54PM -0500, Theodore Tso wrote:
> On Sat, Dec 08, 2007 at 05:21:00PM -0600, Matt Mackall wrote:
> > random: improve variable naming, clear extract buffer
> > 
> > - split the SHA variables apart into hash and workspace
> > - rename data to extract
> > - wipe extract and workspace after hashing
> > 
> > diff -r 924f9a441236 drivers/char/random.c
> > --- a/drivers/char/random.c	Sat Dec 08 16:22:29 2007 -0600
> > +++ b/drivers/char/random.c	Sat Dec 08 16:32:31 2007 -0600
> > @@ -461,7 +461,7 @@ static void __add_entropy_words(struct e
> >  	unsigned long flags;
> >  	/* rotate the add pointer more rapidly to span more of the
> >  	 * pool on a given add */
> > -	const int step = 5;
> > +	const int step = 13;
> >  
> >  	/* Taps are constant, so we can load them without holding r->lock.  */
> >  	tap1 = r->poolinfo->tap1;
> 
> This change has nothing to do with the patch comment.  If you want to
> change the step size from 5 to 13, why not just change patch 5/6 to
> just use a step size of 13 from the beginning?

Patch refresh gone wrong.
 
> Otherwise, yeah, this patch does make sense.

Twice, in fact. This patch as sent won't compile due to an extra &.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/6] random: make backtracking attacks harder
  2007-12-09  1:45   ` [PATCH 4/6] random: make backtracking attacks harder Theodore Tso
@ 2007-12-09  5:43     ` Matt Mackall
  2007-12-09 13:33       ` Theodore Tso
  0 siblings, 1 reply; 19+ messages in thread
From: Matt Mackall @ 2007-12-09  5:43 UTC (permalink / raw)
  To: Theodore Tso, Andrew Morton, linux-kernel

On Sat, Dec 08, 2007 at 08:45:50PM -0500, Theodore Tso wrote:
> On Sat, Dec 08, 2007 at 05:20:38PM -0600, Matt Mackall wrote:
> > random: make backtracking attacks harder
> > 
> > At each extraction, we change (poolbits / 16) + 32 bits in the pool,
> > or 96 bits in the case of the secondary pools. Thus, a brute-force
> > backtracking attack on the pool state is less difficult than breaking
> > the hash. In certain cases, this difficulty may be is reduced to 2^64
> > iterations.
> 
> OK, a backtracking attack assumes a fairly catastrophic case where the
> attacker has managed to compromise the internal pool state.  In Linux,
> if an attacker has this much access, in most scenarios the attacker
> probably has the ability to gain write access to kernel memory as
> well, at which point you have far worse problems.
> 
> The definition of a backtracking attack is when the attacker uses the
> current pool state to try to recover the last entropy extraction.  As
> you point out, at each extraction, 96 bits are changed in the pool.
> But at each extraction, only 80 bits are extracted.

Agreed.
 
> If you are trying to find the value of the 80 bits that were
> extracted, and you know the current state of the pool, yes, you can
> take the 96 bits that were changed after the last extraction, and try
> all possible 2**96 combinations of the bits; but you probably won't
> rule anything out, since after you iterate over all of the 2**96
> combinations, you'll probably be able to generate all of the 2**80
> possible output bits.  So you won't gain anything by trying to do a
> backtracking attack.  So I don't think there's anything to worry about
> here.

Two observations:

- 2**96 << 2**160 so our feedback is much weaker than our hash so we
  should improve it on general principle
- there's a way to improve this attack to 2**64 in some situations!

I presume you've seen this paper:

 http://eprint.iacr.org/2006/086.pdf

It was fairly obsolete at printing and makes a bunch of mistakes. But
they do observe that at certain points in our feedback, the first
512-bit block of the pool is not overwritten by the feedback and thus
32 bits of feedback can be immediately discovered. Then an attacker
can run 2**64 hashes to recover the remaining bits with excellent odds
of having a unique preimage. It can't usually be extended multiple
iterations because we move the add_ptr too much, so it's fairly weak
as a backtracking attack.

They precede this with a more general description of a 2**96 attack
which doesn't work for precisely the reason you describe (2**16
possible preimages for each output). But their 2**64 attack does seem
like it works (in the world of cryptanalysis, and not real hardware of
course!).

Simply feeding back all five words of our hash rather than three in
the secondary pools hardens this all right up. This patch also makes
everything in here much easier to read and analyze.

> What you describe in your comments:
> 
> > +	 * We mix the hash back into the pool to prevent backtracking
> > +	 * attacks (where the attacker knows the state of the pool
> > +	 * plus the current outputs, and attempts to find previous
> > +	 * ouputs), unless the hash function can be inverted. 

That part of the comment is not new, but I don't remember which of us
wrote it.. 

> > ...By
> > +	 * mixing at least a SHA1 worth of hash data back, we make
> > +	 * brute-forcing the feedback as hard as brute-forcing the
> > +	 * hash.
> >  	 */

That part is new and I think is accurate.

> Secondly, the fact that we use catastrophic reseeding means that even
> if the state was compromised, at some point in time, every so often
> when we do a "catastrophic reseed", this acts as a firewall to limit
> the damage caused by a internal state exposure, both in the past and
> the future.

Absolutely. Having some depth in the design is quite valuable. But
that doesn't mean we shouldn't shore up individual weaknesses when we
find them.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/6] random: do extraction before mixing
  2007-12-09  0:35   ` [PATCH 3/6] random: do extraction before mixing Theodore Tso
@ 2007-12-09  7:12     ` Matt Mackall
  0 siblings, 0 replies; 19+ messages in thread
From: Matt Mackall @ 2007-12-09  7:12 UTC (permalink / raw)
  To: Theodore Tso, Andrew Morton, linux-kernel

On Sat, Dec 08, 2007 at 07:35:54PM -0500, Theodore Tso wrote:
> On Sat, Dec 08, 2007 at 05:20:17PM -0600, Matt Mackall wrote:
> > random: do extraction before mixing
> > 
> > If an attacker manages to capture the current pool state, she can
> > determine the last 10 bytes extracted from the pool. 
> 
> That's not true; we aren't just extracting data in the
> __add_entropy_words() call.  In fact, above that, the bulk of the
> extraction comes form when we hash the entire pool, feeding back a
> portion of the hash into the pool here:
> 
> 	for (i = 0; i < r->poolinfo->poolwords; i += 16) {
> 		/* hash blocks of 16 words = 512 bits */
> 		sha_transform(buf, (__u8 *)(r->pool + i), buf + 5);
> 		/* feed back portion of the resulting hash */
> 		add_entropy_words(r, &buf[i % 5], 1);
> 	}

Ok, yes, I'd forgotten we were chaining in the final sha_transform. I
plead too many bufs and buf+5s, which I fix up in 6/6. Funny thing is
that I'd convinced myself that this attack didn't work (correct) last
year when I read the paper I mentioned earlier. But yesterday and
today I couldn't spot the problem with it. So again, I'll add an
explicit comment for future hackers and researchers.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/6] random: step more rapidly through the pool when adding and extracting
  2007-12-09  4:22       ` Matt Mackall
@ 2007-12-09 12:55         ` Theodore Tso
  2007-12-09 17:08           ` Matt Mackall
  0 siblings, 1 reply; 19+ messages in thread
From: Theodore Tso @ 2007-12-09 12:55 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andrew Morton, linux-kernel

On Sat, Dec 08, 2007 at 10:22:04PM -0600, Matt Mackall wrote:
> 
> It did. My laptop didn't relay them through my smarthost and my domain
> has an SPF record.

Ah, yes:

http://homepages.tesco.net/~J.deBoynePollard/FGA/smtp-spf-is-harmful.html

							- Ted


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/6] random: make backtracking attacks harder
  2007-12-09  5:43     ` Matt Mackall
@ 2007-12-09 13:33       ` Theodore Tso
  2007-12-09 17:05         ` Matt Mackall
  0 siblings, 1 reply; 19+ messages in thread
From: Theodore Tso @ 2007-12-09 13:33 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andrew Morton, linux-kernel

On Sat, Dec 08, 2007 at 11:43:37PM -0600, Matt Mackall wrote:
> > If you are trying to find the value of the 80 bits that were
> > extracted, and you know the current state of the pool, yes, you can
> > take the 96 bits that were changed after the last extraction, and try
> > all possible 2**96 combinations of the bits; but you probably won't
> > rule anything out, since after you iterate over all of the 2**96
> > combinations, you'll probably be able to generate all of the 2**80
> > possible output bits.  So you won't gain anything by trying to do a
> > backtracking attack.  So I don't think there's anything to worry about
> > here.
> 
> Two observations:
> 
> - 2**96 << 2**160 so our feedback is much weaker than our hash so we
>   should improve it on general principle

Well, no.  You're comparing apples and oranges here.  You can iterate
over 2**96 choices, but given that even if you are doing an iterative
guessing attack, you have 2**80 knowns, and 2**96 unknowns, it doesn't
buy you anything.  So saying that there is a successful attack with
O(2**96) simply isn't true!  Secondly, a 160-bit hash is actually only
80 bits strong (because of the birthday attack); when SHA was
originally published by the US government, it was designed to be
paired with the Clipper chip, which was an 80 bit symmetric key cipher
with a backdoor.

> - there's a way to improve this attack to 2**64 in some situations!

Yeah, but again, what does this "attack" buy you?  If you know the
output, and the current state of the pool, after doing 2**64
iterations you can get the state of the pool before the output.
Big-di-whoopdie-do.  You can't actually *do* anything with it.  

If you want to use that to obtain the unknown 80-bit output at that
point in time, you're back to the 96-bits unknown problem, since in
the case where you don't know the output, the shortcut attack doesn't
work.

> I presume you've seen this paper:
> 
>  http://eprint.iacr.org/2006/086.pdf
> 
> It was fairly obsolete at printing and makes a bunch of mistakes. 

Oh yes, I'm familiar with the paper; I wrote a rebuttal to it to LKML
at the time, and cc'ed the authors, who never got back to me.  I also
made a bunch of changes to strengthen /dev/random against some of
their observations that described weaknesses, even though I didn't and
still don't think you could leverage them into a practical attack,
even *if* you accept their premise of a state compromise that didn't
involve even worse consequences than enabling theoretical attacks
against /dev/random.

> Simply feeding back all five words of our hash rather than three in
> the secondary pools hardens this all right up. This patch also makes
> everything in here much easier to read and analyze.

It's fixing a non-problem, but I don't think changing actually hurts
anything.  My preferred way of dealing with this problem is simply to
increase the output pool size, since that increases the number of bits
that change each time from 96 bits to 160 bits.

(This is a private paranoia patch I've been carrying around for a
while.)

     	    	      	      	      - Ted

commit 9c6c0954b3d56de04f03225e16d7f0d31c823c92
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Sat Oct 13 00:03:26 2007 -0400

    randomness-fix-increase_output_pool

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 5fee056..731eb15 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -249,7 +249,7 @@
  * Configuration information
  */
 #define INPUT_POOL_WORDS 128
-#define OUTPUT_POOL_WORDS 32
+#define OUTPUT_POOL_WORDS 64
 #define SEC_XFER_SIZE 512
 
 /*
@@ -288,8 +288,8 @@ static struct poolinfo {
 } poolinfo_table[] = {
 	/* x^128 + x^103 + x^76 + x^51 +x^25 + x + 1 -- 105 */
 	{ 128,	103,	76,	51,	25,	1 },
-	/* x^32 + x^26 + x^20 + x^14 + x^7 + x + 1 -- 15 */
-	{ 32,	26,	20,	14,	7,	1 },
+	/* x^64 + x^52 + x^39 + x^26 + x^14 + x + 1 -- 15 */
+	{ 64,	52,	39,	26,	14,	1 },
 #if 0
 	/* x^2048 + x^1638 + x^1231 + x^819 + x^411 + x + 1  -- 115 */
 	{ 2048,	1638,	1231,	819,	411,	1 },
@@ -314,8 +314,8 @@ static struct poolinfo {
 	/* x^128 + x^103 + x^78 + x^51 + x^27 + x^2 + 1 -- 70 */
 	{ 128,	103,	78,	51,	27,	2 },
 
-	/* x^64 + x^52 + x^39 + x^26 + x^14 + x + 1 -- 15 */
-	{ 64,	52,	39,	26,	14,	1 },
+	/* x^32 + x^26 + x^20 + x^14 + x^7 + x + 1 -- 15 */
+	{ 32,	26,	20,	14,	7,	1 },
 #endif
 };
 

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/6] random: use xor for mixing
  2007-12-09  2:08       ` Theodore Tso
@ 2007-12-09 13:33         ` Alan Cox
  2007-12-19 21:09         ` Bill Davidsen
  1 sibling, 0 replies; 19+ messages in thread
From: Alan Cox @ 2007-12-09 13:33 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Matt Mackall, Andrew Morton, linux-kernel

> So my personal take on it is that protecting against backtracking
> attacks is mainly useful in silencing academics who like to come up
> with, well, largely academic and theoretical scenario.  If it doesn't
> take much effort, sure, let's try to protect against it (and I think
> we're OK already).

That problem seems to arise here because we have an interface to add
'real' entropy to the pool but not one to add randomness to be used
solely for urandom. If we had both then the user could choose to add some
degree of randomness solely for urandom usage.

Alan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/6] random: make backtracking attacks harder
  2007-12-09 13:33       ` Theodore Tso
@ 2007-12-09 17:05         ` Matt Mackall
  2007-12-10 13:37           ` Theodore Tso
  0 siblings, 1 reply; 19+ messages in thread
From: Matt Mackall @ 2007-12-09 17:05 UTC (permalink / raw)
  To: Theodore Tso, Andrew Morton, linux-kernel

On Sun, Dec 09, 2007 at 08:33:00AM -0500, Theodore Tso wrote:
> On Sat, Dec 08, 2007 at 11:43:37PM -0600, Matt Mackall wrote:
> > > If you are trying to find the value of the 80 bits that were
> > > extracted, and you know the current state of the pool, yes, you can
> > > take the 96 bits that were changed after the last extraction, and try
> > > all possible 2**96 combinations of the bits; but you probably won't
> > > rule anything out, since after you iterate over all of the 2**96
> > > combinations, you'll probably be able to generate all of the 2**80
> > > possible output bits.  So you won't gain anything by trying to do a
> > > backtracking attack.  So I don't think there's anything to worry about
> > > here.
> > 
> > Two observations:
> > 
> > - 2**96 << 2**160 so our feedback is much weaker than our hash so we
> >   should improve it on general principle
> 
> Well, no.  You're comparing apples and oranges here.  You can iterate
> over 2**96 choices, but given that even if you are doing an iterative
> guessing attack, you have 2**80 knowns, and 2**96 unknowns, it doesn't
> buy you anything.  So saying that there is a successful attack with
> O(2**96) simply isn't true!

Again, I agree that there's not any sort of useful attack here.

> > - there's a way to improve this attack to 2**64 in some situations!
> 
> Yeah, but again, what does this "attack" buy you?  If you know the
> output, and the current state of the pool, after doing 2**64
> iterations you can get the state of the pool before the output.
> Big-di-whoopdie-do.  You can't actually *do* anything with it.  

Ahh, but this an attack on the output! You know only:

 pool' (including a fortuitously placed add_ptr)

and from that, you can get all but 64 bits of pool. By 2**64 iterations of:

 pool = pool' + guess
 hash = SHA1(init, pool) # simplified
 pool^ = add(pool, hash)
 if pool^ = pool':
  success!

Now you've recovered pool. Then you can recover the *output* with:

 hash = SHA1(init, pool)
 pool' = add(pool, hash)
 entropy = extract(pool)
 output = SHA1(hash, entropy)

Depending on where the add_ptr lands, we might even be able to go back
another iteration. 

By the way, I think the paper missed out on some corner case attacks
which extend the range to greater than half of the possible add_ptr
positions. Whenever less than 96 bits are changed in the first block,
we can discover the pool in O(2**64).

> It's fixing a non-problem, but I don't think changing actually hurts
> anything.  My preferred way of dealing with this problem is simply to
> increase the output pool size, since that increases the number of bits
> that change each time from 96 bits to 160 bits.

Let's do that. But let's also do this because it makes the code more
readable and somewhat stronger (for instance the first word fed back
will be cascaded from the whole pool rather than the first 512 bits).
We also only need to grab the lock once.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/6] random: step more rapidly through the pool when adding and extracting
  2007-12-09 12:55         ` Theodore Tso
@ 2007-12-09 17:08           ` Matt Mackall
  0 siblings, 0 replies; 19+ messages in thread
From: Matt Mackall @ 2007-12-09 17:08 UTC (permalink / raw)
  To: Theodore Tso, Andrew Morton, linux-kernel

On Sun, Dec 09, 2007 at 07:55:41AM -0500, Theodore Tso wrote:
> On Sat, Dec 08, 2007 at 10:22:04PM -0600, Matt Mackall wrote:
> > 
> > It did. My laptop didn't relay them through my smarthost and my domain
> > has an SPF record.
> 
> Ah, yes:
> 
> http://homepages.tesco.net/~J.deBoynePollard/FGA/smtp-spf-is-harmful.html

Yep, I temporarily set it up to mitigate a spate of From: forgeries
against my domain, then it bit me later.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/6] random: make backtracking attacks harder
  2007-12-09 17:05         ` Matt Mackall
@ 2007-12-10 13:37           ` Theodore Tso
  0 siblings, 0 replies; 19+ messages in thread
From: Theodore Tso @ 2007-12-10 13:37 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andrew Morton, linux-kernel

On Sun, Dec 09, 2007 at 11:05:20AM -0600, Matt Mackall wrote:
> Ahh, but this an attack on the output! You know only:
> 
>  pool' (including a fortuitously placed add_ptr)
> 
> and from that, you can get all but 64 bits of pool. By 2**64 iterations of:
> 
>  pool = pool' + guess
>  hash = SHA1(init, pool) # simplified
>  pool^ = add(pool, hash)
>  if pool^ = pool':
>   success!

Yes, you're right.  I thought O(2**64) shortcut attack required
knowledge of the output to unwind an interation, but looking again at
the paper, I was incorrect.  You can indeed roughly half of the time,
be able to go backwards in O(2**64) operations, instead of O(2**96)
operations.  Note BTW, that if you assume that you can do SHA in 11
cycles per second (source: http://www.cryptopp.com/benchmarks.html)
and this requires you to hash approximatly half the pool, 11 cycles *
64 bytes * 2^64 operations is approximately 190.7 *centuries* assuming
a 2 gigahertz processor.  Obviously that's much better than 7 million
*millenia* (i.e., 7 billion years) for O(2**96), but 190,669 years is
quite a long time. 

I suppose if the NSA had 20,000 2Ghz processors in the basement
cranking for 10 years, then 50% of the time *after* they did a black
bag job to crack the random pool state, they could get the last 80
bits generated from /dev/random, but it just seems that if you are
assuming the power to grab the pool plus add_ptr, there would be much
more useful things you could --- like for example having the black bag
job trojaning the software to grab the private key directly.

So, I still view this as an academic weakness, and not a real-life
threat.  :-)

> Let's do that. But let's also do this because it makes the code more
> readable and somewhat stronger (for instance the first word fed back
> will be cascaded from the whole pool rather than the first 512 bits).
> We also only need to grab the lock once.

Yeah, fair enough.  Mixing in the whole hash doesn't actually cost
that much more.  It might have a theoretical chance of potentially
losing slightly more entropy, but that is really a theoretical issue
as well --- and the fact that we are mixing in the time each time we
extract more than makes up for that.  Oh, wait.  Right, I forgot, you
took that out at some point after you took over maintainership.

Can you add this patch, too?  (Compile tested, not boot tested, but
it's pretty obviously correct.)

							- Ted


commit 499aa1b63923ff8a2309b3a1eccacb54db159c58
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Mon Dec 10 08:35:53 2007 -0500

    random: Mix in the time of extraction into the entropy pool
    
    We don't actually add any entropy credit, but we do mix in the time of
    extraction to add further jitter and make life harder for an attacker
    by making the extraction process time-dependent.  This means that any
    attacks on the /dev/random pool can't use a static analysis.
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 5fee056..b05f05c 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -767,6 +767,14 @@ static void extract_buf(struct entropy_store *r, __u8 *out)
 {
 	int i;
 	__u32 data[16], buf[5 + SHA_WORKSPACE_WORDS];
+	struct {
+		cycles_t cycles;
+		long jiffies;
+	} sample;
+
+	sample.jiffies = jiffies;
+	sample.cycles = get_cycles();
+	add_entropy_words(&input_pool, (u32 *)&sample, sizeof(sample)/4);
 
 	sha_init(buf);
 	/*

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/6] random: use xor for mixing
  2007-12-09  2:08       ` Theodore Tso
  2007-12-09 13:33         ` Alan Cox
@ 2007-12-19 21:09         ` Bill Davidsen
  1 sibling, 0 replies; 19+ messages in thread
From: Bill Davidsen @ 2007-12-19 21:09 UTC (permalink / raw)
  To: Theodore Tso, Matt Mackall, Andrew Morton, linux-kernel

Theodore Tso wrote:
> On Sat, Dec 08, 2007 at 06:40:17PM -0600, Matt Mackall wrote:
>> I'm working on strengthening forward security for cases where the
>> internals are exposed to an attacker. There are any number of
>> situations where this can and does happen, and ensuring we don't
>> expose ourselves to backtracking is a worthwhile theoretical concern.
> 
> See my other comments; I don't think we are currently vulnerable to
> backtracking.
> 
> I tend to view backtracking as largely a theoretical concern, as most
> of the time, if the attacker has read access to kernel memory in order
> to compromise the internal state of /dev/random, the attacker very
> likely has *write* access to kernel memory, at which point you have
> much bigger problems to worry about, at least going forward.  
> 
> I suppose if you had *just* generated an 80-bit AES session key, right
> before the internal state was compromised, this might be interesting,
> but if the attacker can compromise arbitrary kernel memory, then
> attacker might as well just grab keying information from the userspace
> process --- such as perhaps the long-term private key of the user, or
> the data to be encrypted.
> 
> So my personal take on it is that protecting against backtracking
> attacks is mainly useful in silencing academics who like to come up
> with, well, largely academic and theoretical scenario.  If it doesn't
> take much effort, sure, let's try to protect against it (and I think
> we're OK already).
> 
> But a much better use of our time would be spent creating userspace
> support so we can more efficiently pull entropy from TPM modules, or
> the noise from a sound/microphone input, etc., or other hardware
> sources of entropy.  That would provide a much more practical
> improvement to the /dev/random subsystem than worry about what I feel
> are largely academic concerns.

That doesn't actually sound too hard, and the sounds of passing traffic 
are not likely to be replicable in any case. Lots of sensor data might 
be used as well, fan rpm, etc. That sounds so obvious I can't believe 
there isn't a reason it's not being done.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2007-12-19 20:50 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <2.753618428@selenic.com>
     [not found] ` <3.753618428@selenic.com>
2007-12-09  0:01   ` [PATCH 2/6] random: use xor for mixing Theodore Tso
2007-12-09  0:40     ` Matt Mackall
2007-12-09  2:08       ` Theodore Tso
2007-12-09 13:33         ` Alan Cox
2007-12-19 21:09         ` Bill Davidsen
     [not found] ` <4.753618428@selenic.com>
2007-12-09  0:35   ` [PATCH 3/6] random: do extraction before mixing Theodore Tso
2007-12-09  7:12     ` Matt Mackall
     [not found] ` <5.753618428@selenic.com>
2007-12-09  1:45   ` [PATCH 4/6] random: make backtracking attacks harder Theodore Tso
2007-12-09  5:43     ` Matt Mackall
2007-12-09 13:33       ` Theodore Tso
2007-12-09 17:05         ` Matt Mackall
2007-12-10 13:37           ` Theodore Tso
     [not found] ` <6.753618428@selenic.com>
2007-12-09  1:48   ` [PATCH 5/6] random: step more rapidly through the pool when adding and extracting Theodore Tso
2007-12-09  4:00     ` Andrew Morton
2007-12-09  4:22       ` Matt Mackall
2007-12-09 12:55         ` Theodore Tso
2007-12-09 17:08           ` Matt Mackall
     [not found] ` <7.753618428@selenic.com>
2007-12-09  1:51   ` [PATCH 6/6] random: improve variable naming, clear extract buffer Theodore Tso
2007-12-09  5:08     ` Matt Mackall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).