linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matt Mackall <mpm@selenic.com>
To: Andrew Morton <akpm@osdl.org>
Cc: tytso@mit.edu, jmorris@intercode.com.au, jamie@shareable.org,
	linux-kernel@vger.kernel.org, davem@redhat.com
Subject: Re: [RFC][PATCH] Make cryptoapi non-optional?
Date: Fri, 15 Aug 2003 23:38:16 -0500	[thread overview]
Message-ID: <20030816043816.GC325@waste.org> (raw)
In-Reply-To: <20030815170532.06e14e89.akpm@osdl.org>

On Fri, Aug 15, 2003 at 05:05:32PM -0700, Andrew Morton wrote:
> Matt Mackall <mpm@selenic.com> wrote:
> >
> > I'm pretty sure there was never a time when entropy
> > accounting wasn't racy let alone wrong, SMP or no (fixed in -mm, thank
> > you).
> 
> Well is has been argued that the lack of locking in the random driver is a
> "feature", adding a little more unpredictability.

> Now I don't know if that makes sense or not, but the locking certainly has
> a cost.  If it doesn't actually fix anything then that cost becomes a
> waste.
> 
> IOW: what bug does that locking fix?

Ok, the brief summary is that we're keeping a count of the number of
bits of unguessable environmental data (aka entropy) we've collected.
If we never give out more bits than we take in, we are immune to even
brute force attacks on the state of the pool and therefore the outputs
are unguessable as well (which is why /dev/random blocks where
/dev/urandom does not). If we fail to account for outputs properly, we
lose this property.

a) extract_entropy (pool->lock)

 This can be called simultaneously from different contexts and a race
 will result in extracting more bits from the pool than are accounted
 for or available. This could also underflow. I use the new pool lock
 here to cover the accounting - we rely on the hashing and mixing to
 evenly distribute bits in the extraction itself and we have feedback
 during the extraction as well.

 For nitpickers, there remains a vanishingly small possibility that
 two readers could get identical results from the pool by hitting a
 window immediately after reseeding, after accounting, _and_ after
 feedback mixing. I have another patch I'm still playing with that
 addresses this by hashing a different subset of the pool on each call
 rather than the entire pool, with the position covered by the pool
 lock during accounting. (This has the controversial side effect of
 doubling bandwidth.)

 Note that I added a flag here to limit the amount of entropy pulled
 to what's available. Random_read() formerly would check for the
 number of available bits and then ask for them, and extract_entropy
 would copy out that many bits even if an intervening caller raced and
 depleted the pool.

 The possible contention issue here is that /dev/random and
 /dev/urandom+get_random_bytes() currently share the same pool and
 get_random_bytes gets called moderately frequently by things like
 sequence number generation by numerous places in the network layer
 and various things in filesystems. I suspect this won't ever be a
 noticeable lock but see below.

 [There was also a cute sleeping problem here with random_read.
 random_read used old-style open-coded sleeping, marked itself
 TASK_INTERRUPTIBLE, then called extract_entropy, which would do a
 conditional reschedule, and fall asleep until the next wake up,
 despite having enough entropy to fulfill the request.]

b) add_entropy_words (pool->lock)

 This function folds new words into the pool. The pool lock is taken
 to prevent a race whereby both callers read the same value of for the
 add_ptr and mix on top of each other, again causing an overestimate
 of the entropy in the pool. The lock could be taken and released
 inside the loop, but this tends to get called for a small number of
 words (usually <=8).

c) credit_entropy_store (pool->lock)

 Locking here actually prevents underaccounting, but it also prevents
 overflow. Short and sweet.

d) RNDGETPOOL ioctl (pool->lock)

 The intent of this ioctl is to take an atomic snapshot of the entropy
 pool, entropy count, etc., presumably for debugging purposes. For
 purpose of getting random data, one should simply read(). With the
 addition of a second pool, it's no longer useful for debugging, but
 could be useful for attackers and I'd it to just return -ENOTTY. For
 now, it takes the pool lock. Should be no users, shouldn't increase
 contention.

e) batch_entropy_store && batch_entropy_process (batch_lock)

 This gets really messy without locking. This is a circular buffer
 that gets filled typically from interrupts and emptied by a
 workqueue. Lack of locking here could cause tail to pass head, etc.,
 dropping samples on the ground or replaying a whole buffer worth old
 samples. Rather than hold the lock while we mix the queue, we copy
 the sample ring. (I have a better patch to do this with a flip
 buffer.)

 [By the way, whoever did the workqueue conversion for 2.5 changed this
 code to wakeup the processing worker when the sample pool was half
 full rather on every sample but got the test not quite right. I had
 to stare at this for a bit:

        new = (batch_head+1) & (batch_max-1);
        if ((unsigned)(new - batch_tail) >= (unsigned)(batch_max / 2))
        {

 There was a further problem where extract_entropy was adding zero
 entropy timing samples and filling the sample buffer with useless
 samples. Occassionally the above bug would keep it from scheduling
 batch processing in this case and the code could block permanently,
 throwing out all new samples. Ouch.]

f) change_poolsize (queued for resend)

 The /proc/sys/kernel/random sysctl is writable and lets you change
 the pointer to the input pool out from under callers (boom). Short of
 expanding the pool lock to cover all operations on the pool, there's
 no clean fix for this. And it's not really worth the trouble. As it takes
 quite a number of transactions to empty the 4k+1kbits of entropy we can
 currently hold (enough for dozens of strong keys), if we end up
 waiting for entropy regularly, queueing theory tells us we're
 exceeding our input rate and we're going to lose no matter how big
 the buffer is. 

 On the other hand, the input pool is only 4kbits and rather than keep
 this feature I can save most of those 512 bytes for everyone by deleting the
 resizing code.

g) urandom starves/races random (queued for resend)

 Readers of /dev/urandom and get_random_bytes (both nonblock) pull from
 the same pool as /dev/random readers and without limit. As there are
 numerous users of get_random_bytes as pointed out above, /dev/random
 readers can easily be starved (and previously, race on wakeup), even
 by remote readers. This is rather a problem for the classic
 entropy-source-limited headless web server which may very well be
 trying to use both in, for example, a departmental certificate
 authority.

 My solution is to clean up the pool creation code and add a second
 output pool for nonblocking users. The pool reseeding logic is
 cleaned up to address a bunch of corner cases and has a low watermark
 parameter so that the nonblocking users can avoid draining the input
 pool entirely. The current default is to not let nonblocking readers
 draw the input pool below the point where blocking readers can do two
 catastrophic reseeds.

 The cleanup of the pool code lets this easily become per_cpu output
 pools for the non-blocking readers with about 10 lines of code if the
 above-mentioned contention is an issue. I haven't tried this yet, but
 I already did per_cpu for the cryptoapi stuff and it should be about
 the same.

 We could go completely lockless for the nonblocking pool also, but
 that would require some code duplication.

I can detail some of the non-race accounting issues I fixed but that
message would be similarly long and more tedious.

-- 
Matt Mackall : http://www.selenic.com : of or relating to the moon

  parent reply	other threads:[~2003-08-16  4:38 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-08-09  7:44 [RFC][PATCH] Make cryptoapi non-optional? Matt Mackall
2003-08-09  8:04 ` David S. Miller
2003-08-09 14:05   ` Matt Mackall
2003-08-09 17:39     ` David S. Miller
2003-08-09 19:46       ` Matt Mackall
2003-08-09 20:17         ` David S. Miller
2003-08-10  8:15           ` Matt Mackall
2003-08-10  8:32             ` virt_to_offset() (Re: [RFC][PATCH] Make cryptoapi non-optional?) YOSHIFUJI Hideaki / 吉藤英明
2003-08-10  8:30               ` David S. Miller
2003-08-10  9:02                 ` virt_to_offset() YOSHIFUJI Hideaki / 吉藤英明
2003-08-11 18:21                   ` virt_to_offset() David Mosberger
2003-08-12  2:46                     ` virt_to_offset() David S. Miller
2003-08-10  9:05                 ` virt_to_offset() (Re: [RFC][PATCH] Make cryptoapi non-optional?) Matt Mackall
2003-08-10  9:04                   ` David S. Miller
2003-08-10 11:00                     ` [PATCH 1/9] introduce virt_to_pagoff() YOSHIFUJI Hideaki / 吉藤英明
2003-08-10 11:00                     ` [PATCH 2/9] convert crypto to virt_to_pageoff() YOSHIFUJI Hideaki / 吉藤英明
2003-08-10 11:05                     ` [PATCH 3/9] convert net " YOSHIFUJI Hideaki / 吉藤英明
2003-08-10 11:07                     ` [PATCH 4/9] convert drivers/block " YOSHIFUJI Hideaki / 吉藤英明
2003-08-10 11:09                     ` [PATCH 5/9] convert drivers/ide " YOSHIFUJI Hideaki / 吉藤英明
2003-08-10 11:10                     ` [PATCH 6/9] convert drivers/net " YOSHIFUJI Hideaki / 吉藤英明
2003-08-10 11:10                     ` [PATCH 7/9] convert drivers/scsi " YOSHIFUJI Hideaki / 吉藤英明
2003-08-10 11:31                       ` Christoph Hellwig
2003-08-10 11:51                         ` David S. Miller
2003-08-10 12:03                           ` YOSHIFUJI Hideaki / 吉藤英明
2003-08-10 14:54                             ` Christoph Hellwig
2003-08-10 14:51                           ` Christoph Hellwig
2003-08-10 13:54                         ` Russell King
2003-08-10 13:55                         ` Russell King
2003-08-10 14:55                           ` Christoph Hellwig
2003-08-11  5:21                           ` David S. Miller
2003-08-10 11:10                     ` [PATCH 8/9] convert drivers/usb " YOSHIFUJI Hideaki / 吉藤英明
2003-08-10 11:10                     ` [PATCH 9/9] convert fs/jbd " YOSHIFUJI Hideaki / 吉藤英明
2003-08-11  2:15             ` [RFC][PATCH] Make cryptoapi non-optional? Jamie Lokier
2003-08-11  2:38               ` Matt Mackall
2003-08-11  4:54               ` David S. Miller
2003-08-11  5:17                 ` Jamie Lokier
2003-08-13  5:01                 ` [Numbers][PATCH] " Matt Mackall
2003-08-10 14:46           ` [RFC][PATCH] " James Morris
2003-08-09 14:33 ` Matt Mackall
2003-08-09 17:13   ` Jamie Lokier
2003-08-09 17:33     ` Matt Mackall
2003-08-10 13:18       ` James Morris
2003-08-10 17:45         ` Matt Mackall
2003-08-11  2:09           ` Jamie Lokier
2003-08-11  2:35             ` Matt Mackall
2003-08-11  4:59               ` Jamie Lokier
2003-08-11  5:04                 ` Matt Mackall
2003-08-11  5:20                   ` Jamie Lokier
2003-08-11  5:54                     ` Matt Mackall
2003-08-11  6:24                       ` Jamie Lokier
2003-08-11  4:58             ` David Wagner
2003-08-11  5:36               ` Jamie Lokier
2003-08-11 19:21                 ` David Wagner
2003-08-13 19:37                   ` Jamie Lokier
2003-08-13  3:52             ` Theodore Ts'o
2003-08-13 15:44               ` i810_rng.o on various Dell models Jim Carter
2003-08-13 16:15                 ` Jeff Garzik
2003-08-13 18:43                   ` Jamie Lokier
2003-08-13 18:36               ` [RFC][PATCH] Make cryptoapi non-optional? Jamie Lokier
2003-08-15  0:16                 ` Network Card Entropy? was: " Mike Fedyk
2003-08-15  0:22                   ` Robert Love
2003-08-13  3:20           ` Theodore Ts'o
2003-08-13  4:06             ` Matt Mackall
2003-08-14 16:53               ` Val Henson
2003-08-14 19:40                 ` David Wagner
2003-08-14 20:07                   ` Chris Friesen
2003-08-14 21:36                   ` Jamie Lokier
2003-08-15  0:25                     ` Val Henson
2003-08-15 11:47                       ` Jamie Lokier
2003-08-15  0:17                   ` Val Henson
2003-08-15  1:45                     ` David Wagner
2003-08-15  2:21                       ` Matt Mackall
2003-08-15  7:30                     ` Andries Brouwer
2003-08-15  7:40                       ` David S. Miller
2003-08-15  7:55                         ` Andries Brouwer
2003-08-15  8:06                           ` Måns Rullgård
2003-08-15  8:11                             ` Nick Piggin
2003-08-15 15:11                             ` Matt Mackall
2003-08-15 22:16                               ` Jamie Lokier
2003-08-15 20:22                           ` Val Henson
2003-08-16  6:27                             ` David Wagner
2003-08-18  4:25                               ` Val Henson
2003-08-15  8:09                         ` Nick Piggin
2003-08-15 15:03                       ` Matt Mackall
2003-08-15 17:04                         ` Andries Brouwer
2003-08-15 22:05                           ` Jamie Lokier
2003-08-15 22:02                         ` Jamie Lokier
2003-08-15 12:48                     ` Jamie Lokier
2003-08-15 22:34                     ` Theodore Ts'o
2003-08-15 22:12               ` Theodore Ts'o
2003-08-15 23:35                 ` James Morris
2003-08-16 15:51                   ` Matt Mackall
2003-08-17 14:37                     ` James Morris
2003-08-17 15:30                       ` Matt Mackall
2003-08-15 23:55                 ` Matt Mackall
2003-08-16  0:05                   ` Andrew Morton
2003-08-16  0:58                     ` Jamie Lokier
2003-08-16  4:57                       ` Matt Mackall
2003-08-16  4:38                     ` Matt Mackall [this message]
2003-08-16  5:03                       ` Andrew Morton
2003-08-16  5:39                         ` Matt Mackall
2003-08-18  6:43                       ` Andreas Dilger
2003-08-18  6:55                         ` David Lang
2003-08-18 11:59                           ` Jamie Lokier
2003-08-18 12:11                             ` Måns Rullgård
2003-08-18 13:33                               ` Jamie Lokier
2003-08-18 17:03                             ` David Lang
2003-08-18 17:51                               ` Jamie Lokier
2003-08-22  4:28                           ` David Wagner
2003-08-25  4:29                             ` Jamie Lokier
2003-08-18 15:20                         ` Matt Mackall
2003-08-18  3:23                   ` Theodore Ts'o
2003-08-18 15:46                     ` Matt Mackall
2003-08-10  2:07   ` Robert Love
2003-08-10  3:14     ` Matt Mackall
2003-08-10  3:49       ` David S. Miller
2003-08-10  4:01         ` Robert Love
2003-08-10  4:07           ` Robert Love
2003-08-16 20:40 Adam J. Richter
2003-08-17  4:28 ` Matt Mackall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030816043816.GC325@waste.org \
    --to=mpm@selenic.com \
    --cc=akpm@osdl.org \
    --cc=davem@redhat.com \
    --cc=jamie@shareable.org \
    --cc=jmorris@intercode.com.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).