On Sun, 2018-05-13 at 23:48 +0300, Adrian Bunk wrote:
> On Wed, May 09, 2018 at 11:46:00PM +0100, Ben Hutchings wrote:
[...]
> > # Options for a new fix
> > 
> > It is unlikely that any further fix will be forthcoming on the kernel
> > side, so I believe that we need to do one of:
> > 
> > 1. Add entropy to the kernel during boot; either:
> >    a. Improve systemd-random-seed
> >    b. Recommend use of haveged
> 
> I don't see any solution above that both always works and never results
> in new CVEs.

Indeed.

> As an example, what happens if I debootstrap and deploy the resulting
> filesytem to a large number of identical embedded systems without
> entropy sources?

Then it is your fault when they turn into a botnet. :-)  Availability
of randomness must be considered in the design of embedded systems.

[...]
> /dev/urandom is documented in a very misleading way, quoting random(4):
>    When read during early boot time, /dev/urandom may return data prior to
>    the entropy pool being initialized.  If this  is  of  concern  in  your
>    application, use getrandom(2) or /dev/random instead.
> 
> What is the worst case for "early boot time" here? "always"?

No, I don't think so.

> Due to the gdm bugs mentioned above we know that there are real-life 
> situations where gdm currently uses "random" data that might be 
> predictable.
> 
> grep tells me:
> daemon/gdm-x-session.c:        auth_entry.data = gdm_generate_random_bytes (auth_entry.data_length, &error);
> daemon/gdm-display-access-file.c:        *cookie = gdm_generate_random_bytes (GDM_DISPLAY_ACCESS_COOKIE_SIZE,
> 
> Repeat the same for every package that uses /dev/urandom.

This is certain undesirable, but it's exploitable only by local users. 
(If you let the X server listen to the network, all authentication
cookies are sent in the clear so you've already lost.  If you use ssh X
forwarding, it generates a new authentication cookie for use with the X
proxy on the remote machine.)

> 
> >    b. Tolerate a longer wait for getrandom() to return
> 
> I suspect there might be no guaranteed upper bound for the waiting time.

Interrupt timing feeds into the RNG, and as long as there's at least
one interrupt per second then I think the RNG will reach the fully
initialised state after a few minutes.  I just started a VM with a
serial console and only a shell running as pid 1, which is about as
idle a system as I can imagine, and it was seeing more than one
interrupt per second.  However, other architectures (e.g. s390x) might
achieve greater idleness.

> > ...
> > The libbsd maintainer (Guillem Jover) favours option 2a.
> > 
> > One of the krb5 maintainers (Benjamin Kaduk) favours option 2b, and
> > also proposed that systemd could provide a wait-for-rng-ready unit to
> > support this.
> 
> I don't see any general solution that is both correct and easy.

Indeed.

> The proper way forward might be to deprecate /dev/urandom and add a 
> third option GRND_UNSAFE_RANDOM to getrandom() that is documented to 
> never block but might return predictable data in some cases.

This doesn't solve anything for us.  (It does help with the original
problem of device nodes possibly being absent from a minimal container
or chroot.)

> It would then be up to the application to decide whether predictable
> data is acceptable, and what to do in entropy-starved situations.
> 
> Regarding the suggested wait-for-rng-ready systemd unit for others to 
> wait on, this only makes sense for cases where "do not start at all"
> is the best handling for a "no entropy" situation.

Yes.

Ben.

> > Ben.
> 
> cu
> Adrian
> 
-- 
Ben Hutchings
For every action, there is an equal and opposite criticism. - Harrison